Before we move on to manipulating array values stored in the database, it is useful to first learn the basics.
Representing a list
SELECT '{a,b,c}'::text[];
text
----------
{a,b,c}
(1 row)
We have to specify the type. Else postgres would not know the type since it looks like a string.
Or we could also use the ARRAY[]
style
# SELECT ARRAY['a','b','c'];
array
-------
{a,b,c}
SELECT
ing FROM
an array
So why doesn’t the following work?
# SELECT * FROM ARRAY['a','b','c']
ERROR: 42601: syntax error at or near "ARRAY"
LINE 1: SELECT * FROM ARRAY['a','b','c'];
^
LOCATION: scanner_yyerror, scan.l:1134
SELECT FROM
expects that the data source is:
- a set of rows
- OR points to a set of word (table)
ARRAY[]
returns an atomic value. Which is why it works with just SELECT
but fails when FROM
is used.
Convert arrays into rows using UNNEST
# SELECT UNNEST(ARRAY['a','b']);
unnest
--------
a
b
(2 rows)
Convert rows into arrays using ARRAY()
Watch out. We are not using ARRAY[]
. We are using array()
.
SELECT array(
SELECT UNNEST(ARRAY['a','b'])
);
array
-------
{a,b}
(1 row)
To summarize:
UNNEST
converts array into a series of rows.array()
converts a bunch of rows into an array.ARRAY[]
used to represent/define an array.
Concatenate arrays using array_cat()
# SELECT array_cat('{a,b}'::text[], '{b,c}'::text[]);
array_cat
-----------
{a,b,b,c}
(1 row)
We coerced the argument into text arrays to indicate to array_cat
what data types it was dealing with. If we used the ARRAY[]
initializer, this would not be required. Example below.
SELECT array_cat(ARRAY['a','b'], ARRAY['b', 'c']);
Notice that the final array contains b
twice. We would love to remove duplicates. There are two ways to do it and we’ll explore both below.
Removing duplicates from an array using DISTINCT
This method works with integers, text, float, etc
DISTINCT
helps remove duplicates from all data types. But it requires rows as input. Not arrays.
- So first convert an array to rows using
UNNEST
- Pass it to DISTINCT
- And then if you still require an array as output, use
array()
SELECT DISTINCT UNNEST(
'{apple, banana, cat, apple}'::text[]
);
unnest
--------
banana
cat
apple
(3 rows)
SELECT DISTINCT UNNEST(
'{1.245, 1.3, 1.5, 1.2}'::float[]
) ORDER BY 1;
And we have columns. But we need an array back as output. So convert using array()
SELECT array(
SELECT DISTINCT UNNEST(
'{apple, banana, cat, apple}'::text[]
)
);
array
--------------------
{banana,cat,apple}
(1 row)
What if we want it sorted? Use ORDER BY 1
SELECT array(
SELECT DISTINCT UNNEST(
'{apple, banana, cat, apple}'::text[]
) ORDER BY 1
);
ORDER BY 1
would sort by the first column (We could starting with1
here). In this case we are working with on-the-fly arrays, without giving them column names. So we use column number references.
Want to try this with a float array? Let’s do it.
SELECT array(
SELECT DISTINCT UNNEST(
'{1.2, 1.245, 2.3, 1.245, 1.246}'::text[]
)
);
array
-----------------------
{1.2,1.246,2.3,1.245}
(1 row)
Removing duplicates from integer arrays using sort
and uniq
Caveat: This method only works for integer arrays (yeah so no floats too).
sort
and uniq
are functions available in the intarray
extension. Enable the extension in your database by running the following statement.
CREATE EXTENSION IF NOT EXISTS intarray;
Try using the uniq
function.
SELECT uniq(
'{1,2,1,3}'::int[]
);
uniq
-----------
{1,2,1,3}
(1 row)
Oops. Does not work. uniq
only works on a sorted list. So use the sort
function to sort it first.
SELECT uniq(
sort(
'{1,2,1,3}'::int[]
)
);
uniq
---------
{1,2,3}
(1 row)
I haven’t yet benchmarked in detail which method works the fastest for sorting integers. I used both methods as a part of another benchmarking to update 100k rows (coming soon). For now, all I can say is, both aren’t slow. I’ll try a more detailed benchmarking another day.
What else?
These are essentials that I find useful to know. There are a lot more array functions that Postgres supports. Go check it out.
I’ll be sharing more from my notes on PostgreSQL. Subscribe to get updates.
I’m running psql —version 9.4.1
Also, it may be worth noting that rawdata.deptname and rawdata.deptcode are
both text data types.
The errors I’m getting are:
ERROR: syntax error at or near «json_build_object»
LINE 1: insert into json(data) json_build_object(SELECT DISTINCT dep…
and
ERROR: syntax error at or near «row_to_json»
LINE 1: insert into json(data) row_to_json(SELECT DISTINCT deptname,…
As per this advice
<http://stackoverflow.com/questions/24006291/postgresql-return-result-set-as-json-array>
from stack overflow, I also tried running:
INSERT INTO json(data) array_to_json(array_agg(SELECT DISTINCT deptcode,
deptname FROM rawdata));
Which returned:
ERROR: syntax error at or near «array_to_json»
LINE 1: INSERT INTO json(data) array_to_json(array_agg(SELECT DISTIN…
Also tried json_build_array with the same result.
I did try to use commands from the documentation page you linked but I just
don’t understand how I should be building the query. In my head, the query
should basically say, «Build objects from distinct rows in rawdata, push
each object into the array, and then insert the array into the json table.»
I could do it in javascript or python but I’m pretty green when it comes to
SQL. I know it’s probably simple, but I’m having a hell of a time trying to
figure it out.
Anyway, thanks for the suggestion and letting me rubber duck debug off of
you.
On Fri, Mar 27, 2015 at 12:47 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:
> On 03/27/2015 10:40 AM, Eli Murray wrote:
>
>> Hi All,
>>
>> I’m trying to create an array of JSON objects from an existing table.
>> I’ve created a new table:
>>
>> «CREATE TABLE json(data json[]);»
>>
>> Now i need to populate it with key value pairs from another table. I’d
>> like to create the objects to look like:
>>
>> { «code»: rawdata.deptcode, «name»: rawdata.deptname }
>>
>> But I’m having difficulty understanding the syntax to do this because I
>> find the docs to be rather difficult to parse.
>>
>> I’ve tried running the query:
>>
>> «INSERT INTO json(data) json_build_object(SELECT DISTINCT deptcode,
>> deptname AS code, name FROM rawdata);»
>>
>> but it returns a syntax error.
>>
>
> That would be?
>
> What version of Postgres?
>
>
>> I’ve also tried running:
>>
>> «INSERT INTO json(data) row_to_json(SELECT DISTINCT deptcode, deptname
>> FROM rawdata);»
>>
>> but no dice.
>>
>> Does anyone know the proper syntax to create an array of JSON objects
>> from an existing table?
>>
>
> json_build_array?
> http://www.postgresql.org/docs/9.4/interactive/functions-json.html
>
>
>> —
>> Senior Web Developer at The Daily Illini
>> ejmurra2(at)illinimedia(dot)com <mailto:ejmurra2(at)illinimedia(dot)com>
>> (815) 985-8760
>>
>
>
> —
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
—
Senior Web Developer at The Daily Illini
ejmurra2(at)illinimedia(dot)com
(815) 985-8760
Comments
bobbywang000
added a commit
to bobbywang000/nemosyne
that referenced
this issue
Jun 5, 2020
See typeorm/typeorm#1532, since simple-array is stored as text under the hood, we need to set the default for `events` to an empty string, rather than an empty array.
bobbywang000
added a commit
to bobbywang000/nemosyne
that referenced
this issue
Jun 5, 2020
* Add Uniqueness Constraints for DateRange, Entry, Tag DateRange: start/end dates have to be unique Entry: written date and subject date have to be unique Tag: name has to be unique * Make DateRange.title optional * Set default for DateRange.events to empty string See typeorm/typeorm#1532, since simple-array is stored as text under the hood, we need to set the default for `events` to an empty string, rather than an empty array. * Add cascades Set most cascades to ['insert', 'update'] to reduce risk of data loss if a single deletion causes a TON of data to be deleted. Exception: deleting a `Tag` deletes all the associated `Note`s.
kauz
added a commit
to kauz/typeorm
that referenced
this issue
May 25, 2021
AlexMesser
pushed a commit
that referenced
this issue
May 30, 2021
…7681) * fix: correctly strip type conversion in postgres for default values Use better regex to remove only type conversion, not the whole string to the end Closes: 7110 * fix: better regex * fix: modify postgres default regex to not break #1532 * fix: modify postgres default regex to support user-defined types This makes sure updated regex does not break migrations chnages introduced in #7647 * test: add test case for #5132