Wednesday, October 24, 2012

More uses for PostgreSQL arrays

Arrays and the Pg extensions to them are very useful for solving SQL problems that are otherwise tricky to deal with without procedural functions or tortured SQL. There are some good tricks with arrays that're worth knowing about, but aren't always immediately obvious from the documentation. I want to show you a few involving ANY and ALL, intarray, and array indexing.

ANY and ALL are more powerful than you probably expect

The row/array comparison pseudo-operators ANY and ALL are powerful.

They accept arrays or rowsets, so you can apply them directly to an array or to a set-returning subquery. There's no need to unnest an array, nor any need to array_agg a set-returning subquery.

They also work on many more operators than the simple <, >, <=, >=, = and <>. For example, you can use LIKE.

That gives rise to queries like this:

FROM pg_catalog.pg_namespace

Note that NOT LIKE ANY will have potentially unexpected effects; you probably want NOT (a LIKE ANY (...)).

In this particular case a ~ regular-expression match is likely to be faster - but this trick is good for when you have wildcard patterns supplied from elsewhere, like the classic "how do I match a value against LIKE expressions / regular expressions in a column" question.

I learned about LIKE ANY from an answer by maniek on Stack Overflow. It's an informative place.

There's built-in GIN and GiST index support for arrays

There's GIN index support for arrays, allowing indexed searches for some array operators like the contains operator @>.

This index support can be used to do an indexed search for arrays using certain array operators.

For example, you can search for arrays containing a particular element.

intarray has index opclasses and its own query language

The intarray extension is a very useful tool, containing functions like idx(...) to get the position of an element within an integer array.

Less well known is the fact that it also has its own basic query language for searching integer arrays, allowing you to express criteria like "contains 1 and 4 but not 7":

SELECT ARRAY[1,2,3,4,5,6] @@ '1&4&(!7)';

It provides no less than three different index methods (two GiST, one GIN) to assist with these queries and some other intarray operators.

More commonly known but important tools

More widely used and known are the useful arrays tools unnest and array_agg, the array constructor ARRAY[...], support for multidimensional arrays and arrays of composite types, array slices like some_array[1:3], and Pg's ability to sort and compare arrays directly. You probably know about all these already, but if you don't they're really useful, so check them out.

What're your favourite array-related tricks, features and tools?

No comments:

Post a Comment