Software Development

The Awesome PostgreSQL 9.4 / SQL:2003 FILTER Clause for Aggregate Functions

Sometimes when aggregating data with SQL, we’d love to add some additional filters. For instance, consider the following world bank data:

 
 
 
 
 
 
 
 

GDP per capita (current US$)

2009    2010    2011    2012
CA      40,764  47,465  51,791  52,409	
DE      40,270  40,408  44,355  42,598	
FR      40,488  39,448  42,578  39,759	
GB      35,455  36,573  38,927  38,649	
IT      35,724  34,673  36,988  33,814	
JP      39,473  43,118  46,204  46,548	
RU       8,616  10,710  13,324  14,091	
US      46,999  48,358  49,855  51,755

And the table structure:

CREATE TABLE countries (
  code CHAR(2) NOT NULL,
  year INT NOT NULL,
  gdp_per_capita DECIMAL(10, 2) NOT NULL
);

Now, let’s assume we’d like to find the number of countries with a GDP higher than 40,000 for each year.

With standard SQL:2003, and now also with the newly released PostgreSQL 9.4, we can now take advantage of the new FILTER clause, which allows us to write the following query:

SELECT
  year,
  count(*) FILTER (WHERE gdp_per_capita >= 40000)
FROM
  countries
GROUP BY
  year

The above query will now yield:

year   count
------------
2012   4
2011   5
2010   4
2009   4

And that’s not it! As always, you can use any aggregate function also as a window function simply by adding an OVER() clause to the end:

SELECT
  year,
  code,
  gdp_per_capita,
  count(*) 
    FILTER (WHERE gdp_per_capita >= 40000) 
    OVER   (PARTITION BY year)
FROM
  countries

The result would then look something like this:

year   code   gdp_per_capita   count
------------------------------------
2009   CA           40764.00       4
2009   DE           40270.00       4
2009   FR           40488.00       4
2009   GB           35455.00       4

jOOQ 3.6 will also support the new FILTER clause for aggregate functions

Good news for jOOQ users. You can write the same query with jOOQ intuitively as such:

DSL.using(configuration)
   .select(
        COUNTRIES.YEAR,
        count().filterWhere(
            COUNTRIES.GDP_PER_CAPITA.ge(40000)
        ))
   .from(COUNTRIES)
   .groupBy(COUNTRIES.YEAR)
   .fetch();

… and

DSL.using(configuration)
   .select(
        COUNTRIES.YEAR,
        COUNTRIES.CODE,
        COUNTRIES.GDP_PER_CAPITA
        count().filterWhere(
                   COUNTRIES.GDP_PER_CAPITA.ge(40000))
               .over(partitionBy(COUNTRIES.YEAR)))
   .from(COUNTRIES)
   .fetch();

And the best thing is that jOOQ (as usual) emulates the above clause for you if you’re not using PostgreSQL. The equivalent query would be:

SELECT
  year,
  count(CASE WHEN gdp_per_capita >= 40000 THEN 1 END)
FROM
  countries
GROUP BY
  year

jooq-the-best-way-to-write-sql-in-java-small

Read more about what’s new in PostgreSQL 9.4 here

Lukas Eder

Lukas is a Java and SQL enthusiast developer. He created the Data Geekery GmbH. He is the creator of jOOQ, a comprehensive SQL library for Java, and he is blogging mostly about these three topics: Java, SQL and jOOQ.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
Back to top button