How to use first_value() window function to get last non null value?

Question

I want to perform a fill-down operation in Postgres.

DDL:

create table brands (
id int,
category varchar(20),
brand_name varchar(20)
);

insert into brands values
 (1,'chocolates','5-star')
,(2,null,'dairy milk')
,(3,null,'perk')
,(4,null,'eclair')
,(5,'Biscuits','britannia')
,(6,null,'good day')
,(7,null,'boost')
,(8,'shampoo','h&s')
,(9,null,'dove')
;

Expected output is:

category	brand_name
chocolates	5-star
chocolates	dairy milk
chocolates	perk
chocolates	eclair
Biscuits	britannia
Biscuits	good day
Biscuits	boost
Shampoo	h&s
Shampoo	dove

I tried using the following script but it doesn't seem to work.

select id,
      first_value(category)
      over(order by case when category is not null then id end desc nulls last) as category,
      brand_name
from brands

Can someone suggest a fix?
In MS SQL the following snippet seems to work fine:

select id,
       first_value (category) IGNORE NULLS
       over(order by id desc
        rows between current row and unbounded following) as category,
       brand_name
FROM brands
ORDER BY id

Take a look at Migrate Your Mindset Too. That just might be the most important thing when migrating to another Database manager. — Belayer, Commented Nov 28, 2022 at 20:44

JHH · Accepted Answer · 2022-11-28 07:06:20Z

5

with cte as (
select id,
       category,
       count(category) over (order by id) as category_id,
       brand_name
  from brands)
select id,
       first_value(category) over (partition by category_id order by id) as category,
       brand_name
  from cte;

UPDATE: added query without CTE per request:

select id,
       (array_agg(category) over (order by id))[max(case when category is null then 0 else id end) over (order by id)] as category,
       brand_name
  from brands;

edited Nov 28, 2022 at 7:06

answered Nov 28, 2022 at 5:12

JHH

1,3891 gold badge3 silver badges9 bronze badges

Thank you for the solution. Although I'm aware of this approach, but what if I do not want to use a cte as an intermediate step. I would just like to use a single window function to solve this(just as the ms sql solution i mentioned)
– Rishav Ghosh
Commented Nov 28, 2022 at 5:37
2

@RishavGhosh: what's wrong with using a CTE?
– user330315
Commented Nov 28, 2022 at 6:44

Add a comment |

Jonas Metzler · Accepted Answer · 2022-11-28 06:46:24Z

I think there's nothing wrong with using a CTE (see JHH's answer) and I would prefer that.

Postgres DB's don't provide this IGNORE NULLS concept of SQLServer DB's, so I guess you should stop thinking you will get a nearly identic query for Postgres DB's like in MS SQL.

Anyway, if you don't want to use a CTE or complex subquery, you can define your own function and aggregation and run this.

Function creation:

-- CREATE your function
CREATE FUNCTION yourFunction(STATE anyelement, VALUE anyelement)
    RETURNS anyelement
    IMMUTABLE PARALLEL safe
AS
$$
SELECT COALESCE(VALUE, STATE); -- Replace NULL values here
$$ LANGUAGE SQL;

Aggregate creation using the function:

-- CREATE your aggregate
CREATE AGGREGATE yourAggregate(ANYELEMENT) (
    sfunc = yourFunction, -- Call your function here
    stype = ANYELEMENT
);

Your query using this aggregate:

SELECT id, 
  yourAggregate(category) -- Call your aggregate here
  OVER (ORDER BY id, category), 
  brand_name
FROM brands
ORDER BY id;

Of course, you should rename both function and aggregate and use more meaningful names.

This will produce the same outcome like the CTE version.

Try out: db<>fiddle

If you are keen on defining and using own functions and you will use that often, you could do that.

Otherwise, just use a CTE, that's quite fine. There is no reason to do not use CTE's.

Always be aware that while using own functions, you are on risk of bad performance, so you should check whether this query is too slow.

Erwin Brandstetter · Accepted Answer · 2022-11-28 06:46:10Z

I am afraid that's not implemented in Postgres (at least up to Postgres 15). The manual about window functions:

The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead, lag, first_value, last_value, and nth_value. This is not implemented in PostgreSQL: the behavior is always the same as the standard's default, namely RESPECT NULLS.

So you have to use a workaround with a CTE or subquery like JHH suggested, or roll your own window function (which will be comparatively slow).

See (answer to similar question on dba.SE):

Carry over long sequence of missing values with Postgres

General Grievance · Accepted Answer · 2023-03-08 20:20:09Z

-2

select *, max(category) over (partition by r) 
from(
select *,sum(case when category is not null then 1 end) over (order by rn) r
from
(select *,row_number() over() as rn from brands) l
)p

edited Mar 8, 2023 at 20:20

General Grievance

4,90736 gold badges35 silver badges50 bronze badges

answered Mar 2, 2023 at 7:08

ashin

1

Add a comment |

Collectives™ on Stack Overflow

How to use first_value() window function to get last non null value?

4 Answers 4

Not the answer you're looking for? Browse other questions tagged
sql
postgresql
window-functions
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Not the answer you're looking for? Browse other questions tagged sqlpostgresqlwindow-functions or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
sql
postgresql
window-functions
or ask your own question.