4

I want to perform a fill-down operation in Postgres.

DDL:

create table brands (
id int,
category varchar(20),
brand_name varchar(20)
);

insert into brands values
 (1,'chocolates','5-star')
,(2,null,'dairy milk')
,(3,null,'perk')
,(4,null,'eclair')
,(5,'Biscuits','britannia')
,(6,null,'good day')
,(7,null,'boost')
,(8,'shampoo','h&s')
,(9,null,'dove')
;

Expected output is:

category brand_name
chocolates 5-star
chocolates dairy milk
chocolates perk
chocolates eclair
Biscuits britannia
Biscuits good day
Biscuits boost
Shampoo h&s
Shampoo dove

I tried using the following script but it doesn't seem to work.

select id,
      first_value(category)
      over(order by case when category is not null then id end desc nulls last) as category,
      brand_name
from brands

Can someone suggest a fix?
In MS SQL the following snippet seems to work fine:

select id,
       first_value (category) IGNORE NULLS
       over(order by id desc
        rows between current row and unbounded following) as category,
       brand_name
FROM brands
ORDER BY id
1
  • 1
    Take a look at Migrate Your Mindset Too. That just might be the most important thing when migrating to another Database manager.
    – Belayer
    Commented Nov 28, 2022 at 20:44

4 Answers 4

5
with cte as (
select id,
       category,
       count(category) over (order by id) as category_id,
       brand_name
  from brands)
select id,
       first_value(category) over (partition by category_id order by id) as category,
       brand_name
  from cte;

UPDATE: added query without CTE per request:

select id,
       (array_agg(category) over (order by id))[max(case when category is null then 0 else id end) over (order by id)] as category,
       brand_name
  from brands;
2
  • Thank you for the solution. Although I'm aware of this approach, but what if I do not want to use a cte as an intermediate step. I would just like to use a single window function to solve this(just as the ms sql solution i mentioned) Commented Nov 28, 2022 at 5:37
  • 2
    @RishavGhosh: what's wrong with using a CTE?
    – user330315
    Commented Nov 28, 2022 at 6:44
2

I think there's nothing wrong with using a CTE (see JHH's answer) and I would prefer that.

Postgres DB's don't provide this IGNORE NULLS concept of SQLServer DB's, so I guess you should stop thinking you will get a nearly identic query for Postgres DB's like in MS SQL.

Anyway, if you don't want to use a CTE or complex subquery, you can define your own function and aggregation and run this.

Function creation:

-- CREATE your function
CREATE FUNCTION yourFunction(STATE anyelement, VALUE anyelement)
    RETURNS anyelement
    IMMUTABLE PARALLEL safe
AS
$$
SELECT COALESCE(VALUE, STATE); -- Replace NULL values here
$$ LANGUAGE SQL;

Aggregate creation using the function:

-- CREATE your aggregate
CREATE AGGREGATE yourAggregate(ANYELEMENT) (
    sfunc = yourFunction, -- Call your function here
    stype = ANYELEMENT
);

Your query using this aggregate:

SELECT id, 
  yourAggregate(category) -- Call your aggregate here
  OVER (ORDER BY id, category), 
  brand_name
FROM brands
ORDER BY id;

Of course, you should rename both function and aggregate and use more meaningful names.

This will produce the same outcome like the CTE version.

Try out: db<>fiddle

If you are keen on defining and using own functions and you will use that often, you could do that.

Otherwise, just use a CTE, that's quite fine. There is no reason to do not use CTE's.

Always be aware that while using own functions, you are on risk of bad performance, so you should check whether this query is too slow.

1

I am afraid that's not implemented in Postgres (at least up to Postgres 15). The manual about window functions:

The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead, lag, first_value, last_value, and nth_value. This is not implemented in PostgreSQL: the behavior is always the same as the standard's default, namely RESPECT NULLS.

So you have to use a workaround with a CTE or subquery like JHH suggested, or roll your own window function (which will be comparatively slow).

See (answer to similar question on dba.SE):

-2
select *, max(category) over (partition by r) 
from(
select *,sum(case when category is not null then 1 end) over (order by rn) r
from
(select *,row_number() over() as rn from brands) l
)p

Not the answer you're looking for? Browse other questions tagged or ask your own question.