7

Sample data

CREATE TABLE test
    (id integer, session_ID integer, value integer)
;

INSERT INTO test
    (id, session_ID, value)
VALUES
    (0, 2, 100),
    (1, 2, 120),
    (2, 2, 140),
    (3, 1, 900),
    (4, 1, 800),
    (5, 1, 500)
;

Current query

select 
id,
last_value(value) over (partition by session_ID order by id) as last_value_window,
last_value(value) over (partition by session_ID order by id desc) as last_value_window_desc
from test
ORDER BY id

I was running into a problem with the last_value() window function: http://sqlfiddle.com/#!15/bcec0/2

In the fiddle I am trying to work with the sort direction within the last_value() query.

Edit: The question is not: Why I don't get the all time last value and how to use the frame clause (unbounded preceding and unbounded following). I know about the difference of first_value(desc) and last_value() and the problem that last_value() does not give you the all-time last value:

The default frame clause is unbounded preceding until current row. So first value is always giving the first row withing the clause. So it doesn't matter if there is just one row (the frame clause includes only this one) or one hundered (the frame clause includes all hundred). The result is always the first one. In DESC order it is the same: DESC changes the sort order and then the first row is the last value, no matter how many rows you get.

With last_value() the behavior is very similar: If you have one row, it gives you the last value of the default frame clause: This one row. At the second row, the frame clause contains the two rows, the last one is the second. That's why last_value() does not give you the last row of all rows but only the last row until the current row.

But if I change the order to DESC I am expecting that I have the last row of all first, so I get this one at the first row, than the last but second one at the second row and so on. But that's not the result. Why?

For the current example these are the results for first_value(), first_value(desc), last_value(), last_value(desc) and what I am expecting for the last_value(desc):

 id | fv_asc | fv_desc | lv_asc | lv_desc | lv_desc(expecting)
----+--------+---------+--------+---------+--------------------
  0 |    100 |     140 |    100 |     100 |    140
  1 |    100 |     140 |    120 |     120 |    120
  2 |    100 |     140 |    140 |     140 |    100
  3 |    900 |     500 |    900 |     900 |    500
  4 |    900 |     500 |    800 |     800 |    800
  5 |    900 |     500 |    500 |     500 |    900

For me it seems that the ORDER BY DESC flag is ignored within the default frame clause last_value() call. But it is not within the first_value() call. So my question is: Why is the last_value() result the same as the last_value(desc)?

3
  • Are you using MS SQL Server or Postgresql?
    – jarlh
    Commented Feb 17, 2017 at 13:24
  • (1) Tag with the database you are really using. (2) There is something about last_value() that I don't quite remember, but I always use first_value(). Commented Feb 17, 2017 at 13:29
  • @jarhl Usually I am using PostgreSQL. But it is the same thing on SQL Server
    – S-Man
    Commented Feb 17, 2017 at 14:28

3 Answers 3

12

The problem with LAST_VALUE() is that the default rules for windowing clauses remove the values that you really want. This is a very subtle problem and is true in all databases that support this functionality.

This comes from an Oracle blog:

Whilst we are on the topic of windowing clauses, the implicit and unchangeable window clause for the FIRST and LAST functions is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, in other words all rows in our partition. For FIRST_VALUE and LAST_VALUE the default but changeable windowing clause is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, in other words we exclude rows after the current one. Dropping rows off the bottom of a list makes no difference when we are looking for the first row in the list (FIRST_VALUE) but it does make a difference when we are looking for the last row in the list (LAST_VALUE) so you will usually need either to specify ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING explicitly when using LAST_VALUE or just use FIRST_VALUE and reverse the sort order.

Hence, just use FIRST_VALUE(). This does what you want:

with test (id, session_ID, value) as (
      (VALUES (0, 2, 100),
              (1, 2, 120),
              (2, 2, 140),
              (3, 1, 900),
              (4, 1, 800),
              (5, 1, 500)
      )
     )
select id,
       first_value(value) over (partition by session_ID order by id) as first_value_window,
       first_value(value) over (partition by session_ID order by id desc) as first_value_window_desc
from test
order by id
2
  • Hi, thanks, but this problem is explained for example here: dba.stackexchange.com/questions/76726/… But this wasn't my question. I was not asking for the difference between first_value() and last_value() but for the difference between lastvalue(order by ... asc) and last_value(order by...desc) I'll update the question to make it clearer
    – S-Man
    Commented Feb 17, 2017 at 14:34
  • @S-Man . . . That is the problem you are encountering. Really, it is better to use first_value() all the time and just adjust the sort order to get what you want. The definition of last_value() conflicts with the semantics of the windowing functions. Commented Feb 18, 2017 at 14:28
5

After one year I have got the solution:

Take this statement:

SELECT
    id,
    array_accum(value) over (partition BY session_ID ORDER BY id)      AS window_asc,
    first_value(value) over (partition BY session_ID ORDER BY id)      AS first_value_window_asc,
    last_value(value) over (partition BY session_ID ORDER BY id)       AS last_value_window_asc,
    array_accum(value) over (partition BY session_ID ORDER BY id DESC) AS window_desc,
    first_value(value) over (partition BY session_ID ORDER BY id DESC) AS first_value_window_desc,
    last_value(value) over (partition BY session_ID ORDER BY id DESC)  AS last_value_window_desc
FROM
    test
ORDER BY
    id

This gives

id  window_asc     first_value_window_asc  last_value_window_asc  window_desc    first_value_window_desc  last_value_window_desc  
--  -------------  ----------------------  ---------------------  -------------  -----------------------  ----------------------  
0   {100}          100                     100                    {140,120,100}  140                      100                     
1   {100,120}      100                     120                    {140,120}      140                      120                     
2   {100,120,140}  100                     140                    {140}          140                      140                     
3   {900}          900                     900                    {500,800,900}  500                      900                     
4   {900,800}      900                     800                    {500,800}      500                      800                     
5   {900,800,500}  900                     500                    {500}          500                      500           

The array_accum shows the used window. There you can see the first and the current last value of the window.

What happenes shows the execution plan:

"Sort  (cost=444.23..449.08 rows=1940 width=12)"
"  Sort Key: id"
"  ->  WindowAgg  (cost=289.78..338.28 rows=1940 width=12)"
"        ->  Sort  (cost=289.78..294.63 rows=1940 width=12)"
"              Sort Key: session_id, id"
"              ->  WindowAgg  (cost=135.34..183.84 rows=1940 width=12)"
"                    ->  Sort  (cost=135.34..140.19 rows=1940 width=12)"
"                          Sort Key: session_id, id"
"                          ->  Seq Scan on test  (cost=0.00..29.40 rows=1940 width=12)"

There you can see: First there is an ORDER BY id for the first three window functions.

This gives (as stated in question)

id  window_asc     first_value_window_asc  last_value_window_asc  
--  -------------  ----------------------  ---------------------  
3   {900}          900                     900                    
4   {900,800}      900                     800                    
5   {900,800,500}  900                     500                    
0   {100}          100                     100                    
1   {100,120}      100                     120                    
2   {100,120,140}  100                     140    

Then you can see another sort: ORDER BY id DESC for the next three window functions. This sort gives:

id  window_asc     first_value_window_asc  last_value_window_asc  
--  -------------  ----------------------  ---------------------  
5   {900,800,500}  900                     500                    
4   {900,800}      900                     800                    
3   {900}          900                     900                    
2   {100,120,140}  100                     140                    
1   {100,120}      100                     120                    
0   {100}          100                     100                        

With this sorting the DESC window function are executed. The array_accum column shows the resulting windows:

id  window_desc    
--  -------------  
5   {500}          
4   {500,800}      
3   {500,800,900}  
2   {140}          
1   {140,120}      
0   {140,120,100}  

The resulting (first_value DESC and) last_value DESC is now absolutely identical to the last_value ASC:

id  window_asc     last_value_window_asc  window_desc    last_value_window_desc  
--  -------------  ---------------------  -------------  ----------------------  
5   {900,800,500}  500                    {500}          500                     
4   {900,800}      800                    {500,800}      800                     
3   {900}          900                    {500,800,900}  900                     
2   {100,120,140}  140                    {140}          140                     
1   {100,120}      120                    {140,120}      120                     
0   {100}          100                    {140,120,100}  100    

Now it became clear to me why last_value ASC is equal to last_value DESC. It's because the second ORDER of the window functions which gives an inverted window.

(The last sort of the execution plan ist the last ORDER BY of the statement.)

As a little bonus: This query shows a little optimization potential: If you call the DESC windows first and then the ASC ones you do not need the third sort. It is in the right sort at this moment.

0

Check how the window frame is defined. This example might help:

select 
    id,
    last_value(value) over (
        partition by session_id
        order by id
    ) as lv_asc,
    last_value(value) over (
        partition by session_id
        order by id desc
    ) as lv_desc,
    last_value(value) over (
        partition by session_id
        order by id
        rows between unbounded preceding and unbounded following
    ) as lv_asc_unbounded,
    last_value(value) over (
        partition by session_id
        order by id desc
        rows between unbounded preceding and unbounded following
    ) as lv_desc_unbounded
from t
order by id;
 id | lv_asc | lv_desc | lv_asc_unbounded | lv_desc_unbounded 
----+--------+---------+------------------+-------------------
  0 |    100 |     100 |              140 |               100
  1 |    120 |     120 |              140 |               100
  2 |    140 |     140 |              140 |               100
  3 |    900 |     900 |              500 |               900
  4 |    800 |     800 |              500 |               900
  5 |    500 |     500 |              500 |               900
1
  • 1
    Hi, thanks. My problem is: Why give columns lv_asc and lv_desc the same result? For lv_desc I am expecting the order 140, 120, 100, 500, 800, 900 because the window is ordered other way round.
    – S-Man
    Commented Feb 20, 2017 at 7:24

Not the answer you're looking for? Browse other questions tagged or ask your own question.