12

Say I have table gps like:

CREATE TABLE gps(
   gps_id serial primary key,
   measured_timestamp timestamp, 
   geom Geometry(Point,4326),
   stop_id int --Null
);

And I have a table of stops like:

CREATE TABLE stops(
   stop_id serial primary key,
   geom Geometry(Point,4326)
);

If I want to do an UPDATE on gps to find the nearest stop to each point, is there a way to use a LATERAL query?

I tried something like

UPDATE gps
SET stop_id = nearest.stop_id
FROM LATERAL(SELECT stop_id FROM stops ORDER BY stops.geom <-> gps.geom LIMIT 1) nearest

but that told me

ERROR:  invalid reference to FROM-clause entry for table "gps"
                                               ^
HINT:  There is an entry for table "gps", but it cannot be referenced from this part of the query.

So is the only way to do?

UPDATE gps
SET stop_id = nearest.stop_id
FROM (SELECT gps.gps_id, stop_id 
      FROM gps
      LATERAL JOIN (SELECT stop_id FROM stops ORDER BY stops.geom <-> gps.geom LIMIT 1) stops) nearest
WHERE nearest.gps_id = gps.gps_id

This feels like joining the same table to itself, which wouldn't need to happen with a SELECT INTO

2 Answers 2

19

No need for JOIN LATERAL (or do you really just want to use it?); an UPDATE will pass each processing row to the following query, which is the same concept as using a JOIN LATERAL.[*]

Try

UPDATE gps
  SET  stop_id = (
    SELECT stops.stop_id
    FROM   stops
    ORDER BY
           gps.geom <-> stops.geom
    LIMIT  1
  )
;

[*] Just to give an example on that; consider a SELECT instead to find the closest stop to each gps point using JOIN LATERAL:

SELECT a.gps_id,
       a.measured_timestamp,
       a.geom,
       b.stop_id
FROM   gps AS a
CROSS JOIN LATERAL (
  SELECT stops.stop_id
  FROM   stops
  ORDER BY
         a.geom <-> stops.geom
  LIMIT  1
) AS b
;

Each row in gps is now passed individually and subsequentially to the JOIN LATERAL sub-query to be processed; this (sort of) mimicks the UPDATE command (note how it is the same sub-query).

8
  • 1
    Follow up question: if I wanted to SET multiple columns this way would I move the subquery to FROM or would I repeat the subquery for each column?
    – raphael
    Commented Apr 5, 2018 at 18:33
  • 1
    @raphael damn, accidentally deleted my comment, so again: move to FROM (...you got me there, I'm not 100% sure, but I guess that you can assign multiple SET columns if the subquery returns only one distinct value per column, as with the stop_id in the example, without a WHERE condition)
    – geozelot
    Commented Apr 5, 2018 at 19:18
  • 1
    I'll play around with it and report back.
    – raphael
    Commented Apr 5, 2018 at 19:59
  • 2
    @raphael got it: ... SET (<col1>, <col2>) = (<query>) where <query> (as above) returns two columns in respective order, having one distinct value each..
    – geozelot
    Commented Apr 6, 2018 at 10:53
  • 1
    @DPSSpatial I think that´s in fact a common practice for large tables since an UPDATE is resource heavy most of the time. The JOIN LATERAL will add up to this, maybe using a CTE can be helpful? I´m no expert here, but maybe it´s worth checking your work_mem value to minimize temporary disk storage etc. Also, on large tables, it´s at least faster, maybe even less demanding, to create a new table instead (I read that a long while ago, can´t find source)
    – geozelot
    Commented May 24, 2018 at 18:27
0

And if you want to store this minimal distance in your gps table you can use this code:

ALTER TABLE gps
ADD min_distance float8;

UPDATE gps
SET min_distance = list.distance
FROM(
    SELECT gps.gps_id AS name, MIN(gps.geom <-> stops.geom) AS distance
    FROM gps
    CROSS JOIN stops
    GROUP BY gps.gps_id
    ORDER BY gps.gps_id)
    AS list
WHERE gps.gps_id = list.name;

Not the answer you're looking for? Browse other questions tagged or ask your own question.