5

In a points layer with the fields "POP" (corresponding to the population value) and "CATEG" (corresponding to municipality id), my goal is to connect the points of each municipality to the two closest highest values.

The expression I'm working with is the following, but it has lacking features such as the grouping by "CATEG" and the limit of maximum points:

make_line(
  $geometry,
  closest_point(
    aggregate(
      layer:='POINTSGROUP',
      aggregate:='collect' ,
      expression:=$geometry,
      filter:="POB" < maximum( attribute(@parent, 'POB'))
    ),
    $geometry
  )
)

The result it gives is the following but it is wrong:

enter image description here

I have simulated the goal with the correct result:

enter image description here

The idea behind this objective, I think, can be synthesized in two steps:

  1. For each municipality (A,B), detect the two points with the highest value of the attribute "POB"
  2. Then, connect the other points of each municipality (A,B) to one of the two points (with higher value) according to their proximity
4
  • 1
    Can you share the dataset you used to create the screenshot? To be clear: each point is connected to just one other point - right? And you want to connect only to points inside the same polygon (A,B)? So A and B are represented in the CATEG attribute? And: why is e.g. point 1627 (left side of your screenshot) connected to 2359 and not to 1660? Please make the critieria more explicit under which conditions the points should be connected to what other point.
    – Babel
    Commented Jun 6, 2023 at 9:55
  • 1
    I add all the requested information: Data: drive.google.com/file/d/1cGQgr5fTGYygcwBwC7w-kdSxGXLuOAfl/… Clarifications: - Correct, each point in the municipality (A or B) is connected to one of the two points with the highest value. - Correct, A and B are represented in the "CATEG" attribute - Point 1627 is connected to point 2359 because it is the point with the highest value closest point Commented Jun 6, 2023 at 10:47
  • 2
    Is 1928, in municipality B, as one of the "hubs" a mistake, should it not be 2030?
    – Matt
    Commented Jun 6, 2023 at 13:46
  • Oh yes, my mistake in not looking carefully and drawing the simulation. In municipality B the two highest values are 2030 and 2359. I'm sorry if this has complicated the understanding of the goal Commented Jun 6, 2023 at 14:33

1 Answer 1

4

Use this expression. On line 3, it creates the 2nd largest value per category as variable @max: get an array of all pob values for the current CATEG value, sort it in descending order and get the 2nd value (index [1]); in the filter part later, we use only pob values larger or equal to this value. Then use overlay_nearest() function:

with_variable(
    'max',
    array_sort (array_agg (pob,group_by:=CATEG),0)[1],  
with_variable(
    'cat',
    CATEG,
make_line(
    $geometry,
    eval('
        overlay_nearest(
            @layer,
            $geometry,
            filter:=pob >= ' || @max || ' and CATEG = ''' || @cat || '''
        )
    ')[0]
)))

The expression working on your dataset: blue=Category A, red=category B; black dotted line: mid-line between the two largest values per category: enter image description here


Edit:

Challenges in this expression is how to include a filter condition so that we can compare the attribute value of the feature currently evaluated inside the overaly_nearest() function not to a fixed value, but a dynamic expression, which is based, as here, on aggregate functions. So the challenge is to including the parent feature or other features (when aggregating). You can't include this directly in the filter, so you have to use a trick and concatenate the whole overlay_nearest() function as a text string and then evaluating it with eval() - see here: gis.stackexchange.com/a/415248/88814.

Especially tricky is that the dynamically calculated part (referring to the parent/aggregated features) has to remain outside the string so that it will be calculated correctly (on the parent feature or any other features when using aggregate functions) and to return the desired value(s). So for clarity, the value is created as variables @max and @cat outside the overlay_nearest() function and the variable is then inserted in between the string parts by concatenating the different parts with pipes ||.

On top of this, the value stored in the @cat variable has be be passed as a string to be concatenated, so you have to use not less than three single quotes ''' one after the other (as two single quotes are used inside a string to introduce a quote and thus prevent the string being ended).

An alternative, equivalent expression to the above one, avoiding the variable and including everything in the string concatenation part, is:

make_line(
    $geometry,
    eval('
        overlay_nearest(
            @layer,
            $geometry,
            filter:= pob >= ' || to_string(array_sort (array_agg (pob,group_by:=CATEG),0)[1])|| 
            ' and CATEG = ''' || CATEG || '''
        )
    ')[0]
)
3
  • 1
    Needless to say that this solution is perfectly well suited to my project. In this solution and in others experts contributions, I consider the use of the general function 'with_variable' in combination with the geometric functions 'overlay_nearest' and 'make_line' a great and at the same time a learning challenge Commented Jun 6, 2023 at 18:55
  • 1
    Excellent explanation of how this dynamic expression works and how we can uses its valuable tricks. Thank you very much for sharing this explanation which will surely help us to start understanding the possibilities of this type of expressions Commented Jun 6, 2023 at 19:45
  • 1
    I included the additional explanations in the revised answer and deleted it here in the comments so that it can be found easier.
    – Babel
    Commented Jun 7, 2023 at 7:14

Not the answer you're looking for? Browse other questions tagged or ask your own question.