Pairing cartesian coordinates for minimum distance

Question

I have n start locations, defined by their x,y coordinates on a two-dimensional plane.

I have a further n destination locations (again, as x,y coordinates) that differ from the start locations.

All locations are effectively random.

I'd like to pair each start location with one of the destination locations, in such a way as to minimize the sum of the distances between the start and destination locations for each pair.

I have multiple sets of points I'd like to solve, with N<=15.

I've tried to use Excel to solve. I can calculate the distance between any pair of x,y coordinates by: =SQRT((x1-x2)^2+(y1-y2)^2)) I thought I'd just list the start locations and then permutate the list of destination locations while summing the distance results. The trouble with that approach is that the number of permutations for the sort order of a list of 15 items is 15 factorial which is a discouragingly huge number (over a trillion). Any ideas how to approach this?

Welcome to MSE. For some basic information about writing mathematics at this site see, e.g., basic help on mathjax notation, mathjax tutorial and quick reference, main meta site math tutorial and equation editing how-to. — José Carlos Santos, Commented Aug 22, 2022 at 7:17
This looks more like a programming / algorithmic problem than just a mathemtical one. Did you consider asking the question at Stack Overflow? — CiaPan, Commented Aug 22, 2022 at 7:18
Thanks for these comments. I agree it's more of algorithmic problem so I may ask at Stack Overflow. Also I'll use the recommended mathjax notation in future instead of Excel formulas. — Okanagan, Commented Aug 22, 2022 at 7:27
The sum of minimum distances is also a minimum. Any restrictions about this pairing? The destination point can be the same for two start locations? — Cesareo, Commented Aug 22, 2022 at 7:50
@Cesareo Given equal numbers of 'sources' and 'destinations' I presumed the connections need to be one-to-one. This also agrees with the OP's approach of permuting one of the two vectors. Anyway, you're right that this should be stated more clearly. — CiaPan, Commented Aug 22, 2022 at 11:57

Cesareo · Accepted Answer · 2022-08-25 15:23:37Z

Follows a python scrip implementing a Genetic Algorithm to handle this combinatorics optimization problem. The script has a minimum documentation and the main program bulk was obtained from a repository. Essential changes were introduced modeling the crossover operation. The script was tested with the furnished example. No guarantee of the best solution because GA's have random nature but the best (or a near best) solution can be obtained after some trials. Sorry for the heresies in programming: I am a novice regarding python programming.

# genetic algorithm search of the one min optimization problem
from numpy.random import randint
from numpy.random import rand
import numpy as np
import random
import math
import matplotlib.pyplot as plt


VOID = -1
count = 1
history = []
max_value = 100000000

np.random.seed(0)
# define the total iterations
n_iter = 1000
# bits
n_bits = 12
# define the population size
n_pop = 500
# crossover rate
r_cross = 0.9
# mutation rate
basis  = list(range(n_bits))
r_mut  = 1.0 / float(n_bits) / 2
#start  = [rand(2).tolist() for _ in range(n_bits)]
#end    = [rand(2).tolist() for _ in range(n_bits)]

start =  [[0,7], [0,17], [6,1], [8,7], [16,17], [24,7], [24,17], [40,7], [50,7], [50,15], [52,20], [60,11]]
end   =  [[2,5], [2,16], [6,24], [10,22], [16,26], [28,16], [20,26], [40,22], [44,6], [48,28], [54,30], [38,24]]
# objective function
def sod(ind):
    global count
    global history
    global max_value
    (ind_s, ind_e) = ind
    s = 0   
    for k in range(n_bits):
        (xs, ys) = start[ind_s[k]]
        (xe, ye) = end[ind_e[k]]
        dx   = xs - xe
        dy   = ys - ye
        dist = math.sqrt(dx*dx + dy*dy)
        s   += dist
    
    if s < max_value:
        max_value = s
        history.append([np.log(count), max_value])
    
    count += 1
    return s

# tournament selection
def selection(pop, scores, k=3):
    # first random selection
    selection_ix = randint(len(pop))
    for ix in randint(0, len(pop), k-1):
        # check if better (e.g. perform a tournament)
        if scores[ix] < scores[selection_ix]:
            selection_ix = ix
    return pop[selection_ix]

def maintain(sc2r, c1):
    hit = len(sc2r)*[0]
    k = 0
    for bit in sc2r:
        if bit in c1:
            hit[k] = bit
            c1 = list(set(c1)-set([bit]))
        else:
            hit[k] = VOID
        k += 1
    if len(c1) == 0:
        return (hit, c1)
    j = 0
    c1 = random.sample(c1,len(c1))
    for k in range(len(hit)):
        if hit[k] == VOID:
            hit[k] = c1[j]
            j += 1
    return (hit, c1)


def crossover(p1, p2, r_cross):
    (inds1,inde1) = p1
    (inds2,inde2) = p2
    # check for recombination
    if rand() < r_cross:
        # select crossover point that is not on the end of the string
        pt = randint(1, len(inds1)-2)
        # perform crossover
        sc1 = inds1[:pt] 
        sc2 = inds2[:pt]
        sc1r = inds1[pt:] 
        sc2r = inds2[pt:]
        c1 = list(set(basis) - set(sc1))
        c2 = list(set(basis) - set(sc2))
        (hit1, c1) = maintain(sc2r,c1)
        (hit2, c2) = maintain(sc1r,c2)
        inds1 = sc2 + hit2
        inds2 = sc1 + hit1
        ec1 = inde1[:pt] 
        ec2 = inde2[:pt]
        ec1r = inde1[pt:]
        ec2r = inde2[pt:]
        c1 = list(set(basis) - set(ec1))
        c2 = list(set(basis) - set(ec2))
        (hit1, c1) = maintain(ec2r,c1)
        (hit2, c2) = maintain(ec1r,c2)
        inde1 = ec2 + hit2
        inde2 = ec1 + hit1
    return [[inds1, inde1],[inds2, inde2]]

# mutation operator
def mutation(bitstring, r_mut):
    (inds, inde) = bitstring
    for i in range(n_bits):
        # check for a mutation
        if rand() < r_mut:
            # flip the bit
            temp = inds[i]
            inds[i] = inds[temp]
            inds[temp] = temp
    for i in range(n_bits):
        # check for a mutation
        if rand() < r_mut:
            # flip the bit
            temp = inde[i]
            inde[i] = inde[temp]
            inde[temp] = temp

# genetic algorithm
def genetic_algorithm(objective, n_bits, n_iter, n_pop, r_cross, r_mut):
    # initial population of random bitstring
    pop = []
    for k in range(n_pop):
        pop.append([random.sample(basis,n_bits),random.sample(basis,n_bits)])
    # keep track of best solution
    (best, best_eval) = (0, objective(pop[0]))
    # enumerate generations
    for gen in range(n_iter):
        # evaluate all candidates in the population
        scores = [objective(c) for c in pop]
        # check for new best solution
        for i in range(n_pop):
            if scores[i] < best_eval:
                (best, best_eval) = (pop[i], scores[i])
                #print(">%d, new best f(%s) = %.3f" % (gen,  pop[i], scores[i]))
        # select parents
        selected = [selection(pop, scores) for _ in range(n_pop)]
        # create the next generation
        children = list()
        for i in range(0, n_pop, 2):
            # get selected parents in pairs
            (p1, p2) = (selected[i], selected[i+1])
            # crossover and mutation
            for c in crossover(p1, p2, r_cross):
                # mutation
                mutation(c, r_mut)
                # store for next generation
                children.append(c)
        # replace population
        pop = children
    return [best, best_eval]


# perform the genetic algorithm search
(best, score) = genetic_algorithm(sod, n_bits, n_iter, n_pop, r_cross, r_mut)
print('Done!')
print('f(%s) = %f' % (best, score))


startT = np.array(start).T.tolist()  
startx = startT[0] 
starry = startT[1]    
plt.plot(startx, starty, 'ro')
endT = np.array(end).T.tolist()  
endx = endT[0] 
endy = endT[1]    
pkt.plot(endx, endy, 'bo')
bs = best[0]
be = best[1]

for k in range(n_bits):
    inds = bs[k]
    inde = be[k]
    plt.plot([startx[inds], endx[inde]],[starty[inds],endy[inde]],'k-')

plt.axis([-1, 61, -1, 31])
plt.show()

historyT = np.array(history).T.tolist() 
hx = historyT[0]
hy = historyT[1]
plt.plot(hx,hy,'k-')
plt.show()

Follows the pairing

$$ [[2, 0, 1, 3, 4, 5, 6, 9, 8, 7, 11, 10], [0, 1, 2, 3, 4, 5, 6, 7, 10, 11, 8, 9]] $$

RobPratt · Accepted Answer · 2022-08-25 15:55:58Z

This is the linear assignment problem, and it can be solved in polynomial time via specialized algorithms or linear programming.

Here is SAS code to solve the problem for your sample data. The first approach uses the linear programming solver, and the second approach uses the linear assignment problem algorithm implemented in the network solver.

data SourceData;
   input x y @@;
   source = compress('Source_'||_N_);
   datalines;
0 7  0 17  6 1  8 7  16 17  24 7  24 17  40 7  50 7  50 15  52 20  60 11
;

data DestinationData;
   input x y @@;
   destination = compress('Destination_'||_N_);
   datalines;
2 5  2 16  6 24  10 22  16 26  28 16  20 26  40 22  44 6  48 28  54 30  38 24
;

data PlotData;
   set SourceData(in=IsSource) DestinationData;
   group = IsSource;
run;

proc optmodel;
   /* declare parameters and read data */
   set <str> SOURCES;
   set <str> DESTINATIONS;
   num xs {SOURCES}, ys {SOURCES};
   num xd {DESTINATIONS}, yd {DESTINATIONS};
   read data SourceData into SOURCES=[source] xs=x ys=y;
   read data DestinationData into DESTINATIONS=[destination] xd=x yd=y;
   num cost {s in SOURCES, d in DESTINATIONS} = sqrt((xs[s] - xd[d])^2 + (ys[s] - yd[d])^2);

   /* declare linear programming (LP) problem */
   var Assign {SOURCES, DESTINATIONS} >= 0;
   min TotalCost = sum {s in SOURCES, d in DESTINATIONS} cost[s,d] * Assign[s,d];
   con AssignSource {s in SOURCES}:
      sum {d in DESTINATIONS} Assign[s,d] = 1;
   con AssignDestination {d in DESTINATIONS}:
      sum {s in SOURCES} Assign[s,d] = 1;

   /* call LP solver */ 
   solve;

   /* create output data set */
   create data SolutionData from 
      [source destination]={s in SOURCES, d in DESTINATIONS: Assign[s,d].sol > 0.5}
      x1=xs[s] y1=ys[s] x2=xd[d] y2=yd[d]
      function='line' drawspace='datavalue';
quit;

/* plot solution */
proc sgplot data=PlotData sganno=SolutionData noautolegend;
   scatter x=x y=y / group=group;
run;



proc optmodel;
   /* declare parameters and read data */
   set <str> SOURCES;
   set <str> DESTINATIONS;
   num xs {SOURCES}, ys {SOURCES};
   num xd {DESTINATIONS}, yd {DESTINATIONS};
   read data SourceData into SOURCES=[source] xs=x ys=y;
   read data DestinationData into DESTINATIONS=[destination] xd=x yd=y;
   set LINKS = SOURCES cross DESTINATIONS;
   num cost {<s,d> in LINKS} = sqrt((xs[s] - xd[d])^2 + (ys[s] - yd[d])^2);
   set <str,str> ASSIGNMENTS;

   /* call network solver */
   solve with network / linear_assignment direction=directed links=(weight=cost) out=(assignments=ASSIGNMENTS);

   /* create output data set */
   create data SolutionData from 
      [source destination]={<s,d> in ASSIGNMENTS}
      x1=xs[s] y1=ys[s] x2=xd[d] y2=yd[d]
      function='line' drawspace='datavalue';
quit;

/* plot solution */
proc sgplot data=PlotData sganno=SolutionData noautolegend;
   scatter x=x y=y / group=group;
run;

The solution found by @Cesareo is indeed optimal:

Thanks for the linear assignment problem link. It led me to Combinatorial optimization which is a more general description of my problem. But all this has helped me realise that I'd best just stick with what I had already done: plot the sets of points and connect pairs by visually looking for what appears to give a minimum total distance. That will be good enough for my purposes. — Okanagan, Commented Aug 23, 2022 at 11:04
If you provide one such set of points, I’ll find an optimal solution. — RobPratt, Commented Aug 23, 2022 at 12:56
That would be great, thanks. Here's one set of 12 source points: 0,7 0,17 6,1 8,7 16,17 24,7 24,17 40,7 50,7 50,15 52,20 60,11 And their destination points: 2,5 2,16 6,24 10,22 16,26 28,16 20,26 40,22 44,6 48,28 54,30 38,24 — Okanagan, Commented Aug 25, 2022 at 5:32
I chose the Python solution because I don't have access to SAS. — Okanagan, Commented Aug 27, 2022 at 2:33

Stack Exchange Network

Pairing cartesian coordinates for minimum distance

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
optimization
permutations
discrete-optimization
matching-theory
coordinates
.

Hot Network Questions

Pairing cartesian coordinates for minimum distance

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged optimizationpermutationsdiscrete-optimizationmatching-theorycoordinates.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
optimization
permutations
discrete-optimization
matching-theory
coordinates
.