Google-OR Tools: Assignment minimizing overlapping members

Question

I'm rusty on constraint optimization and am looking for help in this particular use case. There are individuals who are each member to several teams. This is a fixed many-to-many relationship and is determined a-priori. There are 3 time slots where the teams can be scheduled to conduct a business meeting, but if an individual is a member of more than one team which are both meeting at a given time slot, they'll only be able to attend one. The objective is to schedule the teams into the time slots, minimizing the number of overlaps of individuals.

I've started with a simple Boolean array for the model:

    var model = new CpModel();

    var x = new BoolVar[_timeSlots.Length, _teams.Length];
    foreach (var s in _timeSlots)
    {
        foreach (var t in _teams)
        {

            x[s, t] = model.NewBoolVar($"teamMeets[{s},{t}]");
        }
    }

Teams may only meet once:

    foreach (var t in _teams)
    {
        var teamMeet = new List<ILiteral>();
        foreach (var s in _timeSlots)
        {
            teamMeet.Add(x[s, t]);
        }
        model.AddExactlyOne(teamMeet);
    }

I'm hazy on how to proceed to minimize the overlap - I'm thinking it should be an IntVar containing the number of individuals overlapping between the teams, and then add a model.Minimize objective. But I could use a hint on how I might correlate the time slots, teams, and individuals in this.

I struggle to see the complete picture. Are members a-priori assigned to teams or is this a decision to be made? Sounds like it's fixed. But then it doesn't make much sense to throw a solver at it as there are only 3 time-slots and enumeration will be much more efficient or not? What are those team-meetings? A tournament-schedule? Round-robin? I guess there isn't much to achieve with only 3 time-slots? Finally: Wouldn't it be easier to reason about the problem starting with a bool-(3d-)tensor of time, team, member: i guess all cons are then 1d/2d indexing sum-bounds. A-priori fixings possible. — sascha, Commented Oct 15, 2023 at 3:03
Thanks @sascha, I’ve adjusted the phrasing to clarify. The individual-team relationship is fixed beforehand, and the meetings are internal to each team - like a business meeting, not a sports meet. Hope that clarifies it. I was considering adding the third dimension of ‘member’ to the BoolVar, but unsure of how to go about keeping constrains set between ‘team’ and ‘member’ when solving. — James Kolpack, Commented Oct 15, 2023 at 15:15

Erwin Kalvelagen · Accepted Answer · 2023-10-30 13:20:08Z

It is often a good idea to split this task into two parts:

Developing a mathematical model
Implement the model (coding)

Often, if you do a good job on (1), (2) becomes a breeze.

Here is my attempt.

Indices: $$\begin{aligned}& i && \text{individuals}\\& t &&\text{teams} \\ & s && \text{time slots} \end{aligned}$$

Data: $$\color{darkblue}m_{i,t}=\begin{cases}1 & \text{if individual $i$ is member of team $t$} \\ 0 & \text{otherwise}\end{cases}$$

Binary variables: $$\color{darkred}x_{t,s} = \begin{cases}1 & \text{if meeting of team $t$ takes place at time slot $s$} \\ 0 & \text{otherwise}\end{cases}$$ and $$\color{darkred}y_{i,s} = \begin{cases}1 & \text{if individual $i$ has meetings at timeslot $t$} \\ 0 & \text{otherwise}\end{cases}$$

My idea is to use as objective: maximize the number of meetings each individual can attend. This is the same as minimizing the number of overlaps, except that it is easier to model. The complete MIP model can look like:

$$ \begin{aligned}\max &\sum_{i,s}\color{darkred}y_{i,s} \\ & \sum_s \color{darkred}x_{t,s}=1&&\forall t \\ & \color{darkred}y_{i,s}\le \sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x_{t,s}&&\forall i,s \\ & \color{darkred}x_{t,s} \in \{0,1\} \\ &\color{darkred}y_{i,s} \in \{0,1\} \end{aligned} $$

The first constraint says: a team meeting has to occur exactly once. The second constraint says: if not a member of any teams that meet during time slot $s$, then individual $i$ is not attending that time slot.

Using some random data:

----     23 SET member  team membership

                   team1       team2       team3       team4       team5       team6

individual1          YES                                                         YES
individual2                                  YES
individual3                                  YES                     YES
individual4                                                          YES         YES
individual5                                  YES
individual6          YES                     YES
individual7                                                                      YES
individual8                      YES                     YES
individual10         YES         YES                                 YES
individual11         YES         YES                                 YES         YES
individual12                                                         YES         YES
individual13         YES                                                         YES
individual14                     YES         YES         YES
individual15         YES         YES
individual17         YES                     YES                     YES         YES
individual18                                             YES
individual19         YES                                 YES
individual20                                 YES                     YES

I get the following solution:

----     46 VARIABLE x.L  schedule team meetings at time slot

        timeslot1   timeslot2   timeslot3

team1           1
team2                       1
team3           1
team4                                   1
team5                                   1
team6                       1

----     46 VARIABLE y.L  individual is in meeting at time slot

               timeslot1   timeslot2   timeslot3

individual1            1           1
individual2            1
individual3            1                       1
individual4                        1           1
individual5            1
individual6            1
individual7                        1
individual8                        1           1
individual10           1           1           1
individual11           1           1           1
individual12                       1           1
individual13           1           1
individual14           1           1           1
individual15           1           1
individual17           1           1           1
individual18                                   1
individual19           1                       1
individual20           1                       1


----     51 PARAMETER meetingcount  number of meetings

               timeslot1   timeslot2   timeslot3

individual1            1           1
individual2            1
individual3            1                       1
individual4                        1           1
individual5            1
individual6            2
individual7                        1
individual8                        1           1
individual10           1           1           1
individual11           1           2           1
individual12                       1           1
individual13           1           1
individual14           1           1           1
individual15           1           1
individual17           2           1           1
individual18                                   1
individual19           1                       1
individual20           1                       1

There are just a few double bookings. The meetingcount is $\sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x^*_{t,s}$.

There is nothing particularly complex in the model, so it should be very easy to implement in code. Of course, this problem is small enough to do a complete enumeration.

A MIP model to minimize the "overlaps" can be developed as follows.

Let $\color{darkred}c^{\ge 2}_{i,s}$ be the number meetings at time slot $s$ individual $i$ would like to attend exceeding 1 (i.e. 2 or more; this is in other words the number of overlaps). Then we can write:

$$ \begin{aligned}\min &\sum_{i,s}\color{darkred}c^{\ge 2}_{i,s} \\ & \sum_s \color{darkred}x_{t,s}=1&&\forall t \\ & 1+\color{darkred}c^{\ge 2}_{i,s}\ge \sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x_{t,s}&&\forall i,s \\ & \color{darkred}x_{t,s} \in \{0,1\} \\ &\color{darkred}c^{\ge 2}_{i,s} \ge 0 \end{aligned} $$

In other words, $$ \color{darkred}c^{\ge 2}_{i,s} = \max\{0,\sum_t \color{darkblue}m_{i,t}\cdot \color{darkred}x_{t,s}-1\} $$

In the Minizinc answer a MIP model is created from a high-level CP model. The statistics with the HiGHS solver are also given. When I ran my MIP model and the MiniZinc model, I see some differences:

                              MIP   MiniZinc
   rows before presolve        67      174 
   columns before presolve     79      141
   nz before presolve         253       na
   rows after presolve         57       87 
   columns after presolve      69       69   (minizinc generates some general integer variables)
   nz after presolve          195      261
   Timing                     0.02     0.1
   Nodes                        1        9
   LP iterations              103     1200

So, it looks like my MIP model is a little bit more efficient than the generated Minizinc model. Of course, in this case, this is totally inconsequential. But I suspect this can also be true for large, difficult models.

There is no way of knowing the Minizinc CP->MIP models are poorly performing, except by developing and comparing against a properly formulated MIP model.

Thank you very much! I think this definitely fills a few conceptual gaps I’ve been lacking and also so well organized and presented. I actually went and implemented this with a rather crude and ugly decision tree which kinda-works, I’ll implement this and report back! — James Kolpack, Commented Oct 22, 2023 at 13:14
For comparisons of the MiniZinc model, it is worth commenting out the conflicts variables, as they are not actually used for solving, only for output. Added a note in my answer to clarify that. — Zayenz, Commented Oct 25, 2023 at 5:12

Zayenz · Accepted Answer · 2023-10-25 05:10:10Z

The MIP model by Erwin Kalvelagen is a good way to solve the problem, and is probably quite efficient.

As an alternative, the following MiniZinc model is a fairly straight-forward model based on the problem description. Additionaly, it can be used with different types of solvers.

% The data-set of a problem
enum Teams;
enum Individuals;
enum Timeslots;
array[Teams] of set of Individuals: team_members;

% The meeting time for each team
array[Teams] of var Timeslots: meeting_time;

% The count of overlaps
var int: overlaps = sum(t1, t2 in Teams where t1 < t2) (
    if meeting_time[t1] = meeting_time[t2] then
        card(team_members[t1] intersect team_members[t2])
    else
        0
    endif
);

% Solve minimizing the number of overlaps using the solvers default strategy
solve minimize overlaps;

% Variables for counting the conflicts for each individual, used only for output
array[Individuals, Timeslots] of var int: conflicts = array2d(Individuals, Timeslots, [
    let {
      var int: meetings = count(t in Teams where i in team_members[t]) (
          meeting_time[t] = m
      )
    } in
    meetings
  | i in Individuals, m in Timeslots
]);

output [
    "Times: \(meeting_time)\n",
    "Total overlaps: \(overlaps)\n",
    "Conflicts: \n", show2d(conflicts),"\n",
];

The data can be supplied in a dzn-file. Here is a file corresponding to the instance in Kalvelagens reply

Timeslots = {M1, M2, M3};
Individuals = {I1, I2, I3, I4, I5, I6, I7, I8, I9, I10, I11, I12, I13, I14, I15, I16, I17, I18, I19, I20};
Teams = {T1, T2, T3, T4, T5, T6};

team_members = [
    {I1, I6, I10, I11, I13, I15, I17, I19},
    {I8, I10, I11, I14, I15},
    {I2, I3, I5, I6, I14, I17, I20},
    {I8, I14, I18, I19},
    {I3, I4, I10, I11, I12, I17, I20},
    {I1, I4, I7, I11, I12, I13, I17},
];

In MiniZinc version 2.7.6 with the default solver (Gecode 6.3.0, with statistics), the output is as follows

[...]
----------
Times: [M2, M1, M1, M3, M3, M1]
Total overlaps: 3
Conflicts: 
[|      M1: M2: M3: 
 |  I1:  1,  1,  0
 |  I2:  1,  0,  0
 |  I3:  1,  0,  1
 |  I4:  1,  0,  1
 |  I5:  1,  0,  0
 |  I6:  1,  1,  0
 |  I7:  1,  0,  0
 |  I8:  1,  0,  1
 |  I9:  0,  0,  0
 | I10:  1,  1,  1
 | I11:  2,  1,  1
 | I12:  1,  0,  1
 | I13:  1,  1,  0
 | I14:  2,  0,  1
 | I15:  1,  1,  0
 | I16:  0,  0,  0
 | I17:  2,  1,  1
 | I18:  0,  0,  1
 | I19:  0,  1,  1
 | I20:  1,  0,  1
 |]
----------
==========
%%%mzn-stat: failures=23
%%%mzn-stat: initTime=0.000784
%%%mzn-stat: nodes=61
%%%mzn-stat: peakDepth=7
%%%mzn-stat: propagations=3096
%%%mzn-stat: propagators=122
%%%mzn-stat: restarts=0
%%%mzn-stat: solutions=8
%%%mzn-stat: solveTime=0.000591
%%%mzn-stat: variables=121
%%%mzn-stat-end
%%%mzn-stat: nSolutions=8
%%%mzn-stat-end
Finished in 68msec.

Using OR-Tools 9.7 as the solver, gives the following statistics

[...]
Times: [M1, M2, M1, M3, M3, M2]
Total overlaps: 3
Conflicts: 
[|      M1: M2: M3: 
 |  I1:  1,  1,  0
 |  I2:  1,  0,  0
 |  I3:  1,  0,  1
 |  I4:  0,  1,  1
 |  I5:  1,  0,  0
 |  I6:  2,  0,  0
 |  I7:  0,  1,  0
 |  I8:  0,  1,  1
 |  I9:  0,  0,  0
 | I10:  1,  1,  1
 | I11:  1,  2,  1
 | I12:  0,  1,  1
 | I13:  1,  1,  0
 | I14:  1,  1,  1
 | I15:  1,  1,  0
 | I16:  0,  0,  0
 | I17:  2,  1,  1
 | I18:  0,  0,  1
 | I19:  1,  0,  1
 | I20:  1,  0,  1
 |]
----------
==========
%%%mzn-stat: nSolutions=9
%%%mzn-stat-end
%%%mzn-stat: boolVariables=63
%%%mzn-stat: failures=26
%%%mzn-stat: objective=3
%%%mzn-stat: objectiveBound=3
%%%mzn-stat: propagations=5882
%%%mzn-stat: solveTime=0.009147
%%%mzn-stat-end
Finished in 98msec.

Using the included HiGHS MIP solver gives the following statistics

%%%mzn-stat: nodes=7
%%%mzn-stat: objective=3
%%%mzn-stat: objectiveBound=3
%%%mzn-stat: solveTime=0.2083
%%%mzn-stat-end
Times: [M3, M2, M3, M1, M1, M2]
Total overlaps: 3
Conflicts: 
[|      M1: M2: M3: 
 |  I1:  0,  1,  1
 |  I2:  0,  0,  1
 |  I3:  1,  0,  1
 |  I4:  1,  1,  0
 |  I5:  0,  0,  1
 |  I6:  0,  0,  2
 |  I7:  0,  1,  0
 |  I8:  1,  1,  0
 |  I9:  0,  0,  0
 | I10:  1,  1,  1
 | I11:  1,  2,  1
 | I12:  1,  1,  0
 | I13:  0,  1,  1
 | I14:  1,  1,  1
 | I15:  0,  1,  1
 | I16:  0,  0,  0
 | I17:  1,  1,  2
 | I18:  1,  0,  0
 | I19:  1,  0,  1
 | I20:  1,  0,  1
 |]
----------
==========
%%%mzn-stat: nodes=7
%%%mzn-stat: objective=3
%%%mzn-stat: objectiveBound=3
%%%mzn-stat: solveTime=0.2083
%%%mzn-stat-end
%%%mzn-stat: nSolutions=1
%%%mzn-stat-end
Finished in 291msec.

Since this is a fairly simple instance, there is no real insight in testing the different solvers. With larger instances, there is probably a lot more to be gained from choosing the right solver and adding parallel search.

In addition, the conflicts array of variables in the model is just added to get some pretty output, it is not useful for solving the problem. To get the best speed, the variables (and the output statement that uses them) should probably be removed.

Optimization team · Accepted Answer · 2023-10-30 14:10:24Z

The original data mises the team 9,16 (these two do not belong to any team) The code in OR-TOOLS

Data

from ortools.sat.python import cp_model
Timeslots = [1,2,3]
Individuals = [i+1 for i in range(20)]
Teams = [team+1 for team in range(6)]

    team_members = {
        1:[1, 6, 10, 11, 13, 15, 17, 19],
        2:[8, 10, 11, 14, 15],
        3:[2, 3, 5, 6, 14, 17, 20],
        4:[8, 14, 18, 19],
        5:[3, 4, 10, 11, 12, 17, 20],
        6:[1, 4, 7, 11, 12, 13, 17],
        }

The code

class VarArraySolutionPrinter(cp_model.CpSolverSolutionCallback):
    """Print intermediate solutions."""

    def __init__(self, variables):
        cp_model.CpSolverSolutionCallback.__init__(self)
        self.__variables = variables
        self.__solution_count = 0


    def on_solution_callback(self):
        self.__solution_count += 1

    def solution_count(self):
        return self.__solution_count


def SearchForAllSolutionsSampleSat():
    """Showcases calling the solver to search for all solutions."""
    # Creates the model.
    model = cp_model.CpModel()
    
    x = {(team,t):model.NewBoolVar(f"x_{team}_{t}") for team in Teams for t in Timeslots}
    over_pt = {(p,t):model.NewIntVar(0,len(Teams)-1, f"over_{p}_{t}") for p in Individuals for t in Timeslots}
    
    # Create the constraints.
    for team in Teams:
        model.AddExactlyOne([x[team,t] for t in Timeslots])

    for t in Timeslots:
        for p in Individuals:
            expressions = [x[team,t] for team in Teams if p in team_members[team] ]
            model.Add(sum(expressions)-1<= over_pt[p,t])            

    of_expr = sum([over_pt[p,t] for p in Individuals for t in Timeslots])
    model.Minimize(of_expr)
    # Create a solver and solve.
    solver = cp_model.CpSolver()
    solution_printer = VarArraySolutionPrinter([x,over_pt])
    # Enumerate all solutions.
    solver.parameters.enumerate_all_solutions = False
    # Solve.
    status = solver.Solve(model, solution_printer)
    print(f"Status = {solver.StatusName(status)}")
    
    print('Overassigned' , [i[0] for i in over_pt if solver.Value(over_pt[i]) ])


    team_assignment = {}    
    for (team,t),v in x.items():
        if solver.Value(v)>0:
            team_assignment[team,t] =1 
            print(f"Team {team} ------>  time {t}")
    print(f"Team member   {1} {2} {3} ")    

    for p in Individuals:
        time_assign = []
        for t in Timeslots:
            a = [team for team in Teams if p in team_members[team] and (team,t) in team_assignment]
            if len(a)>0:
                time_assign.append(len(a))
            else:
                time_assign.append(0)
        print(f"{p}             {time_assign[0]} {time_assign[1]} {time_assign[2]} ")    

SearchForAllSolutionsSampleSat()

Results

Status = OPTIMAL
Overassigned [11, 14, 17]
Team 1 ------>  time 3
Team 2 ------>  time 1
Team 3 ------>  time 1
Team 4 ------>  time 2
Team 5 ------>  time 2
Team 6 ------>  time 1
Team member   1 2 3 
1             1 0 1 
2             1 0 0 
3             1 1 0 
4             1 1 0 
5             1 0 0 
6             1 0 1 
7             1 0 0 
8             1 1 0 
9             0 0 0 
10             1 1 1 
11             2 1 1 
12             1 1 0 
13             1 0 1 
14             2 1 0 
15             1 0 1 
16             0 0 0 
17             2 1 1 
18             0 1 0 
19             0 1 1 
20             1 1 0

The code has an upper bound of 2 on the number of duplicates over_pt. (I.e. this problem with just 1 time period will be infeasible). I assume this code was copy-pasted which is almost always a recipe for disaster. Better than copy-paste is to copy the concepts but enter the code by typing. It forces you to pay attention to the details. — Erwin Kalvelagen, Commented Oct 30, 2023 at 12:31

James Kolpack · Accepted Answer · 2024-01-30 16:00:45Z

Thank you Erwin Kalvelagen and Zayenz for the detailed and insightful responses. I was able to get the model coded in the .NET version of Google OR Tools, I'm pasting that below.

One peculiarity was that the optimization for the individuals time slots had to be minimized for it to work - model.Minimize(indTimeSlotVars);. I had expected this to be use "Maximize" based on the model, but that ends up assigning all teams to time slot one. I may be missing something still, but this way works.

    // Model.
    var model = new CpModel();

    // Data
    int[,] m =
    {
        { 1, 0, 0, 0, 0, 1 }, // 1
        { 0, 0, 1, 0, 0, 0 }, // 2
        { 0, 0, 1, 0, 1, 0 }, // 3
        { 0, 0, 0, 0, 1, 1 }, // 4
        { 0, 0, 1, 0, 0, 0 }, // 5
        { 1, 0, 1, 0, 0, 0 }, // 6
        { 0, 0, 0, 0, 0, 1 }, // 7
        { 0, 1, 0, 1, 0, 0 }, // 8
        { 0, 0, 0, 0, 0, 0 }, // 9
        { 1, 1, 0, 0, 1, 0 }, // 10
        { 1, 1, 0, 0, 1, 1 }, // 11
        { 0, 0, 0, 0, 1, 1 }, // 12
        { 1, 0, 0, 0, 0, 1 }, // 13
        { 0, 1, 1, 1, 0, 0 }, // 14
        { 1, 1, 0, 0, 0, 0 }, // 15
        { 0, 0, 0, 0, 0, 0 }, // 16
        { 1, 0, 1, 0, 1, 1 }, // 17
        { 0, 0, 0, 1, 0, 0 }, // 18
        { 1, 0, 0, 1, 0, 0 }, // 19
        { 0, 0, 1, 0, 1, 0 }, // 20
    };

    // Variables.
    int[] individuals = Enumerable.Range(0, 20).ToArray();
    int[] teams = Enumerable.Range(0, 6).ToArray();
    int[] timeSlots = Enumerable.Range(0, 3).ToArray();

    // x - 1 if meeting of team t takes place at time slot s, else 0
    var x = new IntVar[teams.Length, timeSlots.Length];
    foreach (var t in teams)
    foreach (var s in timeSlots)
        x[t, s] = model.NewIntVar(0, 1,$"team time slots[{t},{s}]");

    // y - 1 if individual i has meetings at time slot s, 0 otherwise
    var y = new IntVar[individuals.Length, timeSlots.Length];
    foreach (var i in individuals)
    foreach (var s in timeSlots)
        y[i, s] = model.NewIntVar(0, 1, $"individual time slots[{i},{s}]");

    // each team meets exactly one time
    foreach (var t in teams)
        model.AddLinearConstraint(LinearExpr.Sum(timeSlots.Select(s => x[t, s])), 1L, 1L);

    //individual must have at least one team meeting at the given time slot to attend
    foreach (var i in individuals)
    foreach (var s in timeSlots)
        model.Add(
            new BoundedLinearExpression(
                y[i, s],
                LinearExpr.Sum(teams.Select(t => m[i, t] * x[t, s])),
                false));

    // maximize number of times individuals meet
    var indTimeSlotVars = LinearExpr.NewBuilder();
    foreach (var i in individuals)
    foreach (var s in timeSlots)
        indTimeSlotVars.Add(y[i, s]);
    model.Minimize(indTimeSlotVars);

    var solver = new CpSolver();
    var cpSolverStatus = solver.Solve(model);
    Console.WriteLine($"Solver status: {cpSolverStatus}");

Stack Exchange Network

Google-OR Tools: Assignment minimizing overlapping members

4 Answers 4

Not the answer you're looking for? Browse other questions tagged
or-tools
assignment-problem
cp-sat
or ask your own question.

Hot Network Questions

Google-OR Tools: Assignment minimizing overlapping members

4 Answers 4

Not the answer you're looking for? Browse other questions tagged or-toolsassignment-problemcp-sat or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
or-tools
assignment-problem
cp-sat
or ask your own question.