I have a constraint optimization problem as follows:
- I need to assign $m$ tasks to $n$ days, with $n \geq m$.
- Each day can host 0 to $m$ tasks.
- Each task either belongs to type $A$ or $B$.
- I want to assign the tasks to the days such that tasks of the same type are grouped as much as possible on the same days (minimize mixing tasks of different types on the same day)
- I have a set of constraints indicating the days on which each task can occur.
- Moreover, the number of days that is used to host all the tasks should also be minimized. For instance: If the constraints allow hosting 2 tasks of the same type either on 2 different days or otherwise on the same day, hosting the 2 tasks on the same day should have preference.
I'm having issues finding a formulation of this problem that can be solved. My first thought is to use an $m\times n$ matrix $M$ where if $M(i,j) \neq 0$ it means that task $i$ occurs on day $j$. $M(i,j) = 1$ if the task is of type $A$, $M(i,j) = -1$ if the task is of type $B$.
I think part of my optimization objective should express that I want to minimize Shannon entropy: $-\sum p \dot \log(p)$. The zeros in my matrix are not relevant for the entropy calculation so when I compute $p$ I need to divide 2 variables: The number of tasks of group $A$ or $B$ divided by the total number of tasks on that day. I don't know the number of tasks on a given day beforehand since this is part of the optimization so I get a division of 2 variables.
I also tried an alternative for entropy in the objective function: I sum all the 1's and -1's on each day and then take the absolute value: The larger this sum, the higher the orderliness of that day. But again, when wanting to compare the 'orderliness' of 2 days with a different number of activities on them, I have to divide by the number of activities on that day, which is again division of 2 variables.
I've tried solving this problem with OR-tools and CVXPY but I get error messages like: The problem does not adhere to "disciplined convex programming". As I understand it, the division of 2 variables is a problem. Even worse, it is possible that a day has 0 activities and then I get a division by zero.
Is there any other formulation of this optimization problem that would not cause these problems?