5
$\begingroup$

I have the following objective that I want to maximize:

\begin{equation} \max_{U_T\in \mathbb{R}, x\in\mathbb{R}^T} J_\alpha(U_T) = \frac{\alpha}{\alpha-1}\log\left(\frac{\cosh(U_T)}{\cosh(\alpha U_T)^\frac{1}{\alpha}}\right) \,,\qquad \text{s.t:} \qquad U_T= Ax+b. \end{equation} where $A \in \mathbb{R}^{1\times T}$, $b\in\mathbb{R}$ and $\alpha>1$.

It is easy to show that this is, in fact, a pseudo-concave function by checking that: \begin{equation} \eta'_\alpha(u)(v-u)\leq0 \implies \eta_\alpha(v)\leq \eta_\alpha(u), \end{equation} where $\eta_\alpha(u) = \frac{\cosh(u)}{\cosh(\alpha u)^\frac{1}{\alpha}}$.

This, of course, implies the quasiconcavity of $J_\alpha(U_T)$.

Here's a plot of the function that I made with Wolfram Mathematica: Plot of <span class=$J_\alpha(x)$ for $\alpha=2$." />

-My questions are:

  1. Is there a way to write this objective function as a DQCP compliant program in CVXPY?
  2. When $\alpha \to 1^+$, $J_1(u) := -\log[\frac{e^{u\tanh(u)}}{\cosh(u)} ]$ pointwise. Is there a way to write that as a DQCP compliant program in CVXPY as well?
  3. If it can't be put into a DCP compliant formulation, what would you recommend that I do to solve this problem numerically (especially question 2)?

I'm using CVXPY since I'll add more complicated constraints later, but maybe the problem has a simple analytical solution as well.


-What I have so far:

I am especially interested in question 2, so I'm trying to solve that first.

Using the definitions of $\cosh$, write: \begin{align} J_1(u) &:= \lim_{\alpha\to 1^+}J_\alpha(u)\\ &= \log(\cosh(u)) -u \tanh(u)\\ &= \log\left(e^u + e^{-u}\right)-u \tanh(u) + K\\ &= -u\left(\log\left(\frac{1}{1 + e^{-2u}}\right)+ \tanh(u)\right) + K\\ &= -\frac{1}{2}f^{-1}(f(2u))(\log(f(2u))- \tanh(u)) + K, \end{align} where $K$ is a constant and $f(y) := 1/(1+e^{-y})$ is the logistic function, with inverse given by $f^{-1}(y) = \log(\frac{y}{1-y})$. By the well-known relation (Wiki): \begin{equation}\tanh(y) = 2f(2y) - 1,\end{equation} and we can conclude that: \begin{equation} J_1(u) = T(f(2u)) + K, \end{equation} where $T:[0,1]\to\mathbb{R}$ is: \begin{equation}T(z) = -\frac{1}{2}\log\left(\frac{z}{1-z}\right)\left(\log(z) - 2z-1\right).\end{equation}

T is in fact concave, as we can see from the graph of its second derivative in Mathematica:

Plot of <span class=$T(z)$" />

By further manipulation of $T$, we get the expression:

\begin{align} -2T(z) &= \log\left(\frac{z}{1-z}\right)(z + z - 1 + \log(z))\\ &= z\log\left(\frac{z}{1-z}\right) + (1-z)\log\left(\frac{1-z}{z}\right) + \log\left(\frac{z}{1-z}\right)\log(z)\\ &= 2\text{JS}(t \vert\vert 1-t) + \log^2(z) - \log(z)\log(1-z), \end{align} where $\text{JS}$ is the Jensen-Shannon entropy (simmetrized KL divergence).

Note the right-hand side is a sum of convex functions, as $\log^2(z)$ and $-\log(z)\log(1-z)$ are convex in $(0,1)$).

Therefore, I think I would be able to write the original program by optimizing $T$ and adding a (quasilinear) constraint to account for the change of variable. Question 2 finally reduces to finding a DCP compliant representation of $T$.


Update: Progress on trying to find a DCP representation of $\log^2(t)$ and $-\log(t)\log(1-t)$:

I found some interesting things:

a. $\log^2(t)$:

We can write this as: \begin{equation} \min_{z} z \qquad\text{s.t.: } w\geq -\log(t),\,w\geq 0,\,z\geq w^2, \end{equation} for fixed $t \in (0,1)$.

b. $-\log(t)\log(1-t)$:

This is the tricky one. I had given up already, when I found this wikipedia page on the Dilogarithm function.

\begin{equation} \text{Li}_2(z) = -\int_{0}^{z}\frac{\log(1-t)}{t}dt = -\int_{0}^{1}\frac{\log(1-z t)}{t}dt, \end{equation}

and:

\begin{equation} \text{Li}_2(z) + \text{Li}_2(1-z) = \frac{1}{6}\pi^2 - \log(z)\log(1-z) \end{equation}

Now we can approximate the integral with numerical integration (trapezoidal rule): \begin{equation} -\int_{0}^{z}\frac{\log(1-t)}{t}dt \approx -\frac{1}{N_t}\sum_{k=1}^{N_t}\frac{1}{2}\left(\frac{\log(1-z t_k)}{t_k} + \frac{\log(1-z t_{k-1})}{t_{k-1}}\right), \end{equation} where $t_k$ for $k\in\{0,N_t\}$ discretizes the interval $[0,1]$.

Now this approximation is a convex DCP expression.

I guess the only thing missing now is to write the remaining constraint for the change of variable.

$\endgroup$
7
  • 1
    $\begingroup$ I doubt there is a DCP representation for T. But if you post at ask.cvxr.com only about a DCP representation for T, and don't even mention anything else, you will get focused attention from a collection of DCP formulation wizards. I suggest putting the expression in the title. T seems a likely candidate for @ErlingMOSEK 's challenge ask.cvxr.com/t/express-1-a-2-x-2-in-cvx/5642/4 . $\endgroup$ Commented Nov 1, 2023 at 23:21
  • 1
    $\begingroup$ Those additive terms aren't convex or concave over the entirety of their natural domain, therefore they are unlikely to have a DCP representation. The same with T. $\endgroup$ Commented Nov 2, 2023 at 17:48
  • 2
    $\begingroup$ Have you considered the trick of rewriting this quasiconvex optimization problem via a family of convex functions as in section 3.4.5. of Boyd and Vandenberghe? $\endgroup$ Commented Nov 2, 2023 at 22:29
  • $\begingroup$ @BrianBorchers I didn't know that was a thing. Optimization is not my expertise. Do you think this would help? How would that work in this case? If it solves the question practically I'm more than willing to accept it as an answer! I think I found a way to represent it as a DCP compliant program, but I get ill-posedness in this reformulation. I'll check my calculations and edit the post if it is correct. Thanks again for all the help! $\endgroup$
    – Uomond
    Commented Nov 3, 2023 at 14:25
  • 1
    $\begingroup$ @Uomond If you have a DCP compliant representation, you should post it here and at ask.cvxr.com/t/dcp-representation-of-1-2-log-z-1-z-log-z-2z-1/… $\endgroup$ Commented Nov 3, 2023 at 18:46

0