How to prove that the proposed solution of this "Camels and Bananas" problem is optimal? And how to generalize it?

Question

I found the video "YAHOO Interview Puzzle || Camel and Bananas || Logic + Optimization" on the LOGICALLY YOURS YouTube channel.

Here's the essence of it:

you must transport bananas from an initial location to a market 1000 km away
you can only use a camel for transporting them
the camel requires (= consumes) 1 banana for each km it travels, and can carry at most 1000 bananas at any one time
if you have 3000 bananas to start with, at the initial location, what is the max number of bananas you can eventually get to the market?

In the video, they address it using a lot of heuristics / not much explanation, which is the part that I am trying to figure out / reduce to a more systematic or formal approach.
Here is their approach:

You need to transport the bananas using 2 intermediate stops between the initial location and the market (proof?). Starting a forward journey with fewer than 1000 bananas is wasteful (proof?). Therefore, 5 trips are necessary between the initial storage and the first stop (proof?), and 2000 bananas must be at the first stop when the initial storage is empty, which implies the first stop must be at 200 km (this they actually do prove). Therefore, 3 trips are necessary between the first stop and the second stop (proof?), and 1000 bananas must be at the second stop when the first stop is empty, which implies the second stop must be at 533 km, and the final answer is $1000 - (1000 - 533) = 533$ bananas at the market (this is actually wrong under the assumption that no 'fractional' bananas can be consumed, as it would leave 1 banana at the second stop; the correct answer is 534 km, leaving 532 bananas at the market).

I've been trying to frame this problem in several ways, e.g. as a linear program (see below), but I keep circling back to having to use the assumptions or heuristics they used in the video.
Is it possible that one cannot solve this without making all these 'guesses'? Is there a more 'formal' way of tackling this? And especially is this problem a 'known' one in mathematics, with an 'official' solution or approach that guarantees its optimality?

PS : describing here my (failed) attempts to solve this without guessing.

One interesting aspect of this problem, I find, is that without any intermediate stops one would have to travel 5000 km to empty the initial storage (and it couldn't even be done, as one would be stuck without fuel after the first forward trip of 1000 km with 1000 bananas).
The use of intermediate stops not only allows to carry at least some bananas to the destination, but also drastically reduces the total distance travelled.
This made me realize that in fact this is also a minimization of the total distance travelled, because the number of bananas left at the end is 3000 minus that.

So my first naive attempt (under the assumption that 5 trips were required to carry all the bananas to the first stop) was to look for a value of $x$ (distance of the first stop from the start) that maximized the product of the forward distance reached by the number of bananas left:

$x \cdot (3000 - 5 \cdot x)$

This turns out to be $x = 300$, but leaves only 1500 bananas at the first stop.
Repeating the same approach for $y$ (distance of the second stop from the first stop), i.e. maximizing $y \cdot (1500 - 3 \cdot y)$, results in $y = 250$ and 750 bananas left.
450 km still need to be covered in the last forward trip, leaving 300 bananas overall.
Much fewer than the 532 they got by their heuristic method.

My second idea was to find $x$ such that the number of bananas left at the first stop would be maximal, when all the initial stock had been moved.
Obviously this didn't work, as $x = 1$ would be the best solution, leaving 2995 bananas, but then regardless of the value of $y$, the final number of bananas would be much lower than 532.

This made me think that the values of $x$ and $y$ cannot be determined independently (i.e. one cannot solve this problem by finding a 'best' value for $x$, ignoring all the rest, and then doing the same for $y$).
But I am not sure of this conclusion, and I cannot find an expression to link $x$ and $y$.

Unless I am mistaken, the number of trips one needs to make to remove a total of $N_0$ bananas, by taking $n$ bananas at a time, is:

$2 \cdot \lceil \frac {N_0} n \rceil - 1$

As bananas are consumed at each trip, it would seem advantageous to keep $n$ as large as possible.
This seems to prove the video's assumption that the maximal possible number of bananas should be taken on any forward trip, i.e. $n = 1000$, and to imply that $5$ trips (3 forward and 2 backward) must be used to empty the initial storage.
At that point, however, $x$ is not known yet, and it is not clear to me what implies that one should choose $y$ so that only 3 trips are left to make.

If I assume that up to 3 trips must be made (2 forward, 1 backward), transporting $n_1, n_2$ bananas, respectively, in the forward trips from $x$ to $x+y$, and that a single final forward trip needs to be made from $y$ to the market, then for a given choice of $y$:

So this could potentially be seen as a linear program in $x, y, n_1, n_2$:

$argmax \ n1 + n2 - 3 \cdot y - (1000 - x - y)$

with several constraints (I am probably putting some redundant ones in) :

$x \ge 1$

$y \ge 1$

$n_1 \le 1000$

$n_2 \le 1000$

$3000 - 5 \cdot x - n_1 - n_2 = 0$

$n_1 - 2 \cdot y \ge 1$

$n_1 + n_2 - 3 \cdot y \ge 1$

$n_1 + n_2 - 3 \cdot y \le 1000$

This works (solved using R's lpSolve).
It yields $x = 200, y = 334, n_1 = n_2 = 1000$.

But is it the correct approach?
And how do I know that it is best to have only stop $x+y$ between $x$ and the market, rather than 2 or more stops?

If I had a different initial number of bananas, and/or a different maximal carrying capacity, and/or a different total distance, I would be left guessing, as I have developed no 'method' to tackle this.

Any ideas/suggestions?

EDIT based on Paul Sinclair's suggestions

Let:
$m \in \mathbb Z^+$ = maximal # bananas that can be carried at any one time by the camel
$x_0 = 0$ (position of the initial storage)
$x_i \in \mathbb Z^+, \forall i \ge 1$ = distance between stop $i-1$ and stop $i$
$b_0 \in \mathbb Z^+$ = # bananas at the initial storage at the start
$b_i, \forall i \ge 1$ = # bananas left at stop $i$ when stop $i-1$ is empty
$D\in \mathbb Z^+$ = distance of the market from the initial storage
$\sum_{i=1}^k = D$ (meaning that the market is exactly at stop $k$)

Assuming that the initial storage and all intermediate stops before the final destination (market) must be empty at the end, what are the required number of stops $k$, and the corresponding $k$ values of $x_i$, which result in the maximal $x_k$ ?

As per Paul's suggestion (except for the indices):

$b_i = b_{i-1} - (2 \cdot \lceil \frac {b_{i-1}} m \rceil - 1) \cdot x_i$

[editing halted - apparently the hypotheses I am using are incorrect / do not accurately represent the linked Youtube problem]

One thought (not sure if this would lead to a full formal proof or not): If you could use multiple camels, then it would be clearer the optimal strategy would be to start with 3 camels; pause, redistribute the load, and leave one camel behind once 1000 bananas are consumed; and then again, pause, redistribute the load, and leave another camel behind once another 1000 bananas are consumed. But without having multiple camels, the need to backtrack is equivalent to needing 5 camels on the first leg, 3 camels on the second portion, and 1 camel on the last. — Daniel Schepler, Commented Nov 16, 2023 at 20:58
Another observation: the proposed solution is certainly not a unique optimum. You could instead imagine leaving one banana behind once every mile on the first two forward trips of 200 miles, for example, and then on the return trips leave the camel unladen. — Daniel Schepler, Commented Nov 16, 2023 at 21:12
Another possible approach to get a full formal proof might be along the lines of the typical proof that a greedy algorithm is optimal: try to transform any solution into another solution which is at least as good, and such that it starts with using 1 banana to get the other 2999 bananas transported 1/5 of a mile (for example). — Daniel Schepler, Commented Nov 16, 2023 at 21:42

Daniel Schepler · Accepted Answer · 2023-11-18 01:21:58Z

To show that the given solution is optimal, let me add a tracking mechanism: I will suppose that as the camel goes from kilometer marker $n$ to $n+1$, a banana peel is left behind at $n+0.2$; and for the reverse trip from $n+1$ to $n$, a banana peel is left behind at $n+0.8$. Now, suppose at the end of any process, there is exactly one banana peel at $0.2$. Then the camel can only have taken at most 1000 bananas from 0 to 1, so there are at least 2000 bananas (or their peels) left at kilometer marker 0 or less. Therefore, at most 999 bananas (or their peels) end up at kilometer marker 1 or greater. Similarly, if there are two banana peels at $0.2$, then there is at least one banana peel at $0.8$, and there are at least 1000 bananas left at kilometer marker 0 or less. Therefore, in this case, at most 1997 bananas end up at kilometer marker 1 or greater. Finally, if there are three or more banana peels at $0.2$, then there are at least two banana peels at $0.8$, so at most 2995 bananas end up at kilometer marker 1 or greater in this case. (And if there are no banana peels at $0.2$, then it is impossible for any bananas to end up at kilometer marker 1 or greater.) Combining these cases, we see that in any case, at most 2995 bananas end up at kilometer marker 1 or greater.

Now similarly, at most 2990 bananas will end up at kilometer marker 2 or greater, and so on until we can conclude at most 2000 bananas will end up at kilometer marker 200 or greater. From this point, you can make a similar argument to before, but with fewer cases, to show that at most 1997 bananas will end up at kilometer marker 201 or greater. Continuing, we get that at most 1001 bananas will end up at kilometer marker 533 or greater.

At this point, we have a special edge case to argue, and you will get that at most 999 bananas end up at kilometer marker 534 or greater. From here on, it's easy to argue that at most 998 bananas end up at kilometer marker 535 or greater, etc., so at most 533 bananas end up at kilometer marker 1000 or greater. As an easy corollary, at most 533 bananas end up at the market exactly at kilometer marker 1000. (And contrary to what you wrote, 533 is in fact achievable, if you use one staging point at 200 km, a second staging point at 533 km, and leave one banana behind at 533 km.)

Thanks, this is very interesting. I will think about it and see if/how it can be generalised to $N_0$ initial bananas and a total distance $D$. Regarding 533 vs 534, I was applying the video's own statement that no fractional bananas could be consumed, and that no bananas could be left behind. So in fact the initial statement of the problem is incomplete. It should also say 'and at the end no bananas must be left anywhere except at the market'. But I suppose that would be a hint/giveaway to the existence of other places where the bananas could be. — user6376297, Commented Nov 18, 2023 at 9:04
@user6376297 - I finally took the time to watch that video, and I see you have been making several claims about it that are not true, including your claim here that at no time can a banana be left behind. The reason was not a rule of the solution, but because common sense says leaving a bunch of bananas behind means you can't get them to market. But that "common sense" breaks down when taking a few more bananas would require an additional trip that would eat more bananas than it would bring. Their own solution intentionally leaves 1 banana at the last stop for this reason, just like Daniel. — Paul Sinclair, Commented Nov 21, 2023 at 14:35
OK. For the record, I did not intentionally try to mislead anyone or make false claims. Anything I stated was based on my genuine (but not necessarily perfect) understanding of the problem as it was presented in the video. I may have been myself mislead by some of the comments under the video, e.g. one saying that 533 was not achievable due to the impossibility of fractional bananas. But yes, of course it is possible under different assumptions or hypotheses. — user6376297, Commented Nov 21, 2023 at 20:30
I did not mean to imply you were trying to deceive, only that certain claims you made didn't agree with what I saw there. I fully expected this was because you mis-interpreted (from my point-of-view) several things that the video said. This being one of them. (Another was that the video did not justify 5 trips on the first stage. It seemed very cleary explained to me.) — Paul Sinclair, Commented Nov 21, 2023 at 21:13

Paul Sinclair · Accepted Answer · 2023-11-18 13:03:50Z

You need to transport the bananas using 2 intermediate stops between the initial location and the market (proof?).

As you have noted, trying to carry bananas all the way leaves you with $0$ bananas at the market and a camel that won't go back for the others as it has no bananas to eat. So you have to move forward in stages. (how many stages comes out from working the problem. They have mentioned the answer at the start, but that number is not a necessary consideration.)

So make your first stop at some distance $x$.

Starting a forward journey with fewer than 1000 bananas is wasteful (proof?).

The fewer bananas the camel carries on a trip, the more trips it has to make to get all $3000$ bananas moved (or eaten). So maximizing the number of bananas carried per trip gives you the fewest eaten bananas.

Therefore, 5 trips are necessary between the initial storage and the first stop (proof?)

To move the 3000 bananas will require three trips out, and therefore two trips back to pick up more bananas. Thus five trips total, eating $5x$ bananas. The highest $x$ can be is $600$. Otherwise the camel would run out of bananas

and 2000 bananas must be at the first stop when the initial storage is empty,

The idea here is that in the stage you want the camel to start off with full loads too. So the amount of bananas at the end of the first stage should be a multiple of $1000$, and since you cannot make it all the way in a second stage either (which you can check by the same logic as shows that going all the way in one trip will not work), a third stage should also be a multiple of $1000$. This can only be if the camel consumes multiples of $1000$ bananas a stage (except the final one), and with only $3000$ bananas total, that will have to be $1000$ bananas consumed for the first two stages.

However, while this sounds reasonable, I do not find it fully credible without closer examination. I would rather leave the first day's journey as some variable distance $x$, then optimize once I have a full expression for the cost.

which implies the first stop must be at 200 km (this they actually do prove). Therefore, 3 trips are necessary between the first stop and the second stop (proof?),

Accepting that there are $2000$ bananas left at the first stop, the reasoning is the same as before: the camel will need two trips out to carry the $2000$ bananas, and one trip back to pick up the second load.

and 1000 bananas must be at the second stop when the first stop is empty, which implies the second stop must be at 533 km, and the final answer is 1000−(1000−533)=533 bananas at the market (this is actually wrong under the assumption that no 'fractional' bananas can be consumed, as it would leave 1 banana at the second stop; the correct answer is 534 km, leaving 532 bananas at the market).

Not quite. If the camel eats $1000$ bananas for its second stage, then it can move $333\frac 13$ km for the second stage. Putting that stop at $200 + 333\frac 13 = 533\frac13$ km from the initial location, not $533$ km. I see nothing that restricts the stages to an integral number of kilometers. So the final day the camel travels $1000 - 533\frac 13 = 466\frac23$ km and eats that many bananas. Feed it the remaining $\frac 13$ of a banana as a reward, and you are left with $533$ bananas, and a camel to sell, since taking it back home would turn your profit into a loss.

As I stated earlier, the assumption that letting the camel eat exactly $1000$ bananas a stage would result in an optimal solution is a bit too much for me to just take unexamined. Instead I would assume $k$ stages of length $x_1, x_2, \dots, x_k$ with $\sum_{i=1}^k x_k = 1000$. At the start of each stage there are $b_n$ remaining bananas. So $b_1 = 3000$ and $$b_i = b_{i-1} - \left(2\left\lceil \frac {b_{i-1}}{1000}\right\rceil - 1\right)x_{i-1}$$ (where $\lceil t \rceil$ denotes the least integer $\ge t$, and there are $\left\lceil \frac {b_i}{1000}\right\rceil$ outward trips in stage $i$ with one fewer trips back). So $$\frac{\partial b_i}{\partial x_{i-1}} = 2\left\lceil \frac {b_{i-1}}{1000}\right\rceil.$$ Also for $j \ge i, \frac{\partial b_i}{\partial x_j} = 0$ since $b_i$ is determined by the distances travelled before the $i$-th stage, not after. For $j < i -1, \frac{\partial b_i}{\partial x_j} = \frac{\partial b_{i-1}}{\partial x_j}$, except when $b_{i-1}$ is an exact multiple of $1000$, where the partial derivative is undefined. This is because $\left\lceil \frac {b_{i-1}}{1000}\right\rceil$ is constant away from such points.

The number of bananas available to sell at market is $b_{k+1}$. From the relations above, it follows that for all $j \le k,$ $$\frac{\partial b_{k+1}}{x_j} = 2\left\lceil \frac {b_j}{1000}\right\rceil$$

except where the $b_j$ are multiples of $1000$, where it is undefined. At the maximum of $b_{k+1}$, if any such derivative is defined, it has to be $0$. But the only way for it to be $o$ is for $b_j = 0$, which means the camel has failed to arrive with bananas. So the maximum must occur where the derivatives are undefined. I.e., where all the $b_j$ (except the arrival value $b_{k+1}$) are multiples of $1000$.

So for the maximal strategy, either the first stage ends with $1000$ bananas or with $2000$ bananas. We've already seen that $2000$ bananas leads to $533$ bananas to sell. If we allow the camel to eat $2000$ bananas that first stage, we can go $\frac{2000}5 = 400$ km the first stage, and will have $600$ km to go on future stages, leaving only $400$ bananas to sell. So the optimal strategy leaves $2000$ bananas after the first stage and $1000$ after the second stage.

Thank you! I am still processing the information you provided. Just an initial comment on "since you cannot make it all the way in a second stage either". I could stop at $x = 400$, leaving me with $1000$ bananas after the necessary $5$ trips, and then a single last trip of $600$ km, leaving me with $400$ bananas at the end. So yes, the main point I am trying to understand here is what makes $x=200$ the 'obvious' choice. I will read on and see if I get it. — user6376297, Commented Nov 18, 2023 at 11:44
As you've discovered I go on to cover that. While was discussing the idea at that point, I admitted that I did not find it obvious and therefore ended the post with a more analytical demonstration. I've just modified that part a little, as I realized now that I started thinking about the ceiling function as a floor function, and described the derivative with respect to $x_j$ could be non-zero if $b_j < 1000$. This is now fixed (even though the conclusion was still the same anyway). — Paul Sinclair, Commented Nov 18, 2023 at 13:09
Thanks again. I am working on it, trying to understand your approach. One thing that puzzles me is that if I plot the number of bananas left at stop $x+y$ as a function of $x$, $y$ being constant and set to $334$, the function is increasing up to $x=200$, then it is decreasing. This is of course consistent with the fact that $x=200$ gives the maximal number of bananas left at $x+y$ before the last forward trip is taken, but not with the fact that the derivative of that quantity is $0$ everywhere except in $x=200$, although I agree that theoretically it should be. I'll keep investigating. — user6376297, Commented Nov 21, 2023 at 8:15
Thederivative of the number of bananas left does not exist at $x = 200$. The result that maximums only occur where the derivative is $0$ requires the derivative to exist everywhere. The full result is that the extrema of a function occur either where the derivative is $0$, or the derivative does not exist, or at a boundary of the domain. You have to check them all to find the maximum. — Paul Sinclair, Commented Nov 21, 2023 at 12:53
I see. I am working under the assumption that the various $x_i$ can only be integers, so for me the derivatives should be differences. I will edit my post showing what comes out of that. — user6376297, Commented Nov 21, 2023 at 14:08

Stack Exchange Network

How to prove that the proposed solution of this "Camels and Bananas" problem is optimal? And how to generalize it?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
optimization
puzzle
.

Hot Network Questions

How to prove that the proposed solution of this "Camels and Bananas" problem is optimal? And how to generalize it?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged optimizationpuzzle.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
optimization
puzzle
.