With free choice of hand size, the problem is NP-hard
We will reduce the decision version of vertex cover to this.
- The deck contains one unique card $i$ for every vertex $i$ in the graph.
- Let each hand represent the complement of a prospective cover, i.e. the cards represent the vertexes not in the cover.
- Let the hand size be equal to the number of vertexes in the graph minus the target number of vertexes in the cover.
- The value of a combo is $\lambda_{ij} = 1$ if edge $ij$ exists in the graph, and $\lambda_{ij} = 0$ otherwise.
Then the value of a hand is $1$ iff there is some edge in the graph that is not covered, i.e. both endpoints of the edge were selected to not be in the prospective cover. If there is no vertex cover of the desired size, then every hand is worth $1$ and the expected value of the max combo is $1$; if there is such a vertex cover, then the value for the corresponding hand is $0$ (i.e. no edges were left out) and the expected value of the max combo is less than $1$.
"Single-pass" evaluation
We will want to find some exploitable structure in "typical" combo functions. I recently published a paper that may help, depending on the structure of the payoff matrix (or tensor) $\lambda$.
Consider the following process of evaluating a hand of cards:
- First, I tell you how many aces you drew.
- Then, I tell you how many twos you drew.
- Then, I tell you how many threes you drew.
- ...
You can remember things between each step, and update your memory at each step, such that at the end you can determine the max payoff of the hand. In terms of Python code with a transition function next_state
:
state = None
state = next_state(state, 1, num_ones)
state = next_state(state, 2, num_twos)
state = next_state(state, 3, num_threes)
# ...
Given such a transition function, the output of the algorithm is the distribution of max payoffs of a random hand. From here, the expected value can be computed using the weighted mean.
In the worst case, you would have to remember every previous card you drew, in which case the algorithm has no advantage over iterating over all possible hands with the multivariate hypergeometric formula. However, if you can determine the payoff of a hand while retaining less information than this, the algorithm can find the distribution of payoffs more efficiently.
Examples of exploitable structure
Longest straight
For example, this transition function looks for the longest straight in your hand:
def next_state(state, outcome, count):
best_run, best_run_outcome, run = state or (0, outcome, 0)
if count >= 1:
run += 1
else:
run = 0
return max((run, outcome), (best_run, best_run_outcome)) + (run,)
In this case, you don't need to remember every previous card, only the length of the current run and the best run you've seen so far.
Sums
Another example, which may be applicable to "tech" cards, is to simply sum the value of each card:
def next_state(state, outcome, count):
return (state or 0) + outcome * count
Ideally the number of possible unique sums is small, e.g. if the payoffs of each tech card is a small integer.
Band matrix
If the payoff matrix $\lambda$ can be permuted into a band matrix with bandwidth $b$ (or analogously for tensors), then you can check for the best combo while only remembering whether the most recent $b$ classes have been drawn. For $p$ unique payoff values, the number of unique states would then be at most $2^b p$.
Count uniques by combo
If the number of combos is small, you could try counting the number of unique cards drawn in each combo:
def next_state(state, outcome, count):
state = list(state or [0] * NUM_COMBOS)
if count > 0:
for combo in COMBOS[outcome]
state[combo] += 1
return tuple(state)
Then at the end you would determine which (possibly partial) combo gives the best payoff.
Linearity of expectation
If all you are interested in is the expected value, and you can factor out a purely additive component, then you can take advantage of linearity of expectation. For example, suppose you can split the payoff into the sum of:
- A fixed "tech" value per card, which are unconditionally summed.
- The value-above-tech of the best combo.
Then you can sum the expected value of the two components to get the overall expected value. The first one is just the expected tech value per card times the hand size, and the second one can be computed more efficiently since you no longer need the joint distribution with the tech value.
Code example
This uses my Icepool Python library. The idea is to split into three components:
- Base values for each card, which are added together unconditionally.
- Inclusive combos, which can each be scored at most once.
- Exclusive combos, which only the highest is scored.
The three components are not independent, but this doesn't matter for the expected value. They could be computed jointly for some extra cost.
There are some differences in payoff matrix compared to the example question; I tried to follow the text rather than the literal numbers provided.
from icepool import Deck, ContainsSubsetEvaluator, JointEvaluator, SumEvaluator
deck = Deck({'G': 3, 'S': 3, 'F': 3, 'JJ': 3, 'T': 28})
deal = deck.deal(6)
# Each card unconditionally contributes this value.
# Duplicates count multiple times.
base_values = {
'G': 2,
'S': 1,
'F': 1,
'JJ': 1,
'T': 2,
}
base_score = deal.sum(sub=base_values)
print('Base score:')
print(base_score)
print('Mean:', base_score.mean())
print()
# Each of these combos can be scored up to once.
inclusive_combos = [
({'F'}, 0.7),
]
inclusive_score = sum(value * deal.contains_subset(combo)
for combo, value in inclusive_combos)
print('Inclusive combo score:')
print(inclusive_score)
print('Mean:', inclusive_score.mean())
print()
# Only the max of these combos is scored.
exclusive_combos = [
({'G', 'S'}, 3),
({'S', 'F'}, 4),
({'G', 'JJ'}, 1),
]
# Jointly evaluate whether each combo was gotten.
# This is not the most efficient possible,
# though it is more efficient than enumerating
# all possible hands.
# IntersectionSizeEvaluator could be used in place
# of ContainsSubsetEvaluator.
exclusive_evaluator = JointEvaluator(
*(ContainsSubsetEvaluator(combo)
for combo, _ in exclusive_combos))
exclusive_gots = exclusive_evaluator.evaluate(deal)
# Then pick the max gotten combo value.
def score_exclusive(gots):
return max(
got * value
for got, (_, value) in zip(gots, exclusive_combos))
exclusive_score = exclusive_gots.sub(score_exclusive)
print('Exclusive combos:')
print(exclusive_score)
print('Mean:', exclusive_score.mean())
print()
print('Overall mean:',
base_score.mean() + inclusive_score.mean() + exclusive_score.mean())
Result:
Base score:
Die with denominator 3838380
Outcome |
Quantity |
Probability |
6 |
84 |
0.002188% |
7 |
3906 |
0.101762% |
8 |
58590 |
1.526425% |
9 |
377580 |
9.836962% |
10 |
1132740 |
29.510887% |
11 |
1529199 |
39.839698% |
12 |
736281 |
19.182077% |
Mean: 10.65
Inclusive combo score:
Die with denominator 3838380
Outcome |
Quantity |
Probability |
0.0 |
2324784 |
60.566802% |
0.7 |
1513596 |
39.433198% |
Mean: 0.2760323886639676
Exclusive combos:
Die with denominator 3838380
Outcome |
Quantity |
Probability |
0 |
2562150 |
66.750817% |
1 |
371257 |
9.672232% |
3 |
371257 |
9.672232% |
4 |
533716 |
13.904720% |
Mean: 0.9430780693938589
Overall mean: 11.869110458057827
The algorithm
The algorithm itself is based on the decomposition of the multivariate hypergeometric formula as a product of binomials. Combined with the transition function formulation above, this allows us to memoize intermediate results at each step, in other words, "what would the state distribution be if I drew only $0 \ldots k$ cards, all of which are class $1 \ldots M$ or lower". For the details, you can read my paper.
@inproceedings{liu2022icepool,
title={Icepool: Efficient Computation of Dice Pool Probabilities},
author={Albert Julius Liu},
booktitle={Eighteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment},
volume={18},
number={1},
pages={258-265},
year={2022},
month={Oct.},
eventdate={2022-10-24/2022-10-28},
venue={Pomona, California},
url={https://ojs.aaai.org/index.php/AIIDE/article/view/21971},
doi={10.1609/aiide.v18i1.21971}
}