No general solution to k-combination of multiset?

Question

The problem to solve

Solve k-combination whose input may contain repeated items, and return only unique combinations.

e.g input is [0,1,2,2,4], k = 2, the result is:

{(0, 1), (2, 4), (1, 2), (0, 4), (1, 4), (0, 2), (2, 2)}

The solution I can think of

The only general solution I can think is to use a set or map to extract unique results.

e.g with python

from itertools import combinations
print (set(combinations([0, 1, 2, 2, 4], 2)))

Questions

Is there a general solution to do this, without using a set/map for filtering.

I can generate combinations of unique items, in order (refer: https://stackoverflow.com/a/76048486), but if there are repeated items, then there will be repeated combinations.

I've read this: https://math.stackexchange.com/a/554237 , seems it says there is no general solution to get all unique combinations, though there do have a formula to get count of unique combinations.

But, I'm not sure.

@Update - Summary & tips

According to the answers, there are ways both via iteration & recursion to do this.
When translating from python to go:
- yield in python, need chan in go, or u need return all combinations (which can be huge for large input).
- python's else for for and while is omitted when break.
- list slicing in python will make a copy (of reference ?), while slicing a go slice will reuse the same underlying array, (so in go, if the slice being copied will be modified later, and u don't want that change in the copy, then u should copy/append it into a new slice).

Does this answer your question? How to generate all distinct combinations (where input elements are repeated) in Python(using Itertools)? — Joseph Wood, Commented Apr 19, 2023 at 23:49
@JosephWood This is an algorithm question, and I don't see any fitting answer there. (I had btw made the same mistake, flagged as duplicate of this, and retracted when I realized this is not a python question...) — Kelly Bundy, Commented Apr 20, 2023 at 0:00
@KellyBundy, btw didn't know about more_itertools.distinct_combinations. I'll have to check it out. — Joseph Wood, Commented Apr 20, 2023 at 0:44
@JosephWood To me, sympy.multiset_combinations doesn't sound general. Can you use in in Java or Go? — Kelly Bundy, Commented Apr 20, 2023 at 0:50
Using "set" simplifies the problem drastically, removes unnecessary complexity, so it is viable way. — MBo, Commented Apr 20, 2023 at 3:23

David Eisenstat · Accepted Answer · 2023-04-20 14:16:43Z

Here’s a non-recursive enumerator for combinations in lexicographic order, optimized for simplicity over speed. Knuth’s The Art of Computer Programming almost certainly has a speed-optimized version.

def combinations(a, k):
    a = sorted(a)
    while True:
        yield a[:k]
        for i in range(k - 1, -1, -1):
            # a[i + 1 :] is sorted. Sort a[i:].
            x = a[i]
            j = i + 1
            while j < len(a):
                if x < a[j]:
                    break
                a[j - 1] = a[j]
                j += 1
            a[j - 1] = x
            # a[j] is the next greatest element after x. If there are enough
            # elements after it, rotate them into position.
            if j + (k - i) <= len(a):
                rotate(a, i, j, j + (k - i))
                break
        else:
            break


def rotate(a, i, j, k):
    reverse(a, i, j)
    reverse(a, j, k)
    reverse(a, i, k)


def reverse(a, i, k):
    for j in range((k - i) // 2):
        a[i + j], a[(k - 1) - j] = a[(k - 1) - j], a[i + j]


print(*combinations([0, 1, 2, 2, 4], 2))
print(*combinations([0, 1, 2, 2, 4], 3))
print(*combinations([0, 0, 0, 1, 1, 1], 3))

Output:

[0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]
[0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]
[0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]

I've translated your answer to go: go.dev/play/p/RyanyA38Vuc , as a generic version. Another go translation, that send combinations to channel: go.dev/play/p/P1_ik7w8j-Q , which is more like the yield feature in python. — Eric, Commented Apr 21, 2023 at 9:29

Joseph Wood · Accepted Answer · 2023-04-21 11:15:40Z

Below is an algorithm I developed some time back for this task:

def multiset_combinations(v, m):
    
    v = sorted(v)
    f = [0]
    j = 0

    # Populating f, a list of expanded indices. E.g. For
    #
    #    v = [2, 2, 33, 33, 33, 40]
    #
    # f would be:
    #
    #    [0, 0, 1, 1, 1, 2]
    
    for i in range(1, len(v)):
        if v[i] != v[i - 1]:
            j += 1

        f.append(j)

    v = list(set(v))
    r = [0] * m
    n = len(v)
    
    z   = f[:m]
    idx = [0] * n
    p   = len(f) - m
    
    last = f[-m:]
    last[-1] += 1
    
    # Find the first occurence of each index. E.g. For the
    # same example above, idx would be:
    #
    #   [0, 2, 5]
    
    for i in range(n):
        idx[i] = f.index(i)
    
    # The main algorithm

    while True:
        while z[-1] < n:
            for j in range(m):
                r[j] = v[z[j]]

            z[-1] += 1
            yield r.copy()

        if z == last:
            break
        
        for i in range(m - 2, -1, -1):
            if z[i] != f[p + i]:
                z[i] += 1
                
                for j, k in zip(range(i + 1, m),
                                range(idx[z[i]] + 1, idx[z[i]] + m)):
                    z[j] = f[k]
                
                break

Calling it:

print(*multiset_combination([0, 1, 2, 2, 4], 2))
#> [0, 1] [0, 2] [0, 4] [1, 2] [1, 4] [2, 2] [2, 4]

print(*multiset_combination([0, 1, 2, 2, 4], 3))
#> [0, 1, 2] [0, 1, 4] [0, 2, 2] [0, 2, 4] [1, 2, 2] [1, 2, 4] [2, 2, 4]

print(*multiset_combination([0, 0, 0, 1, 1, 1], 3))
#> [0, 0, 0] [0, 0, 1] [0, 1, 1] [1, 1, 1]

It is pretty efficient as well. It can generate 753564 in just under a second:

v1 = [1,
      2, 2,
      3, 3, 3,
      4, 4, 4, 4,
      5, 5, 5, 5, 5,
      6,
      7, 7,
      8, 8, 8,
      9, 9, 9, 9,
      10, 10, 10, 10, 10,
      11,
      12, 12,
      13, 13, 13,
      14, 14, 14, 14,
      15, 15, 15, 15, 15]
      
def get_time(v, m):
    t1 = time.time()
    list(multiset_combinations(v, m))
    t2 = time.time()
    return t2 - t1
    
get_time(v1, 10)
#> 0.8702290058135986

Your python version finishes in 1.38 s, the one from another answer stackoverflow.com/a/76064642 finishes in 9.93 s, on my laptop, so indeed yours is faster. Though go is much faster than python seems. — Eric, Commented Apr 21, 2023 at 13:33
I've translated your answer into go: go.dev/play/p/jdQvr6c5b0g , it finishes in 0.12 s without print, while my translation for the other answer go.dev/play/p/RyanyA38Vuc takes 0.20 s on my laptop. So, yours is still faster. BTW, the above link can't finish in go.dev/play probably due to it takes too much resource, need to run it locally. — Eric, Commented Apr 21, 2023 at 18:06
I added another go translation which send combinations to go channel, which is more like yield in python: go.dev/play/p/3WivvzV2E0u , now it can be run online, it takes 0.15 s on my laptop, and it's generic. — Eric, Commented Apr 21, 2023 at 18:29

btilly · Accepted Answer · 2023-04-20 01:27:38Z

1

Here is code that generates these one at a time in Python.

I'm using iterators. Translating away from that would be the hardest part of translating this to another language.

def unique_multisets (multiset, k):
    # Find the repetition of each item.
    count = {}
    for x in multiset:
        count[x] = 1 + count.get(x, 0)

    # Turn this into (value, count) pairs.  The reversed is
    # because we will reverse it later in generating the multisets.
    pairs = list(reversed(count.items()))

    # Calculate a running total.
    total = 0
    totals = []
    for p in pairs:
        total += p[1]
        totals.append(total)

    def recurse (partial, idx):
        if k == len(partial):
            yield partial
        elif idx < 0 or totals[idx] < k - len(partial):
            pass # Can't make combinations.
        else:
            (value, repeat) = pairs[idx]
            max_used = min(repeat, k - len(partial))
            for i in range(1 + max_used):
                partial.append(value)
            for i in range(1 + max_used):
                partial.pop()
                yield from recurse(partial, idx - 1)

    yield from recurse([], len(pairs)-1)

for x in unique_multisets([0, 1, 2, 2, 4], 2):
    print(x)

answered Apr 20, 2023 at 1:27

btilly

45.8k3 gold badges64 silver badges94 bronze badges

Looks correct, wondering is it possible to make it an iteration, (I'll try after fully understand it later). I head that every recursion have an iteration version, but not sure.
– Eric
Commented Apr 20, 2023 at 7:56
@Eric It goes the other way. Every iteration can be rewritten as recursion. But there are recursive functions that cannot be rewritten as iteration, the Ackermann function being the standard example. I can solve this one iteratively, but it will be a very different solution.
– btilly
Commented Apr 20, 2023 at 13:49
@btilly That doesn't sound right. Do you mean some restricted kind of "iteration"?
– Kelly Bundy
Commented Apr 20, 2023 at 14:12
@KellyBundy that exactly- you can compute any recursive function iteratively by emulating a call stack with segmenting the entry/exit points appropriately. Primitive recursive functions don't require this "hack"- you can determine the number of loops before entering the loop. Non-primitive recursive functions (like the ackermann function) require an explicit stack.
– Dillon Davis
Commented Apr 20, 2023 at 14:27
@KellyBundy I meant "bunch of for loops". With a call stack you can emulate anything - that's how we can run recursive functions in the first place.
– btilly
Commented Apr 20, 2023 at 14:52

| Show 2 more comments

Collectives™ on Stack Overflow

No general solution to k-combination of multiset?

The problem to solve

The solution I can think of

Questions

@Update - Summary & tips

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
algorithm
combinations
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

The problem to solve

The solution I can think of

Questions

@Update - Summary & tips

3 Answers 3

Not the answer you're looking for? Browse other questions tagged algorithmcombinations or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
algorithm
combinations
or ask your own question.