2
\$\begingroup\$

I present my solution for Project Euler problem 23 written in Python (2.7).

The problem statement is:

A perfect number is a number for which the sum of its proper divisors is exactly equal to the number. For example, the sum of the proper divisors of 28 would be 1 + 2 + 4 + 7 + 14 = 28, which means that 28 is a perfect number.

A number whose proper divisors are less than the number is called deficient and a number whose proper divisors exceed the number is called abundant.

As 12 is the smallest abundant number, 1 + 2 + 3 + 4 + 6 = 16, the smallest number that can be written as the sum of two abundant numbers is 24. By mathematical analysis, it can be shown that all integers greater than 28123 can be written as the sum of two abundant numbers. However, this upper limit cannot be reduced any further by analysis even though it is known that the greatest number that cannot be expressed as the sum of two abundant numbers is less than this limit.

Find the sum of all the positive integers which cannot be written as the sum of two abundant numbers.

I have done everything to my knowledge to make the code run as fast as possible. It takes about 600 milliseconds on my machine.

It is one of the first times that I actually put the else part of the for statement to use. Using any alternatively turned out to be about 1.5× as slow.

SMALLEST_ABUNDANT = 12

# largest number for which we have to test if it can be expressed as the
# sum of two abundant numbers
MAX_TEST_SUM_OF_TWO_ABUNDANT = 28123

# largest number for which we have to know if it is abundant
MAX_TEST_ABUNDANT = MAX_TEST_SUM_OF_TWO_ABUNDANT - SMALLEST_ABUNDANT

# largest possible divisor that we have to consider
MAX_DIVISOR = MAX_TEST_ABUNDANT / 2

from collections import defaultdict

def solve_euler23():
    """Return the sum of all positive integers which cannot be written
    as the sum of two abundant numbers.
    """
    # mapping from a number to the set of its proper divisors (i.e.,
    # excluding the number itself)
    divisors = defaultdict(set)
    for divisor in range(1, MAX_DIVISOR+1):
        for number in range(2*divisor, MAX_TEST_ABUNDANT+1, divisor):
            divisors[number].add(divisor)

    def is_abundant(number):
        return sum(divisors[number]) > number

    abundant_numbers = sorted(filter(is_abundant, divisors))

    # make a set so that membership can be tested efficiently
    abundant_numbers_set = set(abundant_numbers)

    impossible_sum_of_two_abundant_numbers = []
    for number in range(1, MAX_TEST_SUM_OF_TWO_ABUNDANT+1):
        for abundant in abundant_numbers:
            if number - abundant in abundant_numbers_set:
                break # is the sum of two abundant numbers
        else:
            impossible_sum_of_two_abundant_numbers.append(number)

    return sum(impossible_sum_of_two_abundant_numbers)

if __name__=='__main__':
    print solve_euler23()

My questions are:

  • Is there a better algorithm?
  • Did I overlook something which would make the implementation of my chosen algorithm more efficient?
  • Can I make the code shorter or clearer?
\$\endgroup\$

2 Answers 2

4
\$\begingroup\$

1. Sieving for sums

The computation of the sum of divisors can be improved using a bit of math. Let's call the sum-of-divisors function \$σ\$, and consider \$σ(60)\$: $$ σ(60) = 1 + 2 + 3 + 4 + 5 + 6 + 10 + 12 + 15 + 20 + 30 + 60. $$ Collect together the divisors by powers of 2: $$ \eqalign{σ(60) &= (1 + 3 + 5 + 15) + (2 + 6 + 10 + 30) + (4 + 12 + 20 + 60) \\ &= (1 + 2 + 4)(1 + 3 + 5 + 15) \\ &= (1 + 2 + 4)σ(15)}. $$ Then do the same thing for \$σ(15)\$, collecting together the divisors by powers of 3: $$ \eqalign{σ(15) &= (1 + 5) + (3 + 15) \\ &= (1 + 3)(1 + 5) \\ &= (1 + 3)σ(5)}. $$ And so on, getting \$σ(60) = (1 + 2 + 4)(1 + 3)(1 + 5)\$. In general, if we can factorize \$n\$ as: $$ n = 2^a3^b5^c\dotsm $$ then $$ σ(n) = (1 + 2 + \dotsb + 2^a)(1 + 3 + \dotsb + 3^b)(1 + 5 + \dotsb + 5^c)\dotsm $$ These multipliers occur many times, for example \$(1 + 2 + 4)\$ occurs in the sum of divisors of every number divisible by 4 but not by 8, so it's most efficient to sieve for the sums of many divisors at once, like this:

def all_sum_divisors(n):
    """Return a list of the sums of divisors for the numbers below n.

    >>> all_sum_divisors(10) # https://oeis.org/A000203
    [1, 1, 3, 4, 7, 6, 12, 8, 15, 13]

    """
    result = [1] * n
    for p in range(2, n):
        if result[p] == 1: # p is prime
            p_power, last_m = p, 1
            while p_power < n:
                m = last_m + p_power
                for i in range(p_power, n, p_power):
                    result[i] //= last_m
                    result[i] *= m
                last_m = m
                p_power *= p
    return result

2. Review

  1. I would change the signature of the function as follows:

    def solve_euler23(n=28124):
        """Return the sum of all positive integers below n which cannot be
        written as the sum of two abundant numbers."""
    

    The reason for doing this is to make it possible to test the function on small instances: for example, you could test the claim in the problem statement (that 24 is the smallest number that cannot be written etc.), by running solve_euler23(24) and checking that it returns \$ \sum_{1≤i<24}i=276 \$, and then running solve_euler23(25) and checking that it returns the same result.

  2. The division operator / returns a float in Python 3, so for portability it would be worth writing:

    MAX_DIVISOR = MAX_TEST_ABUNDANT // 2
    
  3. The efficiency gain of avoiding computing the sum of divisors of the numbers up to SMALLEST_ABUNDANT (12) is negligible, and it deprives you of the chance to check that your sum-of-divisors algorithm is correct on small numbers (for example, an off-by-one error might lead it to deduce that 6 is abundant, but if you started at 12 you might have trouble spotting the bug).

  4. The question only asks for the sum of the positive integers which cannot be written as the sum of two abundant numbers, so it's not necessary to keep a list of the numbers themselves.

  5. When looking for sums adding up to number, the code tests every abundant number:

    for number in range(1, MAX_TEST_SUM_OF_TWO_ABUNDANT+1):
        for abundant in abundant_numbers:
            if number - abundant in abundant_numbers_set:
                break
    

    but we only need to test abundant numbers up to number // 2 (if two abundant numbers add up to number, then one of them will be no more than number // 2). We could test for this in the loop:

    for number in range(1, MAX_TEST_SUM_OF_TWO_ABUNDANT+1):
        for abundant in abundant_numbers:
            if number - abundant in abundant_numbers_set:
                break
            else if abundant >= number // 2:
                impossible_sum_of_two_abundant_numbers.append(number)
                break
    

    but because abundant_numbers is sorted, we can quickly calculate how far along this list we need to go, using bisect.bisect and itertools.islice:

    for number in range(1, MAX_TEST_SUM_OF_TWO_ABUNDANT+1):
        for abundant in islice(abundant_numbers,
                               bisect(abundant_numbers, number // 2)):
            if number - abundant in abundant_numbers_set:
                break
    

3. Revised code

from bisect import bisect
from itertools import islice

def solve_euler23(n=28124):
    """Return the sum of all positive integers below n which cannot be
    written as the sum of two abundant numbers.

    """
    sum_divisors = all_sum_divisors(n)
    abundant = [i for i in range(1, n) if sum_divisors[i] > 2 * i]
    abundant_set = set(abundant)
    def unsums():
        for i in range(1, n):
            for j in islice(abundant, bisect(abundant, i // 2)):
                if i - j in abundant_set:
                    break
            else:
                yield i
    return sum(unsums())

I find this is about three times as fast as the code in the original post.

\$\endgroup\$
3
\$\begingroup\$

A few things :

  • the 4 constants defined at the top look like some attempt to perform some tiny optimisations. I am not sure it is worth the pain. Also, if you want to go for micro optimisations, you'd be happy to know that local variable lookup is usually faster than global variable lookup.

  • more generally, your variable names are quite long which is usually a good thing but a bit painful and not very natural when writing/reading code related to math. For instance, n conveys just a much information as number.

  • it is usually a good idea to split your logic into smaller functions that can be tested individually. This is especially relevant for project euler code as you'll keep on computing divisor list, prime number list and so on.

  • as a pretty small optimisation, you can define divisors as being an array associating number to its divisors instead of a dictionnary. Now you don't need to use sorted anymore and lookup for a particular number is faster (I think). Also, you don't need to use sets of divisors because you'll generate them in order anyway.

  • as a real optimisation, when looking if a number n can be written as a sum of two number, you are iterating over all abundant numbers. You can actually stop once you've reached n/2 because if you were to find something, you'd have found it already.

Now, the adapted code looks like :

def divisors_sieve(lim):
    """Computes the list of divisors for values up to lim included."""
    div = [[1] for i in range(lim + 1)]
    for i in range(2, 1 + lim // 2):
        for j in range(2 * i, lim + 1, i):
            div[j].append(i)
    return div

def solve_euler23():
    """Return the sum of all positive integers which cannot be written
    as the sum of two abundant numbers.
    """
    # mapping from a number to the set of its proper divisors (i.e.,
    # excluding the number itself)
    MAX = 28123
    abundant_numbers = [i for i, d in enumerate(divisors_sieve(MAX)) if i and sum(d) > i]

    # make a set so that membership can be tested efficiently
    abundant_numbers_set = set(abundant_numbers)

    impossible_sum_of_two_abundant_numbers = []
    for n in range(1, MAX+1):
        lim = n / 2
        for abundant in abundant_numbers:
            if n - abundant in abundant_numbers_set:
                break # is the sum of two abundant numbers
            if abundant > lim:
                impossible_sum_of_two_abundant_numbers.append(n)
                break
        else:
            impossible_sum_of_two_abundant_numbers.append(n)

    return sum(impossible_sum_of_two_abundant_numbers)

if __name__=='__main__':
    print solve_euler23()
\$\endgroup\$
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.