Optimizing solution for Project Euler Problem #23 (non-abundant sums)

Question

A perfect number is a number for which the sum of its proper divisors is exactly equal to the number. For example, the sum of the proper divisors of 28 would be 1 + 2 + 4 + 7 + 14 = 28, which means that 28 is a perfect number.

A number n is called deficient if the sum of its proper divisors is less than n and it is called abundant if this sum exceeds n.

As 12 is the smallest abundant number, 1 + 2 + 3 + 4 + 6 = 16, the smallest number that can be written as the sum of two abundant numbers is 24. By mathematical analysis, it can be shown that all integers greater than 28123 can be written as the sum of two abundant numbers. However, this upper limit cannot be reduced any further by analysis even though it is known that the greatest number that cannot be expressed as the sum of two abundant numbers is less than this limit.

Find the sum of all the positive integers which cannot be written as the sum of two abundant numbers.

I have been trying to optimize my solution for almost a day now but my program is not ready to get small and optimized. Can anyone please tell me how I can do that?

def isabundant(n): return sum(list(x for x in range(1, int(n/2)+1) if n % x == 0)) > n
abundants = list(x for x in range(1, 28123) if isabundant(x) == True)
sums = 0
for i in range(12, 28123):
    for abundant in abundants:
        if abundant >= i and isabundant(i+abundant) == True: sums += i
print(sums)

amon · Accepted Answer · 2014-08-24 19:41:27Z

Your first problem is that you're trying to cram too much information onto one line. As a result, you loose the overview. Here is a simple refactoring:

def is_abundant(n):
    max_divisor = int(n / 2) + 1
    sum = 0
    for x in range(1, max_divisor):
        if n % x == 0:
            sum += x  
    return sum > n

abundants = list(x for x in range(1, 28123) if is_abundant(x))

sums = 0
for i in range(12, 28123):
    for abundant in abundants:
        if abundant >= i and is_abundant(i + abundant):
            sums += i
print(sums)

The == True tests are unecessary and were removed.
Naming was improved: isabundant → is_abundant.
The long statement in is_abundant was split up.

Now we can think about how this could be optimized.

One subproblem is calculating all divisors of a number. We could put the relevant code into its own function. We can furthermore exploit that if n % i == 0, then there must be another integer k so that n == i * k. This means that we only have to look at a lot less numbers: only the range(2, 1 + int(sqrt(n))) is interesting.

def divisors(n):
    """
    Returns all nontrivial divisors of an integer, but makes no guarantees on the order.
    """
    # "1" is always a divisor (at least for our purposes)
    yield 1

    largest = int(math.sqrt(n))

    # special-case square numbers to avoid yielding the same divisor twice
    if largest * largest == n:
        yield largest
    else:
        largest += 1

    # all other divisors
    for i in range(2, largest):
        if n % i == 0:
            yield i
            yield n / i

We can now rewrite our is_abundant to the simpler:

def is_abundant(n):
    if n < 12:
        return False
    return sum(divisors(n)) > n

Later, in your main loop, you are doing a rather weird calculation. What were we supposed to do?

Find the sum of all the positive integers which cannot be written as the sum of two abundant numbers.

We furthermore know that all integers above 28123 can be written as such a sum. Thus we have to look at the range(1, 28123 + 1)! How can we decide if a number n can be written as a sum of abundant numbers i and k? There exists any abundant number i with the constraint i < n, and another abundant number with the constraints k < n and n - i == k. Here is one clever way to write this:

def is_abundant_sum(n):
   for i in abundants_smaller_than_n:
       if (n - i) in abundants_smaller_than_n:
           return True
   return False

Because we don't want to calculate the abundants_smaller_than_n each time, we just take all possible abundants and bail out if we get larger than n:

def is_abundant_sum(n):
   for i in abundants:
       if i > n:  # assume "abundants" is ordered
         return False
       if (n - i) in abundants:
           return True
   return False

where abundants = [x for x in range(1, 28123 + 1) if is_abundant(x)].

Now, all that is left to do is to sum those numbers where this condition does not hold true:

sum_of_non_abundants = sum(x for x in range(1, 28123 + 1) if not is_abundant_sum(x))

We could perform one optimization: abundants is a list, which is an ordered data structure. If we search for an element that is not contained in the list, all elements would have to be searched. The set() data structure is faster, so:

abundants_set = set(abundants)
def is_abundant_sum(n):
   for i in abundants:
       if i > n:  # assume "abundants" is ordered
         return False
       if (n - i) in abundants_set:
           return True
   return False

@MohammadAreebSiddiqui Thanks for pointing that out. It seems I was too distracted by interesting number theory to write a correct divisors() implementation. That's kind of embarrassing. Anyway, it's fixed now and passed the projecteuler.net test. — amon, Commented Aug 24, 2014 at 19:44
you could've just added a check to yield n / i only if i < n / i :) — Mohammad Areeb Siddiqui, Commented Aug 24, 2014 at 19:48

Wcampos · Accepted Answer · 2016-09-20 20:38:50Z

A simple optimization would be to do this instead of iterating trough every single division:

def GetSumOfDivs(n):
    i = 2
    upper = n
    total = 1
    while i < upper:
        if n%i == 0:
            upper = n/i
            total += upper
            if upper != i: total += i
        i += 1
    return total

then you can check if the returned value is greater than the actual number to populate your list.

this works like this: lets say you want to get the sum of the divisors of 28... instead of iterating through each number: +1, +2, +4, +7, +14 it will add 2 numbers at once like this: 3, +4+7, +14 because you keep decreasing the upper limit for the loop for each divisor.

keep in mind that for low numbers it will actually take more time to run through, but for big number you have a massive improvement.

Time difference: (19.381) - (11.419) = 7.962 seconds faster while generating the abundant list

another optimization is to search in the already generated list of abundants instead of checking if the number is abundant all over again, but you should use a dictionary instead of a list to make the search substantially faster.

My solution to your problem:

def GetSumOfDivs(n):
    i = 2
    upper = n
    total = 1
    while i < upper:
        if n%i == 0:
            upper = n/i
            total += upper
            if upper != i: total += i
        i += 1
    return total


def isabundant(n): return GetSumOfDivs(n) > n
lAbundants = [x for x in range(12, 28123) if isabundant(x) == True]
dAbundants = {x:x for x in lAbundants}

sums = 1
for i in range(2, 28123):
    boo = True
    for k in lAbundants:
        if k < i:
            if (i-k) in dAbundants:
                boo = False
                break
        else : break
    if boo == True: sums += i

print(sums)

Why a list and a dictionary? the list is ordered and the dictionary for fast indexing.

Total execution time: 12.08 seconds

now try it in C++... I'm sure its under 2 seconds... XD good luck

Stack Exchange Network

Optimizing solution for Project Euler Problem #23 (non-abundant sums)

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
python
optimization
programming-challenge
python-3.x
or ask your own question.

Hot Network Questions

Optimizing solution for Project Euler Problem #23 (non-abundant sums)

2 Answers 2

Not the answer you're looking for? Browse other questions tagged pythonoptimizationprogramming-challengepython-3.x or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
python
optimization
programming-challenge
python-3.x
or ask your own question.