0
\$\begingroup\$

The problem statement: The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

This code takes 9 seconds!, is there a better way?

from time import time


def is_prime(number):
    for divisor in range(2, int(number ** 0.5) + 1):
        if number % divisor == 0:
            return False
    return True


def calculate_largest(number):
    # checking backwards to get the largest number first
    for divisor in range(int(number ** 0.5) + 1, 0, -1):   
        if is_prime(divisor) and number % divisor == 0:
            return divisor


if __name__ == '__main__':
    time1 = time()
    print(calculate_largest(600851475143))
    print(time() - time1)
\$\endgroup\$
4
  • 3
    \$\begingroup\$ That is not the most efficient solution. I suggest to check some related questions \$\endgroup\$
    – Martin R
    Commented Jul 13, 2019 at 19:21
  • \$\begingroup\$ I have rolled back your last edit. Please don't change or add to the code in your question after you have received answers. See What should I do when someone answers my question? Thank you. \$\endgroup\$
    – Peilonrayz
    Commented Jul 13, 2019 at 20:05
  • \$\begingroup\$ didn't know that, thanks for letting me know, anyway I want to mark it as answered or so. \$\endgroup\$
    – user203258
    Commented Jul 13, 2019 at 20:09
  • \$\begingroup\$ @emadboctor If you want to mark the question as answered then you should upvote answers if they helped you. If you think on of the answers helped the most, then you should press the tick next to that answer, which marks it as the best answer. \$\endgroup\$
    – Peilonrayz
    Commented Jul 16, 2019 at 0:51

2 Answers 2

1
\$\begingroup\$

order

You wrote:

    if is_prime(divisor) and number % divisor == 0:

that is, if A and B:.

But B is typically false and is quickly computed. Prefer if B and A:

stride

Rather than

    for divisor in range(int(number ** 0.5) + 1, 0, -1):

you might want to begin with an odd number and then use a stride of -2. You needn't special case for when number is a power of two, since the input clearly isn't.

sieve

Consider sieving for the solution.

\$\endgroup\$
3
  • 2
    \$\begingroup\$ Yeah I don't think sieving is the way to go with this one. Whilst you can use similar fundamentals they achieve fundamentally different things. \$\endgroup\$
    – Peilonrayz
    Commented Jul 13, 2019 at 19:51
  • 2
    \$\begingroup\$ Thanks, I edited the code and ran it again, time decreased dramatically (0.11 secs) \$\endgroup\$
    – user203258
    Commented Jul 13, 2019 at 20:06
  • 1
    \$\begingroup\$ Sieving is one of the easiest ways to speed up searching for small primes, and anyone can understand how it works. I think it's a fine way of speeding up this code. EDIT: I should note that the way to use a sieve in this case is to make a cache of primes, so you don't have to check every time if a number is prime. \$\endgroup\$
    – Turksarama
    Commented Jul 14, 2019 at 12:14
1
\$\begingroup\$

Problems

There is one correcntness and two performance problems with your approach, both connected with running through the problem from the high numbers and down.

For the correctness problem consider the number 91, it has the factorization of 7*13, but your approach would find 7 to be the largest, since 9 < 91**0.5 < 10 < 13, but the correct one would be 13. The reason we usually only need to consider primes up to the square-root of the target, is that whatever number is left when we have factorized it would be a prime number itself. This includes the number itself if we have found no prime factors.

The first performance problem is that you have no cached data in is_prime, so you have to run over all of the numbers to check if it is a prime every time, this is as opposed to building up knowledge of the primes in the region, which would likely save you a lot of time, as you will likely need to test is_prime for a lot of numbers before you hit the right one.

The second performance problem is that since you are running from above, you cannot make use of decreases to the target when you find smaller prime factors, which would have reduced the space you needed to look at.

Alternative

It is much simpler to write out the entire factorization, and then return the largest factor, than to try to go strictly after the last one.

Another thing to notice is that primes will come before any other numbers that is a factor of that prime, so we can simply factorize from the lowest number and not try to remove non-primes, because we will already have factored out the parts those numbers would factor out, then they will never show up in the factorization.

Note that if you want to do factorization for many numbers, you might want to construct the set of relevant prime numbers, while for a single use it is hard to eliminate a number faster than a single modulus operation.

Here is an implementation of the above principles:

def factorize(target):
    factors = []
    while target % 2 == 0 and target > 1:
        factors.append(2)
        target = target // 2
    max_factor = target **0.5
    factor = 3
    while target > 1 and factor <= max_factor:
        while target % factor == 0 and target > 1:
            factors.append(factor)
            target = target // factor
            max_factor = target ** 0.5
        factor += 2
    if target > 1:
        factors.append(target)
    return factors

print(factorize(600851475143)[-1])

When I measure the time of the above code, it takes roughly 0.3ms to run (a bit more for a single run and a bit less when I run it in a loop, about 0.35ms vs 0.29ms per run).

\$\endgroup\$
2
  • \$\begingroup\$ In the nested while loop, shouldn't target = target // 2 be target = target // factor? \$\endgroup\$
    – RootTwo
    Commented Jul 20, 2019 at 3:09
  • \$\begingroup\$ @RootTwo That was indeed my intent, and I somehow ended up writing it wrong. I have corrected it now. \$\endgroup\$
    – Ninetails
    Commented Jul 20, 2019 at 6:44