Problems
There is one correcntness and two performance problems with your approach, both connected with running through the problem from the high numbers and down.
For the correctness problem consider the number 91
, it has the factorization of 7*13
, but your approach would find 7
to be the largest, since 9 < 91**0.5 < 10 < 13
, but the correct one would be 13
. The reason we usually only need to consider primes up to the square-root of the target, is that whatever number is left when we have factorized it would be a prime number itself. This includes the number itself if we have found no prime factors.
The first performance problem is that you have no cached data in is_prime
, so you have to run over all of the numbers to check if it is a prime every time, this is as opposed to building up knowledge of the primes in the region, which would likely save you a lot of time, as you will likely need to test is_prime
for a lot of numbers before you hit the right one.
The second performance problem is that since you are running from above, you cannot make use of decreases to the target when you find smaller prime factors, which would have reduced the space you needed to look at.
Alternative
It is much simpler to write out the entire factorization, and then return the largest factor, than to try to go strictly after the last one.
Another thing to notice is that primes will come before any other numbers that is a factor of that prime, so we can simply factorize from the lowest number and not try to remove non-primes, because we will already have factored out the parts those numbers would factor out, then they will never show up in the factorization.
Note that if you want to do factorization for many numbers, you might want to construct the set of relevant prime numbers, while for a single use it is hard to eliminate a number faster than a single modulus operation.
Here is an implementation of the above principles:
def factorize(target):
factors = []
while target % 2 == 0 and target > 1:
factors.append(2)
target = target // 2
max_factor = target **0.5
factor = 3
while target > 1 and factor <= max_factor:
while target % factor == 0 and target > 1:
factors.append(factor)
target = target // factor
max_factor = target ** 0.5
factor += 2
if target > 1:
factors.append(target)
return factors
print(factorize(600851475143)[-1])
When I measure the time of the above code, it takes roughly 0.3ms
to run (a bit more for a single run and a bit less when I run it in a loop, about 0.35ms
vs 0.29ms
per run).