Counting numbers smaller than $N$ with exactly $k$ distinct prime factors

Question

Using common notation, $\omega(n)$ is the number of distinct prime factors on $n$. Similiarly, $\Omega(n)$ is the number of prime factors of $n$, not necessarily distinct: $120=2^{3}\cdot 3 \cdot 5$ , therefore $\omega(120)=3$ , $\Omega(120) = 5$.

If we let $W_k(N)$ be the count of numbers not exceeding $N$ with exactly $k$ distinct prime factors {$i | i \leq N , \omega (i) = k$}, is there any way or formulation to calculate it? Or at least a relatively coincise algorithm for it, which does not include factoring each number and checking for $\omega$ individually?

For example, one can easily find that $W_3(100)=8$ , $W_3(1000000)=379720$ , $W_4(1000000)=208034$.

Please note here I am not looking for an asymptotic approximation but rather an exact count. And $N$ can be large, with order of magnitude of $10^{12}$ and over.

I have found that a solution exists for the $\Omega$ counterpart, known as k-Almost Primes, which admits closed-form formulas, but as for $\omega$, I have found non.

Algorithm idea: recursively enumerate the sets of at least $k$ distinct primes having product no more than $N$, and for each such set, count the $n\le N$ divisible by the product, and apply the inclusion-exclusion principle to exclude the multiples of four or more distinct primes. — Karl, Commented Mar 1, 2021 at 19:17
BTW, you should never factor each number individually to compute $\omega(n)$ over a large range. It's far more efficient to iterate over all possible prime divisors $p$ and mark off multiples (which only costs $O(N/p)$ time). That gets you to the $N=10^9$ range with very little effort, and you could compute the exact values of $\omega(n)$ for all $n \le 10^{12}$ in a matter of days, maybe hours. That should be your baseline from which you're looking to target specific values of $\omega$ more efficiently. — Erick Wong, Commented Mar 2, 2021 at 0:44
@MCFromScratch I’m not sure why you say that a segmented sieve is inaccessible. To reach $10^{12}$ you need only iterate over the primes up to $10^6$, so fewer than $100000$ iterations per segment. And the number of different values of $k$ is small, about $\log N$. I agree that a different approach is a worthwhile endeavor. — Erick Wong, Commented Mar 3, 2021 at 7:34
I'd love to see a nice formula like the k-almost prime count. Meanwhile, ranged sieves for the number of factors in a range exist. Perl/ntheory has one (in C) that does either omega or bigomega. Getting the count or iterating over them takes under a second for N=$10^8$, but 1m22s for N=$10^{10}$ so not blazing. Memory use is trivial just as Erick mentioned, and the time grows linearly. A couple hours for N=$10^{12}$. — DanaJ, Commented Mar 3, 2021 at 9:20
@EricSnyder False, we only need to look at small primes. The solution is to keep track of the remainder for each $n$ after dividing out the small prime powers (this requires only a very small amount of additional iteration, as well as a constant factor more memory). At the end, this remainder will either be exactly 1, or it will be a large factor that is necessarily prime. Either way we can deduce the exact value of $\omega(n)$. — Erick Wong, Commented Mar 8, 2021 at 0:29

Eric Snyder · Accepted Answer · 2021-03-09 01:48:18Z

2

The system asked if I really wanted to post another answer, but I really do feel like this belongs separately. Two things, actually. First, a simple (thought not algebraic) expression for $W_1(N)$, in terms of the prime counting function:

$$W_1(N) = \sum_{i = 1}^{\lfloor \log_2 N \rfloor} \pi \left(N^{\frac{1}{i}} \right)$$

$W_1(N)$ is just all the primes and their prime powers. Primes $\sqrt{N} \leq p$ will only have their first power on the list. Primes $\sqrt[3]{N} \leq p \leq \sqrt{N}$ will also have their second power on the list, and are counted twice, and so on to $i = \lfloor \log_2 N \rfloor$, for which $\pi(N^{\frac1i}) = 1$, that is, only $2$.

Second, a different take on algorithmic sieving. We start with:

$$|\{i | i \leq N , \omega (i) = k\}| = |\{i | i \leq N , \omega (i) \geq k\}| - |\{i | i \leq N , \omega (i) \geq k+1\}|$$

That is, for say $k=4$, if we find the set of integers where $\omega(i) \geq 4$, then remove everything that has $\omega(i) \geq 5$, what remains must have $\omega(i) = 4$. We do this because it means we can just use multiples without worrying about whether the multiples have specific factors. So for instance, $2\cdot3\cdot5\cdot7 = 210$ has four factors. Every multiple of it must have at least as many factors, in some cases many more.

I first thought "Hey we can just divide to find how many multiples there are," but that doesn't work--we end up counting the same thing multiple times. But then it struck me that the only numbers we'll count multiple times are the ones that have at least five distinct factors. So we only need to run this enumeration once, then remove all copies of any values that appear more than once in the list.

There are almost certainly optimizations to be done once you get to larger primes, see the above discussion. Right now my issue seems to be the time used to find and remove the multiples is dominating the other calculations. Weird. Will comment later with Sage code.

answered Mar 9, 2021 at 1:48

Eric Snyder

3,0499 silver badges23 bronze badges

$\begingroup$ Indeed, if we have a fast $\omega(n) \ge k$ function, this can give us a reasonable way to get $\omega(n) = k$. Nothing immediately occurred to me however. Code for k=1 / prime_power_count, or similar OEIS A267712. Also of possible interest is page 12 of arxiv paper though other than k=1 it just concerns asymptotic behaviour. $\endgroup$
– DanaJ
Commented Mar 9, 2021 at 6:42
$\begingroup$ We do have $N = \prod (N) + |\{i | i \leq N , \omega (i) \geq 2\}| +1$ so count for $\omega (n) \geq 2$ is reachable (with $\prod (N)$ being the prime powers counting function). Perhaps extention can be made. $\endgroup$
– MC From Scratch
Commented Mar 9, 2021 at 16:50
$\begingroup$ Having tried a few ways of making this happen in SageMath, I think the algorithm in the second part I wrote here looks great on paper but runs in some ridiculous time like $O(e^{e^n})$. The issue is that even working with the smallest $100$ primes and $k=4$, ${}_{100}C_4$ is nearly 4 million, and the combinatorics stack up super-exponentially from there. Iterating over 30 trillion combinations, even at a microsecond each, still takes a year... $\endgroup$
– Eric Snyder
Commented Mar 9, 2021 at 23:23
$\begingroup$ OK apparently my problem is that no matter which of the various algorithms talked about above that people have suggested, the implementation of factorization in SageMath is so highly optimized that, at least for $N < 10^6$ or so, just using the builtin factorization is much faster than anything else. I assume on any other system this would be a terrible idea, but such is life. $\endgroup$
– Eric Snyder
Commented Mar 10, 2021 at 4:13
$\begingroup$ @EricSnyder What are you using? Using something like sum([1 for i in [1..n] if len(factor(i))==k]) is very slow for me. Maybe I'm missing something, but it shouldn't be that much slower than Pari or Perl. $N_4(10^8)$ in 553s using simple SAGE loop, 40s using simple Pari loop, 21s using simple Perl loop, 0.6s using sieve. All the latter go quickly to C code, so not "fair" in some sense, but SAGE should be going to Pari for the factoring. $\endgroup$
– DanaJ
Commented Mar 10, 2021 at 14:41

| Show 2 more comments

DanaJ · Accepted Answer · 2021-03-16 06:13:55Z

$k=0$

$$N_0(n) = \begin{cases} 0 & \text{if $n < 1$} \\ 1 & \text{if $n >= 1$} \end{cases}$$

Since $n=1$ is the only non-negative integer with $\omega(n) = 0$, this is simple. Any $n >= 1$ returns 1, otherwise return 0.

$k=1$

$$ \begin{align} N_1(n) & = \text{prime_power_count}(n) \\ & = \sum_p^n \sum_{x=1}^{\lfloor\log_{p}(n)\rfloor} 1\\ & = \sum_p^n \lfloor\log_{p}(n)\rfloor\\ & = \pi\left(\lfloor n^{1/2} \rfloor\right) + \sum_{p}^{\lfloor\sqrt n\rfloor} \lfloor\log_{p}(n)\rfloor - 1\\ & = \sum_{k=1}^{\lfloor\log_2(n)\rfloor} \pi\left(\lfloor n^{1/k}\rfloor\right) \end{align}$$

These are the prime powers. We can do a sum over the primes, then the prime powers, or just do the obvious thing and turn the prime power loop into a single integer log. The second to last equation is noting that after $\sqrt(n)$ all the logs are 1, so we can do a single prime count for all the $p^1$ cases, leaving us only having to examine the primes up to $\sqrt(n)$. We can continue this logic to its end and get the last equation which is a series of prime counts. This is what a number of software packages implement, and it is quite fast given a modern prime count method.

$k=2$

$$ \begin{align} N_2(n) & = \sum_{p}^{\lfloor\sqrt n\rfloor} \sum_{i=1}^{\lfloor\log_{p}(n)\rfloor} \sum_{q = p+1}^{\lfloor n/{p^i} \rfloor} \lfloor\log_{q}(\lfloor n/{p^i} \rfloor)\rfloor && \text{$p$ and $q$ over primes} \\ & = \sum_{p}^{\lfloor\sqrt n\rfloor} \sum_{i=1}^{\lfloor\log_{p}(n)\rfloor} \bigl( \pi\left(\lfloor n/{p^i} \rfloor\right) - \sum_q^p \lfloor\log_{q}(\lfloor n/{p^i} \rfloor)\rfloor \bigr) && \text{$p$ and $q$ over primes} \\ \end{align}$$

The first two sums are for summing over the prime powers. In an implementation it's still two loops, but a bit more efficient. We'd also calculate $n/{p^i}$ once rather than doing it over and over inside the innermost loop. The second version is much faster given a decent prime count method, because the final sum is over a small number of primes rather than many.

$k=3$

$$ \begin{align} N_3(n) & = \sum_{p}^{\lfloor n^{1/3} \rfloor} \sum_{i=1}^{\lfloor\log_{p}(n)\rfloor} \sum_{q=p+1}^{\lfloor n^{1/2} \rfloor} \sum_{j=1}^{\lfloor\log_{q}(n/{p^i})\rfloor} \bigl( \pi\left(\lfloor n/{(p^i p^j)} \rfloor\right) - \sum_r^q \lfloor\log_{q}(\lfloor n/{(p^i p^j} \rfloor)\rfloor \bigr) \\ \end{align}$$

Just showing the more efficient innermost loop here. $p$, $q$, and $r$ run over the primes. This looks more complicated than it really is, as again we have two sums indicating the prime powers. We are trying to count the $p^i q^j r^k <= n$. We are going to count in order, so $p < q < r$. So as we loop through the powers $p^i$ we then find $q^j r^k <= n/p^i$, and so on.

The solution for $k = 4$ should be clear at this point, as we just add another outer sum pair, and continue for higher $k$.

It is possible this can be written in a better style, especially turning the double sum for primes powers into a single one, though note we use the actual prime for the start of the next loop. It is also not unlikely this can be further simplified.

In C or Perl code, this turns into a sub-20 line recursive function that handles arbitrary k. Albeit this is with a library providing logint, rootint, prime_power_count, prime_count, and fast looping over primes.

As a timing example:

time say omega_prime_count(2,powint(10,9)):

5.9s  efficient ranged nfactor sieve
1.2s  first double summation method
0.03s second double summation method

time for n=10^10:  0.13 seconds.  (15s for the first, 61s for the sieve)

For $k=8$, I get omega_prime_count(8,10^8) = 719 in 66 microseconds, vs. a naive loop in Pari/GP taking 38 seconds. omega_prime_count(8,10^12) = 953640790 takes 2.8 seconds.

Some interesting papers:

For $k=2$, I can't follow how you can write $$ \begin{align} \sum_{q = p+1}^{\lfloor n/{p^i} \rfloor} \lfloor\log_{q}(\lfloor n/{p^i} \rfloor)\rfloor && \text{$p$ and $q$ over primes} \\ = \bigl( \pi\left(\lfloor n/{p^i} \rfloor\right) - \sum_q^p \lfloor\log_{q}(\lfloor n/{p^i} \rfloor)\rfloor \bigr) && \text{$p$ and $q$ over primes} \\ \end{align}$$ I even wote a small script to test and found its gving different results. — user1057393, Commented Oct 24, 2022 at 14:28
instead of $$\pi\left(\lfloor n/{p^i} \rfloor\right)$$ shouldn't it be $$N1\left(\lfloor n/{p^i} \rfloor\right)$$ ? — user1057393, Commented Oct 24, 2022 at 15:11

Severus' Constant · Accepted Answer · 2024-06-08 08:56:48Z

3 years later.

I am currently working this in my master's thesis and I obtain some results.

I found an asymptotic approximation for numbers where $\omega(i) = 2$, but instead of covering all $i$ values, you get to choose which prime factors to consider. For example, let the prime factors be $p_1$ and $p_2$. Then, we have $\pi_{p_1,p_2}(n) \sim \frac{\log^2 n - (\log n)(\log p_1 + \log p_2)}{2 \log p_1 \log p_2}$. For instance, if we choose $p_1=2$, $p_2=3$, and $n=1000$, the graph will look like the one shown below.

If you sum up these approximation for other prime pairs, you can obtain your approximation. I tried this but I failed because of big error terms.

Generalizing this, I derived the following formula:

$$\pi_{p_1,p_2,...,p_k}(n) \sim \frac{1}{k!} \prod_{i=1}^k \left( \lfloor \frac{\log n}{\log p_i} \rfloor + 2 \right)$$

However, I could not prove this. It remains as my hypothesis. Still, it yields strong results when calculated computationally. For example, selecting $p_1=2, p_2=3, p_3=5, p_4=7$ results in the graph shown below.

My thesis written in Turkish. Blue curve is exact value, and red one is our approximation.

Eric Snyder · Accepted Answer · 2021-03-02 00:16:11Z

I think Karl above has a good start to an algorithmic answer here. But I don't think you even look at divisibility or inclusion-exclusion. I think you actually find every value of $i$ but in a faster way, without checking every $i < N$--just enumerate the possible values of $\{i \mid \omega(i) = k\}$ via multiplication. For the sake of an algorithm, I think you treat the factorizations in their exponent array/vector form; for instance

$$1260 = 2^2\cdot3^2\cdot 5^1\cdot7^1 = [2,2,1,1] \\43472 = 2^4\cdot3^0\cdot 5^0\cdot7^0\cdot11^1\cdot13^1\cdot 17^0\cdot19^1 = [4,0,0,0,1,1,0,1] $$

I'm not an algorithm-optimization person. But here's what I'd do (for $k=4$), because I think it gets you to stop points in the shortest time.

Start with the initial array $[a_2, a_3, a_5, a_7] = [1,1,1,1] = 2\cdot3\cdot5\cdot7 = 210$. Determine the number of different lists of values where all of the elements are $1 \leq a_i \leq 2$ and at least one value is $a_i = 2$. These are: one $2$, two $2$s, three $2$s, and four $2$s.

For each of those possible lists, generate all the permutations, in reverse lexicographical order. This would get you (eliminating brackets and commas for space reasons):

$$2111, 1211, 1121, 1112; 2211, 2121, 2112, 1221, 1212, 1122; 2221, 2212, 2122, 1222; 2222$$

At each step, of course, generate the actual numbers:

$$420, 630, 1050, 1470;\\ 1260, 2100, 2940, 3150, 4410, 7350;\\ 6300, 8820, 14700, 22050, 44100$$

and add to a counter if the number is less than $N$. In each group you can stop when you hit a number that's too high.

Now do the same thing for $1 \leq a_i \leq 3$, with at least one value $a_i = 3$, and so on. You'll want some recursion, and also some recursion. For $n = 10^{16}$, for instance, at the very least $[46,1,1,1]$ and $[44,2,1,1]$ are possibilities.

But eventually you'll hit a point where none of the possibilities is less than $N$. I suggested reverse lexicographical order, as you should hit the smaller numbers earlier each time around. You'll never even try, for instance, $[10,10,10,10]$

Now you add another element to the array to represent $11$, and do the same thing, with the constraint that only four elements can be non-zero. Then add another element for $13$, etc. Eventually you should hit a point where nothing is under $N$, probably when your array looks like $[0,0,....,0,1,1,1,1]$ and the last four elements are the last four primes before the fourth root of $N$.

I may come back later and try to pseudocode this.

Edit: some code available here for lex order, there's code at the bottom for anti-lex order but only in Java, but I'm sure you can generate similar code in your personal favorite language.

Having had this in my brain for a few days, realize there's at least one large issue. There will be a solution for every $30p$ for $p=7,11,...p_n$, where $p_n$ is the largest prime $p < N/30$. So for $10^{16}$, you're looking at using all the primes up to $3\times10^{14}$ or so.\\ There's probably a way to speed this up with seiving, but this'll be true for $42,66,70,105,110...$ I guess the good news is you only need to use those large primes a few times, but primes $N/42 < p < N/30$ get used twice, $N/66 < p < N/42$ three times, etc. — Eric Snyder, Commented Mar 7, 2021 at 23:15

Stack Exchange Network

Counting numbers smaller than $N$ with exactly $k$ distinct prime factors

4 Answers 4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
combinatorics
number-theory
prime-numbers
prime-factorization
.

Linked

Hot Network Questions

Counting numbers smaller than $N$ with exactly $k$ *distinct* prime factors

4 Answers 4

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged combinatoricsnumber-theoryprime-numbersprime-factorization.

Linked

Related

Hot Network Questions

Counting numbers smaller than $N$ with exactly $k$ distinct prime factors

Not the answer you're looking for? Browse other questions tagged
combinatorics
number-theory
prime-numbers
prime-factorization
.