2
$\begingroup$

Background

Some crypto algorithms rely principally on the difficulty of factoring two prime numbers $p$ and $q$. For the purpose of this discussion, assume $p<q$.

(The top of this article is background and and explanation with examples to frame my question; the questions for the SE community are at the bottom.)

What can we learn if we setup a quadratic equation "to win" as follows:

1. $(x-p)(x-q)=0$ such that the zeros of the function represent $p$ and $q$.

This factors and simplifies as:

2. $x^2-(p+q)x-pq$ where the term $pq$ is publicly known, but $p$ and $q$ are unknown.

We can solve for $p$ and $q$ by getting zeros from the quadratic equation, so solving for $p$ and $q$ is reduced to finding the unknown value of $(p+q)$:

3. $ -b \pm\sqrt{b^2-4pq}\over{2} $ where $b$ is $(p+q)$.

Thus, if we can find $(p+q)$ given $pq$ then we can find $p$ and $q$. Since $p$, $q$, and $(p+q)$ are unknown, lets rename for convenience as follows:

3a. $b = p+q$

Noting that pq is public, lets define it as $N$:

3b. $N = pq$

Thus, $b$ is the only unknown to solve for $p$ and $q$:

4a. $p = -b - \frac{\sqrt{b^2-4N}}{2} $

4b. $q = -b + \frac{\sqrt{b^2-4N}}{2} $

Since we don't know $b$, lets create function $F_p(b)$, which is the same as (4a) when $b=p+q$:

4c. $F_p(b) = -b + \frac{\sqrt{b^2-4N}}{2} $

Let us say that $b$ is "optimal" when $b=p+q$ since we can, then, trivially solve for $q$. Thus, when we guess $b$, $F_p(b)$ provides a guess of what $p$ would be if $b$ were optimal.

To understand the behavior of $F_p(b)$, we graph over $b=0..(p+q)$ with contrived and known values of $p$ and $q$ to see if any information about $(p+q)$ can be discerned. Of course when $b$ is optimal (we guess right), then $F_p(p+q) = p$ and we can trivially solve for $q$.

Example 1: When $\frac{q}{p} < 4$, we find $q < 2\sqrt{N}$, thus $q$ is bound by an asymptote, reducing the factoring search space:

For the purposes of graphing, lets choose:

  • $p = 131$
  • $q = 337$

Therefore, optimally:

  • $b = 131+337 = 468$

The graph of $y=F_p(b)$ from $b=0..468$ is as folows:

graph of <span class=$y=F_p(b)$ from 0..468" />

The inflection point can be found by taking the derivative and finding the asymptote:

5. $\frac{b}{2\sqrt{b^2 - 4N}} - \frac{1}{2}$ : derivative of $F_p(b)$

The derivative's asymptote is infinite as the bottom approaches 0, so solving $2\sqrt{b^2 - 4N}=0$ for $b$ we get:

6. $2\sqrt{N}$ : Asymptote of $F_p(b)$ given $N$.

Plugging in $N$ into (6) we find that the asymptote is at $b\approx 420.22$

Note that both $p$ and $q$ are on the left side of the asymptote which reduces our search space for $b$ as $b < 2\sqrt{N}$. This is possible because we can find the asymptote by (6) without actually knowing $p$ and $q$.

Example 2: When $\frac{q}{p} > 4$, we find $q > 2\sqrt{N}$, thus $q$ is unbound by an asymptote:

For this example lets choose $q$ as the next larger prime than $4p$:

  • $p = 131$
  • $q = 541$

Therefore, optimally:

  • $b = 131+541 = 671$

Plugging in $N$ into (6) we find that the asymptote is at $b\approx 532.43$. Thus, $q$ is beyond the asymptote.

Summary

  • In Example 1 we could simply take $2\sqrt{pq}$ from (6) and, since $\frac{q}{p}<4$ then the search space for $q$ is limited by the asymptope from (6).

  • In Example 2, $q$ is beyond the asymptote from (6) and thus $q$ is harder to find as it is unbound by this evaluation.

  • In both examples, $2\sqrt{pq} < (p+q)$ so a good lower starting guess for $b$ is $2\sqrt{pq}$

Questions:

I know in practice that $p$ and $q$ are chosen on the order of $2^{2048}$ or bigger, so it is the relationship between the size of $p$ and $q$ with respect to each other I would like to understand, and therein lies my question:

  • Is this behavior of $\frac{q}{p}<4$ that leads to a weaker product a known property?
    • Is it related to quadratic residues? If so, how?
    • If this is a weakness, has an attack been proposed?
  • Does $2\sqrt{pq} < (p+q)$ always hold true as a lower-bound to begin the search for $b$? Counter-example?
  • In modern cryptography is it already best practice to choose $q$ at least some notably larger multiple from $p$?
    • What is the recommended practice? If different that $4p$, how was the proposed distance chosen?
  • Is the search for $(p+q)$ going to be more, less, or equal in complexity to current prime factorization algorithms?
$\endgroup$

1 Answer 1

1
$\begingroup$

Nice question.

You don't factor $p,q$ they are primes! You (attempt) to factor the product $n=pq.$

The sizes of the choice of primes constrain what can be done. Usually we want to choose the two primes not too different than each other in size. In fact we don't choose primes, we choose random odd numbers and then apply primality tests to the chosen numbers.

A related factoring method to what you are suggesting is quadratic sieve, invented by Pomerance many decades ago.

Idea behind Quadratic Sieve: If $n$ is to be factored, the Quadratic Sieve attempts to find two numbers, $x,y$ such that $x$ is not $\pm y$ modulo $n$ and $x^2=y^2 \pmod n.$ This then implies that $(x-y)(x+y)=0\pmod n$ and we can then find $gcd(x-y,n)$ using the Euclidean algorithm to see if $x-y$ is actually a nontrivial divisor of $n.$

That is not even the best known factoring method, it is the number field sieve once $n=pq$ is large enough, well before we consider RSA size primes of roughly $2048$ bits.

Both of these methods are explained reasonably well in Wikipedia, with sources provided.

$\endgroup$
1
  • $\begingroup$ Can you comment about the (q/p) < 4 behavior? I found that rather curious, what does the inflection point have to do with q/p, and why 4? $\endgroup$
    – KJ7LNW
    Commented Oct 25, 2021 at 20:35

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .