26
$\begingroup$

Some of you may have been following this question, which was closed due to not being research level. So, I'm extracting the part of the question which is at a research level.

Beyond the "simpler" techniques, such as reducing to sorting or an EXPTIME-complete problem, what techniques have been used to prove lower bounds for the time complexity of a problem?

In particular:

  • What are the "cutting-edge" techniques that have been developed in the last decade?
  • Can techniques from Abstract Algebra, Category Theory, or other branches of typically "pure" mathematics be applied? (For example, I often hear mention of the "algebraic structure" of sorting, without any real explanation of what this means.)
  • What are significant but lesser-known results for lower-bound complexity?
$\endgroup$
9
  • 2
    $\begingroup$ Are you interested in lower bounds for function computation problems or lower bounds for anything including distributed computing, data structures, etc.? $\endgroup$
    – Kaveh
    Commented Jul 22, 2013 at 21:22
  • 1
    $\begingroup$ I'm primarily interested in computing of functions. I'm sure once you go parallel, that's a whole different kettle of fish. $\endgroup$ Commented Jul 22, 2013 at 21:29
  • 2
    $\begingroup$ Distributed is not the same as parallel. :) $\endgroup$
    – Kaveh
    Commented Jul 22, 2013 at 21:30
  • 1
    $\begingroup$ True, true. I mean, it's not what I had in mind, but it's not like I'm against answers for distributed computation. $\endgroup$ Commented Jul 22, 2013 at 21:31
  • 1
    $\begingroup$ Sure, I just asked because there are lower bound results in distributed computing which use quite advanced mathematics. $\endgroup$
    – Kaveh
    Commented Jul 22, 2013 at 21:35

6 Answers 6

18
$\begingroup$

Lower bounds for algebraic circuits

In the setting of algebraic circuits, where a lower bound on circuit size is analogous to a lower bound on time, many results are known, but there are only a few core techniques in the more modern results. I know you asked for time lower bounds, but I think in many cases the hope is that the algebraic lower bounds will one day lead to Boolean/Turing machine lower bounds. These results often use deeper techniques from "pure math" as you put it.

I. The degree bound.

Strassen showed that the log of the degree of a certain algebraic variety associated to a (set of) function(s) is a lower bound on the algebraic circuit size of computing those functions.

II. Connected components (or more generally the dimension of any higher homology group).

Ben-Or showed that the size of a real algebraic decision tree deciding membership in a (semi-algebraic) set is at least $\log C$ where $C$ is the number of connected components of that set. Ben-Or used this to prove an $\Omega(n \log n)$ lower bound on sorting (well, element distinctness, but element distinctness reduces to sorting) in the real algebraic decision tree model. Yao extended this from connected components to the sum of the Betti numbers and proved optimal lower bounds for other problems (like $k$-equals)). In a different paper Yao extended this to algebraic decision trees over the integers.

III. Partial derivatives.

This has been the workhorse of many of the modern algebraic circuit lower bounds. I believe partial derivatives were first used to prove a lower bound by Baur-Strassen, where they showed that computing all the first partials of $f$ can be done in size $5s$ where $s$ is the size needed to compute $f$. Combined with Strassen's degree bound, this gave $\Omega(n \log n)$ size lower bounds on various functions, which are still the strongest lower bounds on the size of unrestricted arithmetic circuits for an explicit function.

The more recent use of partial derivatives seems to stem from a paper of Nisan's in which he proved lower bounds on noncommutative circuits by considering the dimension of the space of all partial derivatives. This was used to prove lower bounds on restricted kinds of depth-3 circuits by Nisan-Wigderson, and similar ideas were used to prove lower bounds on multilinear formula size by Raz (and related models by Raz and collaborators). The very recent depth 4 and depth 3 lower bounds by Gupta, Kayal, Kamath, and Saptharishi use a generalization of this idea, to count the dimension of the space of "shifted partial derivatives" - where you can take partial derivatives and then multiply by any monomials of a given degree. Improving the GKKS result to a super-polynomial size lower bound on permanent (which would imply $\mathrm{VP} \neq \mathrm{VNP}$) may "just" be a matter of getting a better understanding of the ideal generated by permanental minors (see the conjecture at the end of their paper).

IV. Defining equations for varieties.

The idea here is to associate to "easy functions" a certain algebraic variety, find equations that vanish on this variety, and show that these equations do not vanish on your "hard function." (Hence proving that your hard function is not in the variety of easy functions, so that it's actually hard.) Especially useful in lower bounds on matrix multiplication. See Landsberg--Ottaviani on the arXiv for the latest, and references to prior lower bounds.

(In fact, I, II, and III above can all be viewed as special cases of finding defining equations for certain varieties, though the proofs that use I,II,III are essentially never phrased that way, as there wasn't really a need to.)

V. Representation theory, esp. as in geometric complexity theory.

Actually, also used by Landsberg--Ottaviani to find some equations for a certain variety. Also used by Burgisser-Ikenmeyer to get a "purely" representation-theoretic proof of a slightly weaker lower bound on matrix multiplication. Conjectured by Mulmuley and Sohoni (cf. "Geometric Complexity Theory I & II") to be useful to resolve $\mathrm{VP}$ vs $\mathrm{VNP}$ and ultimately $\mathrm{NP}$ vs. $\mathrm{P/poly}$.

$\endgroup$
2
12
$\begingroup$

Kaveh has gently suggested in his answer that I should say something. I don't have much else to contribute to this nicely comprehensive list of answers. I can add a few generic words about how "structural complexity" lower bounds have evolved over the past ten years or so. (I use the name "structural complexity" simply to distinguish from algebraic, communication complexity, etc.)

The current approaches are still largely based on diagonalization, and in particular the following basic paradigm: Start by assuming the opposite of the lower bound. This gives you a nice algorithm for some problem. Try to use that algorithm to contradict some hierarchy theorem based on diagonalization, such as the time hierarchy or space hierarchy. As diagonalization arguments alone are not enough to prove new lower bounds, other ingredients are added to the mix in order to obtain the contradictory recipe.

I should say that many arguments from the 70s and 80s can also be said to follow the above pattern; the primary difference nowadays is the "other ingredients" -- there are many ingredients to choose from, and the ways in which ingredients can be applied seem to be limited only by your own creativity. Sometimes, when you don't know how to mix particular ingredients to get any better recipe, but you understand very well how they can mix, it helps to code up a computer program that suggests new recipes for you.

It would be very interesting to obtain new proofs of recent lower bounds that definitively do not follow this paradigm. For instance, can $NEXP \not\subset ACC$ be proved without any reference to a diagonalization argument? To start with, can it be proved without invoking the nondeterministic time hierarchy theorem? (Could one use the "circuit size hierarchy" instead, for example?)

$\endgroup$
10
$\begingroup$

The techniques depend on the model and the type of resource we want to get a lower bound on. Note that to prove a lower bound on the complexity of a problem we have to first fix a mathematical model of computation: a lower bound for a problem states is that no algorithm using some amount of resources can solve the problem, i.e. we are quantifying universally over algorithms. We need to have a mathematical definition of the domain of quantification. (This is generally true for impossibility results.) Therefore, lower bound results hold only for particular model of computation. For example, the $\Omega(n \log n)$ lower bound for sorting only works for comparison based sorting algorithms, without this restriction and in more general models of computation it might be possible to solve sorting faster, even linear time. (See Josh's comment below.)

Here are a few basic direct methods of proving lower bounds in computational complexity theory for the more general models of computation (Turing machines and circuits).

I. Counting:

Idea: We show that there are more functions that algorithms.

Ex: There are functions which require exponentially large circuits.

The problem with this method is that it is an existential argument and doesn't give any explicit function or any upper bound on the complexity of the problem proven to be difficult.

II. Combinatorial/Algebraic:

Idea: We analyze the circuits and show that they have a particular property, e.g. the functions computed by them can be approximated by some nice class of mathematical object, while the target function does not have that property.

Ex: Håstad's switching lemma and its variants use decision tree to approximate $\mathsf{AC^0}$, Razborov-Smolensky use polynomials over fields to approximate functions $\mathsf{AC^0}[p]$, etc.

The issue with this method is that in practice it only has worked for small and relatively easy to analyze classes. There is also Razborov-Rudich's Natural Proofs barrier which in a way formalizes why simple properties by themselves are unlikely to be sufficient for proving more general circuit lower bounds.

Razborov's paper "On the method of approximation" argues that the approximation method is complete for proving lower bounds in a sense.

III. Diagonalization:

Idea. We diagonalize against the functions in the smaller class. The idea goes back to Gödel (and even Cantor).

Ex. Time hierarchy theorems, Space hierarchy theorem, etc.

The main issue with this method is that to get an upper bound we need to have a universal simulator for the smaller class and it is difficult to find good non-trivial simulators. For example, to separate $\mathsf{P}$ from $\mathsf{PSpace}$ we need to have a simulator for $\mathsf{P}$ inside $\mathsf{PSpace}$ and there are results showing that if there are such simulators they are not going to be nice. Therefore, we usually end up with separating classes with same type of resources where using a bit more resources we can universally simulate the smaller class.

We also have the relativization barrier (going back to the Baker, Gill, and Solovay) and algebraization barrier (by Aaronson and Wigderson) which state that particular types of diagonalization arguments will transfer to other settings where the result is provably false.

Note that these barriers do not apply to more general diagonalization arguments. In fact, by Dexter Kozen's paper "Indexing of subrecursive classes", diagonalization in is complete for proving lower bounds.

As you probably have noticed, there is a strong relations between finding good universal simulators for a complexity class and separating that complexity class from larger classes (for a formal statement see Kozen's paper).

Recent works

For recent advances, check Ryan Williams recent papers. I don't discuss them in this answer as I hope Ryan himself will write an answer.

$\endgroup$
4
  • 2
    $\begingroup$ Re sorting: In fact, in the RAM model one can beat $n \log n$, though $O(n)$ time is not yet known. Also, re: III (diagonalization): it's worth mentioning that Ryan Williams' NEXP vs AC^0 result ultimately relies on the nondeterministic time hierarchy theorem (a diagonalization argument), but to get there combines many different results and algorithms in clever ways. $\endgroup$ Commented Jul 23, 2013 at 0:26
  • 1
    $\begingroup$ Every lower bound works only in a particular model of computation, not just the sorting lower bound. Turing machines and Boolean circuits are models of computation, too. $\endgroup$
    – Jeffε
    Commented Jul 23, 2013 at 1:09
  • $\begingroup$ @JɛffE, I think that is implicit in the first sentence of my answer but I will clarify it. $\endgroup$
    – Kaveh
    Commented Jul 23, 2013 at 2:17
  • 2
    $\begingroup$ I think this point should be explicit. It's too often ignored. $\endgroup$
    – Jeffε
    Commented Jul 23, 2013 at 2:22
9
$\begingroup$

Algebraic decision trees

This is not a recent technique, but one that is quite powerful for certain problems.

The algebraic decision tree model is a powerful generalization of comparison trees. In this model, an algorithm is modeled as a non-uniform family of decision trees, one for each input size $n$. Specifically, a $d$th-order algebraic decision tree is a rooted ternary tree with the following structure:

  • Each non-leaf node $v$ is labeled with a multivariate query polynomial $q_v(x_1, \dots, x_n)$ of degree at most $d$. For example, in a comparison tree, every query polynomial has the form $x_i-x_j$ for some indies $i$ and $j$.

  • The edges leaving every non-leaf node are labeled $-1$, $0$, and $+1$.

  • Each leaf is labeled with a possible output description. For example, for the sorting problem, each leaf is labeled with a permutation of the set $\{1,2,\dots,n\}$. For decision problems, each leaf is labeled "yes" or "no".

Given an input vector $\vec{x}\in\mathbb{R}^n$, we compute by traversing a path downward from the root, branching according to the sign of the query polynomials in the visited nodes. The traversal eventually reaches a leaf; the label of that leaf is the output. The "running time" of the algorithm is defined to be the length of the traversed path; thus, the worst-case running time is the depth of the decision tree.

Note in particular that each query polynomial may have $\Omega(n^d)$ distinct terms; nevertheless, the model assumes that we can evaluate the sign of any query polynomial in constant time.

For each leaf $\ell$, let $R(\ell) \subseteq \mathbb{R}^n$ denote the set of input vectors for which execution reaches $\ell$. By construction, $R(\ell)$ is a semi-algebraic subset of $\mathbb{R}^n$ defined by at most $t = \text{depth}(\ell)$ polynomial inequalities of degree at most $d$, for some constant $d$. A classical theorem independently proved by Petrovskiĭ and Oleĭnik, Thom, and Milnor implies that such a semi-algebraic set has at most $(dt)^{O(n)}$ components.

Suppose we want to decide if the input vector lies in a subset $W\subseteq\mathbb{R}^n$. We can make this decision using a $d$th order decision tree with depth $t$ only if $W$ has at most $3^t (dt)^{O(n)}$ components. Equivalently, we have the lower bound $t = \Omega(\log \#W - n\log d)$.

For example, suppose we want to determine whether all $n$ coordinates of the input vector are distinct. The set $W$ of "yes" instances has exactly $n!$ components, one for each permutation of size $n$, so we immediately have a lower bound of $\Omega(n\log n)$.

Note that this lower bound strengthens the classical $\Omega(n\log n)$ comparison lower bound for sorting in two ways. First, the model of computation allows much more complex queries than comparisons at unit cost. Second, and more importantly, the lower bound applies to a decision problem—there are only two distinct outputs, so the naïve information-theoretic bound is trivial.

Extensions of this argument use more interesting complexity measures than the number of components, such as higher Betti numbers, the Euler characteristic, volume, or number of lower-dimensional faces. In all cases, generalizations of the Petrovskiĭ-Oleĭnik-Thom-Milnor theorem imply that each set $R(\ell)$ has "complexity" at most $(dt)^{O(n)}$.

This lower-bound technique has two significant downsides. First, consider any problem that can be solved with a family of algebraic decision trees with polynomial depth. The Petrovskiĭ-Oleĭnik-Thom-Milnor theorem and its generalizations imply that the semi-algebraic sets that define such a problem have complexity at most $n^{O(n)}$. Thus, this technique cannot be used to prove lower bounds bigger than $n\log n$ in this model, for any problem that can be solved in polynomial time.

It is possible to prove $\Omega(n^2)$ lower bounds for certain NP-hard problems, but for similar reasons, there is no hope of anything better. In fact, Meyer auf der Heide proved that some NP-hard problems can actually be solved using linear decision trees with only polynomial depth; specifically, Knapsack can be solved in $O(n^4\log n)$ "time". Meyer auf der Heide's algorithm was later adapted by Meiser to any problem whose solution space is the union of cells in an arrangement of $2^{O(n)}$ hyperplanes in $\mathbb{R}^n$. Thus, it is impossible to prove lower bounds bigger than $n^4\log n$ in this model, for any such problem. An example of such a problem is $k$-SUM (Do any $k$ elements of this set sum to zero?) for any constant $k$; the fastest uniform algorithm for $k$-SUM runs in roughly $O(n^{k/2})$ time. Actually executing Meier auf der Heide's algorithm would require repeatedly solving several NP-hard problems to find the appropriate $O(n^4\log n)$ query polynomials; this construction time is free in the lower-bound model.

Hooray for double-negative results!

$\endgroup$
0
7
$\begingroup$

Manindra Agrawal has a nice paper "Proving Lower Bounds via Psuedorandom Generators". This might be considered a "dark horse" in the running to prove lower bounds but the paper is interesting.

$\endgroup$
3
  • 4
    $\begingroup$ Can you give some more details to make your answer self-contained? $\endgroup$
    – Jeffε
    Commented Jul 23, 2013 at 5:14
  • 5
    $\begingroup$ @JeffE: I wouldn't dream of trying to write a capsule summary on a paper written by a Godel Prize winner, but I will try and go you one better. I will email Mr. Agrawal and see if he would like comment here, he may have new insights as to why he thinks PRG's can/cannot be used to prove lower bounds. $\endgroup$ Commented Jul 23, 2013 at 21:02
  • $\begingroup$ Psuedorandom generators based on linear feedback shift registers have well studied algebraic properties. It might be possible to use Geometric Complexity Theory to show that some generator is next-bit unpredictable and according to Mr. Agrawal,such a strong psuedorandom generator will give you a lower bound. $\endgroup$ Commented Jul 27, 2013 at 15:13
1
$\begingroup$

this is a 32p survey that just appeared on the subject focusing on the circuit lower bounds angle (there is strong overlap in the content with other answers here).

Different techniques have been used to prove several transference theorems of the form "nontrivial algorithms for a circuit class C yield circuit lower bounds against C". In this survey we revisit many of these results. We discuss how circuit lower bounds can be obtained from derandomization, compression, learning, and satisfiability algorithms. We also cover the connection between circuit lower bounds and useful properties, a notion that turns out to be fundamental in the context of these transference theorems. Along the way, we obtain a few new results, simplify several proofs, and show connections involving different frameworks. We hope that our presentation will serve as a self-contained introduction for those interested in pursuing research in this area.

$\endgroup$
1

Not the answer you're looking for? Browse other questions tagged or ask your own question.