11
$\begingroup$

I am using MATLAB to solve a problem that involves solving $\mathbf{A} \mathbf{x}=\mathbf{b}$ at every timestep, where $\mathbf{b}$ changes with time. Right now, I am accomplishing this using MATLAB's mldivide:

x = A\b

I have the flexibility to make as many precomputations as needed, so I am wondering if there is a faster and/or more accurate method than mldivide. What is typically done here? Thanks all!

$\endgroup$
3
  • 1
    $\begingroup$ Do you have specific knowledge about the structure of $A$? For instance, is it symmetric? Positive definite? Tridiagonal? Orthogonal? $\endgroup$
    – Dominique
    Commented Feb 16, 2013 at 3:44
  • $\begingroup$ The matrix $A$ is a dense square matrix. $\endgroup$
    – Doubt
    Commented Feb 16, 2013 at 22:23
  • 3
    $\begingroup$ If you have no other knowledge on $A$, the $LU$ factorization as described in the answer below is your best bet. $\endgroup$
    – Dominique
    Commented Feb 16, 2013 at 22:26

5 Answers 5

14
$\begingroup$

The most obvious thing you can do is to precompute

[L,U] = lu(A) ~ O(n^3)

Then you just compute

x = U \ (L \ b) ~ O(2 n^2)

This would reduce the cost enormously and make it faster. Accuracy would be the same.

$\endgroup$
6
  • 1
    $\begingroup$ Note, from the documentation, L isn't necessarily lower triangular. This answer would likely be faster than a direct solve, however I would be careful to make sure that the L \ b command is smart enough to know to solve L in the correct order (it likely is, but it doesn't say for certain in the documentation). $\endgroup$ Commented Feb 16, 2013 at 2:49
  • $\begingroup$ Yeah you're right, L is the product of a lower-triangular matrix and a permutation matrix. But I'll be damned if it doesn't recognize that all it has to do is backward substitution with L\b. Because I have seen this exact line being used in high-performance code by those I consider experts. $\endgroup$
    – Milind R
    Commented Feb 16, 2013 at 3:14
  • 8
    $\begingroup$ mldivide recognizes permuted triangular matrices and does the right $O(n^{2})$ thing in solving such a system. However, in my experiments, this seems to make the solution process slower by a factor of 10 or so for matrix of size 2000 by 2000 to 10000 by 10000. Thus you'd be better off keeping track of the permutation explicitly by using [L,U,P]=lu(P). $\endgroup$ Commented Feb 16, 2013 at 3:26
  • 1
    $\begingroup$ Also, if your matrix is sparse, you should take advantage of the sparsity in solving the system. The simplest way to do this is to make sure that $A$ is stored in sparse format by using A=sparse(A) before computing the LU factorization. You can also try to permute the rows of A so as to reduce fill-in during the LU factorization. $\endgroup$ Commented Feb 16, 2013 at 3:28
  • 3
    $\begingroup$ @BrianBorcher As far as I know, the best way to keep track of permutation is [L,U,p] = lu(A,'vector'); x = U\(L\b(p)); See example 3 in the lu docs. $\endgroup$
    – Stefano M
    Commented Feb 18, 2013 at 14:05
5
$\begingroup$

We did some extensive computer labs in our scientific computing courses about this topic. For the "small" calculations we did there, Matlab's backslash operator was always faster than anything else, even after we had optimized our code as much as possible and re-orded all matrices beforehand (for instance with Reverse Cuthill McKee ordering for sparse matrices).

You can check out one of our lab instructions. The answer to your question is covered (shortly) on page 4.

A good book on the topic is written for instance by Cheney.

$\endgroup$
4
$\begingroup$

Suppose $A$ is a $n\times n$ dense matrix and you have to solve $Ax_i = b_i$, $i=1\dots m$. If $m$ is big enough then there is nothing wrong in

V = inv(A);
...
x = V*b;

Flops are $O(n^3)$ for inv(A) and $O(n^2)$ for V*b, therefore in order to determine the break-even value for $m$ some experimentation is needed...

>> n = 5000;
>> A = randn(n,n);
>> x = randn(n,1);
>> b = A*x;
>> rcond(A)
ans =
   1.3837e-06
>> tic, xm = A\b; toc
Elapsed time is 1.907102 seconds.
>> tic, [L,U] = lu(A); toc
Elapsed time is 1.818247 seconds.
>> tic, xl = U\(L\b); toc
Elapsed time is 0.399051 seconds.
>> tic, [L,U,p] = lu(A,'vector'); toc
Elapsed time is 1.581756 seconds.
>> tic, xp = U\(L\b(p)); toc
Elapsed time is 0.060203 seconds.
>> tic, V=inv(A); toc
Elapsed time is 7.614582 seconds.
>> tic, xv = V*b; toc     
Elapsed time is 0.011499 seconds.
>> [norm(xm-x), norm(xp-x), norm(xl-x), norm(xv-x)] ./ norm(x)
ans =
   1.0e-11 *
    0.1912    0.1912    0.1912    0.6183

In this trivial example $A^{-1}$ pre-computation is better than $LU$ forward and backward solution for $m>125$.

Some notes

For stability and error analysis please see the comments to this different answer, especially the one by VictorLiu.

The proposed timings are not "scientific" at all, but are meant to show that the approach proposed in the answer by Milind R, while it makes perfect sense if implemented in C or Fortran by calling relevant LAPACK and BLAS subroutines, may prove not so effective in Matlab, even for $m\ll n$.

Timing were performed with Matlab R2011b on a 12 core computer with a fairly constant UNIX load average of 5; best tic, toc time of three probes.

$\endgroup$
9
  • $\begingroup$ Indeed, there is far more parallelism available in a matrix-vector multiply than a triangular solver, so this should be even more apparent if the computations are done in parallel (multicore/GPU/etc...) in any way. $\endgroup$ Commented Feb 18, 2013 at 9:24
  • $\begingroup$ @AronAhmadia I agree: break-even point estimates based only on operation count makes sense only for a serial implementation. $\endgroup$
    – Stefano M
    Commented Feb 18, 2013 at 9:47
  • 1
    $\begingroup$ Note that things will be vastly different if the A matrix is sparse- the inverse will typically be quite dense, while the LU factors are typically reasonably sparse, tipping things back in the direction of LU being faster. $\endgroup$ Commented Feb 18, 2013 at 15:34
  • 1
    $\begingroup$ @Doubt accuracy was already addressed in the comments to this answer. In general there should be only a negligible loss of accuracy, unless you have a pathological $A$ matrix. $\endgroup$
    – Stefano M
    Commented Feb 18, 2013 at 16:38
  • 1
    $\begingroup$ @MilindR of course precomputing inv(A) make sense only if you have to repeatedly solve $Ax=b$ for different $b$'s that are not all known at the same time. Usually if you have multiple right-hand sides, just collect them in a matrix $B$ and go with A\B. $\endgroup$
    – Stefano M
    Commented Feb 19, 2013 at 7:50
3
$\begingroup$

Take a look at this question, the answers show that mldivide is quite clever, and also gives suggestions as to how to see what Matlab uses to solve A\b. This may give you a hint regarding optimization options.

$\endgroup$
0
$\begingroup$

Using backslash is more or less equivalent to inv(A)*B, if you're coding it freely, the latter might be more intuitive. They're about the same (just different in how the computation is carried out), though you should check Matlab documentation for clarification.

To answer your question, backslash is generally fine, but it does depend on the properties of the mass matrix.

$\endgroup$
5
  • 1
    $\begingroup$ Mathematically inv(A)*b is the same as \ however numerically, actually forming the inverse is both less efficient and less accurate. If you are working to learn the linear algebra, this may be acceptable, but I would argue you need a very good reason to form the inverse. $\endgroup$ Commented Feb 16, 2013 at 14:49
  • $\begingroup$ But why would you ever compute inv(A) since that alone is more expensive than A\b? $\endgroup$
    – Dominique
    Commented Feb 16, 2013 at 18:25
  • 7
    $\begingroup$ @Godric: There's a recent paper that discusses the "myth" that inv(A)*b is less accurate: on ArXiv. Not saying that there's usually reason to compute the actual inverse, but just sayin'. $\endgroup$
    – Victor Liu
    Commented Feb 17, 2013 at 4:30
  • 3
    $\begingroup$ @Dominique: Triangular solves are much less parallelizable than matrix-vector multiplication, and sophisticated preconditioned iterative methods often make use of direct methods on subdomains. It is often useful to explicitly form the inverses of a few modest-sized dense triangular matrices in order to improve the parallelism. $\endgroup$ Commented Feb 17, 2013 at 5:27
  • $\begingroup$ @VictorLiu: Thank you for the article. I stand corrected on my accuracy statement (atleast for smart implementations of inv(A)). $\endgroup$ Commented Feb 17, 2013 at 18:40

Not the answer you're looking for? Browse other questions tagged or ask your own question.