Hide keyboard shortcuts

Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1"""Sparse Equations and Least Squares. 

2 

3The original Fortran code was written by C. C. Paige and M. A. Saunders as 

4described in 

5 

6C. C. Paige and M. A. Saunders, LSQR: An algorithm for sparse linear 

7equations and sparse least squares, TOMS 8(1), 43--71 (1982). 

8 

9C. C. Paige and M. A. Saunders, Algorithm 583; LSQR: Sparse linear 

10equations and least-squares problems, TOMS 8(2), 195--209 (1982). 

11 

12It is licensed under the following BSD license: 

13 

14Copyright (c) 2006, Systems Optimization Laboratory 

15All rights reserved. 

16 

17Redistribution and use in source and binary forms, with or without 

18modification, are permitted provided that the following conditions are 

19met: 

20 

21 * Redistributions of source code must retain the above copyright 

22 notice, this list of conditions and the following disclaimer. 

23 

24 * Redistributions in binary form must reproduce the above 

25 copyright notice, this list of conditions and the following 

26 disclaimer in the documentation and/or other materials provided 

27 with the distribution. 

28 

29 * Neither the name of Stanford University nor the names of its 

30 contributors may be used to endorse or promote products derived 

31 from this software without specific prior written permission. 

32 

33THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 

34"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 

35LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 

36A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT 

37OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 

38SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT 

39LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 

40DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY 

41THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 

42(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 

43OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 

44 

45The Fortran code was translated to Python for use in CVXOPT by Jeffery 

46Kline with contributions by Mridul Aanjaneya and Bob Myhill. 

47 

48Adapted for SciPy by Stefan van der Walt. 

49 

50""" 

51 

52__all__ = ['lsqr'] 

53 

54import numpy as np 

55from math import sqrt 

56from scipy.sparse.linalg.interface import aslinearoperator 

57 

58eps = np.finfo(np.float64).eps 

59 

60 

61def _sym_ortho(a, b): 

62 """ 

63 Stable implementation of Givens rotation. 

64 

65 Notes 

66 ----- 

67 The routine 'SymOrtho' was added for numerical stability. This is 

68 recommended by S.-C. Choi in [1]_. It removes the unpleasant potential of 

69 ``1/eps`` in some important places (see, for example text following 

70 "Compute the next plane rotation Qk" in minres.py). 

71 

72 References 

73 ---------- 

74 .. [1] S.-C. Choi, "Iterative Methods for Singular Linear Equations 

75 and Least-Squares Problems", Dissertation, 

76 http://www.stanford.edu/group/SOL/dissertations/sou-cheng-choi-thesis.pdf 

77 

78 """ 

79 if b == 0: 

80 return np.sign(a), 0, abs(a) 

81 elif a == 0: 

82 return 0, np.sign(b), abs(b) 

83 elif abs(b) > abs(a): 

84 tau = a / b 

85 s = np.sign(b) / sqrt(1 + tau * tau) 

86 c = s * tau 

87 r = b / s 

88 else: 

89 tau = b / a 

90 c = np.sign(a) / sqrt(1+tau*tau) 

91 s = c * tau 

92 r = a / c 

93 return c, s, r 

94 

95 

96def lsqr(A, b, damp=0.0, atol=1e-8, btol=1e-8, conlim=1e8, 

97 iter_lim=None, show=False, calc_var=False, x0=None): 

98 """Find the least-squares solution to a large, sparse, linear system 

99 of equations. 

100 

101 The function solves ``Ax = b`` or ``min ||Ax - b||^2`` or 

102 ``min ||Ax - b||^2 + d^2 ||x||^2``. 

103 

104 The matrix A may be square or rectangular (over-determined or 

105 under-determined), and may have any rank. 

106 

107 :: 

108 

109 1. Unsymmetric equations -- solve A*x = b 

110 

111 2. Linear least squares -- solve A*x = b 

112 in the least-squares sense 

113 

114 3. Damped least squares -- solve ( A )*x = ( b ) 

115 ( damp*I ) ( 0 ) 

116 in the least-squares sense 

117 

118 Parameters 

119 ---------- 

120 A : {sparse matrix, ndarray, LinearOperator} 

121 Representation of an m-by-n matrix. 

122 Alternatively, ``A`` can be a linear operator which can 

123 produce ``Ax`` and ``A^T x`` using, e.g., 

124 ``scipy.sparse.linalg.LinearOperator``. 

125 b : array_like, shape (m,) 

126 Right-hand side vector ``b``. 

127 damp : float 

128 Damping coefficient. 

129 atol, btol : float, optional 

130 Stopping tolerances. If both are 1.0e-9 (say), the final 

131 residual norm should be accurate to about 9 digits. (The 

132 final x will usually have fewer correct digits, depending on 

133 cond(A) and the size of damp.) 

134 conlim : float, optional 

135 Another stopping tolerance. lsqr terminates if an estimate of 

136 ``cond(A)`` exceeds `conlim`. For compatible systems ``Ax = 

137 b``, `conlim` could be as large as 1.0e+12 (say). For 

138 least-squares problems, conlim should be less than 1.0e+8. 

139 Maximum precision can be obtained by setting ``atol = btol = 

140 conlim = zero``, but the number of iterations may then be 

141 excessive. 

142 iter_lim : int, optional 

143 Explicit limitation on number of iterations (for safety). 

144 show : bool, optional 

145 Display an iteration log. 

146 calc_var : bool, optional 

147 Whether to estimate diagonals of ``(A'A + damp^2*I)^{-1}``. 

148 x0 : array_like, shape (n,), optional 

149 Initial guess of x, if None zeros are used. 

150 

151 .. versionadded:: 1.0.0 

152 

153 Returns 

154 ------- 

155 x : ndarray of float 

156 The final solution. 

157 istop : int 

158 Gives the reason for termination. 

159 1 means x is an approximate solution to Ax = b. 

160 2 means x approximately solves the least-squares problem. 

161 itn : int 

162 Iteration number upon termination. 

163 r1norm : float 

164 ``norm(r)``, where ``r = b - Ax``. 

165 r2norm : float 

166 ``sqrt( norm(r)^2 + damp^2 * norm(x)^2 )``. Equal to `r1norm` if 

167 ``damp == 0``. 

168 anorm : float 

169 Estimate of Frobenius norm of ``Abar = [[A]; [damp*I]]``. 

170 acond : float 

171 Estimate of ``cond(Abar)``. 

172 arnorm : float 

173 Estimate of ``norm(A'*r - damp^2*x)``. 

174 xnorm : float 

175 ``norm(x)`` 

176 var : ndarray of float 

177 If ``calc_var`` is True, estimates all diagonals of 

178 ``(A'A)^{-1}`` (if ``damp == 0``) or more generally ``(A'A + 

179 damp^2*I)^{-1}``. This is well defined if A has full column 

180 rank or ``damp > 0``. (Not sure what var means if ``rank(A) 

181 < n`` and ``damp = 0.``) 

182 

183 Notes 

184 ----- 

185 LSQR uses an iterative method to approximate the solution. The 

186 number of iterations required to reach a certain accuracy depends 

187 strongly on the scaling of the problem. Poor scaling of the rows 

188 or columns of A should therefore be avoided where possible. 

189 

190 For example, in problem 1 the solution is unaltered by 

191 row-scaling. If a row of A is very small or large compared to 

192 the other rows of A, the corresponding row of ( A b ) should be 

193 scaled up or down. 

194 

195 In problems 1 and 2, the solution x is easily recovered 

196 following column-scaling. Unless better information is known, 

197 the nonzero columns of A should be scaled so that they all have 

198 the same Euclidean norm (e.g., 1.0). 

199 

200 In problem 3, there is no freedom to re-scale if damp is 

201 nonzero. However, the value of damp should be assigned only 

202 after attention has been paid to the scaling of A. 

203 

204 The parameter damp is intended to help regularize 

205 ill-conditioned systems, by preventing the true solution from 

206 being very large. Another aid to regularization is provided by 

207 the parameter acond, which may be used to terminate iterations 

208 before the computed solution becomes very large. 

209 

210 If some initial estimate ``x0`` is known and if ``damp == 0``, 

211 one could proceed as follows: 

212 

213 1. Compute a residual vector ``r0 = b - A*x0``. 

214 2. Use LSQR to solve the system ``A*dx = r0``. 

215 3. Add the correction dx to obtain a final solution ``x = x0 + dx``. 

216 

217 This requires that ``x0`` be available before and after the call 

218 to LSQR. To judge the benefits, suppose LSQR takes k1 iterations 

219 to solve A*x = b and k2 iterations to solve A*dx = r0. 

220 If x0 is "good", norm(r0) will be smaller than norm(b). 

221 If the same stopping tolerances atol and btol are used for each 

222 system, k1 and k2 will be similar, but the final solution x0 + dx 

223 should be more accurate. The only way to reduce the total work 

224 is to use a larger stopping tolerance for the second system. 

225 If some value btol is suitable for A*x = b, the larger value 

226 btol*norm(b)/norm(r0) should be suitable for A*dx = r0. 

227 

228 Preconditioning is another way to reduce the number of iterations. 

229 If it is possible to solve a related system ``M*x = b`` 

230 efficiently, where M approximates A in some helpful way (e.g. M - 

231 A has low rank or its elements are small relative to those of A), 

232 LSQR may converge more rapidly on the system ``A*M(inverse)*z = 

233 b``, after which x can be recovered by solving M*x = z. 

234 

235 If A is symmetric, LSQR should not be used! 

236 

237 Alternatives are the symmetric conjugate-gradient method (cg) 

238 and/or SYMMLQ. SYMMLQ is an implementation of symmetric cg that 

239 applies to any symmetric A and will converge more rapidly than 

240 LSQR. If A is positive definite, there are other implementations 

241 of symmetric cg that require slightly less work per iteration than 

242 SYMMLQ (but will take the same number of iterations). 

243 

244 References 

245 ---------- 

246 .. [1] C. C. Paige and M. A. Saunders (1982a). 

247 "LSQR: An algorithm for sparse linear equations and 

248 sparse least squares", ACM TOMS 8(1), 43-71. 

249 .. [2] C. C. Paige and M. A. Saunders (1982b). 

250 "Algorithm 583. LSQR: Sparse linear equations and least 

251 squares problems", ACM TOMS 8(2), 195-209. 

252 .. [3] M. A. Saunders (1995). "Solution of sparse rectangular 

253 systems using LSQR and CRAIG", BIT 35, 588-604. 

254 

255 Examples 

256 -------- 

257 >>> from scipy.sparse import csc_matrix 

258 >>> from scipy.sparse.linalg import lsqr 

259 >>> A = csc_matrix([[1., 0.], [1., 1.], [0., 1.]], dtype=float) 

260 

261 The first example has the trivial solution `[0, 0]` 

262 

263 >>> b = np.array([0., 0., 0.], dtype=float) 

264 >>> x, istop, itn, normr = lsqr(A, b)[:4] 

265 The exact solution is x = 0 

266 >>> istop 

267 0 

268 >>> x 

269 array([ 0., 0.]) 

270 

271 The stopping code `istop=0` returned indicates that a vector of zeros was 

272 found as a solution. The returned solution `x` indeed contains `[0., 0.]`. 

273 The next example has a non-trivial solution: 

274 

275 >>> b = np.array([1., 0., -1.], dtype=float) 

276 >>> x, istop, itn, r1norm = lsqr(A, b)[:4] 

277 >>> istop 

278 1 

279 >>> x 

280 array([ 1., -1.]) 

281 >>> itn 

282 1 

283 >>> r1norm 

284 4.440892098500627e-16 

285 

286 As indicated by `istop=1`, `lsqr` found a solution obeying the tolerance 

287 limits. The given solution `[1., -1.]` obviously solves the equation. The 

288 remaining return values include information about the number of iterations 

289 (`itn=1`) and the remaining difference of left and right side of the solved 

290 equation. 

291 The final example demonstrates the behavior in the case where there is no 

292 solution for the equation: 

293 

294 >>> b = np.array([1., 0.01, -1.], dtype=float) 

295 >>> x, istop, itn, r1norm = lsqr(A, b)[:4] 

296 >>> istop 

297 2 

298 >>> x 

299 array([ 1.00333333, -0.99666667]) 

300 >>> A.dot(x)-b 

301 array([ 0.00333333, -0.00333333, 0.00333333]) 

302 >>> r1norm 

303 0.005773502691896255 

304 

305 `istop` indicates that the system is inconsistent and thus `x` is rather an 

306 approximate solution to the corresponding least-squares problem. `r1norm` 

307 contains the norm of the minimal residual that was found. 

308 """ 

309 A = aslinearoperator(A) 

310 b = np.atleast_1d(b) 

311 if b.ndim > 1: 

312 b = b.squeeze() 

313 

314 m, n = A.shape 

315 if iter_lim is None: 

316 iter_lim = 2 * n 

317 var = np.zeros(n) 

318 

319 msg = ('The exact solution is x = 0 ', 

320 'Ax - b is small enough, given atol, btol ', 

321 'The least-squares solution is good enough, given atol ', 

322 'The estimate of cond(Abar) has exceeded conlim ', 

323 'Ax - b is small enough for this machine ', 

324 'The least-squares solution is good enough for this machine', 

325 'Cond(Abar) seems to be too large for this machine ', 

326 'The iteration limit has been reached ') 

327 

328 if show: 

329 print(' ') 

330 print('LSQR Least-squares solution of Ax = b') 

331 str1 = f'The matrix A has {m} rows and {n} columns' 

332 str2 = 'damp = %20.14e calc_var = %8g' % (damp, calc_var) 

333 str3 = 'atol = %8.2e conlim = %8.2e' % (atol, conlim) 

334 str4 = 'btol = %8.2e iter_lim = %8g' % (btol, iter_lim) 

335 print(str1) 

336 print(str2) 

337 print(str3) 

338 print(str4) 

339 

340 itn = 0 

341 istop = 0 

342 ctol = 0 

343 if conlim > 0: 

344 ctol = 1/conlim 

345 anorm = 0 

346 acond = 0 

347 dampsq = damp**2 

348 ddnorm = 0 

349 res2 = 0 

350 xnorm = 0 

351 xxnorm = 0 

352 z = 0 

353 cs2 = -1 

354 sn2 = 0 

355 

356 """ 

357 Set up the first vectors u and v for the bidiagonalization. 

358 These satisfy beta*u = b - A*x, alfa*v = A'*u. 

359 """ 

360 u = b 

361 bnorm = np.linalg.norm(b) 

362 if x0 is None: 

363 x = np.zeros(n) 

364 beta = bnorm.copy() 

365 else: 

366 x = np.asarray(x0) 

367 u = u - A.matvec(x) 

368 beta = np.linalg.norm(u) 

369 

370 if beta > 0: 

371 u = (1/beta) * u 

372 v = A.rmatvec(u) 

373 alfa = np.linalg.norm(v) 

374 else: 

375 v = x.copy() 

376 alfa = 0 

377 

378 if alfa > 0: 

379 v = (1/alfa) * v 

380 w = v.copy() 

381 

382 rhobar = alfa 

383 phibar = beta 

384 rnorm = beta 

385 r1norm = rnorm 

386 r2norm = rnorm 

387 

388 # Reverse the order here from the original matlab code because 

389 # there was an error on return when arnorm==0 

390 arnorm = alfa * beta 

391 if arnorm == 0: 

392 print(msg[0]) 

393 return x, istop, itn, r1norm, r2norm, anorm, acond, arnorm, xnorm, var 

394 

395 head1 = ' Itn x[0] r1norm r2norm ' 

396 head2 = ' Compatible LS Norm A Cond A' 

397 

398 if show: 

399 print(' ') 

400 print(head1, head2) 

401 test1 = 1 

402 test2 = alfa / beta 

403 str1 = '%6g %12.5e' % (itn, x[0]) 

404 str2 = ' %10.3e %10.3e' % (r1norm, r2norm) 

405 str3 = ' %8.1e %8.1e' % (test1, test2) 

406 print(str1, str2, str3) 

407 

408 # Main iteration loop. 

409 while itn < iter_lim: 

410 itn = itn + 1 

411 """ 

412 % Perform the next step of the bidiagonalization to obtain the 

413 % next beta, u, alfa, v. These satisfy the relations 

414 % beta*u = a*v - alfa*u, 

415 % alfa*v = A'*u - beta*v. 

416 """ 

417 u = A.matvec(v) - alfa * u 

418 beta = np.linalg.norm(u) 

419 

420 if beta > 0: 

421 u = (1/beta) * u 

422 anorm = sqrt(anorm**2 + alfa**2 + beta**2 + damp**2) 

423 v = A.rmatvec(u) - beta * v 

424 alfa = np.linalg.norm(v) 

425 if alfa > 0: 

426 v = (1 / alfa) * v 

427 

428 # Use a plane rotation to eliminate the damping parameter. 

429 # This alters the diagonal (rhobar) of the lower-bidiagonal matrix. 

430 rhobar1 = sqrt(rhobar**2 + damp**2) 

431 cs1 = rhobar / rhobar1 

432 sn1 = damp / rhobar1 

433 psi = sn1 * phibar 

434 phibar = cs1 * phibar 

435 

436 # Use a plane rotation to eliminate the subdiagonal element (beta) 

437 # of the lower-bidiagonal matrix, giving an upper-bidiagonal matrix. 

438 cs, sn, rho = _sym_ortho(rhobar1, beta) 

439 

440 theta = sn * alfa 

441 rhobar = -cs * alfa 

442 phi = cs * phibar 

443 phibar = sn * phibar 

444 tau = sn * phi 

445 

446 # Update x and w. 

447 t1 = phi / rho 

448 t2 = -theta / rho 

449 dk = (1 / rho) * w 

450 

451 x = x + t1 * w 

452 w = v + t2 * w 

453 ddnorm = ddnorm + np.linalg.norm(dk)**2 

454 

455 if calc_var: 

456 var = var + dk**2 

457 

458 # Use a plane rotation on the right to eliminate the 

459 # super-diagonal element (theta) of the upper-bidiagonal matrix. 

460 # Then use the result to estimate norm(x). 

461 delta = sn2 * rho 

462 gambar = -cs2 * rho 

463 rhs = phi - delta * z 

464 zbar = rhs / gambar 

465 xnorm = sqrt(xxnorm + zbar**2) 

466 gamma = sqrt(gambar**2 + theta**2) 

467 cs2 = gambar / gamma 

468 sn2 = theta / gamma 

469 z = rhs / gamma 

470 xxnorm = xxnorm + z**2 

471 

472 # Test for convergence. 

473 # First, estimate the condition of the matrix Abar, 

474 # and the norms of rbar and Abar'rbar. 

475 acond = anorm * sqrt(ddnorm) 

476 res1 = phibar**2 

477 res2 = res2 + psi**2 

478 rnorm = sqrt(res1 + res2) 

479 arnorm = alfa * abs(tau) 

480 

481 # Distinguish between 

482 # r1norm = ||b - Ax|| and 

483 # r2norm = rnorm in current code 

484 # = sqrt(r1norm^2 + damp^2*||x||^2). 

485 # Estimate r1norm from 

486 # r1norm = sqrt(r2norm^2 - damp^2*||x||^2). 

487 # Although there is cancellation, it might be accurate enough. 

488 r1sq = rnorm**2 - dampsq * xxnorm 

489 r1norm = sqrt(abs(r1sq)) 

490 if r1sq < 0: 

491 r1norm = -r1norm 

492 r2norm = rnorm 

493 

494 # Now use these norms to estimate certain other quantities, 

495 # some of which will be small near a solution. 

496 test1 = rnorm / bnorm 

497 test2 = arnorm / (anorm * rnorm + eps) 

498 test3 = 1 / (acond + eps) 

499 t1 = test1 / (1 + anorm * xnorm / bnorm) 

500 rtol = btol + atol * anorm * xnorm / bnorm 

501 

502 # The following tests guard against extremely small values of 

503 # atol, btol or ctol. (The user may have set any or all of 

504 # the parameters atol, btol, conlim to 0.) 

505 # The effect is equivalent to the normal tests using 

506 # atol = eps, btol = eps, conlim = 1/eps. 

507 if itn >= iter_lim: 

508 istop = 7 

509 if 1 + test3 <= 1: 

510 istop = 6 

511 if 1 + test2 <= 1: 

512 istop = 5 

513 if 1 + t1 <= 1: 

514 istop = 4 

515 

516 # Allow for tolerances set by the user. 

517 if test3 <= ctol: 

518 istop = 3 

519 if test2 <= atol: 

520 istop = 2 

521 if test1 <= rtol: 

522 istop = 1 

523 

524 # See if it is time to print something. 

525 prnt = False 

526 if n <= 40: 

527 prnt = True 

528 if itn <= 10: 

529 prnt = True 

530 if itn >= iter_lim-10: 

531 prnt = True 

532 # if itn%10 == 0: prnt = True 

533 if test3 <= 2*ctol: 

534 prnt = True 

535 if test2 <= 10*atol: 

536 prnt = True 

537 if test1 <= 10*rtol: 

538 prnt = True 

539 if istop != 0: 

540 prnt = True 

541 

542 if prnt: 

543 if show: 

544 str1 = '%6g %12.5e' % (itn, x[0]) 

545 str2 = ' %10.3e %10.3e' % (r1norm, r2norm) 

546 str3 = ' %8.1e %8.1e' % (test1, test2) 

547 str4 = ' %8.1e %8.1e' % (anorm, acond) 

548 print(str1, str2, str3, str4) 

549 

550 if istop != 0: 

551 break 

552 

553 # End of iteration loop. 

554 # Print the stopping condition. 

555 if show: 

556 print(' ') 

557 print('LSQR finished') 

558 print(msg[istop]) 

559 print(' ') 

560 str1 = 'istop =%8g r1norm =%8.1e' % (istop, r1norm) 

561 str2 = 'anorm =%8.1e arnorm =%8.1e' % (anorm, arnorm) 

562 str3 = 'itn =%8g r2norm =%8.1e' % (itn, r2norm) 

563 str4 = 'acond =%8.1e xnorm =%8.1e' % (acond, xnorm) 

564 print(str1 + ' ' + str2) 

565 print(str3 + ' ' + str4) 

566 print(' ') 

567 

568 return x, istop, itn, r1norm, r2norm, anorm, acond, arnorm, xnorm, var