LSTSQ
Overview
The LSTSQ function computes the least-squares solution to a linear matrix equation Ax = b, finding the vector x that minimizes the Euclidean 2-norm \|b - Ax\|_2. This is particularly useful for solving overdetermined systems where there are more equations than unknowns and no exact solution exists.
The least squares method, first published by Adrien-Marie Legendre in 1805 and independently developed by Carl Friedrich Gauss, is a fundamental technique in regression analysis, curve fitting, and data modeling. Given an M \times N coefficient matrix A and an M \times K right-hand side matrix b, the function finds the solution that minimizes the sum of squared residuals.
The minimization problem is formally expressed as:
\min_x \|b - Ax\|_2^2
This implementation uses SciPy’s linalg.lstsq function, which provides access to three LAPACK driver routines: gelsd (divide-and-conquer SVD, the default), gelsy (complete orthogonal factorization), and gelss (standard SVD). The gelsd driver is generally recommended for accuracy, while gelsy can be faster for some problems.
In addition to the solution vector, the function can return:
- Residuals: The squared 2-norm for each column in b - Ax (when M > N and A has full rank)
- Rank: The effective rank of matrix A
- Singular values: The singular values of A, useful for assessing the condition number via \sigma_{\text{max}} / \sigma_{\text{min}}
The cond parameter allows control over numerical rank determination by treating singular values below cond × largest_singular_value as zero, which is important for handling ill-conditioned systems. For more details, see the SciPy linear algebra documentation and the SciPy GitHub repository.
This example function is provided as-is without any representation of accuracy.
Excel Usage
=LSTSQ(a, b, cond, lstsq_driver, lstsq_return)
a(list[list], required): Coefficient matrix (M x N) for the linear equation systemb(list[list], required): Right-hand side matrix (M x K) with same number of rows as acond(float, optional, default: null): Cutoff for small singular values; values below cond * largest_singular_value are treated as zerolstsq_driver(str, optional, default: “gelsd”): LAPACK driver to use for the computationlstsq_return(str, optional, default: “solution”): The type of result to return from the computation
Returns (list[list]): 2D list of least squares result, or error message string.
Examples
Example 1: Least-squares solution using GELSD driver
Inputs:
| a | b | cond | lstsq_driver | lstsq_return | ||
|---|---|---|---|---|---|---|
| 1 | 1 | 0 | 2 | 1e-10 | gelsd | solution |
| 1 | 0 | 1 | 2 | |||
| 1 | 1 | 1 | 3 | |||
| 1 | 0 | 0 | 1 |
Excel formula:
=LSTSQ({1,1,0;1,0,1;1,1,1;1,0,0}, {2;2;3;1}, 1e-10, "gelsd", "solution")
Expected output:
| Result |
|---|
| 1 |
| 1 |
| 1 |
Example 2: Sum of squared residuals for overdetermined system
Inputs:
| a | b | cond | lstsq_driver | lstsq_return | ||
|---|---|---|---|---|---|---|
| 1 | 1 | 0 | 2 | 1e-10 | gelsd | residuals |
| 1 | 0 | 1 | 2 | |||
| 1 | 1 | 1 | 3 | |||
| 1 | 0 | 0 | 1 |
Excel formula:
=LSTSQ({1,1,0;1,0,1;1,1,1;1,0,0}, {2;2;3;1}, 1e-10, "gelsd", "residuals")
Expected output:
| Result |
|---|
| 0 |
Example 3: Effective rank of coefficient matrix
Inputs:
| a | b | cond | lstsq_driver | lstsq_return | ||
|---|---|---|---|---|---|---|
| 1 | 1 | 0 | 2 | 1e-10 | gelsd | rank |
| 1 | 0 | 1 | 2 | |||
| 1 | 1 | 1 | 3 | |||
| 1 | 0 | 0 | 1 |
Excel formula:
=LSTSQ({1,1,0;1,0,1;1,1,1;1,0,0}, {2;2;3;1}, 1e-10, "gelsd", "rank")
Expected output:
| Result |
|---|
| 3 |
Example 4: Singular values of coefficient matrix
Inputs:
| a | b | cond | lstsq_driver | lstsq_return | ||
|---|---|---|---|---|---|---|
| 1 | 1 | 0 | 2 | 1e-10 | gelsd | singular_values |
| 1 | 0 | 1 | 2 | |||
| 1 | 1 | 1 | 3 | |||
| 1 | 0 | 0 | 1 |
Excel formula:
=LSTSQ({1,1,0;1,0,1;1,1,1;1,0,0}, {2;2;3;1}, 1e-10, "gelsd", "singular_values")
Expected output:
| Result | ||
|---|---|---|
| 2.524 | 1 | 0.792 |
Python Code
import numpy as np
from scipy.linalg import lstsq as scipy_linalg_lstsq
def lstsq(a, b, cond=None, lstsq_driver='gelsd', lstsq_return='solution'):
"""
Compute the least-squares solution to Ax = B using scipy.linalg.lstsq.
See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lstsq.html
This example function is provided as-is without any representation of accuracy.
Args:
a (list[list]): Coefficient matrix (M x N) for the linear equation system
b (list[list]): Right-hand side matrix (M x K) with same number of rows as a
cond (float, optional): Cutoff for small singular values; values below cond * largest_singular_value are treated as zero Default is None.
lstsq_driver (str, optional): LAPACK driver to use for the computation Valid options: GELSD, GELSY, GELSS. Default is 'gelsd'.
lstsq_return (str, optional): The type of result to return from the computation Valid options: Solution, Residuals, Rank, Singular Values. Default is 'solution'.
Returns:
list[list]: 2D list of least squares result, or error message string.
"""
# Helper to normalize scalar/single-element inputs to 2D lists
def to2d(x):
return [[x]] if not isinstance(x, list) else x
# Wrap scalar inputs coming from Excel into 2D lists
a = to2d(a)
b = to2d(b)
# Validate matrix structure for a and b
for name, matrix in (("a", a), ("b", b)):
if not isinstance(matrix, list) or len(matrix) == 0:
return [[f"Invalid input: {name} must be a 2D list with at least one row."]]
row_length = len(matrix[0]) if isinstance(matrix[0], list) else 0
if row_length == 0:
return [[f"Invalid input: {name} must be a 2D list with at least one column."]]
for row in matrix:
if not isinstance(row, list):
return [[f"Invalid input: {name} must be a 2D list with rows represented as lists."]]
if len(row) != row_length:
return [[f"Invalid input: each row in {name} must have the same length."]]
if len(a) != len(b):
return [["Invalid input: a and b must have the same number of rows."]]
# Convert matrices to numeric numpy arrays
try:
a_arr = np.array([[float(value) for value in row] for row in a], dtype=float)
b_arr = np.array([[float(value) for value in row] for row in b], dtype=float)
except Exception:
return [["Invalid input: a and b must contain numeric values."]]
# Validate optional parameters
if cond is not None:
try:
cond = float(cond)
except Exception:
return [["Invalid input: cond must be a float or None."]]
driver = str(lstsq_driver).strip().lower()
if driver not in {"gelsd", "gelsy", "gelss"}:
return [["Invalid input: lstsq_driver must be 'gelsd', 'gelsy', or 'gelss'."]]
result_type = str(lstsq_return).strip().lower()
if result_type not in {"solution", "residuals", "rank", "singular_values"}:
return [["Invalid input: lstsq_return must be one of 'solution', 'residuals', 'rank', 'singular_values'."]]
# Execute SciPy least-squares solver
try:
solution, residuals, effective_rank, singular_values = scipy_linalg_lstsq(
a_arr,
b_arr,
cond=cond,
lapack_driver=driver,
)
except Exception as exc:
return [[f"scipy.linalg.lstsq error: {exc}"]]
# Helper to ensure outputs are finite and 2D
def finite_matrix(values, as_column=False):
array = np.asarray(values, dtype=float)
if array.ndim == 0:
array = array.reshape(1, 1)
elif array.ndim == 1:
if as_column:
array = array.reshape(-1, 1)
else:
array = array.reshape(1, -1)
if array.size == 0:
return [[]]
if not np.isfinite(array).all():
return [["scipy.linalg.lstsq error: non-finite result encountered."]]
return array.tolist()
if result_type == "solution":
return finite_matrix(solution, as_column=True)
if result_type == "residuals":
return finite_matrix(residuals)
if result_type == "rank":
return finite_matrix(float(effective_rank))
if result_type == "singular_values":
if singular_values is None:
return [["Invalid input: singular values are unavailable for the selected LAPACK driver."]]
return finite_matrix(singular_values)
return [["Invalid input: lstsq_return must be one of 'solution', 'residuals', 'rank', 'singular_values'."]]