|
In this manual you will learn how to solve different kinds of equations involving matrices and vectors.
Given a matrix A and a vector y, the types of equations we like to solve are of the form y=A·x, where A and y are given. The task is to find the vector x. The first two subsections show how to deal with this equation with no constraints on the solution vector x. Section 4 will show how to deal with the situation when the solution vector x is constrained.
In the equation above, the matrix and the vectors have to conform. This means that the number of rows of A should equal the size of the y vector and the size of the solution vector x will always equal the number of colums of A.
Note: In Praat scripting we don't distinguish in notation between a vector and its transpose.
In this case an exact solution for x is possible because if A is "well behaved" we can calculate its inverse and the solution will be x = A−1·y. The function solve#
(a##
, y#
) is for this type of problem.
The following example shows an exactly solvable system because A is a square matrix and "well behaved":
a## = {{0, 2, 0, 1},
... {2, 2, 3, 2},
... {4,-3, 0, 1},
... {6, 1,-6,-5}}
y# = {0,-2,-7,6}
x# = solve# (a##, y#)
writeInfoLine: x#
My info window shows:
-0.49999999999999967 1.000000000000001 0.3333333333333344 -2.0000000000000027
As a check we can calculate the norm of the difference between y and A·x, which should be zero for a perfect solution.
appendInfoLine: norm (y# - mul# (a##, x#))
My info window shows 1.0777744118960794e-14, which is almost zero (it is not exactly zero due to rounding errors because real numbers cannot be represented exactly in a computer.
If the matrix has more rows than colums an exact solution is generally not possible and the best thing we can do is to find a solution vector x that minimizes the squared distance between the vectors y and A·x. The problem now can be posed as: find the vector x that minimizes ||y − A·x||2. This is a regression problem which can be solved using singular value decomposition.
The following example shows a 5×2 matrix A. The solution therefore is a vector with two elements.
a## = {{-0.84,-0.184},
... { 0.09, 1.26},
... { 0.62,-0.20},
... {-1.48,-1.03},
... { 1.29, 0.03}}
y# = {-0.19, -0.90, -1.53, -2.30, 0.58}
x# = solve# (a##, y#)
writeInfoLine: x#
My info window shows:
0.5881230621928452 0.21643159712237164
We can calculate the norm of the difference:
appendInfoLine: norm (y# - mul# (a##, x#))
My info window shows:
2.556992185298919
This means that no other vector x can be found that has ||y − A·x|| < 2.556992185298919!
If the number of colums is larger than the number of rows the system in general is underdetermined, i.e. we can not determine a complete solution vector.
If there are additional constraints on the vector x we can handle three different cases:
These can be expressed in the following form: minimize ||y − A·x||2 + α(x′x − δ) for some α > 0 and δ ≥ 0. Here the constraint on the squared norm x′x of x is enforced by adding a penalty function with weighing factor α.
The function to use is solveWeaklyConstrained#
(a##
, y#
, alpha
, delta
)
The function is called "weakly constrained" because the penalty function prohibits a relatively large departure of x′x from δ. If we let α go to infinity we have a constrained least squares regression problem of minimizing ||y − A·x||2 subject to x′x = δ. The algorithm we have implemented is described by Ten Berge (1991).
a## = {{ 4, 1, 0.5},
... { 4,-1,-0.5},
... {-4, 1,-0.5},
... {-4,-1, 0.5},
... { 2, 0, 0},
... {-2, 0, 0}}
y# = {-1,-1,-1,-1,-1,1}
alpha = 6.0
delta = 2 / 3
x# = solveWeaklyConstrained# (a##, y#, alpha, delta)
writeInfoLine: x#
My info window shows:
-0.0563380281690141 -3.341667830688472e-17 0.7616819283108669
No constraints are involved if we set α = 0
x# = solveWeaklyConstrained# (a##, y#, 0.0, delta)
writeInfoLine: x#
My info window shows:
-0.05555555555555557 -5.696494054485392e-17 3.0458443711512004e-16
The same solution would have appeared if we had used the following code:
x# = solve# (a##, y#)
writeInfoLine: x#
We choose a very large value for α and set δ to 1.0.
x# = solveWeaklyConstrained# (a##, y#, 1e100, 1.0)
writeInfoLine: x#
My info window shows:
-0.05633802816901411 -3.341667830688472e-17 0.9984117520251988
Here we constrain the elements of the solution to be nonnegative. The problem is stated as minimize ||y − A·x||2 where all xi ≥ 0.
This problem can be solved by an iterative alternating least squares method as described by Borg & Groenen (1997). The function to use is solveNonnegative#
(a##
, y#
[, x#
], maximumNumberOfIterations
, tolerance
, infoLevel
)
The parameters have the following meaning:
a##
, y#
x#
]
maximumNumberOfIterations
tolerance
infoLevel
As an example consider:
a## = {{-4, 2, 2},
{ 2, 4, 2},
{ 1, 1, 1},
{ 2,-1, 3}}
y# = {1,2,1,3}
result# = solveNonnegative# (a##, y#, 100, 1e-17, 0)
writeInfoLine: result#
My info window shows:
0.17687074830212887 0 0.8594104308385341
The same a##
and y#
with extra output and only three iterations:
result# = solveNonnegative# (a##, y#, 3, 1e-17, 2)
writeInfoLine: result#
Now my info window shows:
Iteration: 1, error: 2.7083144168962345
Iteration: 2, error: 0.22835187182198863
Iteration: 3, error: 0.019415103584264275
Number of iterations: 3; Minimum: 0.019415103584264275
0.18686771227425772 0.0063553925295192215 0.8542134965490019
The solution is not reached after only 3 iterations. We use the found solution to start a new iteration:
result# = solveNonnegative# (a##, y#, result#, 100, 1e-17, 1)
writeInfoLine: result#
My info window shows
Number of iterations: 6; Minimum: 0.011337868480725613
0.17687074830212887 0 0.8594104308385341
The final solution has been reached after 6 extra iterations.
As we have seen above, if the number of columns is larger than the number of rows then a unique solution does not exist in general because the number of unknowns is larger than the number of equations. However, there is an exception:
If the matrix A has some special properties and the solution vector has to be sparse, i.e. most of its values are zero, then we can find the x that minimizes ||y − A·x||2.
In general an iterative procedure is needed for the minimization. We have implemented one based on iterative hard thresholding which is described by Blumensath & Davies (2010).
solveSparse#
(a##
, y#
[, x#
], maximumNumberOfNonzeros
, maximumNumberOfIterations
, tolerance
, infoLevel
)
The parameters have the following meaning:
a##
, y#
x#
]
maximumNumberOfNonzeros
maximumNumberOfIterations
tolerance
infoLevel
In the following example we first construct a sparse vector x, with random numbers between 0.1 and 10, and a random Gaussian matrix A. From these two we construct our y. We then solve the sparse system and compare its solution xs to the constructed x.
nrow = 100
ncol = 1000
x# = zero# (ncol)
for i to size (x#)
x# [i] = if randomUniform (0,1) < 0.005 then randomUniform (0.1, 10) else 0.0 fi
endfor
# On average in x# 5 out of 1000 will be unequal zero.
a# = randomGauss## (nrow, ncol, 0.0, 1.0 / nrow)
y# = mul# (a##, x#)
maximumNumberOfNonzeros = 10
maximumNumberOfIterations = 200
tolerance = 1e-17
info = 0 ; no info
xs# = solveSparse# (a##, y#, maximumNumberOfNonzeros, maximumNumberOfIterations, tolerance, info)
# We have found the solution now check
dif# = x# - xs#
inner = inner (dif#, dif#)
writeInfoLine: if inner > 1e-7 then "error" else "OK" endif
My info window shows: OK.
© djmw 20230801