Quote:
Originally Posted by davidaustin
I've attached my A and S matrices along with the ordered list of team numbers.
|
OK, I crunched those numbers with Octave.
Here's the Octave script I used:
format long
A = load('A.txt');
S = load('S.txt');
T = load('teams.txt');
x = A\S;
Tx = [T,x];
save Tx.txt Tx;
Attached is the output list of Team OPR's generated by the above script using your data.
Quote:
|
Thanks for your help with this. I'd really like to understand this better. I'll also look at numpy's least square result.
|
Glad to help. From what I can see, I think you understand it pretty well so far.
When you have a set of overdetermined linear equations:
[A][x]≈[b] (notice the approx equal sign "≈" because there is no exact solution since the system is overdetermined)
...if you multiply both sides by [A]
T, you get:
[A]
T[A][x]=[A]
T[b] => [N][x]=[d]
...which is a set of n equations in n unknowns, where "n" in this case is the number of columns in [A] (the number of teams).
Notice the "≈" sign has changed to "=" since the system now has an exact solution. The above are called the normal equations (of the overdetermined linear system), and solving these normal equations for [x] gives the least-squares solution to the original [A][x]≈[b] problem.
But if you're using a language with a linear algebra library, you can solve the overdetermined [A][x]≈[b] for the least squares solution directly. In Octave the syntax is called "left division" (x = A\b). In Python you use np.linalg.lstsq(A,b).
Be aware that "least squares" (min L2 norm of residuals) is only one of many possible "best fit" solutions to the overdetermined system [A][x]≈[b].
For example, there's the "Least Absolute Deviations (LAD)" solution (min L1 norm of residuals). And there are other "robust regression" methods.