calling all Python gurus...

**I have an M by N matrix A and an M by 1 column vector b.

I want to find the N by 1 column vector x which minimizes sum(abs(A*x-b)).

What is the recommended way to do this in Python?**

This can be formulated as a linear program and solved with something like SciPy’s scipy.optimize.fmin_cobyla (http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_cobyla.html#scipy.optimize.fmin_cobyla), PyGLPK (http://tfinley.net/software/pyglpk/), or cvxopt (http://cvxopt.org/).

I used the cvxopt example implementation of L1 norm minimization on this page. fmin_cobyla

I exported the A and b (revised) matrices to CSV files and used the following script:


import argparse
import numpy as np

from l1 import l1
import cvxopt

def main():
    parser = argparse.ArgumentParser(description="Find the 'x' with minimum abs error for Ax-b=0.")
    parser.add_argument("a_file", help="The CSV file containing the 'A' matrix.")
    parser.add_argument("b_file", help="The CSV file containing the 'b' vector.")
    args = parser.parse_args()

    a_mat = np.genfromtxt(args.a_file, delimiter=',')
    b_vec = np.genfromtxt(args.b_file, delimiter=',')

    x = l1(cvxopt.matrix(a_mat), cvxopt.matrix(b_vec))
    print np.array(x)


if __name__ == '__main__':
    main()

Result:


     pcost       dcost       gap    pres   dres   k/t
 0:  8.7743e+03  2.9247e+03  6e+03  0e+00  3e-15  1e+00
 1:  8.9318e+03  4.9739e+03  4e+03  1e-16  8e-15  2e+00
 2:  8.3896e+03  6.1401e+03  2e+03  3e-16  5e-15  2e+00
 3:  7.8324e+03  6.8913e+03  9e+02  2e-16  2e-15  1e+00
 4:  7.4882e+03  7.1498e+03  3e+02  3e-16  9e-16  5e-01
 5:  7.3762e+03  7.2315e+03  1e+02  2e-16  7e-16  2e-01
 6:  7.3177e+03  7.2699e+03  5e+01  2e-16  1e-15  8e-02
 7:  7.2992e+03  7.2823e+03  2e+01  2e-16  2e-15  3e-02
 8:  7.2918e+03  7.2869e+03  5e+00  3e-16  3e-15  9e-03
 9:  7.2895e+03  7.2884e+03  1e+00  2e-16  4e-15  2e-03
10:  7.2889e+03  7.2887e+03  1e-01  2e-16  3e-15  2e-04
11:  7.2888e+03  7.2888e+03  2e-02  2e-16  2e-14  5e-05
12:  7.2888e+03  7.2888e+03  6e-03  2e-16  2e-13  1e-05
Optimal solution found.
  40.85472072]
   54.50362792]
   57.5105164 ]
   75.75089537]
  121.94860251]
   46.52666963]
   58.57712428]
   30.90475117]
   36.04659762]
   73.62493218]
   51.12644449]
   51.248826  ]
   42.37621978]
    3.94520428]
   78.67860326]
   37.77162505]
   38.81927895]
   29.40909155]
   44.89222214]
   86.02462292]
   64.98284056]
   65.17634379]
   78.69920606]
   57.15841432]
   57.00407176]
   98.82612034]
   35.16978323]
   68.48203503]
   91.06996332]
   40.08103217]
   56.89136308]
   48.42120165]
   30.41951755]
   38.14178871]
   38.78992582]
   55.06826228]
   23.35460718]
   43.32600518]
   46.55423128]
   61.66252831]
   31.49935873]
   35.06424312]
   96.31845687]
   51.16348017]
   53.45403994]
   52.65396522]
   42.75751957]
   57.49749418]
  108.39757194]
   81.67409568]
   53.09710197]
   90.24598119]
   66.6702275 ]
   46.93615142]
   61.9347642 ]
   56.08362874]
   19.47767745]
   45.29674652]
   47.34223078]
   63.39464624]
  -48.796162  ]
   47.35556574]
   87.12906013]
   50.71256405]]

I also got an implementation using fmin_cobyla working, but it is much slower (and/or needs some parameter tuning). Minutes instead of sub-second. Here are the results I got from fmin_cobyla after 5000 objective function evaluations:


  40.85566343   54.50291262   57.51132686   74.48562198  121.75464378
   46.52686084   58.57605178   30.90550162   36.04789644   73.62653722
   51.12556634   51.24789644   42.37605178    3.94692557   78.67702265
   37.77087379   38.81812298   29.41100324   45.26634529   86.02524272
   64.98252427   65.17540453   78.69838188   57.1605178    57.0038835
   98.82653722   35.16957929   67.45066719   91.06860841   40.08220065
   56.89061489   48.42006472   30.42006472   38.14174757   38.78964401
   55.06860841   23.35533981   43.32621359   46.55469256   61.66213592
   31.49773463   35.06213592   96.31909385   51.16245955   53.45436893
   52.65501618   43.26409911   57.49838188  105.8154998    81.67443366
   53.09708738   90.24530744   66.67055016   46.93527508   61.93656958
   56.08478964   19.47702265   45.29773463   47.34174757   56.6887101
  -48.7961165    47.35469256   87.12815534   50.71132686]

Just from a cursory glance, it looks like you’re talking about implementing linear least squares, no?

No. L1 norm minimization.

Least Squares is L2 norm minimization, which is what is used for “OPR” computation here on CD. I am postulating that min L1 norm may be a better metric for FRC scouting purposes.

In the attached example, only 19% of the min_L2_norm computed alliance scores fall within +/-10 points of the actual alliance scores, whereas 41% of the min_L1_norm computed ones do.

*





Hi Jared,

As I mentioned to you privately, I was able to duplicate your Python/cvxopt results exactly on a Debian Linux installation of Python/cvxopt.

Since then I’ve had some free time and came back to the challenge of getting it to work in Octave (MatLab).

The solution turned out to be to convert the original nonlinear/nondifferentiable problem – min(sum(abs(Ax-b))) – into a linear program and solve it with Octave’s glpk (or MatLab’s linprog) LP solver.

According to tic/toc, Octave/glpk took 0.125 seconds to solve on my 8-year-old desktop running XP.

The resulting solution, however, is interestingly different from yours. Although the L1 norm of the residuals of both solutions are virtually identical (to the 4th decimal place), the L2 norm of the glpk LAD residuals is 0.2 smaller than cvxopt’s.

Although min L2 norm of residuals is mathematically unique, L1 is not. I assume that accounts for the difference.

From all the data that I’ve looked at so far, min L1 norm of residuals (Least Absolute Deviation or LAD) appears to be a slightly better metric than L2 (Least Squares). Now that efficient methods of computing LAD have been identified perhaps that opens a whole new world of fun number crunching for the stats mavens here on CD.

Are there any R, Mathematica, SciLab, etc gurus out there who would be interested in doing the LAD computation and posting results?

I also got an implementation using fmin_cobyla working, but it is much slower (and/or needs some parameter tuning). Minutes instead of sub-second.

That doesn’t surprise me. I think any approach that doesn’t first convert the LAD problem to an LP problem is going to be much more computation-intensive.

Well it’s not any of the above, but I generated solutions with MOSEK, Gurobi, OOQP, and XpressMP using the online NEOS solvers.

*

NEOS - Least Absolute Deviation via LP.zip (5.15 KB)


NEOS - Least Absolute Deviation via LP.zip (5.15 KB)

Just generated the solution using Windows version of stand-alone console app glpsol.exe from GLPK package available here:

It 0.1 sec to generate the LP and solve it.

Still looking for any gurus who might be able to help with the syntax for solving this using R.