Chief Delphi

Chief Delphi (http://www.chiefdelphi.com/forums/index.php)
-   Python (http://www.chiefdelphi.com/forums/forumdisplay.php?f=187)
-   -   calling all Python gurus... (http://www.chiefdelphi.com/forums/showthread.php?t=128822)

Ether 04-15-2014 06:41 PM

calling all Python gurus...
 


I have an M by N matrix A and an M by 1 column vector b.

I want to find the N by 1 column vector x which minimizes sum(abs(A*x-b)).

What is the recommended way to do this in Python?




Jared Russell 04-15-2014 08:55 PM

Re: calling all Python gurus...
 
This can be formulated as a linear program and solved with something like SciPy's scipy.optimize.fmin_cobyla (http://docs.scipy.org/doc/scipy/refe...fmi n_cobyla), PyGLPK (http://tfinley.net/software/pyglpk/), or cvxopt (http://cvxopt.org/).

Jared Russell 04-17-2014 06:56 PM

Re: calling all Python gurus...
 
I used the cvxopt example implementation of L1 norm minimization on this page. fmin_cobyla

I exported the A and b (revised) matrices to CSV files and used the following script:

Code:

import argparse
import numpy as np

from l1 import l1
import cvxopt

def main():
    parser = argparse.ArgumentParser(description="Find the 'x' with minimum abs error for Ax-b=0.")
    parser.add_argument("a_file", help="The CSV file containing the 'A' matrix.")
    parser.add_argument("b_file", help="The CSV file containing the 'b' vector.")
    args = parser.parse_args()

    a_mat = np.genfromtxt(args.a_file, delimiter=',')
    b_vec = np.genfromtxt(args.b_file, delimiter=',')

    x = l1(cvxopt.matrix(a_mat), cvxopt.matrix(b_vec))
    print np.array(x)


if __name__ == '__main__':
    main()

Result:

Code:

    pcost      dcost      gap    pres  dres  k/t
 0:  8.7743e+03  2.9247e+03  6e+03  0e+00  3e-15  1e+00
 1:  8.9318e+03  4.9739e+03  4e+03  1e-16  8e-15  2e+00
 2:  8.3896e+03  6.1401e+03  2e+03  3e-16  5e-15  2e+00
 3:  7.8324e+03  6.8913e+03  9e+02  2e-16  2e-15  1e+00
 4:  7.4882e+03  7.1498e+03  3e+02  3e-16  9e-16  5e-01
 5:  7.3762e+03  7.2315e+03  1e+02  2e-16  7e-16  2e-01
 6:  7.3177e+03  7.2699e+03  5e+01  2e-16  1e-15  8e-02
 7:  7.2992e+03  7.2823e+03  2e+01  2e-16  2e-15  3e-02
 8:  7.2918e+03  7.2869e+03  5e+00  3e-16  3e-15  9e-03
 9:  7.2895e+03  7.2884e+03  1e+00  2e-16  4e-15  2e-03
10:  7.2889e+03  7.2887e+03  1e-01  2e-16  3e-15  2e-04
11:  7.2888e+03  7.2888e+03  2e-02  2e-16  2e-14  5e-05
12:  7.2888e+03  7.2888e+03  6e-03  2e-16  2e-13  1e-05
Optimal solution found.
[[  40.85472072]
 [  54.50362792]
 [  57.5105164 ]
 [  75.75089537]
 [ 121.94860251]
 [  46.52666963]
 [  58.57712428]
 [  30.90475117]
 [  36.04659762]
 [  73.62493218]
 [  51.12644449]
 [  51.248826  ]
 [  42.37621978]
 [  3.94520428]
 [  78.67860326]
 [  37.77162505]
 [  38.81927895]
 [  29.40909155]
 [  44.89222214]
 [  86.02462292]
 [  64.98284056]
 [  65.17634379]
 [  78.69920606]
 [  57.15841432]
 [  57.00407176]
 [  98.82612034]
 [  35.16978323]
 [  68.48203503]
 [  91.06996332]
 [  40.08103217]
 [  56.89136308]
 [  48.42120165]
 [  30.41951755]
 [  38.14178871]
 [  38.78992582]
 [  55.06826228]
 [  23.35460718]
 [  43.32600518]
 [  46.55423128]
 [  61.66252831]
 [  31.49935873]
 [  35.06424312]
 [  96.31845687]
 [  51.16348017]
 [  53.45403994]
 [  52.65396522]
 [  42.75751957]
 [  57.49749418]
 [ 108.39757194]
 [  81.67409568]
 [  53.09710197]
 [  90.24598119]
 [  66.6702275 ]
 [  46.93615142]
 [  61.9347642 ]
 [  56.08362874]
 [  19.47767745]
 [  45.29674652]
 [  47.34223078]
 [  63.39464624]
 [ -48.796162  ]
 [  47.35556574]
 [  87.12906013]
 [  50.71256405]]

I also got an implementation using fmin_cobyla working, but it is much slower (and/or needs some parameter tuning). Minutes instead of sub-second. Here are the results I got from fmin_cobyla after 5000 objective function evaluations:

Code:

[  40.85566343  54.50291262  57.51132686  74.48562198  121.75464378
  46.52686084  58.57605178  30.90550162  36.04789644  73.62653722
  51.12556634  51.24789644  42.37605178    3.94692557  78.67702265
  37.77087379  38.81812298  29.41100324  45.26634529  86.02524272
  64.98252427  65.17540453  78.69838188  57.1605178    57.0038835
  98.82653722  35.16957929  67.45066719  91.06860841  40.08220065
  56.89061489  48.42006472  30.42006472  38.14174757  38.78964401
  55.06860841  23.35533981  43.32621359  46.55469256  61.66213592
  31.49773463  35.06213592  96.31909385  51.16245955  53.45436893
  52.65501618  43.26409911  57.49838188  105.8154998    81.67443366
  53.09708738  90.24530744  66.67055016  46.93527508  61.93656958
  56.08478964  19.47702265  45.29773463  47.34174757  56.6887101
  -48.7961165    47.35469256  87.12815534  50.71132686]


Spoam 04-19-2014 04:08 PM

Re: calling all Python gurus...
 
Just from a cursory glance, it looks like you're talking about implementing linear least squares, no?

Ether 04-19-2014 05:08 PM

Re: calling all Python gurus...
 
1 Attachment(s)
Quote:

Originally Posted by Spoam (Post 1376433)
Just from a cursory glance, it looks like you're talking about implementing linear least squares, no?

No. L1 norm minimization.

Least Squares is L2 norm minimization, which is what is used for "OPR" computation here on CD. I am postulating that min L1 norm may be a better metric for FRC scouting purposes.

In the attached example, only 19% of the min_L2_norm computed alliance scores fall within +/-10 points of the actual alliance scores, whereas 41% of the min_L1_norm computed ones do.



Ether 05-04-2014 03:58 PM

Re: calling all Python gurus...
 
Quote:

Originally Posted by Jared Russell (Post 1375592)
I used the cvxopt example implementation of L1 norm minimization on this page.

Hi Jared,

As I mentioned to you privately, I was able to duplicate your Python/cvxopt results exactly on a Debian Linux installation of Python/cvxopt.

Since then I've had some free time and came back to the challenge of getting it to work in Octave (MatLab).

The solution turned out to be to convert the original nonlinear/nondifferentiable problem -- min(sum(abs(Ax-b))) -- into a linear program and solve it with Octave's glpk (or MatLab's linprog) LP solver.

According to tic/toc, Octave/glpk took 0.125 seconds to solve on my 8-year-old desktop running XP.

The resulting solution, however, is interestingly different from yours. Although the L1 norm of the residuals of both solutions are virtually identical (to the 4th decimal place), the L2 norm of the glpk LAD residuals is 0.2 smaller than cvxopt's.

Although min L2 norm of residuals is mathematically unique, L1 is not. I assume that accounts for the difference.

From all the data that I've looked at so far, min L1 norm of residuals (Least Absolute Deviation or LAD) appears to be a slightly better metric than L2 (Least Squares). Now that efficient methods of computing LAD have been identified perhaps that opens a whole new world of fun number crunching for the stats mavens here on CD.

Are there any R, Mathematica, SciLab, etc gurus out there who would be interested in doing the LAD computation and posting results?


Quote:

I also got an implementation using fmin_cobyla working, but it is much slower (and/or needs some parameter tuning). Minutes instead of sub-second.
That doesn't surprise me. I think any approach that doesn't first convert the LAD problem to an LP problem is going to be much more computation-intensive.



Ether 05-08-2014 05:00 PM

Re: calling all Python gurus...
 
1 Attachment(s)
Quote:

Originally Posted by Ether (Post 1382548)
Are there any R, Mathematica, SciLab, etc gurus out there who would be interested in doing the LAD computation and posting results?

Well it's not any of the above, but I generated solutions with MOSEK, Gurobi, OOQP, and XpressMP using the online NEOS solvers.



Ether 05-26-2014 05:00 PM

Re: calling all Python gurus...
 
Quote:

Originally Posted by Ether (Post 1383901)
Well it's not any of the above, but I generated solutions with MOSEK, Gurobi, OOQP, and XpressMP using the online NEOS solvers

Just generated the solution using Windows version of stand-alone console app glpsol.exe from GLPK package available here:

http://sourceforge.net/projects/winglpk/

It 0.1 sec to generate the LP and solve it.

Still looking for any gurus who might be able to help with the syntax for solving this using R.




All times are GMT -5. The time now is 04:51 AM.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi