View Single Post
  #37   Spotlight this post!  
Unread 27-11-2014, 11:03
mr_yes mr_yes is offline
Registered User
FRC #1248 (The TITANium Allies)
Team Role: Mentor
 
Join Date: Apr 2007
Rookie Year: 2005
Location: Berea, Ohio
Posts: 33
mr_yes has a spectacular aura aboutmr_yes has a spectacular aura about
Re: FIRST Choice 2015 Window Open

Quote:
Originally Posted by Michael Hill View Post
I've made a spreadsheet comparing credits to $ amount.

https://docs.google.com/spreadsheets...it?usp=sharing
I'm impressed! Yesterday I worked up the following Python code which harvests the FC offerings from the website and outputs them as a comma-separated-values list. But it's really nice to see these items compared to their commercial counterparts. The one column in my list that is not on your spreadsheet is "Max Qty" for those items that are quantity-limited.

I imported a run from this morning into Google Sheets at https://docs.google.com/spreadsheets...it?usp=sharing

Code:
# -*- coding: utf-8 -*-
"""
Created on Wed Nov 26 21:39:26 2014

@author: John
"""

from __future__ import division # for python 2
import nltk, re, pprint
from nltk import word_tokenize
url = "http://www.firstchoicebyandymark.com/everything/"
html=urllib.urlopen(url).read()
#print len(html)," characters read from ",url
#raw=nltk.clean_html(html)
#print len(raw)," characters in cleaned html"
# cut out the useless stuff at the start and end
#istart=raw.find('<div class="product-item"')
#iend=raw.find('<div class="footer">')
#raw = raw[istart:iend]
# turn into tokens
tokens = word_tokenize(html)
text = nltk.Text(tokens)
def findy(searchFor,searchIn):
    for num,value in enumerate(searchIn):
        if value==searchFor:
            return num
    return 0
qx=[]
tokens=tokens[findy('product-item',tokens):] # Advance to first item
while findy('data-productid=',tokens)>0:
    prodid=tokens[findy('data-productid=',tokens)+2]
    link="http://www.firstchoicebyandymark.com"+tokens[findy('href=',tokens)+2]
    if1=findy('Show',tokens)+3
    if2=findy("''",tokens[if1:])
    descrip=' '.join(tokens[if1:if1+if2])
    if1=findy('actual-price',tokens)+3
    price=tokens[if1]
    if2=findy('maxqty',tokens)
    if if2==0 or if2>if1:
        maxqty='none'
    else:
        maxqty=tokens[if2+11]
    end=findy('/div',tokens[if1:])+if1
    qx.append([prodid,descrip,price,maxqty,link])
    tokens=tokens[end:] # Advance to next item
print 'product id,description,price,max qty,link'
for x in qx:
    print x[0],',"',x[1],'",',x[2],",",x[3],",",x[4]
Reply With Quote