Quote:
Originally Posted by Greg Marra
FRCLinks uses a Javascript redirect. I am pointing at FRCLinks for links to team pages right now on TBA, but I need to do a full scrape of FIRST's pages to update Team Names to be accurate now. Wget doesn't follow Javascript redirects - I may need to bake up something a bit fancier to either parse these out of FRCLinks, or parse them out of FIRST's team data when I am scraping event attendance.
|
"window.location.*?=.*?\"(.*)\"" as a regex on the content of the frclinks is a pretty simple way of grabbing Pat's redirect. That is how frcfeed is doing it. Just grab the content of group 1.
In python:
Code:
URL = re.search("window.location.*?=.*?\"(.*)\"",Content).group(1)