|
|
|
![]() |
|
|||||||
|
||||||||
![]() |
|
|
Thread Tools |
Rating:
|
Display Modes |
|
|
|
#1
|
||||
|
||||
|
Re: [TBA]: TBATV v4 Development Log
I assume you're using Python?
I've run scripts before on App Engine that have created a few hundred objects in one go, and they seem to work okay (in the cloud, at least; it totally kills the local development server whose datastore implementation just can't compete with Bigtable). I don't have any experience with this, but I'm pretty sure you can pass an optional RPC object argument that can specify a callback, to both the URL fetcher and the datastore, to make them act asynchronously. That way, you should be able to run operations simultaneously. |
|
#2
|
||||||
|
||||||
|
Re: [TBA]: TBATV v4 Development Log
Quote:
Making datastore tasks asynchronous and non-blocking is very attractive. I'll have to read up on RPCs. Thanks for the pointer! |
|
#3
|
||||||
|
||||||
|
Re: [TBA]: TBATV v4 Development Log
I discovered that App Engine has a feature called "key_names". Instead of having App Engine assign a numeric ID to a Model in the datastore, you can specify a string to use instead. Later, you can call Model.get_by_key_name(key) instead of Model.all().filter('property=', value).get(), which is faster.
We are going to use these for Models with obvious canonical names. For instance, Teams will have key_names like 'frc177', and Events will have key_names like '2010ct'. Matches get key_names like '2010ct_sf2m1'. Since we can guess the key_name for a Model we expect to exist, we can find them faster. Neat. |
|
#4
|
||||||
|
||||||
|
Re: [TBA]: TBATV v4 Development Log
I have not had time to work on the project much lately because of a trip to Boston, but I figured I would post our Models. Maybe people can give some feedback on the schema we're moving forward with presently.
Code:
from google.appengine.ext import db
class Team(db.Model):
"""
Teams represent FIRST Robotics Competition teams.
key_name is like 'frc177'
"""
team_number = db.IntegerProperty(required=True)
name = db.StringProperty()
nickname = db.StringProperty()
address = db.PostalAddressProperty() # If we can scrape this.
website = db.LinkProperty()
first_tpid = db.IntegerProperty() #from USFIRST. FIRST team ID number. -greg 5/20/2010
class Event(db.Model):
"""
Events represent FIRST Robotics Competition events, both official and unofficial.
key_name is like '2010ct'
"""
name = db.StringProperty()
event_type = db.StringProperty() # From USFIRST
short_name = db.StringProperty() # Should not contain "Regional" or "Division", like "Hartford"
event_short = db.StringProperty(required=True) # Smaller abbreviation like "CT"
year = db.IntegerProperty(required=True)
start_date = db.DateTimeProperty()
end_date = db.DateTimeProperty()
venue = db.StringProperty()
venue_address = db.PostalAddressProperty() # We can scrape this.
location = db.StringProperty()
official = db.BooleanProperty(default=False) # Is the event FIRST-official?
first_eid = db.StringProperty() #from USFIRST
website = db.StringProperty()
class EventTeam(db.Model):
"""
EventTeam serves as a join model between Events and Teams, indicating that
a team will or has competed in an Event.
"""
event = db.ReferenceProperty(Event,
collection_name='teams')
team = db.ReferenceProperty(Team,
collection_name='events')
class Match(db.Model):
"""
Matches represent individual matches at Events.
Matches have many Videos.
Matches have many Alliances.
key_name is like 2010ct_qm10 or 2010ct_sf1m2
"""
event = db.ReferenceProperty(Event,
collection_name='matches',
required=True)
time = db.DateTimeProperty()
comp_level = db.StringProperty(required=True,choices=set(["Qualifications", "Quarterfinals", "Semifinals", "Finals"])) # This choices set should probably become a global Constant somewhere. How do you do that in Python properly? -greg 5/20/2010
set_number = db.IntegerProperty(required=True)
match_number = db.IntegerProperty(required=True)
class MatchTeam(db.Model):
"""
A join class between Teams and Matches. Serves to store alliance information
Based on code from: http://code.google.com/appengine/articles/modeling.html
"""
match = db.ReferenceProperty(Match,
collection_name='teams',
required=True)
team = db.ReferenceProperty(Team,
collection_name='matches',
required=True)
alliance = db.StringProperty(choices=set(["red", "blue"]),
required=True)
substitute = db.BooleanProperty(default=False) #indicate the team was a substitute on the Alliance
class MatchScore(db.Model):
"""
A one to many relationship class that stores alliance scores for each Match
"""
match = db.ReferenceProperty(Match,
collection_name='scores',
required=True)
alliance = db.StringProperty(choices=set(["red", "blue"]),
required=True)
score = db.IntegerProperty()
class TBAVideo(db.Model):
"""
Store information related to videos of Matches hosted on
The Blue Alliance.
"""
match = db.ReferenceProperty(Match,
collection_name='tba_videos',
required=True)
location = db.StringProperty()
class YoutubeVideo(db.Model):
"""
Store information related to videos of Matches hosted on YouTube.
"""
match = db.ReferenceProperty(Match,
collection_name='youtube_videos',
required=True)
youtube_id = db.StringProperty()
|
|
#5
|
||||
|
||||
|
Re: [TBA]: TBATV v4 Development Log
I am not sure if this is the right thread for the Blue Alliance "wish list", but I will throw this in here anyway.
I would love a column on the page that is W/L/T when I have searched for a particular team. Yes, technically I can just search through the scores, but a W/L/T would save me a lot of time when doing pre-scouting for the championship. |
|
#6
|
||||||
|
||||||
|
Re: [TBA]: TBATV v4 Development Log
Quote:
![]() |
|
#7
|
||||||
|
||||||
|
Re: [TBA]: TBATV v4 Development Log
I put in some more work this afternoon, and the Datafeed system is to the point that we can get all Events, Teams, and Matches from 2010 regionals posted on the FIRST website. We'll be importing this data from the TBAv3 SQL rather than re-scraping from FIRST, but this system is important moving forward for 2011.
There are still issues with how long datastore actions take. Some of these will be improved by using Model.get_or_insert() instead of separately looking things up before trying to modify them. This will particularly improve Match insertion, as we need to create MatchTeam and MatchScore objects to accompany each match. Memcached will play a large role in reducing CPU usage as well. By memcaching certain requests like "give me the HTML for Connecticut 2010's matches", we can dodge hitting the datastore at all, instead making a single memcached call. We'll expire these memcached objects when we make updates to the underlying data they hold (such as getting new Match results), but this should be particularly useful for page views (fast!) and API calls from other apps (repetitive!). I'm still not 100% certain that the (Match, MatchTeam, MatchScore) object model is what we will ultimately go with. It creates nine objects per Match, which is a bit expensive. It's also expensive to make sure that we don't have straggling objects if Teams were erroneously attached to matches. It really does provide nice flexibility going forward though. Moving forward, the next tasks are to build out mock pages for Events, Teams, and Matches. This will make sure that our object model contains everything we think will go in the final pages. We're hoping to do a redesign of the site, so we're not going to put a lot of effort into the visual design at this point. Bare HTML will do. As we get things more towards "works without elaborate setup" we'll push the code into either a Google Code or GitHub repository (any opinions here?). We hope the community will help suggest performance improvements or even commit patches! |
|
#8
|
|||
|
|||
|
Re: [TBA]: TBATV v4 Development Log
Quote:
|
|
#9
|
||||||
|
||||||
|
Re: [TBA]: TBATV v4 Development Log
This is probably what we'll end up doing. I don't think rendering templates is very expensive compared to everything else, so that should be clean and easy.
|
|
#10
|
||||||
|
||||||
|
Re: [TBA]: TBATV v4 Development Log
Here is a screenshot of AppStats showing the problem with the Match, MatchTeam, MatchScore object model. All of the database gets and puts to insert a Match add up to a lot of CPU time.
I wonder if storing everything in Match using a red1, red2, red3, blue1, blue2, blue3 system or a red_teams, blue_teams list reference property is a better idea. I'd really love for these relation objects to be faster, but when we render a full event worth of matches, it takes almost 20 seconds. Does anyone know if it is possible to non-lazily query a bunch of objects ReferenceProperties all at once? Like say, "I want all the Matches, and their MatchTeam objects, and the Team objects on the other end of those"? Last edited by Greg Marra : 02-07-2010 at 21:22. |
|
#11
|
||||||
|
||||||
|
Re: [TBA]: TBATV v4 Development Log
I am going to change the Match model to be less flexible so we can speed up datastore performance.
I am thinking we will have a Match have Match.teams, which will be a ListProperty that stores the Teams in the match. Separately, we will store Match.alliances, which will contain a dictionary shaped like {"red": ["frc177", "frc195", "frc125"], "blue": ["frc433", "frc190", "frc222"]}. The teams property will basically be an index to let us quickly search by team, and the alliances property will store the actual structure of the alliances. We'll add another property to Matches called "game", where we write down which FRC game was being played. This way, if they change the game structure in the future (or we get data on past games), we'll easily be able to adjust Controllers to handle it without having to muck with the model. New concept: Code:
class Match(db.Model):
"""
Matches represent individual matches at Events.
Matches have many Videos.
key_name is like 2010ct_qm10 or 2010ct_sf1m2
"""
event = db.ReferenceProperty(Event,
required=True)
time = db.DateTimeProperty()
comp_level = db.StringProperty(required=True,choices=set(["Qualifications", "Quarterfinals", "Semifinals", "Finals"])) # This choices set should probably become a global Constant somewhere. How do you do that in Python properly? -greg 5/20/2010
set_number = db.IntegerProperty(required=True)
match_number = db.IntegerProperty(required=True)
teams = db.ListProperty(Team) #Primarily for indexing and searching
alliances = db.StringProperty #Store a Dictionary as a JSON string
scores = db.StringProperty #Store a Dictionary as a JSON string
Similarly, by changing from having a bunch of EventTeam objects, we could make teams a ListProperty of an Event. Then to get all of the Teams at an Event would require just finding the Event, instead of finding all of the EventTeams. This is switching away from a many-to-many relationship to many one-to-many relationships. I haven't had much time to work on development, but I think these new ideas will remove some of the major roadblocks that existed. |
![]() |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| [TBA]: [TBA] Curie 2006 and soap108.com | jblay | The Blue Alliance | 3 | 21-04-2010 00:25 |
| [TBA]: Kickoff 2009 Liveblogging Site Development | Greg Marra | The Blue Alliance | 23 | 30-12-2008 18:37 |
| [TBA]: h.264 Video Analysis, and Improving TBATV Video Quality | artdutra04 | The Blue Alliance | 5 | 29-09-2008 00:29 |
| [TBA]: API Client Development | Greg Marra | The Blue Alliance | 21 | 04-05-2008 20:06 |
| [TBA] TBA Presents... Soldering and Multimeter Tutorials | Greg Marra | General Forum | 0 | 27-01-2007 20:15 |