2020 Zebra Data Parser

Hello all, this is the official thread for my 2020 Zebra Data Parser tool! I will be providing updates to its GitHub page as needed throughout the season, and making note of updates in this thread.

If you want some context on what my parser is and what it does, check out my 3 TBA blog posts on it:
Part 1
Part 2
Part 3

This season, I will be using zones as defined in this thread. I have uploaded images showing the zones and zone IDs onto GitHub.

My first update is actually just a quality of life update to the 2019 parser. I will be incorporating 2020 specific zones soon. Changes since v4 (the version used in blog post 3) include:
Adding support for direct data download from the TBA API. Thanks to the TBA team for developing this API. Make sure to check out the TBA match playback tool, it’s pretty sweet. Note that for 2019 data, TBA handles start times differently than I did for the local data, local data will lag TBA data by 1.7 seconds. For 2020, both the TBA data and csv match start times should be synced with the FMS reported start time, so hopefully moving forward there will be no discrepancy.
Added support for up to 10-point convex polygon zones. The upper limit is also now easily adjustable, so if I need to go higher I can. This was an important update for 2020, as there are a few pentagon and hexagon zones I am looking to use, and now I have that capability. I doubt I’m ever going to add support for concave zones, as it will likely cause a noticeable increase in processing time just to build zones that aren’t intuitive to me.
Finally, I added two options for data aggregation/smoothing. These work by combining multiple time-adjacent datapoints together to achieve better location accuracy and reduce noise. Since we are not processing real-time, I can cheat a little bit and use moving averages incorporating both past and future points. This allows for easy lagless smoothing without the difficulty of building something like a Kalman filter. I may build a Kalman filter in the future if I see a need. I’ve added 3-point and 5-point moving average options. My current recommendation is to use the 5-point moving average, although I would like to investigate more. This option dramatically reduces the noise in the high movement derivatives (acceleration and jerk) with what I consider to be minimal positional loss. The 5-point moving average effectively eliminates the frequency characteristics we might see between 1Hz and 5Hz, but I think these kind of movements are generally unimportant for the high-level kind of analysis we are currently doing with the ZEBRA trackers.

Here are some plots showing the speed, acceleration, and jerk measurements for the first 12 seconds of 604 in match 1 at Chezy Champs. If you watch that match, you’ll see that 604 sits still for this time, so these graphs should all be near to 0 throughout.

First we have speed:

image
The 5-point average cuts the average speed in half, from 0.5 ft/s to 0.25 ft/s, or 6 inches per second to 3 inches per second. I think the important thing to note is that you can now have a cleaner cutoff threshold to distinguish stationary robots from moving robots. Unsmoothed, this threshold would probably need to be about 2 ft/s, but with the 5-point average this can be reduced to around 1 ft/s.

Next we have acceleration:

image

The gains start to become much more obvious here. The average acceleration drops from 6 ft/s^2 unsmoothed to 1.5 ft/s^2 when smoothed. Remember that acceleration due to gravity is 32 ft/s^2. So unsmoothed the robot can have a measured acceleration fluctuating as much as 0.5g in random directions. Can you imagine how disorienting that would be if you felt gravity’s direction changing by up to about 30 degrees? That’s how the robot would “feel” with unsmoothed data. For another reference point, here is some data on the DC red metro line braking to a stop. The maximum acceleration in that case was around 7 ft/s^2, so I think it’s wise to smooth measurements until they get at least below that threshold for a stationary robot.

Finally we have jerk:

image

The smoothing really shows it’s value here. The average unsmoothed jerk is 100 ft/s^3, and smoothed it’s only 16 ft/s^3. Using the same paper above, the highest jerk felt from a train stop was 40 ft/s^3.

I may in the future do even more aggressive smoothing, as even the values derived from 5-point averages seem pretty noisy to me. I’m unsure exactly how much smoothing to do though. The best approach I can think of would probably be to correlate Zebra speed and acceleration data with a robot’s own internal odometry measurements from encoders and accelerometers and the like. If any team at one of the Zebra events has good match odometry logs and is willing to share them with me please reach out.

That’s it for now, I’ll be adding 2020 specific metrics soon.

6 Likes

@Will_Toth is this something easy for us to share?

Well I’ve already got an update. Here is v2019.8

When I was cleaning up the code before publishing v2019.7 I made a minor untested change that broke TBA imports, which was one of the key features I just added. That is now fixed.

Here’s some more graphs showing speed, acceleration, and jerk percentile data for 254 at CC:

Speed:

image

Very minimal difference, there’s a bit more of a difference at the extremely high (>95th) percentiles, but even that is not enormous.

Acceleration:

image

We start to see a much more noticeable difference in accelerations though. So much that I’d say much of my analysis of acceleration in my first blog post is pointless. Oops, that’s my bad, the excuse I’m going with is that I had to start somewhere.

Here’s the new list of 2019 CC teams sorted by stdev of acceleration (my opinion is that higher tends to be better):

Team stdev acceleration (ft/s^2)
4414 5.71
2910 4.91
1678 4.68
846 4.53
649 4.48
6443 4.33
5818 4.25
973 4.16
2046 4.15
971 4.11
1710 4.04
5199 4.01
1983 4.01
2928 3.99
2930 3.93
3309 3.90
1619 3.88
254 3.82
1072 3.80
115 3.80
604 3.79
2733 3.78
3476 3.68
114 3.68
3218 3.63
1197 3.62
2659 3.57
1671 3.55
4183 3.55
696 3.48
5507 3.32
498 3.24
3647 3.19
5026 3.07
2557 3.06
2102 2.98
1868 2.83
5700 2.78
5940 2.56

Jerk:

image

Jerk has an enormous difference. In fact, the entirety of the graph is below the 80 ft/s^3 noise threshold I mentioned in the linked blog post. So I’d say you should completely ignore conclusions from that section of the blog post (although to be fair I mentioned high noise as a concern).

Here’s the new list of average jerk for CC 2019 teams (my opinion is that lower tends to be better):

Team average jerk (ft/s^3)
5940 16.78
5026 16.93
2557 17.22
1868 17.32
2102 18.44
2659 18.58
3218 19.10
5700 19.64
3647 19.75
973 20.17
3309 21.35
1072 21.36
971 21.77
3476 21.79
696 22.17
5507 22.75
498 23.20
1983 24.22
5818 24.24
1619 24.51
254 24.55
2928 24.82
1671 24.97
2046 25.51
1197 25.76
114 26.62
649 27.04
1678 27.87
2733 27.95
5199 28.45
2930 28.97
115 29.00
846 29.43
604 29.44
1710 29.73
2910 30.03
4183 30.54
6443 35.29
4414 41.54

My hope is that I can get the noise in jerk measurements down low enough that I can easily identify the big collisions in 2020, since those are fun for me (especially since I don’t have a robot I have to worry about getting smashed :sweat_smile:). We’ll see if that’s achievable or not.

1 Like

Wow, it’s amazing how much you can get done if you’re trapped in an airport all day! Here’s the initial upload of the 2020 parser. I added zone definitions, equivalent zones, zone groups, defense types, and penalties. Please review and let me know if any definitions in there look off. Also, if you have ideas for zone groups or defense types that I haven’t listed definitely let me know. I’d prefer to not change much past week 1/2 for consistency’s sake, so the clock’s ticking if you have ideas.

We can’t really test the advanced analytics, as there exist no 2020 datasets yet. Feeding in 2019 datasets causes errors since the field boundaries are smaller this year. I have enough to work on that I don’t think I’m going to hack anything. Hopefully this data comes out soon and I can test on that.

Added a few simple auto routes, but I’m not sure how I want to handle the more complicated ones. Open to suggestions. There are such a ridiculous number of potential multi-ball auto paths. I feel like if I try to individually cover them all the interesting data will get pretty diluted, and I’m sure teams will take routes I hadn’t even considered. Idk, I haven’t really read up on the scouting threads talking about auto paths, so maybe I’ll start there to get some ideas.

2 Likes

The lower the jerks, and the smoother the acceleration means a better driver?

That’s the theory, not sure if it holds any value in practice.

Updated to v2020.2: https://github.com/inkling16/ZebraDataParser/raw/master/Zebra_Data_Parser_2020.2.xlsm

Key changes:
Now allows TBA import from 2020 events
Better handling of null values from TBA
Extended zone boundaries on the edge of the field to go 0.5 feet outside of the field
Added 2020 Auto Routes
Added Auto Route priorities

We’re getting there for 2020, I still probably need one more update before I can start putting data in my scouting database.
TBA imports now work which is great. A lot of teams are missing the very first t=0 data point in their matches, I think I’ll likely copy the t = 0.1 data right into the t = 0 position because right now I’m throwing out entire team datasets just because they start recording just a bit late, which seems excessively harsh. There are also quite a few datapoints that are outside of the field boundaries. I didn’t run into this last year, but I’ve already seen it a few times this year. I’ve extended the edge zone boundaries half a foot outside of the field to compensate, but as I’ve noted this unfortunately won’t be sufficient. I think I’ll push out another half foot in my next update.

I’ve now added 2020 auto routes, as well as adding an ordered evaluation value. Essentially, this allows for easier auto route definitions by letting you choose which autos are evaluated first. For example, the “Unknown” auto route covers the entire field, so all auto routes technically meet this definition. By evaluating it last though, it is now only used if none of the other routes are found to match a team’s movement.

1 Like

In Duluth at the Northern Lights Regional we utilized a low cost version of the Zebra…

2052 got a ball caught in their drivetrain and their drive wheels shredded the ball in a way that left a trail of lemon zest everywhere they went! The picture doesn’t do it justice but it was awesome to see. You can find the match video where it happened here.

@pntbll1313 might have a picture of the ball…

7 Likes

Confirmed. One Power Cell is cheaper than 2 Zebra tags.

It turns out blue nitrile makes for a great zester.

2 Likes

Looks great! I was just having one small problem, whenever I try to put in a match from TheBlueAlliance it gives me an error like this. I have tried it with 3 different PNW events and Texas Greenville all 4 of them have given me an error but, the Texas Plano event seems to work fine.]