We found a bug in the software that was used today to assign teams to events. I apologize for this. Approximately 5% of teams were not assigned the correct event.
We kept the lottery numbers teams had been assigned during the first run of this process and re-ran the process with the bug corrected. We then compared the first-run results with what the results would have been if the bug had not been present.
The majority of the teams affected would have been assigned to events higher on their preference lists. A handful would have been assigned to events lower on their preference lists.
We will be making no changes to teams who would have been assigned an event lower on their preference list.
We are currently reviewing the data for teams that would have been assigned an event higher on their preference list if the bug had not been present. We intend to work to find space at the events those teams should have been assigned to and offer those slots to teams. As we expect some teams may be fine with staying at the event where they are currently assigned, we will be making no re-assignments without permission.
The re-assignment process will take time as we work through the data, find space, and offer slots to teams. We will have more information tomorrow.
I apologize again for this. We are working hard to make it right.
Well, it sucks for that 5%, but I think far more than 5% were watching “website not available” screens last year. Hopefully the errors are fixed for them soon.
For those talking about the impact and the numbers, yes, that’s an issue.
Here’s the bigger picture, though.
Event assignments went smoothly for MOST people, and with sanity kept. This was the primary goal of the change.
It’s ~12 hours since the mistake was noticed, or less, and FIRST has already: Noted a problem (probably flagged by CD discussion or team emails/phone calls), corrected it, rechecked their work, identified discrepancies caused by the problem, and made an announcement on the path forwards. For a problem that affects 5% of the program (means 95% unaffected–an A).
And their solution involves working with teams, and making sure that every team goes to the events they want to.
tips hat towards NH That’s what I call amazing customer service.
I don’t want to ruin your dreams, but my semi-educated guess is that the FRC Software Engineers are not working on the website, or event registration, or VIMS, or really anything outside of the control system for FRC. FIRST has a separate IT team that (I assume) does this sort of development/implementation.
Let’s be real here, this is an embarrassing mistake.
This is not exactly a difficult software engineering problem. This is taking teams, checking their ranked lists against event capacity in a certain order, and assigning teams to events. This is a script I think a lot of FRC students could write. Basic software robustness practices, like unit testing, would catch stuff like this. Or even basic sanity checks done at the end of the script, like making sure that events that aren’t full admitted everyone who picked them first, would have caught this. It’s not just embarrassing that this happened, but that they released the team list before they spotted it.
And when 5% of teams are affected, when only ~52% of teams even registered for more than one event, that’s really about 1 in 10 teams who made a list. That’s a pretty alarming failure rate.
Of course, FIRST’s handling of the issue since the mistake was spotted is top-notch, and that deserves praise, but frankly this is not rocket science.
I had what I like to call an “I’m an idiot” moment yesterday where I spent hours trying to debug an issue to find that I had reversed some logic in a similar manner but with booleans.
Think about it. This is a new process for FIRST. The very first time they ran it with actual team data, it completed with 95% success. The remaining 5% will be massaged into submission shortly.
We had our results just 2.5 hours after the Priority List process closed.
Honestly, with respect to some of the issues we have seen in the last couple years, this was a HUGE success!
Good, the world needs more dedicated software testers.
I can pretty much tell you what happened. There’s an ordered list of event preferences created as you select preferences and it’s stored indexed by creation order. Each preference has a position on the list that can be changed based on how the user changes preferences stored as a property on the object that contains the event preference.
All well and good, but when the next process ran, rather than sorting them by preference order it ran through them in creation order.
At a pseudocode level you’re looking at
Foreach (eventPreference in preferenceList)
instead of
Foreach (eventPreference in (sort preferenceList by preference))
I’ve made exactly this mistake myself.
It got through “testing” likely because there was likely no dedicated tester who had a specific test to create a list of preferences, then reorder it then run it through assignment.
As unfortunate as the issue is, FIRST it working to make it right.
This is a plausible explanation. All the system testing in the world isn’t going to expose this issue if nobody thinks to create the necessary conditions in the first place. That’s a pretty easy mistake to make.
That said, this isn’t a real time system. FIRST can/should validate the assignment before publishing it to teams - this would have caught this issue. There are a number of invariants we expect to hold true for any set of event preferences:
All events at a higher preference than the assigned event for a given team must be full.
All teams are assigned to an event or waitlist on their preference list.
The lottery numbers of all teams assigned to an event should be less (higher priority) than the lottery number of any team that could not be assigned to this event because it was full.
Testing for all of these (and I’m sure there are more you could add) would be very quick…O(# teams + # events).
So in my day job, I’m a product owner for a relatively complex piece of software that I thought could be implemented quickly (hint: it can’t). In light of that experience, I’ve written two vertical slices of this system (the registration UI and the actual draft process) in Gherkin syntax. Now tell me how easy this is (notice all the corner cases that I thought of in an hour, and this isn’t with the rest of my team with me writing this!) (note that I’ve weaseled out of some of it here, and I’m missing some stuff I’m sure. This is merely illustrative of the complexity)
As a FRC team main contact
I want to select event preferences
So that I can attend the event of my choice
Acceptance criteria:
Given a team that has paid,
And I am a main or alternate contact,
When I log into the system,
Then I will see a list of events to select from.
Given a team that has paid,
And I am not the main or alternate contact,
When I log into the system,
Then I should not be able to select my events.
Given a team that has not paid,
And I am the main or alternate contact,
When I log into the system,
Then I should not be able to select my events.
Given that I have logged into the system,
And have the ability to select events,
When I select my events,
Then they need to be in priority order.
Given that I have selected my events,
When I log out of the system,
And then log back in,
Then I should see the events that I have selected in the order that I have placed them in priority.
Given that I have an event selected,
When I attmept to add the event again,
I should get an error.
Given that I have more than one event selected,
When I change the priority of them,
And log out,
Then when I log in again, I should see them in the new priority order.
----
As a FRC Staff person,
I want to be able to run a lottery process,
So that teams may be assigned to events that they desire.
NOTE: the lottery process is a business rule, serpentine draft with teams
placed into their top selection that has remaining capacity.
Acceptance criteria:
Given that the period for event selection has concluded,
When I run the process for the draft,
Then I should see the lottery numbers assigned to each team.
Given the list of lottery numbers and the teams assignned to them,
When I run the assignment process,
Then the teams should be assigned to events in the order of the lottery numbers.
Given the process is running and has encountered a team,
When there is an event selection for that team,
And that event selection has a higher priority than the remaining event selections for that team,
And there is available capacity at that event,
Then they should be assigned to that event.
(note: repeat until selections are exhausted)
Given that there are teams that have made selections,
And they have not been assigned to any event,
When the process is complete,
Then I should see a report of those teams.
The registration was hugely successful this year, seemingly with only one bug that impacted registration order*. For a first-year changeover to a new process, that’s pretty impressive.