Log in

View Full Version : Team 548 Einstein Statement


Nick Lawrence
20-08-2012, 12:43
Team 548 has released a statement regarding the events on Einstein.

You can read it here, along with FIRST's official response. (http://www.usfirst.org/roboticsprograms/frc/calendar/insert_blog_title_here)

Please keep this civil!

-Nick

IndySam
20-08-2012, 12:44
Team 548 has released a statement regarding the events on Einstein.

http://www.usfirst.org/roboticsprogr...log_title_here

Please keep this civil!

-Nick

You need to work on your url skills, how civil was that?

Kris Verdeyen
20-08-2012, 12:45
The statement is here:

http://www.usfirst.org/sites/default/files/uploadedFiles/Robotics_Programs/FRC/Game_and_Season__Info/2013/FRC_Team_548_Einstein_statement.pdf

Gregor
20-08-2012, 12:46
You need to work on your url skills, how civil was that?

Seems like the first thread he posted was deleted, and the new one has the incorrect link. Original link is here. (http://www.usfirst.org/roboticsprograms/frc/calendar/insert_blog_title_here)

JB987
20-08-2012, 12:47
For additional consideration, from Frank:

Blog Date:
Monday, August 20, 2012 - 09:38
Hello Teams,

Team 548 contacted me recently and asked if I would be willing to publish a statement from them regarding the events on Einstein on the FRC Blog. You can find their statement here. This statement represents, in part, their Steering Committee’s understanding of events that took place during the Einstein matches. To be clear, there are some differences between this understanding of events and the events as presented in the Einstein Report. FIRST continues to stand by its report. (My emphasis).

Taylor
20-08-2012, 13:19
For additional consideration, from Frank:

Blog Date:
Monday, August 20, 2012 - 09:38
Hello Teams,

Team 548 contacted me recently and asked if I would be willing to publish a statement from them regarding the events on Einstein on the FRC Blog. You can find their statement here. This statement represents, in part, their Steering Committee’s understanding of events that took place during the Einstein matches. To be clear, there are some differences between this understanding of events and the events as presented in the Einstein Report. FIRST continues to stand by its report. (My emphasis).

A dozen people can watch the same bank robbery and provide twelve different accounts of what happened (and likely twelve different descriptions of the burglar's appearance).

quinxorin
20-08-2012, 14:13
It gets you thinking, though, doesn't it? Was banning him from ever participating in FIRST again really the right thing to do?

BigJ
20-08-2012, 14:16
It gets you thinking, though, doesn't it? Was banning him from ever participating in FIRST again really the right thing to do?

(At least) one match was tampered with knowingly and purposefully. I'm sure whoever made the final decision regarding consequences did not take it the least bit lightly.

Libby K
20-08-2012, 14:17
Unfortunately, creating an interruption is not the way to 'make a point'. Sorry, I'm not giving anyone a pass on this one. You're supposed to listen to staff and volunteers, and this person didn't.

Good on 548 for coming forward, although the major discrepancies between their statement and FIRST's report still leaves questions for me.

dodar
20-08-2012, 14:18
It gets you thinking, though, doesn't it? Was banning him from ever participating in FIRST again really the right thing to do?

I dont remember ever reading that he got a lifetime ban from FIRST.

BigJ
20-08-2012, 14:22
I dont remember ever reading that he got a lifetime ban from FIRST.

From http://www3.usfirst.org/node/2426

In addition, FIRST has prohibited the individual from participating in any future FIRST event as a coach, mentor, volunteer or in any other capacity. This is the penalty associated with an intentional act of interference.

Jay O'Donnell
20-08-2012, 14:25
I think it's a good thing 548 came out and said this. I very much hope that they stay a respected team in our FIRST community and continue to be successful. In regards to the individual, what's been said has already been said and I don't need to go into that. Great job 548, you guys have done nothing wrong in this process, and have my respect for coming out about it.

Nick Lawrence
20-08-2012, 14:38
There are some major discrepancies present here. While not to start another famous "CD-Massacre" the report and this letter do not match up. I applaud 548 for coming forth with this, it takes a lot of bravery to do that. However, the information they have been given and the conclusions they came to simply don't make sense when you stack them up against the report and individual accounts of that afternoon.

Still, kudos 548 for at least making the statement in the first place.

Now we can move on a little more.

-Nick

quinxorin
20-08-2012, 14:38
Unfortunately, creating an interruption is not the way to 'make a point'. Sorry, I'm not giving anyone a pass on this one. You're supposed to listen to staff and volunteers, and this person didn't.

Good on 548 for coming forward, although the major discrepancies between their statement and FIRST's report still leaves questions for me.
Assuming 548's account to be accurate, what would have happened if he hadn't made his point? The field personnel appear to have brushed him off originally. There may have never been an Einstein investigation, and we never would have known what happened. We also wouldn't have known that there was a vulnerability, and as such it may have been years before it was fixed.

Because the individual did this only after attempting to interact with the field personnel, I feel his actions were entirely warranted and correct.

Of course, this is all predicated on 548's version of events being correct. It seems believable to me, primarily because FIRST was so vague in this area.

Nick Lawrence
20-08-2012, 14:41
Because the individual did this only after attempting to interact with the field personnel, I feel his actions were entirely warranted and correct.

So, intentionally attempting to alter the course of the finals (or any match via interference methods) is okay to you for the sake of protest?

Wow.

-Nick

thefro526
20-08-2012, 14:45
So, intentionally attempting to alter the course of the finals (or any match via interference methods) is okay to you for the sake of protest?

Wow.

-Nick

It's a slippery slope.

Before passing judgment on this, I had to ask myself 'Had he been listened to, would things have played out differently?'

I'm not sure of the answer, and in turn not sure how I feel.

akoscielski3
20-08-2012, 14:46
So, intentionally attempting to alter the course of the finals (or any match via interference methods) is okay to you for the sake of protest?

Wow.

-Nick

I think he was trying to say that the action of telling the field personnel about the issue was theright thing to do. Or that coming forward after the matches was the right thing to do.




PS: ^ look at this Lawrence guy.

quinxorin
20-08-2012, 14:51
So, intentionally attempting to alter the course of the finals (or any match via interference methods) is okay to you for the sake of protest?

Wow.

-Nick
No. That would not be okay with me. However, a brief (three second) interruption that does not influence match outcome is.

Nick Lawrence
20-08-2012, 14:54
The interruption lasted for three seconds, according to the 548 report. Furthermore, according to the report, he did not intend to alter the outcome of the match, but instead wanted to show the field personnel that field hacking was in fact possible. That's okay with me.

The intent is okay with me too, don't get me wrong. But proving it on the biggest stage possible? That's not cool with me.

No. That would not be okay with me. However, a brief (three second) interruption that does not influence match outcome is.

Even a three second interruption interferes with the outcome of the match. For many teams, it takes them only that much time to shoot three baskets.

PS: ^ look at this Lawerence guy.

^ Look at this Koscielski guy, he can't even spell my name right.

Okay, before this thread crashes, lets go back to praising 548 for saying "we're sorry."

-Nick

Akash Rastogi
20-08-2012, 14:57
So, intentionally attempting to alter the course of the finals (or any match via interference methods) is okay to you for the sake of protest?

Wow.

-Nick

I think Nick is saying what most of us are thinking.

This mentor did not tell his team the full story, in my opinion. And this statement makes him look even worse (to me) because none of the other accusations that many are thinking about are addressed. Maybe he said this was the reasoning and that it only happened on Einstein and was a method of protest to protect team 548's wins last season? Maybe he did this at other events to gain competitive advantage? Those are my beliefs, and I'd like to see this same person address those beliefs. Even if he confirms or denies this, I honestly think more than just the Einstein teams deserve an apology. Yes these are bold statements, but I am perfectly confident that I am not the only one with these beliefs.

Kudos to the rest of 548 for releasing this statement. A bit late, but that is understandable. Yes you still have the support of other teams in FIRST (at least mine).

quinxorin
20-08-2012, 14:59
548 shouldn't have had to apologize. Regardless of what one individual did, he acted alone and not as a member of the team; whether right or wrong, it wasn't the team's fault.

Nick Lawrence
20-08-2012, 15:02
548 shouldn't have had to apologize. Regardless of what one individual did, he acted alone and not as a member of the team; whether right or wrong, it wasn't the team's fault.

I think they had to say something, I mean a lot of people knew it was a 548 member for a long time. They had to dissociate with the mentor in question.

-Nick

JVN
20-08-2012, 15:06
Because the individual did this only after attempting to interact with the field personnel, I feel his actions were entirely warranted and correct.


No. No. No.
There are any number of things which could have been done after Einstein to fix this issue. Don't fall into the trap of "he spoke up and was ignored so he had to make his point." There are plenty of ways to get "unignored" (later on) without knowingly sabotaging an event.

The existence of this vulnerability could have been made known, and fixed, after the fact. Suspecting that someone else is exploiting it, is not a valid reason for exploiting it yourself.

Kudos to 548 for coming forward, I expect nothing less from such a well regarded team.
Their team leadership stepped forward. They communicated the facts they have, without editorializing. They apologized.

-John

Ryan Dognaux
20-08-2012, 15:11
548 shouldn't have had to apologize.

What you do reflects upon your team. If someone on my team had done this, you better believe I'd be apologizing for that person's actions.

I have a very hard time believing everything in 548's statement just because of this single sentence - "The actions of the individual were not intended to harm a team or alliance, nor intended to alter the outcome of the
matches on Einstein."

That's exactly what it did. Regardless of the intention, that's what happened. Teams were harmed, match outcomes were effected and the 2012 FRC season was damaged just to prove a point. There's a time and a place for this kind of thing, but not like this. Common sense - use it.

IndySam
20-08-2012, 15:11
I am glad the they came out with this statement but there is way to much mitigating going on and it leaves a bad taste in my mouth.

I am hoping that its just because some lawyers told them to how to say it (after all it is a very litigious world) and not the teams true feelings.

Nick Lawrence
20-08-2012, 15:13
It's a slippery slope.

Before passing judgment on this, I had to ask myself 'Had he been listened to, would things have played out differently?'

I'm not sure of the answer, and in turn not sure how I feel.

Had he been listened to, I'm not sure the finals would actually take place, after they test his theory and likely at the time not know how to fix it. I mean, it probably took a lot of smart people A LOT of time to trace the actual cause of the FCA attacks to the field AP firmware.

-Nick

RobotsVsKittens
20-08-2012, 15:14
This is poorly written and a less than ideal admission of guilt.

Unfortunately, to further demonstrate the issue, and making a poor decision, they created a 3 second field
interruption in match 2.

Who is 'they'?

Jon Stratis
20-08-2012, 15:20
548 shouldn't have had to apologize. Regardless of what one individual did, he acted alone and not as a member of the team; whether right or wrong, it wasn't the team's fault.

548's apology was exactly the correct thing to do.

This is the same "rule" that most organizations and companies have. If I were to talk onto a plane wearing a company shirt and start talking about something bad for the company, it reflects on the company. If I happen to encounter one of our customers and start having a discussion about our products, I'm required to file a report about the encounter.

In this case, an individual who could be identified as belonging to their team performed actions of which the community disproves. Whether or not the team made a public statement, enough people witnessed the incident to ensure that something would be said and spread via rumor. In such a situation, it would go directly against the team and many people would blame the team. By stepping forward as a team, publicly apologizing, and clearly stating that the individuals actions were not representative of the team, this team is performing the necessary PR to move past the incident.

All that said, I personally believe this was the act of an individual, and not something that was sanctioned by the team. I think it really helps to highlight the fact that poor choices can be made by anyone, even a 7-year veteran mentor of a well respected team. I hope this doesn't tarnish the team's reputation in years to come, and I really hope the immediate community they interact with at off season events, districts, and such maintains respect for this team through these difficult times.

Taylor
20-08-2012, 15:20
This is poorly written and a less than ideal admission of guilt.

Who is 'they'?

While I don't like it either, the usually plural "they" is commonly accepted (http://oxforddictionaries.com/words/he-or-she-versus-they) as a singular, gender-neutral pronoun.
I'm like you - I personally don't use it on principle, and I cringe when I see it, but it's not incorrect per se.
I don't know that it's poorly written - like SAM said, it reeks of lawyerspeak - and understandably so. This team has strong ties to its sponsors, as many do (http://www.chiefdelphi.com/forums/showthread.php?t=107553), and when people's livelihoods are at stake, you bet the lawyers are called in.

EricH
20-08-2012, 15:22
Who is 'they'?
In this case, "they" refers to the individual who caused the interference. I could go into all the grammar, but in this case it's the best word if you want to keep anonymity for the person in question.

quinxorin
20-08-2012, 15:24
This is poorly written and a less than ideal admission of guilt.



Who is 'they'?
Presumably 548 was using "they" as a singular pronoun, to prevent revealing whether the individual was male or female.

Jared Russell
20-08-2012, 15:32
This is poorly written and a less than ideal admission of guilt.

I cannot disagree more. 548 did not have to release this statement at all - and I'm sure it was a difficult thing for them to write and distribute. But they chose to do it, because it was right, and that means it is time to put down the pitchforks and torches.

It takes balls to associate one's team or company with an incident like this. The team wrote and released this statement with the full knowledge that (fair or not) some people might look at them a little differently for a while (it's just human nature...and yes I am aware that a large portion of the FRC community already knew/thought they knew the team anyhow).

Hopefully now we can move forward.

JesseK
20-08-2012, 15:45
Without locking down the entire field environment (i.e. banning personal laptops for driver's stations), how could FIRST prevent this type of issue in the future? This is more of an industry-directed question rather than a FIRST-directed question.

quinxorin
20-08-2012, 15:49
Without locking down the entire field environment (i.e. banning personal laptops for driver's stations), how could FIRST prevent this type of issue in the future? This is more of an industry-directed question rather than a FIRST-directed question.
There are many ways to prevent this issue. The Einstein Report details FIRST's plans on how to secure the field.
Furthermore, it took twenty one years for someone to do this. I expect it to take just as long before the next incident.

steverk
20-08-2012, 15:55
it took twenty one years for someone to do this. I expect it to take just as long before the next incident.

Let's hope there is never another incident.

Andrew Schreiber
20-08-2012, 15:55
I'm going to agree with two of the posts in here just to clarify some points based on experience at one of my jobs (I help teach cyber security and ethics is a huge part of it).
Unfortunately, creating an interruption is not the way to 'make a point'. Sorry, I'm not giving anyone a pass on this one. You're supposed to listen to staff and volunteers, and this person didn't.


This is absolutely correct, when you are doing security audits and penetration tests there are very specific rules of how you do things. And executing an attack during a very visible time is NOT one of those ways to do things.

No. No. No.
There are any number of things which could have been done after Einstein to fix this issue. Don't fall into the trap of "he spoke up and was ignored so he had to make his point." There are plenty of ways to get "unignored" (later on) without knowingly sabotaging an event.

The existence of this vulnerability could have been made known, and fixed, after the fact. Suspecting that someone else is exploiting it, is not a valid reason for exploiting it yourself.


(please note, all genders are generic)

THIS is the correct process, the person raised the issue at the time. It was not addressed. He should have documented his findings and sent them to FIRST. After giving FIRST a period of time to respond or fix the issue (think 6 months) he could have published a paper documenting his findings. At the end he should have included his original communication with FIRST and any steps they took or responses.


As it stands the person went from doing the right thing to being an attacker when they tried to "demonstrate" the vulnerability.

JesseK
20-08-2012, 16:22
There are many ways to prevent this issue. The Einstein Report details FIRST's plans on how to secure the field.
Furthermore, it took twenty one years for someone to do this. I expect it to take just as long before the next incident.

Correction -- it took only 3 years for it to happen on the field. The new control system started in 2009. Taking the report results and looking back, I believe one of my former students happened upon something similar in 2009 when he was figuring out how to wrap data into packets for use on a driver's station custom Java display. (For the record, he didn't tell us he found it and he graduated in '09. While his software was brilliant our robot had fundamental mechanical flaws that year). The problem I foresee is FIRST losing trustworthiness in any team that breaks a small rule on the field (namely, no cell phones for the guys who are the pit crew).

From an IT/IA perspective, the plans FIRST described in the report are vague at best, yet it's probably best that way. If we openly crowd-sourced amongst our intelligent community engineers to figure out how the FRC system could be vulnerable, then the companies working on securing the field would be better-equipped to understand what 0-day issues need to be addressed.

@Alec:
I too dislike putting my 6 vacation days, 100's of hours, and several dollars of support at the mercy of GP in such a competitive program. Yet at this point we should contribute to the solution rather than further highlighting the problem.

shawnz
20-08-2012, 16:47
What knocked it down was BAD engineering. [...] We need FIRST to be rock-solid in order to make a lasting impact. In my opinion, we still have a long way to go.

These are awfully harsh words. Remember that hindsight is 20/20. There will never be a day where nothing will have been overlooked, or every potential mistake will have been guaranteed against. FIRST is a volunteer organization, after all; they're doing the best they can. Although I agree with the general premise that blame isn't going to get anybody anywhere here.

BrendanB
20-08-2012, 16:52
Kudos to 548 for coming out and releasing a statement. I still love your team! ;)

Let's not rehash all of this again guys as we still don't know what happened. 548's report differs from FIRST's report but that doesn't tell us which one stands true at the end of the day. There were still other factors that played into this aside from the individuals action(s).

Jon Stratis
20-08-2012, 16:57
What knocked it down was BAD engineering. The loophole that allowed a smartphone, PC, or anything with a WiFi connection to intentionally or unintentionally disrupt a system that should have been rock solid, knocked it down. An organization that seeks legitimacy in the mainstream fell victim to a stupid mistake.

This is very much over critical of FIRST and the job they did with the FMS. Keep in mind, the bug was actually from a vendor-provided firmware update, not something FIRST developed on its own.

Companies fall victim to situations like this all the time. In FIRST's case, it results in a disrupted competition. For other companies, it results in stolen consumer credit card information, a hacked website that installs a virus or trojan on consumers computers, a defaced website in general, or any number of other "bad" things. No company is immune from outside attacks... why should FIRST be any different?

Nick Lawrence
20-08-2012, 17:03
Remember, FIRST did not cause this. It was a bug in the newer Field AP firmware that created this security hole.

-Nick

bardd
20-08-2012, 17:03
Thank you, 548, for stepping up. Even though it wasn't the team's fault, it was the right thing to do, I believe.
It takes real guts to do that. I don't know if I could have done the same.
You didn't lose any of the respect I had for you. If anything, I now appreciate you more for coming forward, and I believe there are many others who feel the same way.

As for this discussion... I think it is too early to discuss this. All that could've been said about the field system was said when the report came out.
The things that can be said about the apology will now be all mixed up with emotions (namely anger from what I've seen in some comments). I think this discussion should be paused, and re-started in a week or so, so that everyone has a chance to think, relax, and digest.

Travis Hoffman
20-08-2012, 17:13
Given this admission/apology, I do wonder how this may affect the status of 548's paid entry into the 2013 Championship.

AlecMataloni
20-08-2012, 17:14
These are awfully harsh words. Remember that hindsight is 20/20. There will never be a day where nothing will have been overlooked, or every potential mistake will have been guaranteed against. FIRST is a volunteer organization, after all; they're doing the best they can. Although I agree with the general premise that blame isn't going to get anybody anywhere here. I agree that I was a bit too harsh. FIRST has done great things with the cards they have been dealt. Unfortunately, there are limits to the reach of a volunteer organization, but when FIRST strives to be on the same level as sports organizations, they should expect the same scrutiny held to established "sports" by the general public.

Gregor
20-08-2012, 17:27
Given this admission/apology, I do wonder how this may affect the status of 548's paid entry into the 2013 Championship.

Given that the mentor in question has been excluded from all future FIRST events, I would hope the paid admission to the 2013 Championship would continue to be extended to 548. This team was hurt just as much as the 11 other Einstein teams.

Renee Becker-Blau
20-08-2012, 17:28
I think that 548's Steering Committee did a good job at responding to and handling the situation. Any mentor or student on a team is a representative of that team and officially associated with the team. This isn't just because you're wearing a team's t-shirt or branding, it's also because students and mentors officially register with a team through FIRST. If an individual on a team is involved in a negative situation, the leadership of the team is brought into the situation as well (ex-Football and Basketball players acting inappropriately).

Jon made a good point:

By stepping forward as a team, publicly apologizing, and clearly stating that the individuals actions were not representative of the team, this team is performing the necessary PR to move past the incident.

I'm glad that FRC 548 has come forward and publicly apologized for the actions of the individual. I hope this will help to diffuse any potential negativity that could occur at future events towards students and mentors on the team.

As for the individual, Jon and Andrew make great points:

There are any number of things which could have been done after Einstein to fix this issue. Don't fall into the trap of "he spoke up and was ignored so he had to make his point." There are plenty of ways to get "unignored" (later on) without knowingly sabotaging an event.

As it stands the person went from doing the right thing to being an attacker when they tried to "demonstrate" the vulnerability.

Renee

Siri
20-08-2012, 17:33
I just wanted to thank 548 for taking the courageous step of publishing this piece. Having committed no fault of their own, they've admirably given our community further impetus to move past the individual's conduct and embrace the challenges and opportunities this situation has exposed in our future. We owe it to ourselves honor just that.


I am impressed with the general tone of this thread in this point. I hope our students --and adults -- can continue to learn from the commendable behavior of 548, all the Einstein teams, and everyone involved in the invetigation. (Certainly some in their echelons may still be recovering, but hopefully this helps the process on tragically affected teams.) I know I will work to develop and retain this culture change, and while I hope we can avoid or preempt such incidents in the future, I believe we'll be better equipped as a community if we must handle one again.

Cory
20-08-2012, 17:42
This team was hurt just as much as the 11 other Einstein teams.

No, they were the only alliance NOT hurt.

Gregor
20-08-2012, 17:47
No, they were the only alliance NOT hurt.

Any team that participated on the einstein field played on a field that had become a tarnished playing ground. They may not have been interfered with directly, but being involved with Einstein must have been a heartbreaking experience. Can you imagine being on the field, not knowing if your robot was next to go down?

DonRotolo
20-08-2012, 18:01
The actions of a single person does indeed reflect upon a team, but in this case it is very clear that this person acted alone. Certainly an error in judgment to take that action.

Team 548 is a class act all the way. Every family has its Black Sheep, so I do not put any blame on the team. So, their coming out and issuing an apology was above and beyond the call of duty.

Travis Hoffman
20-08-2012, 18:02
They may not have been interfered with directly...

Of course they weren't. Why would the mentor use his/her own team or alliance partners as the target for making a point? He/she wouldn't. Whether they knew it at the time or not, 548 received an advantage over the other teams, just for this alone.

Kims Robot
20-08-2012, 18:45
People pushed & pushed for "the team" and/or "the person" to finally come forward, and I really hope that we can leave it at 548's statement. It may not be 100% what everyone dreamed of... but I don't think anything will ever make the situation right.

The team could and SHOULD only have issued their understanding of what happened, which means they are 100% reliant on what the mentor told them happened. Whether the mentor told them the entire truth or not, what are we so worried about? The person is banned, the vulnerability fixed, tons of more issues were found and tediously documented, so lets move on. People that are worried about "discrepancies" are you looking to call the team or the mentor a liar? What good does that do? Or are you legitimately interested to know if there was "a second attacker"? And if so, are we just on our next witch hunt?

I'm not sure people are fully understanding the team's statement. I pieced this together long before the team's statement, and I'm not sure people are getting it...

I'm going to stop dancing around the numbers/"vaguery"....
1. During the first match on Einstein, there was a robot failure in the alliance that included Team 548.
2. An individual mentor from Team 548 believed the failure was likely caused by an interruption.
Translating into english... When 118 went down, the individual assumed someone else was using the attack they knew to be possible. They thought THEIR alliance was being attacked...
3. Acting on their own accord, they entered the field in an attempt to notify FIRST personnel of their belief.
4. The FIRST Technical staff did not pursue the suggestion by the individual and asked for them to leave the field area in which they complied.
How frustrating would it be to think that your alliance was denied the opportunity to compete fairly because of a security hole? They thought 118 had been targeted and that they had lost a key part of their alliance to this attack. Just putting myself in the mentor's shoes I can see how heartbreaking and distraught I would feel. We have all said numerous times that the frustration with this whole thing is that so many of us feel that the Einstein teams never got a "fair shot" to see who really could have been the winner. In that exact moment, this mentor really just wanted a fair shot... perhaps an opportunity at a replay with the bug fixed, or attacker identified... but when the mentor was asked to leave and disregarded, the mentor had no idea what else to do.
5. Unfortunately, to further demonstrate the issue, and making a poor decision, they created a 3 second field interruption in match 2.
548 acknowledges that this was a poor decision, and we can all see how it most definitely was the wrong way to go about it, but even I can acknowledge that being in the same exact situation, the thought would cross my mind. But I would hope even in the heat of the moment, I would make the right decision and let it go. But with how heated debates get here on CD, I would be willing to bet that probably 10% of the FIRST population may have done the exact same thing if they were put in the exact same circumstances and had the exact same knowledge. I'd like to say I hold us all to higher standards, but many of us crack under pressure and none of us has made the absolute right decision every single day of our lives.

I don't think this person was doing it to intentionally harm an alliance or to prove their ultimate hacking skills... I think it was a sad, last ditch effort to get the attention of FIRST and get their alliance "a fair shot" at competing.

Lets let go of the details and all move on. The issue has been fixed, the team has come out into the open & apologized, and many other good things have resulted from all of this. So lets focus on moving forward.

Andrew Lawrence
20-08-2012, 18:54
Kudos to 548 for coming forward and releasing this information. Don't worry, we still love you guys, and if anything, respect you even more for doing a difficult task such as this.

Also, I would like to wholeheartedly thank (even though I may get some crap for this) the individual who caused all of this. What he/she did was a good thing at the wrong time, the worst time. However, I'm going through this with an optimistic viewpoint. From what I'm reading, this individual's best interest was to show the problem to the FTA before it became a larger issue (based on the claim the individual saw some other interference). The individual also came forward to FIRST and admitted to the crime committed, as well as cooperated with FIRST to identify and ultimately come closer to solving the problem.

How the individual effected Einstein was devastating to the students who worked hard to get there, but I believe it wasn't done with a cruel heart. From what I hear, this mentor was a fun and enthusiastic person geared towards inspiring and teaching today's youth, exactly what an ideal mentor would be like. Life banishment from the place where he can help and inspire students is probably one of the worst ideas ever. Maybe a temporary banishment (a few years or so) to let them think about what they've done. And then if Team 548 wants this mentor back, I think he/she should be allowed back, to continue inspiring and teaching students.

What the individual did is completely terrible, but is it something forgivable?

Gregor
20-08-2012, 19:07
Of course they weren't. Why would the mentor use his/her own team or alliance partners as the target for making a point? He/she wouldn't. Whether they knew it at the time or not, 548 received an advantage over the other teams, just for this alone.

...but being involved with Einstein must have been heartbreaking experience. Please read the entire sentence ::rtm::

Lil' Lavery
20-08-2012, 19:23
No, they were the only alliance NOT hurt.

Because there were no other issues on Einstein outside of the intention act of interference?

IanW
20-08-2012, 19:27
This mentor did not tell his team the full story, in my opinion. And this statement makes him look even worse (to me) because none of the other accusations that many are thinking about are addressed. Maybe he said this was the reasoning and that it only happened on Einstein and was a method of protest to protect team 548's wins last season? Maybe he did this at other events to gain competitive advantage? Those are my beliefs, and I'd like to see this same person address those beliefs. Even if he confirms or denies this, I honestly think more than just the Einstein teams deserve an apology. Yes these are bold statements, but I am perfectly confident that I am not the only one with these beliefs.

Even though you seem to be aware of the weight of your words, I think this indictment of Team 548 is too harsh. Your conjectures bring into question the integrity of their entire team, which I don't think is warranted. Especially considering that they apologized to the community as a whole in one of the most public manners they could manage. I truly hope that there are not many "with these beliefs," as it would indicate to me that the community has lost faith in the integrity of its peers, regardless of their reputation.

Otherwise, Kim's statement accurately sums up my thoughts:

People pushed & pushed for "the team" and/or "the person" to finally come forward, and I really hope that we can leave it at 548's statement. It may not be 100% what everyone dreamed of... but I don't think anything will ever make the situation right.

The team could and SHOULD only have issued their understanding of what happened, which means they are 100% reliant on what the mentor told them happened. Whether the mentor told them the entire truth or not, what are we so worried about? The person is banned, the vulnerability fixed, tons of more issues were found and tediously documented, so lets move on. People that are worried about "discrepancies" are you looking to call the team or the mentor a liar? What good does that do? Or are you legitimately interested to know if there was "a second attacker"? And if so, are we just on our next witch hunt?

Ekcrbe
20-08-2012, 20:28
Disclaimer: The following is a hopeful opinion which is not proven. This disclaimer takes the place of all references to the fact that this is only one potential version of the situation, but deserves consideration nonetheless.

I can't really get mad about the "discrepancies" because I don't think the team statement is intended to be that deceitful. The series of events before the purposeful interference as we know them sound very emotionally stressful to the individual. High emotional stress inherently leads to poor judgement, and, in the long term, poor memory and recall. Even before he/she got busted, I'm willing to bet it would have been hard for him/her to recall the whole story. After the release of the Einstein Report, and the individual's subsequent ban for life, his/her story becomes shoddy at best. This is compounded by the possibility that the individual tried to convince him/herself that he/she isn't as guilty as he/she really is, leading to a real belief that is different from reality. By the time Steering Committee was told the story, it was probably far diverged from the truth.

On the other hand, the well-documented unreliability of witnesses probably means the Einstein Report's version isn't all true, either. Like so many things, two sides of the same story are neither the truth nor lies, and the reality lies somewhere in between them.

A couple other things:
From what I'm reading, this individual's best interest was to show the problem to the FTA before it became a larger issue (based on the claim the individual saw some other interference).

...

Life banishment from the place where he can help and inspire students is probably one of the worst ideas ever. Maybe a temporary banishment (a few years or so) to let them think about what they've done. And then if Team 548 wants this mentor back, I think he/she should be allowed back, to continue inspiring and teaching students.

What the individual did is completely terrible, but is it something forgivable?

1. There are plenty of ways to make a statement. The decision the individual made was THE WRONG WAY to do so. I'm not directing anything at you or disparaging your opinion, because you're largely correct. My comment is that voluntary manslaughter (provocation) (http://en.wikipedia.org/wiki/Manslaughter#Voluntary_manslaughter) isn't murder (http://en.wikipedia.org/wiki/Murder), but it's still not permissible. Being mad doesn't give you any and all rights you want, especially when you don't know the full story (and actually have it wrong).
I'm not comparing the magnitudes of each situation, just the framework. I'm also not calling the individual a murderer at all, I truly believe this was a good individual who made a bad choice.

2. Is it really one of the "worst ideas ever"? It's harsh, but you have to set a precedent and say "This is not acceptable in FIRST."

3. Not yet, it seems. But that doesn't mean it will never will be. After a while, everyone can look back differently.

IndySam
20-08-2012, 21:26
The team could and SHOULD only have issued their understanding of what happened, which means they are 100% reliant on what the mentor told them happened.


I'm gonna have to totally disagree with this statement. The team should have only apologized and left it at that. No other information was necessary. There was no need to add to the discussion of what happened.

IndySam
20-08-2012, 21:31
What the individual did is completely terrible, but is it something forgivable?
Yes it's forgivable if the person is honest and truly seeks forgiveness but it is not just about forgiveness. The penalty needs to be so harsh that no one ever considers doing something like this again.

Lifetime ban is not only appropriate it's necessary.

Travis Hoffman
20-08-2012, 21:47
...but being involved with Einstein must have been heartbreaking experience. Please read the entire sentence ::rtm::

I did read it, the first time. :)

connor.worley
20-08-2012, 22:01
I simply don't believe the "protest" idea. The attacker could have indefinitely delayed the match by disconnecting a robot before the match itself started. This would have been an equally effective protest, but would not have risked affecting the outcome of the matches.

Alan Anderson
20-08-2012, 22:01
The team should have only apologized and left it at that. No other information was necessary.

That is my opinion as well.

Based on multiple other reports, I'm going to give little weight to what the now-banned party says happened, and I'm not going to apologize for that. But I will accept Team 548's statement at face value, put it behind me, move forward, and strongly encourage everyone else to do the same.

RobotsVsKittens
20-08-2012, 22:36
Lack of grammar is not the only way something can be poorly written. It's poorly written because it is not explicitly clear who is being referred to. Not explicit enough for the circumstances.

It makes no sense why anyone should talk about forgiveness since no party here has sought it or made extremely explicitly clear who is at fault.

On a personal level, I find the use of words like 'unfortunately' in an apology to be less than genuine. Stating the intent of someone while simultaneously not specifying who that it is we are talking about is laughable. As is the double standard of an individual not representing a team at a competition, but we're all such loyal team players to preserving anonymity.

On a related note, genuine apologies are rare in our society, so it is with a complete lack of surprise that I find many cannot identify one or misidentify it. Ah, but I digress.

Ekcrbe
20-08-2012, 23:12
Lack of grammar is not the only way something can be poorly written. It's poorly written because it is not explicitly clear who is being referred to. Not explicit enough for the circumstances.

It makes no sense why anyone should talk about forgiveness since no party here has sought it or made extremely explicitly clear who is at fault.

On a personal level, I find the use of words like 'unfortunately' in an apology to be less than genuine. Stating the intent of someone while simultaneously not specifying who that it is we are talking about is laughable. As is the double standard of an individual not representing a team at a competition, but we're all such loyal team players to preserving anonymity.

On a related note, genuine apologies are rare in our society, so it is with a complete lack of surprise that I find many cannot identify one or misidentify it. Ah, but I digress.

I seriously doubt the use of improper grammar is a big deal. Let us not forget that the party apologizing is not the party at fault. If I tell you the "individual" is John Smith* of Team 548, is that better on any level than if I say it was a member of Team 548? You answer that. If you think the apology is "less than genuine", take a line from Taylor, one he used on me regarding this very subject:

"I've found that being outraged on behalf of others is often a misuse of energy."

Whether you're mad at the Robostangs, the singular individual, FIRST, or life, lamenting about the state of the world's apology writing is not going to help.

*I don't know if there is a member of Team 548 named John Smith, nor do I intend to accuse anyone on the team of being the anonymous individual.

Akash Rastogi
20-08-2012, 23:33
Even though you seem to be aware of the weight of your words, I think this indictment of Team 548 is too harsh. Your conjectures bring into question the integrity of their entire team, which I don't think is warranted. Especially considering that they apologized to the community as a whole in one of the most public manners they could manage. I truly hope that there are not many "with these beliefs," as it would indicate to me that the community has lost faith in the integrity of its peers, regardless of their reputation.


My post was and is only directed at this one mentor. Not the entire team, I praised the rest of the team for their apology. I am not questioning the integrity of 548, I am continuing to question the integrity of this one mentor's words.

IanW
21-08-2012, 00:01
My post was and is only directed at this one mentor. Not the entire team, I praised the rest of the team for their apology. I am not questioning the integrity of 548, I am continuing to question the integrity of this one mentor's words.

Sorry, I guess I misread/misunderstood the intent of your post then. The point about protecting Team 548's wins made me think you referring to more than just the individual.

JackS
21-08-2012, 01:01
Good on 548 for coming forward, although the major discrepancies between their statement and FIRST's report still leaves questions for me.

Emphasis mine.

I am continuing to question the integrity of this one mentor's words.

I am a bit disappointed by this sentiment for two reasons. First, a lot of the data in the Einstein Report is inconclusive.

Over the course of these tests, FRC Engineering was able to determine how to identify a failed client authentication through the log data recorded in the field access point. However, the configuration of the field access points used during the 2012 FRC competitions, including the matches on Einstein, is such that log data is not retained when the access point is powered off.

This statement, directly from the report, essentially states that the exact number of times the individual from 548 made his or her attack cannot be known, because the logs no longer exist. It is perfectly "plausible" that another individual repeated the same attack elsewhere in the dome, or some sort of other interference occurred.

Secondly, whether the individual made one attack or 100 attacks is a moot point. The individual's actions (regardless of intent) were malicious and he or she was punished accordingly. The job of the CD community is not to further scapegoat the individual for more attacks than he admitted to, as no proof exists. Instead, we should collectively be accepting of 548's generous apology (one they by no means had to provide) and we should all encourage FIRST to try and eliminate dead robots (due to control system failures) almost completely by 2014.

Seth Mallory
21-08-2012, 01:14
I for one am quit satisfied with team 548 statement. Team 548 is also a victim in all of this. Having a "mentor out of control" can tear the guts out of a team. You have scars inside and outside of the team that takes years to recover. It is time to let team 548 work thru this and end this thread.

Ian Curtis
21-08-2012, 02:01
Remember, FIRST did not cause this. It was a bug in the newer Field AP firmware that created this security hole.

-Nick

If your car breaks, do you blame Delphi? Unless you are a huge car dork or work for an OEM, probably not. There are plenty of examples in modern industry where the supplier is the cause of an issue, but everyone still points the finger at the final assembler. Since it is your brand attached to the final product, you've got to ensure that you want your brand on it, even if you didn't build all the parts (and these days, no one builds all the parts).

What he/she did was a good thing at the wrong time, the worst time.

This is absolutely a bridge too far. Ethics are important.

Gray Adams
21-08-2012, 02:58
I am a bit disappointed by this sentiment for two reasons. First, a lot of the data in the Einstein Report is inconclusive.

This statement, directly from the report, essentially states that the exact number of times the individual from 548 made his or her attack cannot be known, because the logs no longer exist. It is perfectly "plausible" that another individual repeated the same attack elsewhere in the dome, or some sort of other interference occurred.

I want to echo this point. By the mentor's own admission, he used the attack, but why should we believe his admission of guilt isn't the full story from his perspective? Every single one of us has been looking for someone or something to blame for what happened on Einstein. The full report has brought up a multitude of points of failure during the finals, and its really not hard to believe the answer to all of this is not as simple as blaming this all on one mentor. As soon as news broke that there was an attack during play, all of the failures on the field were attributed to that. But things just aren't that simple, and we discovered how many root causes for all the different problems there really were. But I firmly believe we still know far too little to place all of the blame on this one attacker. With thousands of incredibly smart people in the dome, its entirely possible that someone else used this attack, whether or not their team was on einstein, and whether or not they were fully aware of their actions.

We've heard 2 sides of the story so far, and unless someone would like to point out something I missed that puts them in direct conflict, I think it's only fair to evaluate this based on what we know.

Everyone was feeling a lot of emotions at the moment, and the attack in response could have been from a moment of desperation. I'm not condoning what happened, but I am trying to understand it.

jason701802
21-08-2012, 04:14
The individual's actions (regardless of intent) were malicious and he or she was punished accordingly.

Malice is entirely dependent upon intent, I think 'destructive' might be closer to what you were looking for.

Taylor
21-08-2012, 08:05
Two thoughts:

1. I can't imagine what next year will be like for the rookie members of 548. How does that conversation go?

2. I've yet to see a post from any of the directly affected Einstein teams in this thread (There is one on the first page from a Robonaut; it points to the article and offers no opinion on the subject). My first inclination is that they are coming together privately as teams to determine exactly how they feel about it; when they've grokked it in fullness, they'll make public statements as they see fit.



My second inclination is simply there's nothing left to say.

JosephC
21-08-2012, 08:14
I'd like to start off by thanking the Robostangs for their statement. It takes a lot of guts to put yourself up in front of the Chief Delphi community. Your team still has as much respect from me as it did before.

One thing that no one has really thought of is the affects this has on the students that are part of that team. Regardless of what actually happened, how do you think they feel? I know that if one of my trusted mentors did something like this it'd take a long time for me to hold my head up high at a competition again.

Arguing about whether or not the individuals acts were in good taste is pointless, nothing we say or do now can change what happened on Einstein. The same go with whether or not 548's apology was written by lawyers. Does it matter in the grand scheme of things? It is, after all, still an apology to the community.

DISCLAIMER: This post is filled with my own thought's and opinions and does not necessarily reflect those of my team.

Gregor
21-08-2012, 11:21
I've yet to see a post from any of the directly affected Einstein teams in this thread (There is one on the first page from a Robonaut; it points to the article and offers no opinion on the subject). My first inclination is that they are coming together privately as teams to determine exactly how they feel about it; when they've grokked it in fullness, they'll make public statements as they see fit.

http://www.chiefdelphi.com/forums/showpost.php?p=1182304&postcount=5


My second inclination is simply there's nothing left to say.

Bolded for emphasis.

techhelpbb
21-08-2012, 12:16
Yes it's forgivable if the person is honest and truly seeks forgiveness but it is not just about forgiveness. The penalty needs to be so harsh that no one ever considers doing something like this again.

Lifetime ban is not only appropriate it's necessary.

No one ever notes a problem again?

No one ever clicks on a list of networks again and misses the button?

No one ever asks why documenting issues has to reach the public level?

No one is ever curious again?

No one ever considers using this particular ISM band again like this?

I would feel much more comfortable with harsh punishment if you couldn't trip over this.

Jon Stratis
21-08-2012, 12:26
No one ever notes a problem again?

No one ever clicks on a list of networks again and misses the button?

No one ever asks why documenting issues has to reach the public level?

No one is ever curious again?

No one ever considers using this particular ISM band again like this?

I would feel much more comfortable with harsh punishment if you couldn't trip over this.

The issue wasn't what you listed... the issue was the intentional interference with the game play. All the items you listed are something an individual can pursue, so long as they do so appropriately. Doing so during a match is not appropriate.

techhelpbb
21-08-2012, 12:26
THIS is the correct process, the person raised the issue at the time. It was not addressed. He should have documented his findings and sent them to FIRST. After giving FIRST a period of time to respond or fix the issue (think 6 months) he could have published a paper documenting his findings. At the end he should have included his original communication with FIRST and any steps they took or responses.

As it stands the person went from doing the right thing to being an attacker when they tried to "demonstrate" the vulnerability.

I also work with security and I agree.

Unfortunately the back story in this case seems to flow in a direction that you'd end up making the public report.

I and others I know have since submitted concerns and vulnerabilities to FIRST and frankly no one I know has received so much as a confirmation e-mail.

So what this will lead to is a pretty serious problem. FIRST has an investment in this control system for a while and that while definitely includes this upcoming year.

I know for a fact that these vulnerabilities remain and their mitigation procedure will not address them so long as the control system remains essentially as it is.

In 6 months if I publish my results publicly I can't with a straight face ever look at a hard to explain robot failure and not assume that I provided the core bit of knowledge that someone of less skill used to possibly cause that.

This is a very bad situation. It does not excuse the interloper at all. It may not have been apparent to the interloper they would face this additional level of inertia in handling the security issues.

There have been moments in my long involvement with FIRST that I felt I was utterly and sometimes quite wrongly ignored. Even that said I can think of a dozen ways in 1 minute that I can get my point across without using Einstein like that and compounding the existing issues with harm to every aspect of FIRST.

I appreciate curiosity but I appreciate the value of the scientific method to satisfy that curiosity. There was no careful control for this experiment and therefore it's not an experiment. What it really is a bunch of intelligent people chasing individual agendas not working *together* and in the process making the situation much worse.

Worse Einstein has become the distraction for who knows how many other possible interruptions that could have been caused accidentally or with intent. There's nothing in that report that closes that door, worse the lack of logs literally blows that door wide open.

techhelpbb
21-08-2012, 12:29
The issue wasn't what you listed... the issue was the intentional interference with the game play. All the items you listed are something an individual can pursue, so long as they do so appropriately. Doing so during a match is not appropriate.

Let's consider that.

The real fields are almost only available during competitions.

This leaves I suppose the initial practice matches before the actual competition venues.

One of the items I listed you could do quite utterly by mistake (I'm not saying this person didn't have intention to try it, I'm just saying we have no idea how many other people did that by mistake).

EricH
21-08-2012, 12:44
Let's consider that.

The real fields are almost only available during competitions.

This leaves I suppose the initial practice matches before the actual competition venues.
You could also approach the FTA and say, "I know you're busy, but could you leave the field up for a few minutes at the end of the day? I've got something that you need to know about." You could also try in the morning before matches.

Let's think about it this way: You have a practice day (well, if you aren't in the districts, you do--even then you have some practice time). Do it to your own team then, it doesn't affect anybody else then--just make sure your team knows you're doing it. Typically, there's about an hour before matches start on any given competition day (depending on opening ceremony start time in relation to pit opening time--don't try anything during the ceremony!). And there is often a couple hours at the end of the day, with the exception being the last day.

If you think that there is a problem with field vulnerability, or other system problems, Do Not Wait. Talk to the FTA during any of those "down" time periods--or ask in a shorter break, say between matches, if you can demonstrate the issue during them. If you are invited to demonstrate it, that's when you should do it--during lunch may also be an option. You can bet that if the vulnerability issue had been demonstrated to an FTA before Einstein, it would have been fixed or blocked before Einstein--it's one of those cases where "one guy knows, so we don't know how many others know".

techhelpbb
21-08-2012, 12:50
You could also approach the FTA and say, "I know you're busy, but could you leave the field up for a few minutes at the end of the day? I've got something that you need to know about." You could also try in the morning before matches.

Let's think about it this way: You have a practice day (well, if you aren't in the districts, you do--even then you have some practice time). Do it to your own team then, it doesn't affect anybody else then--just make sure your team knows you're doing it. Typically, there's about an hour before matches start on any given competition day (depending on opening ceremony start time in relation to pit opening time--don't try anything during the ceremony!). And there is often a couple hours at the end of the day, with the exception being the last day.


I agree with this completely.


If you think that there is a problem with field vulnerability, or other system problems, Do Not Wait. Talk to the FTA during any of those "down" time periods--or ask in a shorter break, say between matches, if you can demonstrate the issue during them. If you are invited to demonstrate it, that's when you should do it--during lunch may also be an option. You can bet that if the vulnerability issue had been demonstrated to an FTA before Einstein, it would have been fixed or blocked before Einstein--it's one of those cases where "one guy knows, so we don't know how many others know".

I disagree with this. The level of testing required to deal with the interloper's actions was/is really beyond what I believe is practical for field testing. Having now setup and broken down a field for this year's competition 2 times I can not see how sufficient time and resources would be available to scientifically and properly do anything more than trip over the solution.

Great if they trip over it. Not so great if they don't.

Additionally I can demonstrate additional issues right now. I know for a fact that several FIRST people know about them. Following only the reporting advice to e-mail the address on the report a person would literally be left in a vacuum. I have made it a point to make this harder to ignore because I expect that someone will do something about it. I'm growing ever more concerned that is not the case.

By September FIRST is hard at work generating the documents and written parameters for 2013 in their final form.
It's now August 21, 2012. So logistically when and where is this exploration going to get done?

rick.oliver
21-08-2012, 13:13
I will open by sharing that I feel good about the way FIRST has conducted themselves throughout this process. I believe that FIRST and the volunteers who participated in the investigation have demonstrated FIRST's values of Gracious Professionalism and Coopertition.

FIRST has shown respect for all of the individuals involved and the FRC community in their transparency and communications of the process and outcomes. They have investigated, learned and put plans in place to correct and improve their hardware, systems and processes. They have maintained their integrity and sensitivity to the Einstein teams and the FRC community throughout the process.

What concerns me about some of the FRC community's response and the FIRST FRC Team 548 Einstein Statement is what it reveals about the FRC community's culture. I have read some comments in this thread suggesting that the interference of the Einstein matches was somehow excusable or justifiable. After reading the report, I come away with the sense that the document actually minimizes the egregiousness of the action.

Certainly folks may and should be forgiven for failures. However, that does not remove the consequences, nor does it restore trust.

GP means that we compete like crazy and at the same time play fair, maintain our integrity, while showing respect for our partners and opponents. I know that there have been times when I have not been a gracious professional. When I recognize it, I admit it, apologize, ask for forgiveness from the person I offended and resolve to do better. I see something like that in their statement and I hope that they do come out of this stronger and better.

But ... what does it say about our culture that this happened and that there are attempts to excuse, justify or minimize it? I would echo what someone said in a previous post, albeit perhaps in a different context. We still have a long way to go.

BigJ
21-08-2012, 13:15
No one ever notes a problem again?

No one ever clicks on a list of networks again and misses the button?

No one ever asks why documenting issues has to reach the public level?

No one is ever curious again?

No one ever considers using this particular ISM band again like this?

I would feel much more comfortable with harsh punishment if you couldn't trip over this.

No one decides to bypass responsible disclosure (one method is mentioned earlier in Andrew's post (http://www.chiefdelphi.com/forums/showpost.php?p=1182343&postcount=36)) and takes it upon themselves to demonstrate vulnerabilities during competition matches again.

EDIT: whoops, there was a 6th page and at least two people already said relatively the same thing:o

techhelpbb
21-08-2012, 14:52
No one decides to bypass responsible disclosure (one method is mentioned earlier in Andrew's post (http://www.chiefdelphi.com/forums/showpost.php?p=1182343&postcount=36)) and takes it upon themselves to demonstrate vulnerabilities during competition matches again.

EDIT: whoops, there was a 6th page and at least two people already said relatively the same thing:o

Starting today it's been 30 days since I sent my first e-mail about this.
6 months is the end of January 2013.

If I follow through with the 6 month process as it stands now I'll be giving the next interloper the perfect window of opportunity for 2013 by publishing in late January. FIRST who might do nothing with the knowledge till then would have little time to react. Worse FIRST will have solidified all their purchases and shipped all the kits of parts.

Suffice it say I'm not thrilled with this. Worse even if I don't point it out then depending on a number of likely factors these exploits will be readily available to any interlopers that we don't know about if they've stumbled on them.

If that's not a house of cards I don't know what is.

So if I publish that information I risk FIRST responding by sanctioning me.
If I don't publish that information who knows if or when it'll get exploited.

For those who get the reference:
'The only way to win is not to play' and unfortunately I don't mean looking for security problem.

Cory
21-08-2012, 14:53
I have read some comments in this thread suggesting that the interference of the Einstein matches was somehow excusable or justifiable. After reading the report, I come away with the sense that the document actually minimizes the egregiousness of the action.

I think a lot of people want to believe FIRST is a utopia where everyone is good and would never do anything wrong simply because we are all participating in a great activity. As such, incidents where bad things happen can be trivialized because people will think "Oh, there must have been a misunderstanding here, so and so would never do anything to harm anyone", when in reality FIRST has bad apples just like any large community.

BigJ
21-08-2012, 14:58
Starting today it's been 30 days since I sent my first e-mail about this.
6 months is the end of January 2013.

If I follow through with the 6 month process as it stands now I'll be giving the next interloper the perfect window of opportunity for 2013 by publishing in late January. FIRST who might do nothing with the knowledge till then would have little time to react. Worse FIRST will have solidified all their purchases and shipped all the kits of parts.

Suffice it say I'm not thrilled with this. Worse even if I don't point it out then depending on a number of likely factors these exploits will be readily available to any interlopers that we don't know about if they've stumbled on them.

If that's not a house of cards I don't know what is.

So if I publish that information I risk FIRST responding by sanctioning me.
If I don't publish that information who knows if or when it'll get exploited.

For those who get the reference:
'The only way to win is not to play' and unfortunately I don't mean looking for security problem.

It doesn't have to be exactly 6 months. One might contact them and say "I will publish these findings on X date unless this is followed up with and another effective course of action is carried out". I don't think anyone here would be against one who did that, or support the powers that be for sanctioning such an individual. The point is that it is responsible disclosure.

techhelpbb
21-08-2012, 15:04
It doesn't have to be exactly 6 months. One might contact them and say "I will publish these findings on X date unless this is followed up with and another effective course of action is carried out". I don't think anyone here would be against one who did that, or support the powers that be for sanctioning such an individual. The point is that it is responsible disclosure.

I understand your point. However, the issue remains. FIRST, not just your robots, the entire contest is a problem too big for the time it's given.

August leaves 10 days.
September they build the documents and the rules.
October and November they setup the kits of parts.
December is anything that rolls over and of course countless holidays.
January, February and March is already too late.

So in reality I've disclosed them to FIRST now.
If I wait until after next season who knows what might happen.

If I levy that sort of consequence on FIRST what might they do?
Cause clearly other people have openly declared risk before that was not mitigated.

It's not just about shifting a few days. It's about the body politic (http://en.wikipedia.org/wiki/Body_politic).

Andrew Schreiber
21-08-2012, 15:06
Starting today it's been 30 days since I sent my first e-mail about this.
6 months is the end of January 2013.

If I follow through with the 6 month process as it stands now I'll be giving the next interloper the perfect window of opportunity for 2013 by publishing in late January. FIRST who might do nothing with the knowledge till then would have little time to react. Worse FIRST will have solidified all their purchases and shipped all the kits of parts.

Suffice it say I'm not thrilled with this. Worse even if I don't point it out then depending on a number of likely factors these exploits will be readily available to any interlopers that we don't know about if they've stumbled on them.

If that's not a house of cards I don't know what is.

So if I publish that information I risk FIRST responding by sanctioning me.
If I don't publish that information who knows if or when it'll get exploited.

For those who get the reference:
'The only way to win is not to play' and unfortunately I don't mean looking for security problem.

You took the number 6 months entirely too seriously. I quite literally pulled that number out of thin air just to let people know that 2 weeks is NOT an appropriate period of time. Obviously publishing just before another round of competitions might not be good. But I was assuming that if a person is intelligent enough to discover the vulnerability and be wise enough to know how to go about exposing it they would have SOME common sense. I guess that's asking too much from people though.

steelerborn
21-08-2012, 15:18
I think the 548 statement was the right thing to do, they should be proud of what they did.

I would also like to point out that I see FIRST as a "sport". Back in high school I was on the varsity football team and there was some "cheating" going on there too. But I would like to say that I have seen more backstabbing in FRC than I did in football. People are people and that will never change, if you have a person who is willing to talk behind your back, then they will do it in FRC too. I had some team-mates who are my friends do this to me and it really hindered the way people see me, and still do to this day. But I am working hard to fix it still almost 3 years later.

techhelpbb
21-08-2012, 15:19
You took the number 6 months entirely too seriously. I quite literally pulled that number out of thin air just to let people know that 2 weeks is NOT an appropriate period of time. Obviously publishing just before another round of competitions might not be good. But I was assuming that if a person is intelligent enough to discover the vulnerability and to not be retarded about how to expose it they would have SOME common sense. I guess that's asking too much from people though.

Common sense is anything but. After all so many wish so many others had it.

This is a situation in which you have on one hand a vulnerability and a certain set of skills, resources and knowledge to outline it.

The other you have an organization pushed to the limits exposed to that vulnerability and perhaps not inclined to deal with it.

There's no reason...literally at all...to expect that I or any other researcher have the ability to influence FIRST corporate. That's the point.

The implied threat of exposure is a weak threat with FIRST because FIRST is a corporation with hundreds of thousands of kids impacted by it. You're not just costing their corporate bottom line or reputation. As all of these similar topic represent you're messing with the kids and it's not one step removed like disclosing some banking data.

Unfortunately this matters. There are too many disclosures I'm aware of and the costs on the other side of that big stick are too great.

Andrew Schreiber
21-08-2012, 15:31
There's no reason...literally at all...to expect that I or any other researcher have the ability to influence FIRST corporate. That's the point.


In my experience the notion that FIRT doesn't listen to people is incorrect.

The notion that one is threatening FIRST with disclosure is incorrect as well. FIRST should want to fix this issue (if they aren't there are other issues that are completely irrelevant to the discussion) and by letting them know you plan on publishing the findings at a later date you are simply being courteous and giving them a chance to fix the issue before it becomes public. No threats implied at all.

techhelpbb
21-08-2012, 15:46
In my experience the notion that FIRT doesn't listen to people is incorrect.

The notion that one is threatening FIRST with disclosure is incorrect as well. FIRST should want to fix this issue (if they aren't there are other issues that are completely irrelevant to the discussion) and by letting them know you plan on publishing the findings at a later date you are simply being courteous and giving them a chance to fix the issue before it becomes public. No threats implied at all.

No one I know that has so far commented has gotten so much as an auto response (a courtesy).

In 17 years my experience calling the FIRST switch board is dismal.

Asking questions in the actual Q&A forum has often been criticized above and beyond this point (to the point I know people who intentionally avoid it).

My experience obviously differs from your own.

You might consider it not a threat to make such a disclosure with lots of time to resolve it, but under the current circumstances I see nothing, at all, that prevents FIRST from viewing your eventual disclosure as an open challenge to their authority.

Right on topic the last person that pointed out something was asked to leave.
One could argue that it would have been subsequently followed up.

However, no where in any discussion that I have seen (or the reports) did it indicate what the process for that follow up was or was ever outlined to the reporting party.

So I bring this back full circle. There are disclosures of issues I am aware of. What is the process by which these courtesies are reciprocated? I posed that same question weeks ago as well.

linuxboy
21-08-2012, 18:54
One perspective that I think has not been brought up, that I think deserves attention is the competition rules. [T14] states:

"If a team needs clarification on a ruling or score, a pre-college student from that team should address the Head Referee after a field reset has been signaled. An team signals their desire to speak with the Head Referee by standing in the red or blue Question Box which will be placed on the floor at each end of the scoring table. Depending on timing, the Head Referee may postpone any requested discussion until the end of the subsequent Match."

While that does not mention the FTA, it is the closest thing I could find to how an official interaction is made concerning the results of a match. I'm not saying this would have affected how staff reacted but I'd like to point out that, from my interpretation of that rule, the proper way for the mentor to bring this up at the field is not at all. If (s)he wanted, (s)he could have revealed this vulnerability to a team member, the team member would have stood in the question box and voiced these concerns with to the Head Referee, who would (hopefully) confer with the technical staff present, and things could have played out differently. I'm not saying they necessarily would have, but we do have rules about who engages field staff, it clearly indicates that only pre-college students may do so, and I know, when I'm volunteering on the field, I would rather talk to a student than a mentor.

DampRobot
22-08-2012, 01:36
I've been watching this thread with much interest lately, and a few interesting points that (I believe) have not been addressed are still fresh in my mind.

First, aren't we forgetting the second person who brought down communications? The story that is corroborated both by the 548 mentor and the official report implies that there was a second attacker, who interestingly attacked the wifi network only after the 548 mentor did his three second demo attack. Most people appear to be assuming that the 548 mentor did all of the wifi atacks, which just doesn't appear to add up. Why did the second attacker act? Did they believe something similar to the first attacker, that they were being attacked? Or did they simply have a malicious intent?

Second, was there institutional knowledge of this security hole? It appears that at least two (and probably more, if this thread is any indicator) FRC members knew of this specific hole. Did no one on the official FRC team know of this? This seems unlikely to me, but depending on the extent of the knowledge of this hole, it certainly could be true. If so, why didn't they attempt to patch it? If not, does this point to an institutional problem in a lack of focus on security? In either case, more needs to be done to recognize and address future security holes.

Third, why did we never learn about this hole at Einstein, where it's relativity unlikely that two separate people coincidentally used this technique to bring down a match. Were there smaller incidents at regionals and division championships that simply did not get noticed until Einstein? Were people with knowledge of this quite until then, or simply unnoticed? And why did a thread never appear on CD with information about this? Surely, unless there was malicious intent, any loyal FIRSTer would rather report this than use it in a match. Were malicious (or simply very quite) people the only ones who ever knew or suspected a exploit of this type?

Hopefully, my questions were constructive and not offensive. I'm just a little surprised that I've never seen them asked or answered yet.

EricH
22-08-2012, 02:28
While that does not mention the FTA, it is the closest thing I could find to how an official interaction is made concerning the results of a match. I'm not saying this would have affected how staff reacted but I'd like to point out that, from my interpretation of that rule, the proper way for the mentor to bring this up at the field is not at all. If (s)he wanted, (s)he could have revealed this vulnerability to a team member, the team member would have stood in the question box and voiced these concerns with to the Head Referee, who would (hopefully) confer with the technical staff present, and things could have played out differently. I'm not saying they necessarily would have, but we do have rules about who engages field staff, it clearly indicates that only pre-college students may do so, and I know, when I'm volunteering on the field, I would rather talk to a student than a mentor.
You are forgetting one thing: T14 ONLY addresses Ref interaction! So your interpretation is that the head ref is the only person on the field that questions can be asked of. Have you or any member of your drive team asked a field resetter anything? How about discussing why your robot isn't connecting with the FTA or FTAA? I'm so sorry, but by your interpretation, you just did something illegal. Move along, you can't discuss that with that person.

Now, would it have been helpful to send a message by that route? Maybe--but that involves a) finding a student who isn't trying to fix something and b) having said student wait until they could get the head ref's attention. Then the head ref has to decide that it's important enough to call the FTA or FTAA away from whatever he's doing (probably trying to fix the problem with 118, in the case of 548's matches), oh and did I mention that by now it's second-or third-hand informationsuspicion (which, if you're paying attention, you may have figured out that that's roughly equivalent to a rumor). In other words, chances are fairly high that going that route you'll either be ignored, or if you do get through, the FTA will want to talk to the originator (in this case, the mentor), and we're right back where we started.


@DampRobot: I didn't pick up the implication of a second person involved in the official report. I got that only from 548's account. Also, a 3 second attack like that one would result in needing to reconnect the wifi, which can take a little bit of time, regardless of if there's another attacker or not. I think a lot of the questions you have are going to be very difficult to answer without putting people under suspicion of cheating or of total ignorance, either of which I'm reluctant to do.

Siri
22-08-2012, 04:54
Second, was there institutional knowledge of this security hole?...While you bring up good points, are you underestimating how difficult this was to purposefully discover and/or how lucky you'd have to be to find it? I honestly don't know, but as I understand it the Cisco firmware with the hole only implemented in Week 4, and even then only manifest in one of the D-Link revisions. While FIRST tested the new firmware thoroughly for the issue it was meant to address, it's not so surprising they didn't test for FCA (page 7). Conceding (as the wireless experts did) that it's not an obvious issue to test for, I'd be somewhat surprised if FIRST officials managed to trip on it in the intervening weeks. Granted, this definitely isn't my area of expertise.

I missed any implication of a second person in the Report. Where are you referring?

You are forgetting one thing: T14 ONLY addresses Ref interaction! So your interpretation is that the head ref is the only person on the field that questions can be asked of. Have you or any member of your drive team asked a field resetter anything? How about discussing why your robot isn't connecting with the FTA or FTAA? I'm so sorry, but by your interpretation, you just did something illegal. Move along, you can't discuss that with that person.I certainly don't take T14 to be the only allowable interaction (having talked to enough FTAs in my day), but it is the only guaranteed interaction. While I've never done it on Einstein, I head refs--even busy ones--seem listen to polite students in the box. I think you'd be hard-pressed to find a ref that wouldn't listen twice to "I know what's wrong; please let me show you how anyone in the stadium can shut down any robot on this field". As I understand it, the demonstration is rather quick (pull up the network list and show you can send a client authorization). If so, the student could show this directly to the ref for added clout.

I know what's done is done, but hopefully an earnest examination will help anyone thinking of doing something like this in the future. No matter how helpless you feel thinking someone else is targeting your team, there are always other ways. In fact, you can't count on anyone even listening to you, much less getting a replay, if you try to interfere yourself. (Not that this is the key reason against interference.)

Al Skierkiewicz
22-08-2012, 08:51
First, aren't we forgetting the second person who brought down communications? The story that is corroborated both by the 548 mentor and the official report implies that there was a second attacker, who interestingly attacked the wifi network only after the 548 mentor did his three second demo attack. Most people appear to be assuming that the 548 mentor did all of the wifi atacks, which just doesn't appear to add up. Why did the second attacker act? Did they believe something similar to the first attacker, that they were being attacked? Or did they simply have a malicious intent?
There was no evidence of a second attack. The original attacker suspected that other failures (for known and documented reasons) were being caused by the attack method that had been discovered. As to the three second attack, please read the report again! Once a device had attempted to communicate with a robot, the disruption could last the entire match. The attacker could easily move on to another robot(s) after the first disruption.
Also note, the robot remained connected to the field and in those cases where the team was using video from the robot, all status and video continued to be displayed at the driver's station. The robot was connected, just the command link from driver's station to robot was interrupted.
Second, was there institutional knowledge of this security hole? It appears that at least two (and probably more, if this thread is any indicator) FRC members knew of this specific hole. Did no one on the official FRC team know of this? This seems unlikely to me, but depending on the extent of the knowledge of this hole, it certainly could be true. If so, why didn't they attempt to patch it? If not, does this point to an institutional problem in a lack of focus on security? In either case, more needs to be done to recognize and address future security holes.
There was no knowledge of this weakness prior to the mentor coming forward and explaining what had actually taken place after the Champs. The mentor was observed on Einstein doing something suspicious with a phone. Anyone repeatedly punching a phone within feet of Einstein while a match is going on is suspect because they are not observing the match at hand. However, the problems did not take on the typical signs of a DOS attack. Had anyone been knowledgeable of the hole (or if the problem had been communicated to the engineering staff), a simple revert to previous firmware, a change in wireless access points on the robot or a combination of the above would have simply fixed the issue. Those changes could easily be made during other closing ceremonies.

Third, why did we never learn about this hole at Einstein, where it's relativity unlikely that two separate people coincidentally used this technique to bring down a match. Were there smaller incidents at regionals and division championships that simply did not get noticed until Einstein? Were people with knowledge of this quite until then, or simply unnoticed? And why did a thread never appear on CD with information about this? Surely, unless there was malicious intent, any loyal FIRSTer would rather report this than use it in a match. Were malicious (or simply very quite) people the only ones who ever knew or suspected a exploit of this type?
If others knew or suspected an issue at other events, they did not come forward with that info. The Einstein Investigation had a clear set of goals and that was to determine what caused so many failures on the Einstein Field. We were not tasked with investigation outside of Einstein and the twelve robots involved in that part of the competition.

To be absolutely clear, there are many people on or near the field during events. Some of these are non-technical volunteers and some have been tech volunteers in the past and some are volunteers who are also on teams competing on the field. Approaching one of those volunteers and expecting the same response as a field expert to a technical issue like this is a bad use of time. At every event there is a crew of volunteers whose directive is to make every robot play, that is the Robot Inspectors. During Champs finals, (all divisions and Einstein) there are inspectors assigned to the field to assist teams with problems and work with the head referee and FTA. There were two experienced division LRIs on Einstein, one on each side of the field during the matches and in the pit area assisting teams between matches. If you have a problem and cannot get resolution, please check in with an inspector or LRI. We want everyone to play, as often as they wish, within the rules of the competition.

Astrokid248
22-08-2012, 09:06
While you bring up good points, are you underestimating how difficult this was to purposefully discover and/or how lucky you'd have to be to find it? I honestly don't know, but as I understand it the Cisco firmware with the hole only implemented in Week 4, and even then only manifest in one of the D-Link revisions. While FIRST tested the new firmware thoroughly for the issue it was meant to address, it's not so surprising they didn't test for FCA (page 7). Conceding (as the wireless experts did) that it's not an obvious issue to test for, I'd be somewhat surprised if FIRST officials managed to trip on it in the intervening weeks. Granted, this definitely isn't my area of expertise.


You wouldn't necessarily have to know the cause of the issue to happen upon the exploit. With the growing number of applications that can control any number of robots with a smartphone, it's really not surprising that between week 4 and Einstein someone whipped out a phone and thought, "What if I connect in during a match?"

It's the "1000 monkeys with 1000 typewriters" postulate at work, and I think it would be wise of FIRST to challenge all teams to try and find these exploits and notify FIRST as they appear. Crowd-source the troubleshooting of these systems, and allow teams to have active feedback throughout the season. It would solve a lot of problems. And I agree with the idea that FIRST should have some kind of pre-written response to let teams know that emails are at least going through.

JamesCH95
22-08-2012, 09:48
You wouldn't necessarily have to know the cause of the issue to happen upon the exploit. With the growing number of applications that can control any number of robots with a smartphone, it's really not surprising that between week 4 and Einstein someone whipped out a phone and thought, "What if I connect in during a match?"

It's the "1000 monkeys with 1000 typewriters" postulate at work, and I think it would be wise of FIRST to challenge all teams to try and find these exploits and notify FIRST as they appear. Crowd-source the troubleshooting of these systems, and allow teams to have active feedback throughout the season. It would solve a lot of problems. And I agree with the idea that FIRST should have some kind of pre-written response to let teams know that emails are at least going through.

That's a great idea in theory. In practice, however, FIRST would be completely overwhelmed with nonsense results from uncontrolled situations that bear little or no relevance to a competition field setup.

Simply put: the problem with the "1,000 monkeys with 1,000 typewriters" postulate in reality is filtering out the 99%+ gibberish content they've created.

Alan Anderson
22-08-2012, 09:54
You wouldn't necessarily have to know the cause of the issue to happen upon the exploit. With the growing number of applications that can control any number of robots with a smartphone, it's really not surprising that between week 4 and Einstein someone whipped out a phone and thought, "What if I connect in during a match?"

To "happen upon the exploit" requires specific hardware. If someone had tried to connect without using one of the exceedingly few handheld devices capable of 5 GHz WiFi, nothing would have happened. That's a good enough reason for me to accept the idea that nobody but the admitted culprit knew about the problem.

techhelpbb
22-08-2012, 10:51
That's a great idea in theory. In practice, however, FIRST would be completely overwhelmed with nonsense results from uncontrolled situations that bear little or no relevance to a competition field setup.

Simply put: the problem with the "1,000 monkeys with 1,000 typewriters" postulate in reality is filtering out the 99%+ gibberish content they've created.

The simple way to find the non-gibberish is request a proof of concept either in video or in front of field personnel.

This would be easier to accomplish with more open documentation about the field (so it can be more readily replicated) and more access to fields (itself not a trivial request).

Of course all of that is useless without clear lines of communications and process.

Also there are probably more devices than one might realize at any one event that can use 5GHz because they are not line of sight to the field. Consider all the driver's station laptops in the pits. I'll assume that no one on the field with a 5GHz laptop has time to be doing anything but what is expected of them.

With Windows Vista and above it would be very simple to craft a background script running as system that would exploit the failed connect attempt hole totally hidden from all but the most experienced eyes even on a driver's station on the field (in effect malware for the field). This wouldn't seem out of place at all because of the driver station software reliance on Windows. Also if someone had a COTS computing device on the robot a similar tactic with wider OS selection would be possible. I am comfortable making this statement because this particular vulnerability is much easier to remedy than others I am aware of.

DampRobot
22-08-2012, 10:54
There was no evidence of a second attack. The original attacker suspected that other failures (for known and documented reasons) were being caused by the attack method that had been discovered. As to the three second attack, please read the report again! Once a device had attempted to communicate with a robot, the disruption could last the entire match. The attacker could easily move on to another robot(s) after the first disruption.


If others knew or suspected an issue at other events, they did not come forward with that info. The Einstein Investigation had a clear set of goals and that was to determine what caused so many failures on the Einstein Field. We were not tasked with investigation outside of Einstein and the twelve robots involved in that part of the competition.

Al Skierkiewicz, thank you for pointing out that what might seem obvious to me might be completely contrary to others' points of view. To address your comments using my interpretation of the report:

First, the official FRC report describes a Galxey Nexus running Android 4.0.4 was probably used for at least one attack ("Failed Client Authentication on Einstein") that we recently learned was committed by the 548 mentor. Another section of the report ("Alternative Source Testing") describes in detail the attempts to bring down communications with the failed client authentication attack, and that downtimes in communications could be as low as three seconds with that device and by using a specific strategy. Especially if the mentor had tried this before (which I'm certainly not trying to imply!), he certainly could have only brought down communications for only three seconds.

The second attacker was, to me, implied by the fact that the mentor left the field before Final 1 and 2 and that continued attacks occurred. Also, witnesses saw an individual selecting teams to take down from a cell phone, who may or may not have been the same mentor. Although they believe they are one and the same, the mentor repeatedly denies doing this attack more than once (and if he had, why wouldn't he have used the strategy that would have resulted in only 3-second downtimes? Malicious intent?). He certainly may have been lying, but the fact of the continued attacks considerably longer than three seconds and their continuance even after this person left the field remains.

I think the question of whether there was knowledge in FIRST about this type of hole is a fair question. It states in the Eisenstein report that they only discovered this error accidentally after championships. Shouldn't the actions of this individual, as well as their attempt to contact field personal, given them at least a hint that something was up? Did someone know about this, and was not heard? I certainly don't know, and I don't really expect that anyone on CD can answer all of my questions conclusively.

As always, no offense meant. Hopefully my comments are seen as constructive.

Alan Anderson
22-08-2012, 11:44
Also there are probably more devices than one might realize at any one event that can use 5GHz because they are not line of sight to the field. Consider all the driver's station laptops in the pits...

The number of driver station laptops in the pits capable of 5 GHz WiFi was vanishingly small. As a robot inspector, checking for wireless networking of teams' laptops was part of my job. I saw exactly zero with 5 GHz radios in three regional competitions and a championship division.

techhelpbb
22-08-2012, 11:53
The number of driver station laptops in the pits capable of 5 GHz WiFi was vanishingly small. As a robot inspector, checking for wireless networking of teams' laptops was part of my job. I saw exactly zero with 5 GHz radios in three regional competitions and a championship division.

Fair enough but it can be added in a second with a USB port or card if they choose. Also what about the other laptops often in the pits:

Apple laptops, most all of them since 2006, have dual band.

Including the MacBook, the MacBook Pro, and the MacBook Air.

I know I saw a few of those in my trips into the pits at various events even if they weren't driver's stations.

Al Skierkiewicz
22-08-2012, 11:55
Damp,
The three seconds referred to in the report is the response to a specific set of steps taken and observed by the First engineering team testing the Samsung Galaxy Nexus phone at HQ. It is not suggested that this is what action was taking place on Einstein, merely an additional failure using that phone during testing. The alternative testing was performed after it was noted that a 5GHz enabled wireless device had caused some issues on Einstein. It was noted by First engineering that devices have this tendency to 'phone home' once they see a wireless network that they recognize. That is the "repeat interval" listed in that part of the report.
In addition from the report..."Each of these authentication attempts has the potential to cause working communication to drop and a dropped connection to be reestablished between the driver station and the robot. Repeated attempts to connect to multiple SSID’s can result in robots that are drivable and robots that are not over the course of the match."

Siri
22-08-2012, 12:21
You wouldn't necessarily have to know the cause of the issue to happen upon the exploit. With the growing number of applications that can control any number of robots with a smartphone, it's really not surprising that between week 4 and Einstein someone whipped out a phone and thought, "What if I connect in during a match?"

It's the "1000 monkeys with 1000 typewriters" postulate at work, and I think it would be wise of FIRST to challenge all teams to try and find these exploits and notify FIRST as they appear. Crowd-source the troubleshooting of these systems, and allow teams to have active feedback throughout the season. It would solve a lot of problems. And I agree with the idea that FIRST should have some kind of pre-written response to let teams know that emails are at least going through.I agree with you--in "1000 people" [likely more] that were around fields on/after Week 4, it seems somewhat plausible to me that someone else who happened to have 5GHz WiFi happened to try to connect to a robot who happened to have Revision A, and happened to try entering a password and cause FCA, and happened to be one of the people that would keep it to themselves. Not likely, but plausible.

What I find significantly less plausible is that FIRST officials happened to do so. Not only is the sample size many, many times smaller, but they are naturally quite busy during matches and additionally have every reason to trust in FIRST's testing. (I acknowledge the potential for complacency.) I cannot picture an FTA or FTAA (etc), much less Dean or Woodie, whipping out their phone in the middle of a match. They have every reason to be among the most busy people in the stadium and no reason to distrust their own selections. This is my argument against DampRobot's question of institutional knowledge.

Al Skierkiewicz
22-08-2012, 12:26
What I find significantly less plausible is that FIRST officials happened to do so. Not only is the sample size many, many times smaller, but they are naturally quite busy during matches and additionally have every reason to trust in FIRST's testing. (I acknowledge the potential for complacency.) I cannot picture an FTA or FTAA (etc), much less Dean or Woodie, whipping out their phone in the middle of a match. They have every reason to be among the most busy people in the stadium and no reason to distrust their own selections. This is my argument against DampRobot's question of institutional knowledge.

HUH?????

Astrokid248
22-08-2012, 12:38
What I find significantly less plausible is that FIRST officials happened to do so. Not only is the sample size many, many times smaller, but they are naturally quite busy during matches and additionally have every reason to trust in FIRST's testing. (I acknowledge the potential for complacency.) I cannot picture an FTA or FTAA (etc), much less Dean or Woodie, whipping out their phone in the middle of a match. They have every reason to be among the most busy people in the stadium and no reason to distrust their own selections. This is my argument against DampRobot's question of institutional knowledge.

Should've clarified a bit. I'm not at all surprised FIRST didn't find it. There is no scenario in which they could've found the issue before Einstien. I'm saying that a) if they implement some sort of a feedback system, maybe the troubleshooting will be more comprehensive and b) the mystery mentor probably isn't the only guy who was aware of the problem before he tried to alert the FTAs. Just my 2¢.

Siri
22-08-2012, 13:09
HUH?????uh oh. What'd I do, Al? :eek:


Should've clarified a bit. I'm not at all surprised FIRST didn't find it. There is no scenario in which they could've found the issue before Einstien. I'm saying that a) if they implement some sort of a feedback system, maybe the troubleshooting will be more comprehensive and b) the mystery mentor probably isn't the only guy who was aware of the problem before he tried to alert the FTAs. Just my 2¢.Oh. Yeah, then we totally agree with each other. (Does that make it 4¢?)

Al Skierkiewicz
22-08-2012, 13:15
I am trying to figure what you are saying in that post.

Jon Stratis
22-08-2012, 13:27
Al, I think his point was that the likelihood of FIRST officials stumbling onto the FCA issue prior to Einstein was extremely small. The FIRST officials who would be most likely to recognize it for what it is (like the FTA's) are too busy during competition and matches to be flipping through their phones, so accidentally stumbling on it would be difficult for them.

I would add one more item... those individuals are probably the last ones who would actually "try" to connect to the field if the option pops up on their phone. They would just cancel out of the option and go about their business.

JesseK
22-08-2012, 14:00
Also there are probably more devices than one might realize at any one event that can use 5GHz because they are not line of sight to the field. Consider all the driver's station laptops in the pits. I'll assume that no one on the field with a 5GHz laptop has time to be doing anything but what is expected of them.

The assumption is a bit naive.

While I agree that 5Ghz wireless cards on battery-powered mission-critical laptops are far and few between (energy mongers...), any individual that tries to interfere from a driver's station laptop will probably not rely on a driver to do so. It's conceivable that the drive team wouldn't know it's happening. Most likely it'd go in a batch file or background script (rundll32.exe anyone?) that doesn't show up. Additionally, it could happen from the queue rather than on the field.

Now that an exploit is public knowledge, it's only a matter of creativity for how it's attempted to be abused. FIRST needs to find a solution for the root cause (sounds like they are). Turning wireless off for the laptops is a start.

techhelpbb
22-08-2012, 14:19
The assumption is a bit naive.

While I agree that 5Ghz wireless cards on battery-powered mission-critical laptops are far and few between (energy mongers...), any individual that tries to interfere from a driver's station laptop will probably not rely on a driver to do so. It's conceivable that the drive team wouldn't know it's happening. Most likely it'd go in a batch file or background script (rundll32.exe anyone?) that doesn't show up. Additionally, it could happen from the queue rather than on the field.

Now that an exploit is public knowledge, it's only a matter of creativity for how it's attempted to be abused. FIRST needs to find a solution for the root cause (sounds like they are). Turning wireless off for the laptops is a start.

It's hard to really enforce the zone around a field by just policing devices that are off.

You can't jam because if you do you probably will jam yourself unless you use a very well designed jamming system. Plus FIRST is a publicly visible corporation and you're taking your legal chances jamming like that. You can't count on the devices staying off after you look at them (if we assume no trust it's no problem to just turn it on or for an attacker to use resource kit tools to turn it back on). You can't even count on a spectrum analyzer and a near field antenna to find the devices because a device could be disabled when you look. You can't rely on denial of service detection because wireless by it's very nature is prone to short service disruptions which makes any channel disruptions less than a complete denial of service harder to detect. You can't even sort the process with a Bayesian filter (http://en.wikipedia.org/wiki/Recursive_Bayesian_estimation) because there are layers of complication and that requires some amount of repetition.

So in reality your choices to prevent future issues get quickly more difficult.

One could track communications losses per match and replay those that don't seem to be due to power issues to the radio (assuming we consider power issues to the radio to be a build quality issue). However, that does not fit with the current process that seems to be at work. Given the current process if an interloper can interfere and not get caught the match outcomes stand. So all it takes is someone with the knowledge and the willingness to absorb the risk.

Stick your head in on a DEFCON or Black Hat convention discussion some time. They'll pull stunts that obviously are pushing or breaking the law right in front of the authorities they know are watching them in the very same room. They aren't shy about it. It's going to be really hard to deny what they were doing if they get busted with a video of them doing it with an audience. At least they aren't concealing their efforts with what they know.

Jon Stratis
22-08-2012, 14:37
There are simply too many ways for a robot to fail (as we saw in the Einstein report) for the refs for FTA's to make a snap call to replay a match unless there is conclusive evidence that the cause was out of the teams control. When there is such evidence, matches are replayed. I've seen it happen when the field has had issues, which does occasionally happen. The issue is identifying the root cause of the failure.

So they're really already doing what you suggest in the last paragraph... it's just rare to be able to make the decision as to root cause.

techhelpbb
22-08-2012, 14:54
There are simply too many ways for a robot to fail (as we saw in the Einstein report) for the refs for FTA's to make a snap call to replay a match unless there is conclusive evidence that the cause was out of the teams control. When there is such evidence, matches are replayed. I've seen it happen when the field has had issues, which does occasionally happen. The issue is identifying the root cause of the failure.

So they're really already doing what you suggest in the last paragraph... it's just rare to be able to make the decision as to root cause.

I agree and with the logging on the field communications devices off and the robots mostly not logging the power to the radios (cause we were forbidden to do so per the official answer to my Q&A from 2012) there was no way anyone in the FIRST field crew would have had a good quick way to even narrow down on that issue.

Even if they monitor that radio power there are known and not well known programming pitfalls that can swamp the radios so I admit even the above wouldn't be entirely complete.

Quality of service monitoring on field side isn't a perfect solution either because the field channels can be swamped robot side or by the already mentioned wireless issues. One could track packet communications on the WiFi bridge as well but that'll probably require some custom firmware and some place to stick the data.

Alan Anderson
22-08-2012, 15:00
Now that an exploit is public knowledge, it's only a matter of creativity for how it's attempted to be abused.

The specific exploit in question is no longer possible. The access point firmware bug that permitted it will not be present in the future.

Other exploits do still exist. Some are essentially impossible to prevent because they are inherent in the nature of 802.11 wireless networking and established security protocols, but they are detectable.

techhelpbb
22-08-2012, 15:29
There is no way I can state my case that the remedies presented in the Einstein report will not be sufficient to prevent exploit in this FIRST related forum or any other FIRST forum publicly. If I make my case, eventually escalating to successful public proof of concept. All I'll be doing is enabling people with bad intentions. Proving my point is not worth the harm it will probably cause to hundreds of thousands of kids.

There is clearly no time remaining to do anything about the issues anyway.

Come what may. I'm glad that having the highest score is not my highest priority.

Siri
22-08-2012, 15:30
I am trying to figure what you are saying in that post.I seem to have done a pretty poor with this one. This is what I thought I was saying:

@Post 93 (http://www.chiefdelphi.com/forums/showpost.php?p=1182650&postcount=93) DampRobot questions how no one on the official FRC team could have known about this FCA hole.

@Post 95 (http://www.chiefdelphi.com/forums/showpost.php?p=1182656&postcount=95) I point out that it only occurs under limited circumstances, and say I'd therefore be surprised if someone from FIRST tripped on it themselves (while sharing the understanding that they would/should have acted had someone told them).

@Post 97 (http://www.chiefdelphi.com/forums/showpost.php?p=1182672&postcount=97) Astrokid points out that you don't need to know the cause of the issue to happen upon it, and says it's not surprising that someone just thought "What if I connect in during a match?"

@Post 105 (http://www.chiefdelphi.com/forums/showpost.php?p=1182696&postcount=105) (the "HUH?" one), I agree but draw a distinction--which, unbeknownst to me, Astrokid agrees with--between "someone" accidentally discovering it, and a FIRST official happening to do so. I draw this distinction because there are a lot more random "someones" than FIRST officials, FIRST officials tend to be rather preoccupied during matches, and given the otherwise extensive testing of the new Cisco firmware, FIRST officials have every reason to trust their selection and the system as a whole.

@Post 106 (http://www.chiefdelphi.com/forums/showpost.php?p=1182697&postcount=106), Al asks be what on Earth I'm talking about.



On a totally different note, does anyone know if the field saves the data records from the spectrum analyzer, or is it solely live feed?

BigJ
22-08-2012, 15:44
There is no way I can state my case that the remedies presented in the Einstein report will not be sufficient to prevent exploit in this FIRST related forum or any other FIRST forum publicly. If I make my case, eventually escalating to successful public proof of concept. All I'll be doing is enabling people with bad intentions. Proving my point is not worth the harm it will probably cause to hundreds of thousands of kids.

There is clearly no time remaining to do anything about the issues anyway.

Come what may. I'm glad that having the highest score is not my highest priority.

Those with bad enough intentions will probably discover it sooner or later (or have already figured it out. Many exploits in software end up working this way). Disclosure is not always a problem. If you believe there is a reasonable mitigation (such as a firmware update, or more stringent procedures in pits+field) that could be made I'm sure many would appreciate it being public knowledge, especially if you have tried reaching out to FIRST already.

However, if you believe it is an issue with no easy mitigation that shakes the current technology foundation of the field and robot control systems to its core, disclosure might not be the best idea unless you are reasonably sure someone is using it.

Just my two cents.

techhelpbb
22-08-2012, 15:53
Those with bad enough intentions will probably discover it sooner or later (or have already figured it out. Many exploits in software end up working this way). Disclosure is not always a problem. If you believe there is a reasonable mitigation (such as a firmware update, or more stringent procedures in pits+field) that could be made I'm sure many would appreciate it being public knowledge, especially if you have tried reaching out to FIRST already.

However, if you believe it is an issue with no easy mitigation that shakes the current technology foundation of the field and robot control systems to its core, disclosure might not be the best idea unless you are reasonably sure someone is using it.

Just my two cents.

I have both sorts of exploits and I have already disclosed this to FIRST 30 days ago so let's start with this:

For one the problem is the way the fields are laid out geometrically and the way areas of common play are positioned. I won't say why this is a problem I will say that a single WIPS sensor per field is not sufficient because of it.

There should be a minimum of 2 of those sensors per field diagonal from each other across the long dimension of the field. Take a good look at where the current AirTight sensor generally ends up and it's proximity to the Cisco hardware.

By the way, this was the very first thought to run through my head given the fact that one alliance or another seemed to be disproportionally likely to have issues.

Al Skierkiewicz
22-08-2012, 16:26
Siri,
I read your post and thought that you were indicating that First engineering had already made the attempt to connect to robots by the time Einstein occurred. then I read further and became more and more confused as to what point you were trying to make. So let me make a few statements. No one at First, to my knowledge, had attempted to connect to a robot during competition. Of course they performed all kinds of testing in the off season and during other events. They constantly take in info from team members, even though they may not acknowledge that they received the info. They perform tests at HQ and ask FTAs to try things in the field. First also reaches out to trusted technical volunteers for their input and testing when needed, to insure that a good cross section of robot design and programming platforms are tested.
Teams attempting to control their own robots at home on their practice fields using 2.4 GHz wifi bands are common. I have done it with my phone and my robot. There are several apps available for Android and iPhone that identify available networks and some actually will do spectrum display showing network ID and signal strength. I checked my phone (I can turn wifi access on and off) between matches on Einstein and found a total of three at 2.4 GHz in addition to any robot radios that were on, and two of them were house Fan network points for use during games. Not what you would expect with all those phones and tablets out there in the stands. There was a spectrum analyzer in place to check for networks coming on line and searching for available connections. To my knowledge that does not keep a log but I can tell you several people were checking that during matches, myself included.
I would like to point people to the list of experts that were present during the Einstein weekend. In that list you will find people from Qualcom who were part of the design team for 802.11 communications and set the specifications, people from Cisco, RF experts from Deka, the wireless consultant that designs systems in the Boston area and worked on the FRC wifi design and a variety of First engineering staff, computer experts and RF Engineers. All of them brought or ordered up whatever tools they felt would be needed to analyze the field and robot communications. Their intention was to find what caused the failures on Einstein and to make an attempt to break the control system in use. I have not seen a group of people so anxious to break something and show off than those assembled. Yes, they found that there are some things that can be done to improve the wifi configuration and improve data transfers and prevent outages. However, and I can't stress this enough, during Einstein the only repeatable failure of wifi control that is supported by robot logs, observation, robot action, etc. was that of the admitted intrusion by a mentor on the field.

ratdude747
22-08-2012, 16:30
The number of driver station laptops in the pits capable of 5 GHz WiFi was vanishingly small. As a robot inspector, checking for wireless networking of teams' laptops was part of my job. I saw exactly zero with 5 GHz radios in three regional competitions and a championship division.

I find that hard to believe... In my house there are 3 Dell Latitudes with 5GHZ capability:

D400- My old laptop, has a Broadcom BCM4306 chip that can do WPA2 and B/G/A.
D800- My dad's laptop, has an older version of ^ that has the same capabilities.
D630- My current laptop. Used to have an Intel 3945 B/G/A, I later upgraded it to an Intel 4965 B/G/N/A.

I've seen those models in pits before... I've seen a couple D400s used as driver stations as well. Not every D400 has a dualband chip but the BCM 4306 was very common in the D_00 units (Dell offered it as a free upgrade from the base Intel B chip).

IIRC they make USB/PCMCIA/ExpressCard adapters that are dual band that one could hide and later plug in when nobody was looking.

EricVanWyk
22-08-2012, 16:38
I have both sorts of exploits and I have already disclosed this to FIRST 30 days ago so let's start with this:

For one the problem is the way the fields are laid out geometrically and the way areas of common play are positioned. I won't say why this is a problem I will say that a single WIPS sensor per field is not sufficient because of it.

There should be a minimum of 2 of those sensors per field diagonal from each other across the long dimension of the field. Take a good look at where the current AirTight sensor generally ends up and it's proximity to the Cisco hardware.

By the way, this was the very first thought to run through my head given the fact that one alliance or another seemed to be disproportionally likely to have issues.

Brian, please stop spreading FUD. I can already see the direction you are aiming, and quite simply physics does not work that way. You are simultaneously crying that the sky is falling and threatening to make the sky fall.

I ask you to consider why you feel that FRCHQ is unresponsive, and why others do not feel that way. Is it HQ? Is it the others? Or is it you?

Al Skierkiewicz
22-08-2012, 16:38
Larry,
Not all devices that claim full 802.11 wifi can actually do 5 GHz. Most devices, phones especially, are very difficult to determine as to what frequencies they can operate at.

DMetalKong
22-08-2012, 16:40
As far as I understand the extent of the problems, and as far as I understand the OSI model, the attacks that people are talking about are mostly happening on the network layer, which means that they would have to be resolved on the network layer or above. Since I doubt we will be moving away from 802.11 as the physical layer, and since I doubt we will be messing with MAC addressing and whatnot on the data link layer, this means that issues would have to be resolved at the network layer*.

So, possible solution time: what if FIRST developed custom firmware for the routers that would require a handshake using PKI in addition to the normal procedures for connecting to the field AP? Give every team a SD card or flash drive that contains a signed public-private keypair belonging to the team, as well as the certificate for the field APs. As long as every team's private key remains private, this would ensure that any request to connect to the field by a team would be irrevocably linked to that specific team (so no posing as team XXX trying to disrupt field communications), and any request to connect to the field that is not signed could safely be ignored. MITM should be mitigated in this scenario as well. Denial-of-service or other types of jamming would be possible, but I am assuming they would be more easily detected (because blocking out a user's communication entirely should require more bandwidth than simply impersonating them (I think? Even the FCA attack described did not stop communications on the physical layer, it only made the router ignore a valid connection attempt))*.

* I am by no means an expert, I am just spouting off from my understanding of a couple of networking courses in school.

techhelpbb
22-08-2012, 16:45
Brian, please stop spreading FUD. I can already see the direction you are aiming, and quite simply physics does not work that way. You are simultaneously crying that the sky is falling and threatening to make the sky fall.

I ask you to consider why you feel that FRCHQ is unresponsive, and why others do not feel that way. Is it HQ? Is it the others? Or is it you?

Eric you did not address the point. You could have addressed the point but instead you went directly for me as the problem.

Yeap there's the response I already predicted in this very topic (look back page or 2 or ask me to quote it).

You are simultaneously saying you want help and information then simultaneously being highly selective of who offers that help without a second thought to the point they make or any proof they may offer.

I asked weeks ago for merely a description of the process for these additional concerns. None has been provided.
I asked again in this topic and none has been provided.

I asked why people that send e-mails to the designated address aren't even granted the courtesy of an auto-responder and got no response.

I asked people at FIRST and the mere response I got was they were 'looking into it' which is often the response I get when you're not getting a call back.

The argument you think counters my point isn't as strong as you'd like to believe.

Now what am I supposed to do to refute your commentary Eric? Show you this works publicly?
Then what? What's going to be the process then, demand I resign as a mentor, or go after the team I helped start?


Here's what I'm going to do for this forum. I'm not posting again in here today.
Come what may I don't play this contest to score the most points, so in the end the threat to my priorities is trivial.

I do this to help kids and to honor what I do for a living...whether or not we can score the most points has little to do
with that. Even the years with the worst robots the kids still come out the winners and that's fine in my score book.

Akash Rastogi
22-08-2012, 17:32
Eric you did not address the point. You could have addressed the point but instead you went directly for me as the problem.

Yeap there's the response I already predicted in this very topic (look back page or 2 or ask me to quote it).

You are simultaneously saying you want help and information then simultaneously being highly selective of who offers that help without a second thought to the point they make or any proof they may offer.

I asked weeks ago for merely a description of the process for these additional concerns. None has been provided.
I asked again in this topic and none has been provided.

I asked why people that send e-mails to the designated address aren't even granted the courtesy of an auto-responder and got no response.

I asked people at FIRST and the mere response I got was they were 'looking into it' which is often the response I get when you're not getting a call back.

The argument you think counters my point isn't as strong as you'd like to believe.

Now what am I supposed to do to refute your commentary Eric? Show you this works publicly?
Then what? What's going to be the process then, demand I resign as a mentor, or go after the team I helped start?


Here's what I'm going to do for this forum. I'm not posting again in here today.
Come what may I don't play this contest to score the most points, so in the end the threat to my priorities is trivial.

I do this to help kids and to honor what I do for a living...whether or not we can score the most points has little to do
with that. Even the years with the worst robots the kids still come out the winners and that's fine in my score book.

Brian,

Please take a step back from your own commentary as well. I am not sure how you came to some of these conclusions from Eric's post. If you two want to argue, carry it to a PM. Sometimes "we're looking into it" has to be taken as good enough. Please avoid drawing random conclusions from what others say on here. But yes, please do take a few days off from this thread.

Thank you,
Akash

Al Skierkiewicz
22-08-2012, 17:32
David,
The specific phone attack only occurred when a 5 GHz enabled device attempted to connect to a robot. No data transfers took place, no handshaking, no virus like attacks, no special apps or software, no involvement with the FMS. Just the simple operation of attempting to connect to the robot access point.

DampRobot
22-08-2012, 18:06
Now what am I supposed to do to refute your commentary Eric? Show you this works publicly?
Then what? What's going to be the process then, demand I resign as a mentor, or go after the team I helped start?


Someone needed to say this (although perhaps a bit less vehemently). There needs to be an official route for security holes that simply does not exist now. I understand that the good folks at FRC have a ton on their plate already, but there is no incentive structure that exists to make sure these types of problems get reported and solved before they cause havoc at the world championships.

This is what I was getting at with my question about institutional knowledge. Either someone at FIRST knew about this hole, and there was an error in communications, or no one found out about this, because there was no reason for someone outside the small FRC team to go an official route.

I think there needs to be an official way to report bugs and to encourage people to report this type of exploit. An official FRC award for work in security, where as part of the submission process there would be a demonstration of the exploit discovered, would help these problems come out officially rather than being used maliciously. Instead of trying to fight "hackers" by ignorance and fear of persecution, give them a reason to strengthen the system, not destroy it.

linuxboy
22-08-2012, 18:35
I certainly don't take T14 to be the only allowable interaction (having talked to enough FTAs in my day), but it is the only guaranteed interaction. While I've never done it on Einstein, I head refs--even busy ones--seem listen to polite students in the box. I think you'd be hard-pressed to find a ref that wouldn't listen twice to "I know what's wrong; please let me show you how anyone in the stadium can shut down any robot on this field". As I understand it, the demonstration is rather quick (pull up the network list and show you can send a client authorization). If so, the student could show this directly to the ref for added clout.


Thanks, this is pretty much what I meant to say. While it is totally valid to talk to the other volunteers, the "official" route for raising an issue is in the question box (and after a match with connection issues, FTAs tend to get to the person in the question box just as soon as the head ref in my experience).

EricH, While it seems that going to the head ref could have yielded the same result, I think its just as likely that the ref (along with the FTA) may have chosen to hear the student out and see a demonstration. That's completely my opinion, there's no way of knowing what would have happened.

EricH
22-08-2012, 19:59
EricH, While it seems that going to the head ref could have yielded the same result, I think its just as likely that the ref (along with the FTA) may have chosen to hear the student out and see a demonstration.It's just as likely, yes. But what you missed is this:

By the time the student has told the ref, who has told the FTA, you have the following chain:

1) Mentor thinks there may have been a DoS attack. (or other issue)
2) Mentor tells student to tell the ref that there may have been a DoS attack.
3) Student tells ref that there may have been a DoS attack, and the FTA may want to know about it.
4) Ref tells FTA (if the FTA isn't already there listening).

That's a minimum of twice removed, on a suspicion. The FTA is going crazy trying to figure out what's going on--and remember, all eyes are on the FTA and his crew (normally they blend into the background, or are supposed to). And, remember, there's an alert that is supposed to catch DoS attacks and it hasn't gone off.

If I'm the FTA, I'm likely to go, "Tell your mentor that there wasn't one detected and we're trying to get to the bottom of this" and get back to trying to get to the bottom of the problem. It won't be until the second match at least that I look at it and go "Hey, there might be something to what that kid was saying his mentor thought. Now what team was he on again?"


Now, if the student was there and said, "We think someone tampered with a robot during a match by this process, which you might not be able to detect", the FTA would be a whole lot more likely to take action, because a) they now have an idea that their detectors aren't working and b) they have something concrete that they can look for if the logs haven't disappeared yet. But that whole thing involves a mentor explaining the process to a student, which takes time.

ratdude747
22-08-2012, 21:13
Larry,
Not all devices that claim full 802.11 wifi can actually do 5 GHz. Most devices, phones especially, are very difficult to determine as to what frequencies they can operate at.

I know... I'm just saying there were popular laptops out there that COULD.

How do I know? My router is a dualband N (two APs) and all 3 laptops can see and connect to my 5ghz Network (set to 5ghz only) just fine.

DMetalKong
22-08-2012, 22:22
David,
The specific phone attack only occurred when a 5 GHz enabled device attempted to connect to a robot. No data transfers took place, no handshaking, no virus like attacks, no special apps or software, no involvement with the FMS. Just the simple operation of attempting to connect to the robot access point.

Al,

Correct me if I misunderstand though, but for 802.11 there is a standard protocol for the router (or other device) to attempt to make the connection. What I was suggesting was modifying this protocol through the router/AP firmware so that the routers/APs that are part of the field network could ignore unauthorized connection attempts.

I see so much discussion of problems with the field without much discussion of solutions. That is not to say that people do not have solutions; I think it is easier to focus on what went wrong than on plans for the future (especially when I get the impression that people feel like they do not have a means of influencing change in the organization as a whole). As much as this discussion is veering from the original intent of the thread (the apology), I would rather see it derailed in a constructive fashion focusing on possible solutions, even if those solutions won't necessarily work.

Siri
22-08-2012, 22:37
Siri,
I read your post and thought that you were indicating that First engineering had already made the attempt to connect to robots by the time Einstein occurred. then I read further and became more and more confused as to what point you were trying to make. So let me make a few statements..Ok, that was the exact opposite of what I meant/said, so I'm glad we cleared that up. Thank you and thanks for the statements, too. I know I can't understand what it's like working inside something so complex and critically-viewed, much less when it's a volunteer organization. At the same time, your point about FIRST constantly collecting information from teams even if they don't say so worries me somewhat. As may have been noticed on this thread and others, the lack of two-way communication before and at events is difficult to handle in some cases. Community members are left to feel they have little recourse, whether or not we actually do. Nothing good seems to happen when officials are overwhelmed with advice (or complaints) and members feel overwhelmed with things to advise about. (I've also been on both sides of this in FIRST and neither is easy or pleasant.)

I do argue with others on this thread that we need a more consistent/accepted/responsive/official/useful/publicized/whathaveyou reporting channel for these sorts of things. So I ask as nicely and respectfully as physically possible towards both parties: how do we do this?

Alan Anderson
22-08-2012, 23:12
Correct me if I misunderstand though, but for 802.11 there is a standard protocol for the router (or other device) to attempt to make the connection. What I was suggesting was modifying this protocol through the router/AP firmware so that the routers/APs that are part of the field network could ignore unauthorized connection attempts.

There's probably no need to modify the protocol. It already dismisses failed client authentication attempts. The disruption to the field network seen on Einstein was due to a bug in the access point firmware, which combined with one version of robot router hardware to cause an unexpected loss of the network connection. That bug is no longer an issue.

An 802.11 protocol change that encrypts "management packets" could probably prevent deauthorization flood attacks from succeeding. It would also break a lot of things in the process.

I see so much discussion of problems with the field without much discussion of solutions. That is not to say that people do not have solutions; I think it is easier to focus on what went wrong than on plans for the future (especially when I get the impression that people feel like they do not have a means of influencing change in the organization as a whole). As much as this discussion is veering from the original intent of the thread (the apology), I would rather see it derailed in a constructive fashion focusing on possible solutions, even if those solutions won't necessarily work.

Did you read the Einstein investigation report through to the end? The last two pages are all about planned possible changes, with a half dozen of them as specific solutions to observed problems.

EricVanWyk
22-08-2012, 23:20
I do argue with others on this thread that we need a more consistent/accepted/responsive/official/useful/publicized/whathaveyou reporting channel for these sorts of things. So I ask as nicely and respectfully as physically possible towards both parties: how do we do this?

At an event, the "question box" is the best way to begin communication, you just need to be patient as your question gets routed to the best person to answer it. Outside an event, email is your best bet. Specific to these types of situations, you can use 2012frcfeedback@usfirst.org (as stated in the Einstein report). Please note that many people are currently on vacation, and the ones that aren't are buried in work.

The important thing to remember is that the hardest part of engineering is communication. The value of your ideas are limited to the people you can influence with them. As a volunteer I've been cursed out several times by people trying to influence me with their ideas, and it is turns out that screaming in someone's face it isn't very effective persuasion. By the time they've finished commenting on my heritage and IQ, they could have instead told me their idea and provided supporting information.

So, when you "attempt to notify FIRST personnel of [your] belief", please be clear, concise, and civil.

DMetalKong
22-08-2012, 23:22
An 802.11 protocol change that encrypts "management packets" could probably prevent deauthorization flood attacks from succeeding. It would also break a lot of things in the process.


I think that breaking things could be acceptable for use in FRC if the need is strong enough. As long as only firmware is changing, and not hardware, the cost of deployment would not be as great as an entirely custom solution.



Did you read the Einstein investigation report through to the end? The last two pages are all about planned possible changes, with a half dozen of them as specific solutions to observed problems.

I did read the report :) I was referring more to the various threads on CD that have been started about the topic.

stjonl
22-08-2012, 23:27
I want to echo this point. By the mentor's own admission, he used the attack, but why should we believe his admission of guilt isn't the full story from his perspective? Every single one of us has been looking for someone or something to blame for what happened on Einstein. The full report has brought up a multitude of points of failure during the finals, and its really not hard to believe the answer to all of this is not as simple as blaming this all on one mentor. As soon as news broke that there was an attack during play, all of the failures on the field were attributed to that. But things just aren't that simple, and we discovered how many root causes for all the different problems there really were. But I firmly believe we still know far too little to place all of the blame on this one attacker. With thousands of incredibly smart people in the dome, its entirely possible that someone else used this attack, whether or not their team was on einstein, and whether or not they were fully aware of their actions.

We've heard 2 sides of the story so far, and unless someone would like to point out something I missed that puts them in direct conflict, I think it's only fair to evaluate this based on what we know.

Everyone was feeling a lot of emotions at the moment, and the attack in response could have been from a moment of desperation. I'm not condoning what happened, but I am trying to understand it.


I think part of the story happen before St. Loius. At MSC the finals were 469, 67 and 830 against 2054, 548 and 245. The red alliance won the first match, but the second match ended in a most unusual note. 2054, 548 and 245 were attempting a tripple balance. A little before that, 67 was in the blue alley and died about two feet in front of the blue bridge. The blue alliance charlie browned the bridge and contact was made with 67 a few times. At the end of the match, the blue bridge was level, but one robot, on the blue alliance side was half on the bridge and the floor. The referees looked it over and huddled up. I believe they call a few 3 point penaties during the match for contact in the alley. The final score was close enough that a blue bridge balance would have given them the match win and force a thrid final match. I can only assume the referees were discussing if there was a bridge balancing interference. As they discussed the issue, the winning teams were call to the floor to cut down the nets. Referees still discussing. Some of the nets have now been cut down. One referee leaves the huddle, removes his striped shirt within a couple steps and is clearly not happy. The rest of the nets are cut down. I can not remember exactly when the referee huddle ended, but at some point the MC them explains that the balanced blue bridge did not count because of the one robot half on the bridge and the floor. No mention of the possible bridge balancing interference. Myself and many others believe that the bridge balance was interfered with, OK, not deliberately, but just the same. That's not how the referees called it. 2054, 548 and 245 could have been denied a chance to win the tournament because of that call. Referees are human and they do the best they can, do not blame them unless you want to fill their shoes.
I was really suprised that the referee huddle was still in progress when they began the ending ceremonies. That really dampened the mode for everyone there.

Siri
23-08-2012, 07:58
At an event, the "question box" is the best way to begin communication, you just need to be patient as your question gets routed to the best person to answer it. Outside an event, email is your best bet. Specific to these types of situations, you can use 2012frcfeedback@usfirst.org (as stated in the Einstein report). Please note that many people are currently on vacation, and the ones that aren't are buried in work…While I agree with you, most significantly on your discussion of communication, the outside-event lack-of-communication issues described lately on CD are far from the first time I've heard intelligent, articulate and patient people report silence from FIRST. Without saying they're right (or wrong), I will contend that it's widespread enough to attract the attention of myself and others.

As for at-event, I've already put in my defense of the question box, but my experience somewhere between yourself and EricH. I've witnessed and experienced mentors communicating issues to students for the question box from both sides, and I'm not sure it's routinely as slow or unwieldy as some are concerned (though it certainly has the potential to be...as a coach I actually review question box procedure with my drivers). In this situation, I do believe officials would listen, especially after the second failure.

At the same time, there hasn't been a year go by (as driver, coach or volunteer) that I haven't seen at least 2-3+ patient, articulate, clear-but-not-obnoxious students, some with documentation or other teams to back them up, never be given the chance to talk and instead get sent away from the box. Nor is it altogether uncommon for these students to be independently proven correct, but only later than necessary and in some cases too late. (The obvious best-known example this year was at championship quals, though thankfully then it wasn't too late.) How many more have had a valid point, and how much difficulty on both sides did the breakdown in communication cost? I understand and feel very dearly for FTAs and my fellow refs, but I can't help but try for alternatives. As this situation demonstrates, there's something to be gained for everyone.

Am I making a mountain of a molehill? Maybe, but it's if so it's far from an uncommon hallucination.

Adam Freeman
23-08-2012, 08:04
I think part of the story happen before St. Loius. At MSC the finals were 469, 67 and 830 against 2054, 548 and 245. The red alliance won the first match, but the second match ended in a most unusual note. 2054, 548 and 245 were attempting a tripple balance. A little before that, 67 was in the blue alley and died about two feet in front of the blue bridge. The blue alliance charlie browned the bridge and contact was made with 67 a few times. At the end of the match, the blue bridge was level, but one robot, on the blue alliance side was half on the bridge and the floor. The referees looked it over and huddled up. I believe they call a few 3 point penaties during the match for contact in the alley. The final score was close enough that a blue bridge balance would have given them the match win and force a thrid final match. I can only assume the referees were discussing if there was a bridge balancing interference. As they discussed the issue, the winning teams were call to the floor to cut down the nets. Referees still discussing. Some of the nets have now been cut down. One referee leaves the huddle, removes his striped shirt within a couple steps and is clearly not happy. The rest of the nets are cut down. I can not remember exactly when the referee huddle ended, but at some point the MC them explains that the balanced blue bridge did not count because of the one robot half on the bridge and the floor. No mention of the possible bridge balancing interference. Myself and many others believe that the bridge balance was interfered with, OK, not deliberately, but just the same. That's not how the referees called it. 2054, 548 and 245 could have been denied a chance to win the tournament because of that call. Referees are human and they do the best they can, do not blame them unless you want to fill their shoes.
I was really suprised that the referee huddle was still in progress when they began the ending ceremonies. That really dampened the mode for everyone there.

I am not quite sure what your post has to do with this topic. It's not even correct. The field was given the "all clear" signal and the scores/winners were announced before the nets were cut. Not sure what was going in with the refs (maybe Gary Voshol can clarify). I know there were some upset mentors from the blue alliance. Heck, even the drive coaches for the winning alliance were less than enthusiastic with the way things ended.

Al Skierkiewicz
23-08-2012, 08:54
In case everyone doesn't know, I will state this again. Key volunteers are asked to report on their events following each of those events. We have weekly phone conferences to answer questions and pass along the latest data. Specifically, that is the FTA, LRI, and Head Ref as well as others. Those reports will contain information that is beneficial to each of the groups, to improve the next weeks events and FRC in general. In some cases, these reports will prompt a Team Update and/or a rule change. As an LRI I can assure you that one of the persons who hears my report is a member of the GDC and I have access via email and phone to First Engineering. If one of us comes across something that we can verify at the event, then we make the tests and insure that there is a real problem that exists. Then we all document it and report it through our individual lines of communication with HQ along with fixes that may have been found. Reports that come to HQ through other means are also evaluated and checked. While some people on this thread believe that they are not listened to or that they are ignored is simply and categorically untrue. They are opinions, not fact. While I can't really speak for the other volunteers, I can tell you that robot inspectors, FTAs, Refs and First staff are dedicated to improving this competition environment. Those outside of the robot key volunteer organization i.e. event staff, coordinators, pit admin and safety advisors are also dedicated to improving things. If you want to know who those people are, you only have to look here at CD and see who is answering certain questions in a helpful, open manner without trying to be condescending, boastful or argumentative. I can tell you from personal experience, trying to evaluate a problem with few staff and come up with a solution in a week or less is very difficult. That pushes confirmations, phone or email, way down on the list or priorities. Most of the First staff are at events throughout the season so that leaves even less time to come up with solutions. First staff and GDC are always looking at CD for input and they are reading exactly what you write here even if they don't respond.
As Alan pointed out earlier, the Einstein report hints that the 802.11 protocol allows for various types of security and those suggestions from the experts are being employed for next season. Other suggestions related to antenna designs, placement of components and other issues with the wifi infrastructure are also being implemented to insure secure communications with your robot.
David and Siri, let me know if this didn't answer your questions. Alan and Eric thanks for your input.

qnetjoe
23-08-2012, 15:31
I think that breaking things could be acceptable for use in FRC if the need is strong enough. As long as only firmware is changing, and not hardware, the cost of deployment would not be as great as an entirely custom solution..

There is really no sense in reinventing the wheel. IEEE has been working on protected management frames for a long time. There is a standard called 802.11w-2009 that does this and it was ratified in 2009 and was recently superseded by 802.11-2012 which was just the merging of ten amendments together to help prevent forking.

The next step in the process would be to find hardware that meets FIRST budget and that meets these standard.

just my two cents

If you want to track the 802.11 WG progress/projects here a good link to start off with

http://grouper.ieee.org/groups/802/11/Reports/802.11_Timelines.htm

Racer26
23-08-2012, 17:08
I'm admittedly a bit late to the party on this one, but here goes anyway.

IMO, 548's statement is what was needed to get the community to put away the pitchforks. BUT, I don't believe the story told to them by the mentor in question. I personally feel that both the story told to 548's committee by the mentor, AND the story in the Einstein Report BOTH overlook the giant elephant in the room, that being the mysterious comms losses through MSC and Newton Elims, exhibiting symptoms consistent with the FCA attack (based on my viewing of the match video I can find), and only to teams that would pose a threat to 548's success in future matches (or current matches, in a desparation effort to avoid elimination).

I'm content to take 548's committee at face value, that this mentor acted alone, and without the knowledge of the rest of the team. However, I feel that the individual who interfered with Einstein isn't telling the whole truth. It is my opinion, and one I'm sure I'm not alone in, that the individual had been using the FCA vulnerability since at least MSC.

Since I believe this individual was acting alone, I hold no grudge against 548 as a team.

Another thing I haven't seen mentioned, is that there was a Week 4 event held in 548's school. Its entirely likely that the mentor discovered the vulnerability there, when they would have had decidedly greater access to a real field than anywhere else.

While I agree with many of the posters in the first couple of pages of this thread that interfering at the highest stage to demonstrate your vulnerability isn't cool, if this person hadn't done that, we probably never would have had the Einstein Investigation, and many of the issues uncovered by it may have continued to go unnoticed.

Ekcrbe
23-08-2012, 18:13
I personally feel that both the story told to 548's committee by the mentor, AND the story in the Einstein Report BOTH overlook the giant elephant in the room, that being the mysterious comms losses through MSC and Newton Elims, exhibiting symptoms consistent with the FCA attack (based on my viewing of the match video I can find), and only to teams that would pose a threat to 548's success in future matches (or current matches, in a desparation effort to avoid elimination).

It's not really an overlook, because the Einstein Report reported on Einstein, and was not obligated to anything else. I believe (anybody confirm?) that FiM is investigating other potential instances of FCA attacks outside of Einstein.

Edit: It seems there is some investigation into matches at MSC by FiM members.

techhelpbb
23-08-2012, 18:28
There is really no sense in reinventing the wheel. IEEE has been working on protected management frames for a long time. There is a standard called 802.11w-2009 that does this and it was ratified in 2009 and was recently superseded by 802.11-2012 which was just the merging of ten amendments together to help prevent forking.

The next step in the process would be to find hardware that meets FIRST budget and that meets these standard.

just my two cents

If you want to track the 802.11 WG progress/projects here a good link to start off with

http://grouper.ieee.org/groups/802/11/Reports/802.11_Timelines.htm

The Cisco 1250 & 1260 series AP (with 32MB) already have support for Cisco Management Frame Protection (MFP). It's a similar idea but prior to the full standard availability. It requires a specific configuration and it requires a Cisco Certified Extensions CCX version 5.0 compliant device. There is support for CCX v5 from Ralink, Atheros and Broadcom. Though mostly as peripherals not in routers, AP or bridges that I could find. So it saves the field side but not the robot side.

Rosewill sells an USB device claiming CCX v5 support but I can't vouch for that personally, the model is:
RNX-N600UBE

I'm not making any recommendations here, if anyone is really interested please discuss the matter with AirTight regarding any caveats.

Liz Smith
23-08-2012, 19:57
There is a big issue I see that has been echoed many times in these Einstein discussions. There are problems with assumptions that are not necessarily correct, speculation without any data, and research without the proper tools. I will use myself as an example, but I feel this applies to everyone.

It would be presumptuous for me to consider myself, for example, a FRC wifi interference researcher just by sitting here at home watching YouTube videos of matches. Any information I gain is anecdotal at best. I do not have a full set of data, I do not have a full FRC field in my house (yet??) to conduct my own research—and that’s not necessarily a bad thing.

But... lets say I’m really concerned about the second to last match in the “Regional State District Division Championship” where my team stopped moving during the match. It is way too easy for me to watch a video reply of that match 100 times and convince myself without a doubt that I know why that robot failed.

This is a problem. I can theorize all I want, but it is counter productive for me to arrive at absolute conclusions because I have neither a full set of data or the proper tools to investigate. It would be much more productive for me to report my information and concerns and maybe a link to the video, but then let go of it and let someone else with the tools to fully investigate the issue look into it. If I pursue it further, I just end up making more assumptions and run the risk of assuming that every robot that stops moving in every match I ever watch is because of this one reason… which in reality can’t possibly be true. Going further, if I then go out and present myself as an expert on these matters all I am doing is spreading rumors without any real empirical data to back it up.

I can definitely voice my concerns, but I have to accept the fact that the people whose full time job it is to solve these problems are going to be better equipped than me to draw these conclusions. I know that FRC staff, volunteers, mentors and students... we all want every single robot on the field to function properly 100% of the time but when theories are presented as factual evidence and people make statements that are just speculation it just causes unnecessary confusion.

Basel A
23-08-2012, 23:40
I'm admittedly a bit late to the party on this one, but here goes anyway.

IMO, 548's statement is what was needed to get the community to put away the pitchforks. BUT, I don't believe the story told to them by the mentor in question. I personally feel that both the story told to 548's committee by the mentor, AND the story in the Einstein Report BOTH overlook the giant elephant in the room, that being the mysterious comms losses through MSC and Newton Elims, exhibiting symptoms consistent with the FCA attack (based on my viewing of the match video I can find), and only to teams that would pose a threat to 548's success in future matches (or current matches, in a desparation effort to avoid elimination).



Based on my understanding, it is impossible to reasonably diagnose a failure as FCA based on match video. Indeed, the symptoms consistent with FCA are consistent with roughly a million other problems. Even after FIRST's thorough investigation, all but one case of likely FCA cases were only considered "likely."

I find it disturbing that you're prepared not only to diagnose a robot failure as a complex problem based on minimal evidence, but also ready to indict an individual, about whom you know exactly one thing, of match-fixing at the highest level.

Ekcrbe
23-08-2012, 23:53
Even after FIRST's thorough investigation, all but one case of likely FCA cases were only considered "likely."

And that case (SF 2-1) was only confirmed because of the individual admitting to it, so there was no testing-based confirmation of FCA being an absolute conclusion.

Gregor
24-08-2012, 01:49
I find it disturbing that you're prepared not only to diagnose a robot failure as a complex problem based on minimal evidence, but also ready to indict an individual, about whom you know exactly one thing, of match-fixing at the highest level.

I don't believe that he is diagnosing them as FCA attacks, only pointing out the possibility of more FCA attacks that might have happened before Einstein.

Greg McKaskle
24-08-2012, 07:51
Diagnosis based solely on video is highly speculative. But knowing more about the behavior of any dashboard feed, the diagnostics on the DS, and any after-match troubleshooting results can move the needle as far as likely. It should require hard evidence or admission to move it to confirmed.

Greg McKaskle

techhelpbb
24-08-2012, 12:00
I'm trying to make this my last post in this topic...

This is a summary of the advice I've given my team:

Wireless networks like this are assumed to occasionally be unreliable and in order to handle the added complexity they implement solutions that may or may not be sufficient to make them as reliable as possible.

It is wonderful what the specifications for these networks would lead you to believe as a selling point for that technology. It is wonderful what demonstrations you can make to test those specifications in one circumstance or another.

However, at the core this technology creates a link subject to some unreliability even when you don't have someone trying to make it unreliable intentionally.

It's wonderful that FIRST is trying to make these links as reliable as possible but we as the robot builders can help by making our robots less dependent on the wireless network being entirely reliable for every instant we use it.

If the parts of the robot we as the robot builders control are less dependent on the reliability of the wireless network it will be much harder for an unforeseen situation over a short period of time to decrease our competitive performance. Regardless of whether that short interruption is from someone trying to cause trouble or an unforeseen circumstance.

Our team is student-led. I'll let them decide how to deal with that. I'm confident there are many things they can do with that advice to improve the competition performance of a robot.

This advice leaves FIRST additional room to have undetected problems in their network for short periods of time for a large number of possible reasons. So this is intended to be constructive and positive leaning guidance. I think it's fair to point it out because I don't expect that it's entirely level headed to charge FIRST with doing something quite hard with a great number of variables and expect there to be no issues along the way.

Alan Anderson
24-08-2012, 17:10
It's wonderful that FIRST is trying to make these links as reliable as possible but we as the robot builders can help by making our robots less dependent on the wireless network being entirely reliable for every instant we use it.

Unfortunately, "we" can't do much at all about the robot's dependence on the network. When the cRIO isn't getting continuous "enabled" signals from the Driver Station, it shuts down all the motors and other actuators. That's something completely beyond the control of robot builders.

What kind of help were you thinking of?

GaryVoshol
24-08-2012, 17:32
I am not quite sure what your post has to do with this topic. It's not even correct. The field was given the "all clear" signal and the scores/winners were announced before the nets were cut. Not sure what was going in with the refs (maybe Gary Voshol can clarify). I know there were some upset mentors from the blue alliance. Heck, even the drive coaches for the winning alliance were less than enthusiastic with the way things ended.

Without disclosing any confidences, all I can say is that the refs did not have an extended discussion about F-2 and the results. Any observers may have seen us talking generally among ourselves as we ended the event.

techhelpbb
24-08-2012, 21:55
Unfortunately, "we" can't do much at all about the robot's dependence on the network. When the cRIO isn't getting continuous "enabled" signals from the Driver Station, it shuts down all the motors and other actuators. That's something completely beyond the control of robot builders.

What kind of help were you thinking of?

The next line of the post you quoted:
"If the parts of the robot we as the robot builders control are less dependent on the reliability of the wireless network it will be much harder for an unforeseen situation over a short period of time to decrease our competitive performance."

I am giving FIRST credit that no error on the scale of the FCA will escape the FRC development process in the future. If it does I assume FIRST will find a way to replay the matches or figure out the resolution as quickly as possible. Obviously to some extent the existing robot signal/status lights (RSL) and driver's station diagnostics are a start to let the people on the field find problems.

Last I checked the disable state does shut down all the motors and actuators in the hardware of the digital sidecar but not all the digital I/O (GPIO), or the I2C, or the User1 light on the cRIO. The rules seem to consider this with how things that can cause movement are connected to the digital sidecar. However I don't think the rules prohibit indicator lights you can see off the field correctly designed to be connected to the GPIO pins or I2C.

Certainly code can execute in the cRIO regardless of enable. If you loose communications you loose your ability to send status back to your driver's station over that link. Of course you loose your ability to move for the sake of safety. However you retain the ability to manipulate those I/O and can use those to deliver status information that might be valuable even if you can't communicate to the driver's station or get near the robot. You'd be able to use the flash memory as well while disabled and then you can retain that data even after the robot is powered off (course it is flash memory so wear leveling could be a concern).

Although that doesn't improve your movement situation if you end up loosing communications long enough to miss enable. Using something like that along with the information communicated by the RSL and the driver's station would give you a lot of information that your code is doing what you think it is, when you think it is, when the field thinks it is, even on a competition field where you can't get near the robot or even communicate with it. In fact if the robot can't communicate with you it could signal that (some might think that redundant but you never know it might have helped Team 118 on Einstein).

Separate from the issue of not getting enabled:

Sending large amounts of data back and forth consistently using TCP and making that critical to the control of the robot is going to increase the chance that a momentary interruption or delay will cause adverse consequences. TCP is going to try to deliver that data but who knows how long it'll take.

UDP, which isn't a reliable protocol however, will still generate useful communications. Someone can create a transaction system with UDP that can actually loose messages and ignore messages unlike TCP trying to help by pretending the link is reliable (when it might be busy or experience some wireless issue). The FMS seems to use UDP a lot itself.

My concerns fall along the same lines as what happens when a critical sensor has become disconnected and you don't detect that it's been disconnected. However the code expects input from that now disconnected sensor in a loop from which you cannot escape and so everything is stuck (it blocks).

What will happen if you loose your camera feed to the driver's station or it suddenly starts getting really dysfunctional and that's the only choice you have for some critical function? What will happen if your driver's station is running code to process that video and the camera feed is disrupted? What will happen if your robot is enabled and keeps waiting for information from the driver's stations and that information is delayed? What will happen if you put a lot of debugging information in to send back to the driver's station and it takes longer than you expected based on tests back home? What would happen if you send a lot of packets to the cRIO and your code didn't read them fast enough and you start to overflow the input buffer (buffer overflow 'exploit' right from the Einstein report starting on page 13)?

Obviously if someone can actually defeat your ability to see the enable your movement driving outputs from the digital sidecar will disable for safety (excusing momentum). Then you have to consider the physical status of the actuators that stopped if you return to the enabled state from that unexpected disabled state.

However, the system can obviously loose packets so the idea of continuous enable transmission seems to give the wrong impression (it is continuous but you can loose some packets and not get disabled). There's even a counter for missed packets in the Field Monitor Software (FMS) and the manual where it says: "Typically there are some lost packets. In a very tame wireless environment, this number will be less than 100." (Page 49, Rebound Rumble FMS manual, Rev. 0). Along with the average time it takes for traffic to go from the driver's station to the robot and back (average meaning not necessarily instantaneous round trip time). That information comes from the driver's stations to the FMS about every 100ms from what I've researched. Unfortunately every interruption to the link is going to delay delivery of TCP packets and might actually loose your UDP packets entirely. Obviously the counter existing with that note in the documentation for the field operators indicates that this happens at least 100 times in a very tame environment, what about a not so tame environment? Also that counter is for each team.

I can provide the links to back this up but I'm not sure I want to be linking the FMS manuals to this site. It might not stand the test of time and I'm not sure if there are rules about it.

Greg McKaskle
25-08-2012, 00:48
..You'd be able to use the flash memory as well while disabled and then you can retain that data even after the robot is powered off (course it is flash memory so wear leveling could be a concern)..

The flash drivers already implement wear leveling. The cRIO was designed as a monitoring/control device with a highly reliable file system and is used by industry to log data in remote and harsh conditions. Log files that detail how your robot operates are a good technique independent of any communications issues. Knowing whether the robot leaves auto, extends the arm too far, or dies entirely is helpful to everyone. Please keep in mind that the logging isn't free and it is possible to log so much data that the cRIO will not have the CPU needed to drive the robot.

In fact if the robot can't communicate with you it could signal that (some might think that redundant but you never know it might have helped Team 118 on Einstein).
118 DS logs clearly showed what was happening on Einstein regarding communications. It showed that the robot was being told to enter auto, the CPU spiked to 100%, and the robot stayed in communication for several seconds longer responding with its voltage and other fields but never indicated that it completed processing the auto command. There were plenty LEDs on 118, and if the code had been executing as expected, if there had been a comms issue, they could have been used to show extra info and logging could have helped as well. The difficulty with 118 was identifying how and why the CPU went to 100%.

Someone can create a transaction system with UDP that can actually loose messages and ignore messages unlike TCP trying to help by pretending the link is reliable (when it might be busy or experience some wireless issue). The FMS seems to use UDP a lot itself.
All traffic from FMS to DS to Robot and back are implemented using UDP with redundant info and some tracking data to calculate trip times and lost packets. TCP is used for smart dashboard and by dashboard cameras.

... However the code expects input from that now disconnected sensor in a loop from which you cannot escape and so everything is stuck (it blocks).

The code on 118 was unique to their gyro reset done as auto began. I don't think anyone would recommend putting a tight loop into the code waiting for a sensor condition. The 118 SW mentor didn't know the code had been added. If the CPU hadn't pegged in the blocking loop, the dashboard and robot behavior would have helped identify that the gyro was disconnected.

The buffer issue mentioned was a secondary issue that explained why 118 couldn't be rebooted from the DS. It didn't directly contribute to the failure. It is an artifact of the version of VXWorks that runs on the cRIO. It allows for improperly written code in one task to impact the communication of other tasks. The buffer was full, not overflowing, and there was no exploit.

The robot disable occurs when no DS commands have been received for 100ms. The packets are sent every 20ms. So it will take 5 sequential packet losses to trigger a disable. The robot will be enabled as soon as another packet arrives, perhaps as short as 20ms. The Einstein communications, as measured and logged by the DS, was very quiet, almost equal to an ethernet cable, except for a field-wide burst in the final match. This may have been external noise such as a lightening strike. Logs of the Einstein robots during qualifications showed far more interference but no disabling caused by it.

Please ask if there are other questions about the Einstein Report.
Greg McKaskle

techhelpbb
25-08-2012, 03:41
The robot disable occurs when no DS commands have been received for 100ms. The packets are sent every 20ms. So it will take 5 sequential packet losses to trigger a disable. The robot will be enabled as soon as another packet arrives, perhaps as short as 20ms. The Einstein communications, as measured and logged by the DS, was very quiet, almost equal to an ethernet cable, except for a field-wide burst in the final match. This may have been external noise such as a lightening strike. Logs of the Einstein robots during qualifications showed far more interference but no disabling caused by it.


I'm unclear on this:

The shortest time delay in the 5 possible RSL light status patterns is 100ms for the off time of the teleop enabled mode.

So if you miss 100ms of communications, become disabled, then 20ms or even 60ms or 80ms passes before you re-enable from a DS packet you might not notice the change in the pattern of the RSL pattern even though you've disabled briefly.

The charts tab in the DS shows when the robot is enabled or disabled even for short periods of time.

The DS sends data to the FMS every 100ms and the FMS logs every 500ms in the match review.

So is it possible for the DS to notice that the robot transitioned from enabled to disabled back to enabled between these 100ms bursts back to the FMS and not report the robot state transition because it happened between reporting intervals to the FMS?

Greg McKaskle
25-08-2012, 08:43
.. you might not notice ..
Correct. The RSL is a pretty crude indicator of the robot state. Keep in mind that a human blink is at least 100ms. I've also reviewed the logs with the drive coach and shown them brief disables that neither they nor the drivers noticed during a match. I've also seen robot logs, very successful robots, that only process the teleop every 60ms and they seem fine with the rate. In other words, they choose to ignore two out of three control packets even though the CPU usage was quite low.

Actually, the FMS<-->DS comms are at 20ms as well. The FMS logs are somewhat slow from what I've seen -- between 2 and 4 points in a second. The DS reports everything it knows to the field. But at this point, the DS log data is the best indication of what took place on the robot and with the comms.

Greg McKaskle

techhelpbb
27-08-2012, 10:52
Am I correct that the missing packet indicators on the FMS and the lost packet counters in the charts tab of the driver's station are counting only the UDP packets that FIRST is using for DS<->Robot communications? It's clear that the average round trip calculation depends on those packets.

Is there any additional monitoring in place on the current fields to track bottlenecks, lost packets, and other TCP/IP behavior while the field operates besides those counters? I mean besides one of the driver's station operators peaking at that with System Monitor?

Is there any kind of prioritization for the UDP traffic imposed by the field and D-Link AP?

What process is in place to prevent the improper configuration of the Windows TCP/IP stack in the driver's station? Specifically with respect to TCP sliding windows and window scaling (http://www.tcpipguide.com/free/t_TCPWindowSizeAdjustmentandFlowControl.htm)?

I ask these questions because of situations where UDP packet traffic sees the unintended side effects of TCP bottlenecks. The effect that concerns me is discussed at length in this link:
Characteristics of UDP Packet Loss: Effect of TCP Traffic (http://www.isoc.org/inet97/proceedings/F3/F3_1.HTM)

If we can see 100 UDP packets disappear during a match in a very tame wireless environment, how much TCP bottlenecking (and packet loss) is really going on impacting the Smart Dashboards and TCP based web cameras?

You can write software to get to all the raw counters you can see in the System Monitor on Windows like this:
Raw performance data class (http://msdn.microsoft.com/en-us/library/windows/desktop/aa394340%28v=vs.85%29.aspx)

It's not clear to the me the driver's station is using the Windows API to collect the lost packet information.

Though even if you did use that source of information you could only monitor with respect to the TCP/IP stack of driver's station. I suppose using the UDP packets to track performance like this was easier than modifying the D-Link AP to run DD-WRT or OpenWRT and passing back it's TCP/IP status statistics to the driver's stations and back to the field. Keeping in mind that the cRIO can't see the traffic generated by the other devices not addressed to the cRIO and plugged into the D-Link AP switch (the D-Link AP doesn't seem to support any kind of port tap to bypass the switch and I doubt it would be wise to ARP poison it).

Also which IP stack for VxWorks is in the cRIO: the BSD stack or the Interpeak stack? The older BSD stack source code supports the features that concern me as can be read here:
Wind River VxWorks TCP/IP stack (http://www.codeforge.com/read/150599/tcp_input.c__html)

It sounds from the description of the buffer configuration above it might have the Interpeak stack in it?
I'm curious to see if there's more than RFC2581 in that TCP/IP stack for congestion control.

Brian

Racer26
27-08-2012, 17:59
Even after FIRST's thorough investigation, all but one case of likely FCA cases were only considered "likely."

I find it disturbing that you're prepared not only to diagnose a robot failure as a complex problem based on minimal evidence, but also ready to indict an individual, about whom you know exactly one thing, of match-fixing at the highest level.

As the earlier poster mentioned: The only "confirmed" case was the one admitted to.

The "likely" cases are ones that the Einstein committee (18 industry experts, plus the 12 teams, 548 included) agreed were reasonably likely to ALSO be caused by the FCA exploit, based on evidence available in terms of match video and DS logs, and the circumstantial evidence of multiple eyewitness accounts stating that they had viewed the individual punching away on the Galaxy Nexus phone at a screen containing numbers of the teams on field at various points throughout Einstein, one reporter distinctly remembering 1114 being targeted. Many people seem to be overlooking (or at least glazing over for the purposes of peacekeeping) this part of the report, when in actuality, its relatively damning.

As for the OTHER cases, outside of Einstein? Nobody investigated them in a proper investigation (at least not yet, to my knowledge, I read in this thread that FiM is conducting something related to MSC), so we may never know for sure.

The match video I've found exhibits the same symptoms as those seen on Einstein and documented in the Einstein report as what would be visible to an astute observer watching a video from a distance (a flashing RSL indicating robot power is still present, and a Flashing alliance station wall light indicating a lack of communications with the robot). I fully agree that these alone do not a complete diagnosis of FCA make.

However, with the circumstantial evidence that a 548 mentor was tampering with the system in one admitted case, plus several other likely cases, according to 42 experts (18 industry + 2x12 team reps). This individual was presumably intending to influence the outcome of the matches, and that makes it reasonably believable to me that these other matches I can find with FCA-like symptoms, being cases where the disabled robot(s) being disabled would pose a distinct advantage to 548, would probably also be attributable to the FCA exploit.

Am I ready to indict this individual of match fixing at the highest-level? YES! They ADMITTED to that, and that's why they're no longer welcome at FIRST events! However, yes, I further believe that they fixed many more matches than they've admitted to, and I know I'm not alone in that belief.

As I stated in my earlier post though, I hold no grudge against 548, because I'm willing to take the TEAM at their word that this INDIVIDUAL was acting ALONE and without the team's knowledge. Its not fair to the present and future students of 548 to have to be punished for something a mentor of theirs did sometime in the past. They're a 3-time district chairman's award team. They are doing good things, and the kids whom they're trying to have an impact on don't deserve to be chastised by the community at large for the actions of someone they trusted. I'm sure they're probably MORE devastated than the rest of the FIRST world, since this mentor violated their trust and damaged their team's hard earned image as a leader in possibly irreparable ways.

Greg McKaskle
27-08-2012, 19:39
Edited to condense the questions. Answers marked with ***'s.
---------------------
Am I correct that the missing packet indicators on the FMS and the lost packet counters in the charts tab of the driver's station are counting only the UDP packets that FIRST is using for DS<->Robot communications?
*** Yes. The trip time and lost packets refer to the control/status loop between DS and robot.

Is there any additional monitoring in place on the current fields to track bottlenecks, lost packets, and other TCP/IP behavior while the field operates besides those counters? I mean besides one of the driver's station operators peaking at that with System Monitor?
*** If those other aspects impact the control/status loop, then the CSA, inspector, or FTA will use other system tools to determine what is causing the problem. The DS monitors a few cRIO factors such as CPU.

Is there any kind of prioritization for the UDP traffic imposed by the field and D-Link AP?
*** In 2012, default settings were used. The report indicates that QOS may be configured in coming seasons.

What process is in place to prevent the improper configuration of the Windows TCP/IP stack in the driver's station? Specifically with respect to TCP sliding windows and window scaling (http://www.tcpipguide.com/free/t_TCPWindowSizeAdjustmentandFlowControl.htm)?
*** Nothing except for overall monitoring of the control/status loop. If that is working poorly, the CSA, inspector, or FTA may decide to look at TCP configuragion, but honestly, that is getting pretty obscure.

If we can see 100 UDP packets disappear during a match in a very tame wireless environment, how much TCP bottlenecking (and packet loss) is really going on impacting the Smart Dashboards and TCP based web cameras?
*** That is less than one packet per second. If TCP is having issues and retransmitting, that will likely impact the UDP and the FTA or others would look into it.

It's not clear to the me the driver's station is using the Windows API to collect the lost packet information.
*** It is not. If you believe that information would be helpful instead or in addition, I'm sure it can be added.

Also which IP stack for VxWorks is in the cRIO: the BSD stack or the Interpeak stack?
*** I don't have a cRIO with me, so I can't answer your question.

Greg McKaskle

techhelpbb
28-08-2012, 12:14
If those other aspects impact the control/status loop, then the CSA, inspector, or FTA will use other system tools to determine what is causing the problem. The DS monitors a few cRIO factors such as CPU.


What tools besides the Windows Performance/System Monitor and ping are available to everyone to diagnose such a situation?

Traceroute won't do much good considering the robot is bridged.

Ping works to some extent because it's ICMP echo and on layer 3 therefore while it's wrapped in IP it's not really TCP or UDP. So if you start ping toward the cRIO you'll see the congestion that is impacting TCP and UDP which are on layer 4. Unfortunately, if you ping the cRIO from the driver's station you'll see the congestion but not necessarily at which point in the communications path the congestion exists. In fact ICMP has not just the ability to detect congestion it also has the ability to throttle inbound traffic that causes congestion of the local receive buffer with the source quench (http://tools.ietf.org/html/rfc1122#section-3.2.2.3) message, which if the sender responds to (and it should) should cause it to back off.

To my knowledge current Microsoft Windows TCP/IP stacks honor source quench requests if they are doing the sending but do not generate them when they receive. Instead it's common for devices that have filled their input buffers to simply drop packets. This behavior appears to be the same in the older VxWorks stack(s).

VxWorks BSD TCP stack (http://www.codeforge.com/read/75229/tcp_subr.c__html)

/*
* When a source quench is received, close congestion window
* to one segment. We will gradually open it again as we proceed.
*/


If someone were to create some ICMP source quench packets they could throttle the remote senders back when their receive buffer is almost full, completely full or just because they need to alter the status of the network communications. For example to force a sender with a large window to reduce it ASAP so it doesn't impact other traffic. (There's nothing stopping a DiffServ QoS as described directly below from using the ICMP source quench in it's own way. No idea which specific hardware FIRST might use for the QoS function so no idea if ICMP source quench will be present or how it will be used.)

In 2012, default settings were used. The report indicates that QOS may be configured in coming seasons.

I presume FIRST will implement DiffServ which is stateless? I suppose one could use Intserv but that requires reservations (RFC2210) to operate and while it's possible VxWorks in the cRIO could pull that off the Axis cameras do not support it. Axis themselves did confirm the cameras support differentiated services code point (DSCP - RFC 2474) per function (audio, video, alarm...)

DiffServ has some end to end issues that are worthy of noting:

1. DiffServ doesn't track all statistics for all open flows (plus side it needs less space to operate, downside it isn't as aware of long term quality issues).

2. DiffServ in high packet loss situations tends to give you the choice to scalp one class to get additional bandwidth for another class (but can't be sure that in the long term that the scalped class is assured bandwidth either).

3. DiffServ has a compensation class but that only helps you if you have some idea of the limits of the uncertainty in the network and if the devices can handle that behavior. In short, if you give the compensation class a large amount of bandwidth to pull from it'll allow more flow to another class to make up for a shortage, but if the packet loss is high you need to make the compensation class larger and that still won't assure that high packet loss over a short period won't reduce the ability of traffic to flow at all.

This is only made worse because the TCP sliding windows and window scaling I noted above will very likely not be smart enough to differentiate between a congestion issue and a packet loss. This congestion issue with TCP has existed for a very long time. It's the equivalent of a hole the size of a dime and the need to pass a half dollar. Sure if you grind the half dollar down long enough it'll fit through the hole but it's going to be unpleasant. The solutions to this problem are called TCP congestion avoidance algorithms. The choice of which algorithm you use can have a dramatic impact on your network performance. TCP-Vegas (http://en.wikipedia.org/wiki/TCP_congestion_avoidance_algorithm) as implemented in DD-WRT, OpenWRT, Linux and BSD can more effectively respond to packet loss from congestion and packet loss from the radio layer on largely unidirectional links (IE: the TCP video is a large amount of bandwidth headed one way). It is unclear to me at this time if Axis cameras, Microsoft Windows, or VxWorks supports TCP-Vegas in their TCP/IP stack. Keeping in mind sometimes you need to tune the queues with TCP-Vegas.

On the one hand a DiffServ QoS unit sitting on the field side will throttle back the senders it can actually communicate with over a short duration which should impact their maximum flow rate over the longer duration. On the other hand, when the packet loss of the network due to WiFi issues pops through (or someone causes packets to be dropped at the radio level) the senders can't see the QoS unit on the field side so they'll resort to their TCP congestion avoidance algorithm of choice. With TCP-Reno/New Reno (http://en.wikipedia.org/wiki/TCP_congestion_avoidance_algorithm) (we are certainly using one of these now) depending on how that sits it could still cause flooding moments after a packet loss. A handy example:

Performance Evaluation of TCP Variants In WiFi Network Using Cross Layer Design Protocol and Explicit Congestion Notification (http://airccse.org/journal/jwmn/0611wmn09.pdf)


Nothing except for overall monitoring of the control/status loop. If that is working poorly, the CSA, inspector, or FTA may decide to look at TCP configuragion, but honestly, that is getting pretty obscure.

Why do an analysis at all? Why not just set up the field and optimize the TCP-IP parameters as a baseline for the acceptable Windows OS for the driver's stations and the cRIO? Then distribute those settings in a simple 'registry file' export from RegEdit or RegEdt32 for installation or comparison.

Probably should also note the following:

1. It is very unlikely the FIRST driver's station will need to become the local master browser. One could turn that feature off in the LanMan parameters in the Windows registry. Even if the students take that laptop off the field and use it else where it would only really be an issue on a network in which they are the only Windows computer and no Samba is running. This NetBIOS feature serves no purpose in the current field and robot systems but it'll generate a handy election and quite likely use NetBIOS over TCP-IP to do it. As an alternative one could turn off NetBIOS over TCP-IP which does have a GUI option to change. If anyone is interested on more details look at the Samba (http://www.samba.org/samba/docs/man/Samba-HOWTO-Collection/NetworkBrowsing.html#id2581357) project.

2. IPV6 ought to be turned off. Especially in Windows 7. Windows 7 has a perverse tendency to use IPV6 first and IPV4 later and not only does IPV6 have so many security concerns that I could fill a book I don't think any device we have supports it unless FIRST is using the InterPeak stack configured for it in the cRIO. I'd be interested to know if there is IPV6 usage in the FIRST ecosystem.

One could take this easily a step further. They could write a simple program or even a script to back up the relevant local Windows system registry entries. Make these changes in preparation of the driver's station function. Then return the original settings when the driver's station is done. In point of fact one could even set a System Restore point but that might get a bit out of hand with regards to storage since you only need to alter a trivial number of keys. On the plus side a program could easily locate the keys to disable the IPV6 protocol for the wired adapter you'll be using.

That is less than one packet per second. If TCP is having issues and retransmitting, that will likely impact the UDP and the FTA or others would look into it.

In a 135 second long match (2 minutes, 15 seconds) you're absolutely correct that if the UDP packet loss of 100 packets per match was distributed evenly that would be less than one packet per second. However, as we agree your enable/disable timer in the cRIO will time out in 100ms. So you can loose 3-4 driver's station generated UDP packets with the enable/disable state in them in a row before the robot runs the 100ms timer out and disables. In 1 second you have fifty 20ms intervals. In theory if the timing is perfect (and let's face it the Windows TCP-IP stack will not reliably send those UDP packets precisely every 20ms and the latency of the link to the cRIO will impact the timing) then you can loose 40 of those UDP packets per second and still not be disabled and you have 135 seconds in which you could do that.

The reason for the math is that if you look at the link below again with my concerns, the TCP sliding window and window scaling functions have their effect over a duration than can easily be in seconds (see Figure 2-5 in link below). So it's possible for the trouble to start, build, drop a UDP packet or a bunch, cycle back, start, build, drop another UDP packet or a bunch. Meanwhile the entire time packets are dropping and devices are making their choices of congestion avoidance process (each algorithm has a set of processes at work) during and after each packet drops. Not just because of radio level issues but also because of congestion and with TCP-Reno/New Reno there really is no way to tell the difference unless there is some mitigation inserted like ICMP source quench.

Characteristics of UDP Packet Loss: Effect of TCP Traffic
(http://www.isoc.org/inet97/proceedings/F3/F3_1.HTM)

It is not. If you believe that information would be helpful instead or in addition, I'm sure it can be added.

I think the best visual representation of that data is a difference from data point to data point from the original start values of the TCP statistics to the final values over time. Pretty much Windows System Monitor already provides this facility. At least it's something to help people diagnose their own issues. Perhaps highlighting it's value will be helpful to some people.

Unfortunately neither the driver's station charts tab nor the Windows local TCP-IP stack statistics show the end users where precisely along the communications path congestion or momentary packet loss occurs. In the same way that showing the average round trip doesn't represent the instantaneous round trip time or even the time to get to the robot versus the time to get to the driver's station. The devices that can currently most determine whether packet congestion is due to packet loss in the radio layer or congestion at the wired sides of the radio links are the APs. There is no facility in TCP-Reno/New Reno itself to calculate round trip time (RTT) which would illuminate that data at any moment is disappearing (it uses a timer usually 200ms-500ms and hence the several second escalation). Even if there was a Intserv QoS unit in the field side of the communications path it wouldn't be able to determine the cause of packet loss from that vantage point (Intserv QoS does actually have information in the duration about the flows through it). In a way the UDP traffic DS<-> robot with the round trip timer represents an addition over TCP-Reno/New Reno that could better arbitrate issues but that driver's station UDP traffic is both too slow to really force down the TCP sliding windows and window scaling for it's own benefit (I write this in relation to the UDP/TCP link I posted above so if it's not apparent please reread that link) and doesn't implement ICMP source quench that I can see. If the driver's station implemented ICMP echo (aka 'ping') and ICMP source quench you could interleave the UDP packets and ICMP echo requests and monitor congestion on layer 3 and layer 4 to every IP device on the robot (to which you could then send individual ICMP source quench messages). Course if DD-WRT or OpenWRT was on the robot AP we could not only send back the statistics to the driver's stations and through that the field we could also use TCP-Vegas as the TCP congestion avoidance algorithm under the right circumstances which would need to include support on the Cisco end. The big draw back of TCP-Vegas I know of shouldn't matter to a FIRST system. TCP-Vegas doesn't play well in live routed environments if you change the routes, because that action invalidates the round trip data.

Windows does have Compound-TCP as a congestion avoidance algorithm since Windows Vista and it's been partially backported, and Linux has support for it since 2.6.17 but I'm not sure it works for Linux at this time. So far as I know it's disabled in Windows Vista and up by default. It can be enabled like this:
How to enable CTCP (http://www.tomstricks.com/how-to-enable-ctcpcompound-tcp-and-ecn-explicit-congestion-notification-in-windows-vista/)

CTCP helps pull down the TCP-Reno congestion avoidance algorithm by maintaining 2 windows. Again something else on the driver's stations to consider. Also please take note that you might have to enable more than CTCP and the timestamps might be an issue for the devices on the robot. The VxWork's older stack seems to support it, so the newer InterPeak stack should as well. The bad part of CTCP is that it would more appropriately mitigate the Windows driver's station effect on the network but it wouldn't by necessity impact all the other devices and what congestion avoidance algorithms they'll use and cause issues with. Unlike the field and robot APs this CTCP algorithm can only effect the communications between the Windows driver's station and whatever else it talks to, so usually the cRIO and FMS. So in effect it takes one source of trouble out of the picture but leaves the rest which will use whatever TCP congestion avoidance algorithm they like with whatever consequences might follow.

I would have suggested TCP-Cubic that is implemented in Linux Kernel 2.6 backported to 2.4 but there's an issue with it that concerns me regarding it's remaining ability to burst after a packet is lost over the radio. Without testing it's hard to say but the nature of all the video data going back to the driver's station might favor TCP-Cubic. I'm just concerned that it's not going to behave as well as it could if TCP-Reno/New Reno remains in operation on the same network and that may not be entirely avoidable. Mind you TCP-Reno/New Reno on the same network as TCP-Vegas will still be unfair to TCP-Vegas but from what I've seen not quite as bad (course that's on wire).


*SUPER SIMPLE VERSION:*

You have a hole big enough for a dime but you want to put a half dollar through it. So you grind up the half dollar and put it through that hole little by little. You have a choice of processes to make sure that as much of the half dollar gets through that hole as quickly as possible. I'm merely suggesting a different way to react to loosing little pieces of the half dollar which you eventually find. I think the way it's handled is slower than it needs to be to get just as much of the half dollar through that hole.

Just to make that even more interesting more than one person is trying to send their own half dollar through that same hole at the same time.

The half dollar is your data.
The hole is your network.
The multiple people are the multiple network devices.
There are lots of reasons you all loose some of the ground up half dollars (which are the packets).
There are different solutions (TCP congestion avoidance algorithms).

Now I'm hiding all my half dollars before someone wants a demo.

Brian

techhelpbb
29-08-2012, 12:23
I'm no longer able to edit the post above. So I'll append additional information like this:

If someone reads the section above where I described DiffServ they'll probably wonder why I keep suggesting field side QoS. The D-Link AP supports some QoS in the form of WMM. The Cisco 1252 supports some QoS in the form of 802.11e. As far as I can tell that was not turned on this year or any previous year. The reason I've discounted it as an option is that a full DiffServ implementation already ignores a lot of information to reduce the resources it requires to perform QoS. 802.11e and WMM ignore even more information. On a wire based DiffServ implementation there are technologies that look at packet flows not tagged with DSCP by the network devices sending them or can guess at the proper QoS class for traffic by the source or destination information in the packets (like Cisco nBAR). This information is not acted upon with 802.11e or WMM because it would require serious stateful packet inspection and the resources that usually demands. So even if someone turned on 802.11e and WMM the devices that send to the AP link would need to tag their packets with DSCP so the APs on either end would know what class the traffic is supposed to be. Such tagging is supported by the Axis camera as noted above. I see no reason that VxWorks can't do the DSCP tagging either. However, any other network devices that might send traffic over the wireless link would be questionable. For example the COTS rule allow a laptop on the robot and that laptop might be sending video, images, or even streaming from VideoLAN back to the driver's station. Such a laptop generally wouldn't have the easily added ability to tag it's traffic with DSCP. The 802.11e and WMM devices would have to default that untagged traffic as classless and the contention mitigation process they use in the QoS might therefore be unfair to the people that implement such solutions. Worse it's hard for FIRST to know whether or not you'll use certain types of traffic. For example you might not use a video feed back to the driver's station. So it would be harder on FIRST to explain why they might blindly impact the performance of your robot devices to enforce a QoS policy with 802.11e and WMM that considers devices you don't actually have. FIRST could alter the QoS parameters on the robot AP and the field AP if your robot doesn't use certain classes but then that adds further convolution to the configuration process for both (and remember a lot of people cycle through the fields with their robots).

To extend my concerns about DiffServ to 802.11e and WMM. In the most abstract simple sense 802.11e and WMM categorize the classes defined by the DSCP tagging into a smaller number of classes. Those classes exist before the radio layer. So the radio layer doesn't really change. It's just that the data presented to the radio layer to communicate is prioritized differently than it might have been with a single queue going into the radio layer. So now the radio can spend more time trying to effectively communicate certain classes of traffic over certain other classes of traffic. Again, if the radio layer is interrupted or has a difficult time finding time to send certain classes of traffic the QoS function of the APs has to decide what to do with all the data coming into it that it has no way to rid itself of. Most devices as noted above will simply start ignoring and therefore dropping inbound packets. That packet dropping behavior triggers one of the TCP congestion avoidance algorithms mentioned above. Which in turn means that: if you have several devices in different DiffServ classes lumped into a smaller number of categories the devices within each smaller number of categories will impact each other's access to that category of service going over the radio link. Essentially the QoS process will really only help some devices rise above others when the communications link actually can pass traffic. Since DiffServ isn't completely aware of all flow statistics it may not notice that it's robbing from one class to give priority to another class when you account for packets lost in the radio layer (someone that knows exactly what I'm talking about will consider MAC parameter tuning to mitigate that but that's tuning and that's a lot of extra considerations for a lot of different robot designs). These caveats justify why the TCP-Variant (TCP-Reno/New Reno, TCP-Vegas, TCP-Cubic, Compound-TCP) still matters even with 802.11e and WMM.

So basically I figure implementing 802.11e and WMM represents way more work that FIRST is willing to put up with during the field operations (and worse to really tune that would be a field by field, robot by robot process so it's really even more painful than it sounds). More of that work could be absorbed in a field side wire implementation of DiffServ for QoS. Of course if something at the application layer (the driver's station software) were only smart enough to look back down the links to all the network devices (which it can get because the teams can configure it as such) one might be able to craft a flow control process that works even without any QoS device. Such a flow control process could use ICMP source quenches as outlined above already and could be designed to be aware of the priority of communications for specific network device functions on the robot as dictated by the application layer uses (so a driver's station would carry all the specific tailor made TCP/IP traffic priorities for a robot and the teams could start to figure out those priorities without a field to work with).

The process of using ICMP source quench is a primitive form of Explicit Congestion Notification (ECN). For about 20 years it's been known that messages such as ICMP source quench can be used to cause issues for a device that supports it. From the perspective of Internet security allowing anonymous ICMP source quenches to slow down your ability to send data obviously is not the best idea. Most firewalls and routers for the Internet allow one to block ICMP entirely including both echo (aka 'ping') and source quench (there's no reason it can't be used in a private network or by FIRST as long as it works). In comparison there's a newer standard less available to the application layer defined in RFC3168 called specifically the Explicit Congestion Notification. I did not suggest that anyone use the newer ECN even though it's been around for more than 10 years (though my instructions to turn on CTCP above mention how to enable the newer ECN in Windows Vista and up). There's a bunch of issues with the newer ECN and some of them impact the performance and connectivity to D-Link devices and certain websites. Opening a hole in a driver's station firewall for ICMP source quench on the local network is one thing (odds are good your local Internet routers will block attempts to pass it to a driver's station while it may be connected to the Internet). Leaving strange behavior in the driver's station that might psuedo-randomly appear in other use is quite another matter and that might happen with the newer ECN. Another upside of implementing ICMP in the driver's station software is that you can use both echo and source quench just by doing that so it's likely to be more compatible and quicker to develop.

On a separate topic from what I just wrote about but still related to my post directly above:

It might benefit FIRST to log the information from the Windows local TCP/IP stack in the driver's station software as I suggested and provided information to do above. This would enable FIRST to have an authoritative set of entries in the match log synchronized with the driver's station software communications with the field and robot. A log that is also retained by the same process as the current DS charts and logs. In that previous post I only considered the visualization of that data when it really does matter if you can get that data in a reliable way long after the matches end.

Again on a separate topic from what I just wrote about but still related to my post directly above:

Also as a bit of a footnote, I suggested disabling IPV6 above on the driver's station. Please be aware that if you disable IPV6 and try to create a new Windows 7 Home Group it'll fail. Windows 7 Home Groups use Peer Name Resolution Protocol (PNRP) and that requires IPV6. You can disable and re-enable IPV6 it'll just break Home Group's ability to resolve names while it's off (there's still the cache). I doubt any device in the FIRST ecosystem depends on a Windows 7 Home Group and there are plenty of better ways to move your stuff without resorting to that.

Brian

Greg McKaskle
30-08-2012, 07:56
Sorry it took so long to reply. Had to read all the links ...

Backing up a few steps, I think the first thing is to monitor and log what goes on. At an event, things get a bit simpler since FIRST owns the AP and accompanying management SW and also has an Airtight. Channel level monitoring was already common, and I suspect the displays will be enhanced to include bandwidth monitoring per team.

In a build situation, it is more difficult, but if the team AP on the robot is instrumented, that info can be used when not on the field.

is there a TCP bottleneck problem? It doesnt seem common. Perhaps with better measures we will know for sure. Until then, if it isn't broke, let's not fix it.

For QOS, FIRST is still looking at options, and allowing experts to help with the selection. I'll try to get the experts to include your input.

Greg McKaskle

techhelpbb
30-08-2012, 12:02
Sorry it took so long to reply. Had to read all the links ...

Backing up a few steps, I think the first thing is to monitor and log what goes on. At an event, things get a bit simpler since FIRST owns the AP and accompanying management SW and also has an Airtight. Channel level monitoring was already common, and I suspect the displays will be enhanced to include bandwidth monitoring per team.

In a build situation, it is more difficult, but if the team AP on the robot is instrumented, that info can be used when not on the field.


I presume you mean with SNMP support on the robot AP and a suitable MIB to identify the OIDs (I know I've been asked about that MIB a few times as people try to work this out). For those reading this that don't know the Object Identifiers (OIDs) are fields in a table of collected status data about the device and the Management Information Bases (MIBs) define what those fields are. I don't think the DAP-1522 supports RMON (custom traps) even if does support SNMP and telnet. On the plus side using SNMP would at least give you some statistical information about what the robot AP can see unless you loose access to that interface during the polling. With the matches being 135 seconds long and some devices having limits as far as the polling speed I would assume that one would try to SNMP poll frequently enough that the inability to do custom internal polling isn't an issue, that short packet losses wouldn't cause there to be no data collected the entire match, and that it doesn't slow the network with it's own resource demands. If you did have a SNMP (RMON if you have it) trap for the OIDs effectively indicating radio link losses you'd only be telling yourself that there's trouble with the radio after the radio link recovers. On the plus side such a delayed trap notification might be better than waiting for a long SNMP external polling timer to expire, on the downside if the SNMP external polling is frequent enough it's probably not vital to waste the robot AP's time doing that. One could expand the existing DS<->Robot communications systems that use UDP to send only the data they want, when they want it, how they want it as well. Looking back at my previous 2 posts above I had suggested collecting basically much of the same data in a custom fashion as well with a script on routers that support that.

Without even modifying the DS software one could already enable the SNMP service in Windows and use one of the many SNMP managers/monitors like Dart (http://www.dart.com/snmp-free-manager.aspx).

SNMP is a UDP based protocol on layer 4. Turning this on has the benefit/downside of generating UDP traffic that contains far more data than the DS<->robot UDP packets. Just doing that SNMP polling will help push down the TCP congestion from the TCP sliding windows and window scaling going toward the robot AP from the field side. Just don't poll too much or you'll basically flood the network and the congestion on the robot side going into the robot AP will get worse. Course if you had SNMP to the devices on the robot then the effect of pushing back TCP would extend into the robot because this traffic is bidirectional and it sends back to the field more data than is required to start the process.

Though again: the upside of sending more UDP traffic is that the congestion has less effect on the UDP packets you send (per that link I keep pointing back to). If you send more traffic for the critical UDP functions in the DS<->Robot communications they'll likely get through a congested link more often. One can send more traffic by making the UDP packets larger (say filling them with statistical data...minding the observations about performance and larger UDP packets) or by sending them more often (reduce the 20ms timer between the UDP packets to 5-10ms...you could still use autonomous/teleop modes the same way and still time out enable in 100ms). So while SNMP does offer the same capability this is a distinguishing point to make.

The only additional concerns I would add to that is to make sure someone changes their SNMP community password (string) on the robot AP. On the field it's not a big issue but off the field it could be used to craft a DoS by over-polling.

Additionally on this subject, even if the robot AP doesn't support RMON the Cisco 1252 does (there may be caveats to this support for instance it might not work in the older VxWorks firmwares).

is there a TCP bottleneck problem? It doesnt seem common. Perhaps with better measures we will know for sure. Until then, if it isn't broke, let's not fix it.

That's fair enough. I do see a number of people reporting issues with the Smart DashBoard and Webcam performance which are both TCP but there's no absolute proof that TCP congestion caused by issues on the field causes that to happen more on the field than in private environments. Proper data collection would go a very long way to figure this out.

I should also note that reducing the number of packets lost in the radio layer will seriously improve the situation. If the radio link quality improves there will be more immunity to the sort of short jamming interruptions in communication I know are possible and that AirTight won't alert about. From AirTight's perspective a short interruption is not what most people perceive as a denial of service (DoS) attack. Normally if you loose a few packets your web page loads a little slower or your video quality goes down. Most people don't expect to use the full bandwidth of their radio link in 135 second intervals. Even expertly configured radio links loose information, it's not generally a desired trait and as all this shows it can be very complicated to deal with the consequences of loosing that information. So in this regards efforts on the part of FIRST to tighten the detection net for the radio link and to improve the quality (antennas, robot AP placement, etc...) go a long way to throttle back on the unusual and obscure network efforts needed to compensate. It's not like we can ask the robots to not move while we make adjustments during a match. (Attach antenna to arm and make program to find best signal. Robotic rabbit-ear adjuster.)

Obviously with so many TCP/IP implementers using the IETF RFC standards and via that implementing TCP-Reno/New Reno (Microsoft (default in XP TCP-SAck which is very close, Vista and up offer TCP-TSAck as well), Mac OSX (it's a default), Axis, VxWorks, BSD (it's a default), Linux being notable exceptions as they go beyond that with TCP-Cubic) they are probably doing that for a reason. The reason is that in a wide range of bidirectional network traffic carried on wire in a wide variety of circumstances TCP-Reno/New Reno (TCP-Westwood/Westwood+ is just a tweak to TCP-New Reno, TCP-SAck adds selective acknowledge to reduce retransmission, and TSAck uses timestamps (not to be confused with CTCP)) behavior is a good compromise when congestion occurs. The only reason I'm suggesting otherwise is that this is a specific set of circumstances. Personally I dislike when I see people turn this stuff on without a good idea of what it might help and what it might hurt in their specific circumstances. (Once in a while there's a fuss about how DD-WRT, OpenWRT and Tomato advertised TCP-Vegas and how people used it without a clue about it's specifics just cause it was a hot topic).

As a backup for my advocating of TCP-Vegas for this mobile robot application consider this:
VEGAS: Better Performance Than Other TCP Congestion Control Algorithms on MANETs (http://www.cscjournals.org/csc/manuscript/Journals/IJCN/volume3/Issue2/IJCN-139.pdf)

For QOS, FIRST is still looking at options, and allowing experts to help with the selection. I'll try to get the experts to include your input.

I appreciate it.
Sorry about the length of the posts whole lot of detail in a small space.

BTW you might want to show someone this:
Cisco End of Sale / End of Life Announcement - 1250 Series (http://www.cisco.com/en/US/prod/collateral/wireless/ps5678/ps6973/end_of_life_notice_c51-681596.html)

If FIRST is interested in replacing that Cisco 1252 here's a quick suggestion to consider:

6 individual APs of the same make and model as the robots use (nice and modular).
All the APs (field and robot) running DD-WRT.
Enable SSH to configure them all.
Configure TCP-Vegas across the wireless link between them (pay attention to where the TCP endpoints are).
Tune the queues (they often default to 1000 packets) using floods of UDP and TCP individually and together.
Keep in mind that small or huge queues are not a great idea, especially considering 802.11n packet aggregation.
Try disabling channel bonding (there are 6 robots, 4 channels in 5GHz, reduce the contention).
Consider that disabling channel bonding might impact the queue changes.
Instrument them all with custom code.
Make up for any missing features in the managed switch on the field side.

If someone would like me to demonstrate what I outlined above: it's easy enough so just ask.

Brian

qnetjoe
31-08-2012, 13:56
BTW you might want to show someone this:
Cisco End of Sale / End of Life Announcement - 1250 Series (http://www.cisco.com/en/US/prod/collateral/wireless/ps5678/ps6973/end_of_life_notice_c51-681596.html)

If FIRST is interested in replacing that Cisco 1252 here's a quick suggestion to consider:

6 individual APs of the same make and model as the robots use (nice and modular).
All the APs (field and robot) running DD-WRT.
Enable SSH to configure them all.
Configure TCP-Vegas across the wireless link between them (pay attention to where the TCP endpoints are).
Tune the queues (they often default to 1000 packets) using floods of UDP and TCP individually and together.
Keep in mind that small or huge queues are not a great idea, especially considering 802.11n packet aggregation.
Try disabling channel bonding (there are 6 robots, 4 channels in 5GHz, reduce the contention).
Consider that disabling channel bonding might impact the queue changes.
Instrument them all with custom code.
Make up for any missing features in the managed switch on the field side.

If someone would like me to demonstrate what I outlined above: it's easy enough so just ask.

Brian


I just want to caution everyone about going down this road with regards to the wireless system; there are a million tangents that you can take a wireless system design, but you will need to be very methodical and listen to the larger scale requirements. The 2015 Control System RFP is a good read just to understand the larger issues that the field has to face.

Section WRC1 states "Capable of controlling 4 co-located active fields with up to 6 robots on each field.". This means that we can not have 6 independent access points per field because even with 20MHz channel widths there are only 20 non-overlapping channels, plus is the middle of that 5 GHz band are required to use dynamic frequency selection (DFS) and transmit power control (TPC) because that is the same band as weather-radar and military applications. It would be a unfair for any team to be using on the these channels when other teams can use non DFS/TPC channels. There are only 9 non-DFS non-overlapping channels (in the US).

We all have to remember that there is a big difference between a concept/prototype and production. FIRST needs to have a production grade wireless system. 10% of my job is prototyping and the other 90% is taking a prototype and turning into something production grade. I recommend that if you are going to go down this road you a need to have a good model, preferably something based on the OSI model. At the end of the day FIRST can and should only do two things:

* Provide a rock solid production grade media layer (OSI Layers 1-3)
* Provide a method for detecting issues in the host layer (OSI Layers 4-7)

I really think that this thread has moved away from the orginal purpose of 548's Einstein Statement into a topic about wireless design. If this is something that you would like to talk about further can you create a new thread?

techhelpbb
31-08-2012, 14:10
I just want to caution everyone about going down this road with regards to the wireless system; there are a million tangents that you can take a wireless system design, but you will need to be very methodical and listen to the larger scale requirements. The 2015 Control System RFP is a good read just to understand the larger issues that the field has to face.

Section WRC1 states "Capable of controlling 4 co-located active fields with up to 6 robots on each field.". This means that we can not have 6 independent access points per field because even with 20MHz channel widths there are only 20 non-overlapping channels, plus is the middle of that 5 GHz band are required to use dynamic frequency selection (DFS) and transmit power control (TPC) because that is the same band as weather-radar and military applications. It would be a unfair for any team to be using on the these channels when other teams can use non DFS/TPC channels. There are only 9 non-DFS non-overlapping channels (in the US).

We all have to remember that there is a big difference between a concept/prototype and production. FIRST needs to have a production grade wireless system. 10% of my job is prototyping and the other 90% is taking a prototype and turning into something production grade. I recommend that if you are going to go down this road you a need to have a good model, preferably something based on the OSI model. At the end of the day FIRST can and should only do two things:

* Provide a rock solid production grade media layer (OSI Layers 1-3)
* Provide a method for detecting issues in the host layer (OSI Layers 4-7)


Fair enough but:

802.11n implements Clear Channel Assessment (CCA) to mitigate busy channels. The robots move so their proximity to the other end of the radio link changes. That could be ignored and be a bad thing or could be exploited for improving things. One could also use more directional antennas to adjust some of this. Not to mention in DD-WRT you can sometimes change the radio output power and sometimes without rebooting the AP (depends on manufacturer). 802.11n uses multipath so if the antenna placements were better perhaps FIRST wouldn't need so much transmit power because of the improvement in the ability to receive. More over, DD-WRT allows you to adjust the threshold for the CCA (assumes device support for this adjustment). This would be handy as well if you know you've got channel overlap and in this case we are lucky enough that we know it might be there.

In the current system 802.11n the maximum bandwidth of 300Mbps (actual throughput will be 60-70% of that) at 5GHz is achieved with radio channel bonding (which I advised to turn off). I should have been more clear above about why there are only 4 300Mbps communications channels available.

The only way you can't have overlap with channel bonding on is if you only use a maximum of 2 radio channels per 4 fields using multiple SSID. Then you have contention at the network level because the radio layer will be time shared between 6 robots. This is the tradeoff FIRST made already but all radio configurations were available to them as they control both ends during a match. The layer 3 and 4 network traffic beyond the UDP DS<->Robot is beyond FIRST's control to a much greater degree. Field side QoS will help to a point. Robot side QoS will just restrict what can be sent and when but in a stand alone environment things might be very different so how is anyone to test? I would think that someone could design their robot to be much less fair to the others using that contention for the radio resources with just multiple SSIDs on a dual channel radio layer especially as it is now (even if there are VLAN bandwidth limits). Even with multipath in the current environment with the robot APs as they are (badly placed) there's risk for hidden nodes (one robot checks to see if another is transmitting and doesn't get a clear reception so it transmits at the same time causing a collision).

Also if so much as one additional network is created that uses channel bonding you have overlap. Never mind adjacent channel interference which you will have if you use 8 of 9 radio channels. 802.11ac with quad radio channel bonding isn't going to improve this situation either. Something running 802.11ac as a hidden node and ignoring 'good neighbor' because it can't receive the transmit from a moving robot would be a real pain.

In that regard:

What happens when you use 802.11n radio channels next to each other in the radio spectrum and physically too close together:
Reinvestigating Channel Orthogonality - Adjacent Channel Interference in IEEE 802.11n Networks (http://hwl.hu-berlin.de/fileadmin/user_upload/documents/aci_80211n.pdf)

How close together is close, what is the effect on UDP (important for DS<->Robot), and what happens if you turn down
the radio output power (the results of this can be used to mitigate that link at the top of this list):
Understanding the Effects of Output Power Settings When Evaluating 802.11n Reference Designs (http://www.quantenna.com/pdf/80211nPowerSettings.pdf)

Why channel bonding is not always the best idea:
The Impact of Channel Bonding on 802.11n Network Management (http://www.cs.ucsb.edu/~laradeek/CoNEXT11.pdf)

One can mitigate the issue of proximity to a wireless radio and overlapping channels (not to mention adjacent channels) with nearby similar networks (well within the distances FIRST is subject to) by reducing the output power of the radios (manually, by script, or frequently by code all of which are options with DD-WRT). It works in 802.11n on 5GHz and if you start reading this attached thesis from Rutgers you'll save me a lot of time typing because all the justification for my statement is basically there (start on page 45 to save yourself time). If one looks at the graphs they'll see that in the tests the writer saved a fair amount of power (always handy for a battery powered robot) and still maintained radio throughput with UDP traffic (the traffic that FIRST's DS<->Robot communications depends upon) but the effective range of the communications was reduced (handy if you have fields near each other). So by extension of this information, in an environment that is not adapting my point is: with proper antenna placement one should be able to reduce the radio power and address the concerns you've presented (in fact it's highly probable with such controls you could have even more than 4 fields). After all, even with these channel limits the reason these devices sell as they do is that the signal doesn't extend such great distance that you couldn't litter these units in nearby homes and not even notice them from the perspective of one home to the next.

Adaptive Transmit Power Control Based on Signal Strength and Frame Loss Measurements For WLANs (http://mss3.libraries.rutgers.edu/dlr/showfed.php?pid=rutgers-lib:26644)

Which is worse? What I suggested isn't all that hard to test.

Additionally, there's nothing stopping anyone from using multiple SSID on a single device with DD-WRT either. One could work up a finger print of robot bandwidth and pair teams or alliances off to fewer than 6 field APs based on that metric. In fact such a automatically field tested bandwidth requirement fingerprint might be handy for a bunch of reasons beyond it's value for that (instead of leaving everyone looking at each other...just push the report to their driver's station for later review).


I really think that this thread has moved away from the orginal purpose of 548's Einstein Statement into a topic about wireless design. If this is something that you would like to talk about further can you create a new thread?

I'm fine with that as well. I'm even open to this conversation in private.

I just want to add. As of this post we are now discussing exactly the sort of balance I wrote to FIRST about already in private. So, while this has ventured into great detail and engineering matters. It did not venture away from the issue of what happened on Einstein or for that matter what *may* have happened elsewhere. I hope that if someone has issues with my points or my point of view they'll discuss it with me. It's better to be respectfully challenged than to be above all challenges. Also my apologies for the crazy way I've had to edit all this. I did not intend originally to have to present a formal thesis of my own and it took some time for me to adjust the presentation of my ideas. I am still quite happy to demonstrate that this works and I'm going to leave this topic at this point. Thanks for your time.

Al Skierkiewicz
04-09-2012, 14:01
Everyone should check out the Cisco site for actual tech specifications and in situ performance for the field AP. This unit is designed to cover the entire floor in large buildings with typical coverage of up to 375 ft and with well over 100 users connected. First engineering, DEKA engineers and the wireless consultants all have extensive experience with the units used.
If anything, there should be a caveat to teams to mount the radio away from large metal objects, near the outside of the robot, with a secured power connector and without having robot appendages move against the case while operating. Just this year alone, I have found teams with the radio mounted on the bottom of the robot, or behind the bumper supports, or underneath or behind 2"x4" box tube that is part of an appendage, or with the radio sandwiched between two pieces of metal plate. I even found one team that had constructed an aluminum box out of perf stock to "protect" the radio.

Jon Stratis
04-09-2012, 14:31
Now there's an idea Al... can we just surround the field + field AP with a giant Faraday cage? That should resolve any and all concerns stemming from interference from outside sources!


Note: I'm not really being serious here :)

techhelpbb
04-09-2012, 14:59
Everyone should check out the Cisco site for actual tech specifications and in situ performance for the field AP. This unit is designed to cover the entire floor in large buildings with typical coverage of up to 375 ft and with well over 100 users connected. First engineering, DEKA engineers and the wireless consultants all have extensive experience with the units used.
If anything, there should be a caveat to teams to mount the radio away from large metal objects, near the outside of the robot, with a secured power connector and without having robot appendages move against the case while operating. Just this year alone, I have found teams with the radio mounted on the bottom of the robot, or behind the bumper supports, or underneath or behind 2"x4" box tube that is part of an appendage, or with the radio sandwiched between two pieces of metal plate. I even found one team that had constructed an aluminum box out of perf stock to "protect" the radio.

I have no doubt that the Cisco 1252 has excellent range I've used them outside of FIRST. I have no doubt that the experts here would respond based on the data they've collected. In the end the question really becomes by what evidence does one determine that the radio power settings used are appropriate at any moment in the system's operation (the field and the robots being the system)? No matter what there's a high density of APs in a relatively small area so no matter what they'll impact the radio performance of each other. It's not clear to me that anything measures even RSSI currently (the thesis linked in my previous post detailed why RSSI alone isn't really the most thorough indicator with regards to radio power control). So really I'm not sure how anyone can know how the power of that radio signal is behaving moment to moment with the various robots involved during a match. I've never seen the experts go around to the robots adjusting them to mitigate their concerns like they could do with the fields during setup. One would have to actually do that because the robot APs are all on the same channel and they can be very close or 50 feet apart from one another near moving parts like rotating metal shooter hoods.

In an infrastructure environment (by far the most common usage of this technology) usually the transmitter power being set as high as possible without it causing other issues is a good thing. Even in an ad hoc network the number of devices actually moving is usually small. In this environment that may not be the case considering the devices being designed to move around. I can easily see that there's a balancing act where the power level one wants is just enough to do the job while keeping the side effects minimized and that power level will change moment to moment (it may be more power than is being used now, but I suspect it's actually less power than is used now a great deal of the time). Logging of the relevant factors seems critical as otherwise I can't see how anyone can have the data to determine the fit of the solution.

Al Skierkiewicz
04-09-2012, 15:33
Now there's an idea Al... can we just surround the field + field AP with a giant Faraday cage? That should resolve any and all concerns stemming from interference from outside sources!


Note: I'm not really being serious here :)
Isn't that Battlebots?



Brian,
You are implying that FIRST and/or it's vendors have not made these measurements or know the RF levels on the field. I have to ask, no demand, that you cease making statements based solely on your own experience without any knowledge of what is taking place on FIRST fields. All you are doing is seeding doubt in the minds of those who have no experience in the field. No matter how long your posts, in my mind you are simply throwing rocks at FIRST. The engineering staff has made these measurements, they know what the coverage contour is on fields, and they know the fade margins caused by objects, robots and people on or near the field.

While the RF output level of the Cisco router is adjustable, as you know. Setting devices to maximum is rarely the best solution depending on the environment. There is no doubt that the RF level is sufficient to reach 50 ft. However, with outside interference, it is not the transmit power but the receiver sensitivity that needs to be considered. In normal environments, high RF levels are likely to saturate the receivers causing front end overload and intermod products in the demod process. The Cisco device is capable of making more than one watt ERP with the antennas currently used. I have worked the world on less than that, often achieving distances of greater than 800 miles on about 0.5 watt ERP, not calculating for losses in antenna lobes, ground, transmission cable or atmospherics.

techhelpbb
04-09-2012, 16:03
Brian,
You are implying that FIRST and/or it's vendors have not made these measurements or know the RF levels on the field. I have to ask, no demand, that you cease making statements based solely on your own experience without any knowledge of what is taking place on FIRST fields. All you are doing is seeding doubt in the minds of those who have no experience in the field. No matter how long your posts, in my mind you are simply throwing rocks at FIRST. The engineering staff has made these measurements, they know what the coverage contour is on fields, and they know the fade margins caused by objects, robots and people on or near the field.


With the deepest respect that is/was literally impossible on a continuous quality maintaining basis. The Einstein reports clearly indicate that not even the logs from the Cisco were available for review during testing. That was also a long report but still it didn't address this point either.

Also, you failed to consider that the fade margins are entirely dependent on robot AP placement or it would be impossible for a team to find a placement of the AP that would interfere with communications. There is no way for FIRST to predict all robot designs sufficiently to test on that level.


While the RF output level of the Cisco router is adjustable, as you know. Setting devices to maximum is rarely the best solution depending on the environment. There is no doubt that the RF level is sufficient to reach 50 ft. However, with outside interference, it is not the transmit power but the receiver sensitivity that needs to be considered. In normal environments, high RF levels are likely to saturate the receivers causing front end overload and intermod products in the demod process. The Cisco device is capable of making more than one watt ERP with the antennas currently used. I have worked the world on less than that, often achieving distances of greater than 800 miles on about 0.5 watt ERP, not calculating for losses in antenna lobes, ground, transmission cable or atmospherics.

1. The robot APs also transmit.

2. The robot APs have been positioned in such a way that they don't communicate clearly despite the Cisco 1252's radio output power. It's not a question of could be...there are sufficient examples.

3. All of the APs both field and robot are capable of interfering with each other. Not just when they are on the same radio channel, but when they are on radio channels adjacent to one another. So the radio interference concerns are not just from outside sources.

4. Logging for the relevant issues should be simple enough to do. If one doubts the validity of their position, or values the correctness of their own that's a great way to mitigate both ends of the concerns.

5. I'm not saying that the field AP can't go a further distance, I'm saying that it does not need to. In fact should not unless it's absolutely necessary based on measurements. This goes for the robot APs as well. Right now all evidence supports that the radio power levels are fixed.

6. You are absolutely correct about the radio receiver sensitivity being important. The entire clear channel assessment (CCA) process that allows the robots to be on the same radio channel depends on that sensitivity. In fact it's adjustable for that reason. It's also impacted by the robot AP placement. Regardless of diversity, MIMO, or RTS/CTS. If the receiver can't receive the transmissions on it's radio channels a robot may as well be a rolling jamming device. A robot that can't receive other transmissions it might interfere with, and with a demand to send, will just send on the same radio channels that are probably busy with other robots causing a collision.

craigcd
04-09-2012, 16:15
First of all I would like to say that the previous FMS discussions are extremely interesting. I kind’a like the “SUPER SIMPLE VERSION” the best. This is a very impressive analysis and the more I read the more confused I get. That is probably because my experience is not with electronics and code and “stuff”. Secondly they seem to have moved from the original purpose of the apology from team 548. This is all important information and maybe a new thread needs to be started and let the Team 548 Einstein Statement pass into history.

EricVanWyk
04-09-2012, 16:43
First of all I would like to say that the previous FMS discussions are extremely interesting. I kind’a like the “super simple explanations” the best. This is a very impressive analysis and the more I read the more confused I get. That is probably because my experience is not with electronics and code and “stuff”. Secondly they seem to have moved from the original purpose of the apology from team 548. This is all important information and maybe a new thread needs to be started and let the Team 548 Einstein Statement pass into history.

There is nothing in the entirety of the field of engineering that can not be understood by every single member of this board. Unfortunately, 'engineer-speak' is a fractured language with countless dialects that all use different words for the same meanings. If something seems confusing, ask the person you are talking to to rephrase it into the right dialect for you. Two of my patents are from rephrasing what a mechanical engineer said in to electrical engineer speak.

Alternately, you may find that they are trying to hide their own confusion in a cloud of vocabulary and acronyms. Don't be impressed, ask questions.

techhelpbb
04-09-2012, 18:20
First of all I would like to say that the previous FMS discussions are extremely interesting. I kind’a like the “SUPER SIMPLE VERSION” the best. This is a very impressive analysis and the more I read the more confused I get. That is probably because my experience is not with electronics and code and “stuff”. Secondly they seem to have moved from the original purpose of the apology from team 548. This is all important information and maybe a new thread needs to be started and let the Team 548 Einstein Statement pass into history.

In addition to that SUPER SIMPLE VERSION here's another:

The situation (The Town Hall With Musical Chairs):
1. You have 7 blind people in a room and there are possibly 4 rooms.
2. They tend to speak at the same volume and they hear just as well as each other.
3. One of those people is the person that everyone is talking to (the key person).
4. That person stands with their back against the wall of the room looking into the space of the room.
5. The other 6 people move around the room blindly.
6. Everyone is trying to be polite and only speak to this key person one at a time.
7. There are invisible portions of the room that make it harder to hear each other (not only are the sounds from each other's perspective too quiet but the voices are too hard for some other people to hear).
8. Any moving person not heard from in a short period of time must stop moving till they hear someone.
9. When they talk it's basically one sentence of some random length at a time then they stop.
10. If someone can't talk for a while they experience a pile up of sentences they must communicate later.

Knowing this:
1. If they knew they were in one of the quiet invisible portions they could just not talk and hurry out but they can't see that so they might talk when they should not.
2. If they all scream confusion will set in because just before they start screaming they might not hear someone else or they might be hard to understand at that extreme volume or they might disturb nearby rooms.
3. If they all whisper sometimes someone won't be heard when they are talking but at least someone will be heard if they are close enough to the key person.
4. If they all could just find the right volume they could all talk and hear each other but that volume changes as they move and they all move blindly.

My solution:
Let everyone talk at different volumes and adjust their volumes as they move. To do it requires communication about the perceived volume as each person talks. Sometimes someone will talk over someone else, but if they all start off slowly increasing volume between their movements it'll be less often they talk over each other and at least someone will get a clear word in edgewise. As long as the balance between volume changing, movement and time is set properly no one should be stuck anywhere for long or continuously talk over anyone else. Let's refer to that balance as being fair to one another.

A. The key person is the Cisco 1252 field AP.
B. The 6 blind people are the robots and robot APs.
C. The voice volumes are the radio transmit powers.
D. The rooms are the fields.
E. The invisible portions are things that make the robot AP occasionally not receive or have other APs receive it's transmit.
F. The short period of silence before which the people must stop is the robot enable that times out in 100ms.
G. The sentences they speak of different lengths are the data communicated over the radios.
H. The pile up of data they must send when they can't is a network congestion problem.
I. When more than one person talks by accident at the same time it's a collision.
J. A person in a quiet portion of the room talking because they can't hear someone else talking is a hidden node.
K. The restriction on each person to listen for another talking before they talk is clear channel assessment.
L. If someone screams into that room that would be a jammer but these people talking over each other serve the same function as that jammer would serve.

I'm suggesting they are talking over each other too often right now.

Further to link this super simple example and the other:
The communications in this example dictates the size of that hole from the other example with the half dollars.
If the communications was better perhaps that hole would be quarter sized instead of dime sized.
Then it would be easier in the first place to send the half dollars.
If the hole was a little bigger and the people stuffing the bits of ground up half dollar were more clearly and quickly communicating the whole problem gets easier.


The alternate situation (The Moving Study Hall):
I also suggested before that we have 12 people in the room.
6 key people and 6 moving people.
Each key person talks with one moving person.
In fact, we could strategically place some of the key people against the wall to keep them closer to a certain moving person.
The concern that people have with 12 people is that the volume in the room would disturb the rooms next door.
They can all be in that room and talk at the same time if they control their voice volumes properly.
Sure sometimes they might disturb each other but they already disturb each other in the other example anyway.


The basic conclusion:
I figure some of this is just a question of controlling that 'volume' of the communications either way.
If we can't manage to at least control the 'volume' of those 'voices', I think we should at least record the whole mess so we can find better solutions later.
If we can't record everything, at least record the 'volumes' of the 'individual people' and how it is perceived by 'everyone' else.
At least then we'll know there were 'quiet spots' in the 'room(s)' and 'who' was in those spots at what times.


Sorry this is long but I hope despite it's length that it is very easy to relate to.

Greg McKaskle
04-09-2012, 21:19
I'm by no means an 802.11 or antennae expert, and I have seen engineers go for hours on trying to critique a single aspect of a lower-level layer of the OSI model for network communications.

Rather than debate network power, what if I instead just measure the efficiency of communication -- how long does it take for a given amount of data to be communicated. The radio tap header contains the data rate and encoding scheme. If it is low, well it could be for any number of reasons, but if it is high, approaching the theoretical limit, then that must mean that things are clicking along just fine. It isn't that hard to measure or even to log. Unless you have strong evidence that shows signal strength to be a root cause of many robot failures, I don't think this discussion will bear fruit. It can easily eat up many forum pages, but no fruit.

Greg McKaskle

techhelpbb
04-09-2012, 22:05
I'm by no means an 802.11 or antennae expert, and I have seen engineers go for hours on trying to critique a single aspect of a lower-level layer of the OSI model for network communications.

Rather than debate network power, what if I instead just measure the efficiency of communication -- how long does it take for a given amount of data to be communicated. The radio tap header contains the data rate and encoding scheme. If it is low, well it could be for any number of reasons, but if it is high, approaching the theoretical limit, then that must mean that things are clicking along just fine. It isn't that hard to measure or even to log. Unless you have strong evidence that shows signal strength to be a root cause of many robot failures, I don't think this discussion will bear fruit. It can easily eat up many forum pages, but no fruit.

Greg McKaskle

Just a couple of points:

1. The necessary signal strength to balance, distance, throughput and interference will always be changing. Creating such a test was already presented in the thesis last page. Anything less than adapting (even if the adaptation is a shell script making the adjustment once a second) will surely have a short coming somewhere. Just the additional consequence of the movement of the robot APs.

2. It's not just the signal strength from the radio output but the antennas, the antenna placements and the competition for the channels (so which way one divides the 9 available radio channels matters as well as the distances between the users of each channel).

So the only way I can envision finding the optimum or at least the 'good enough' for FIRST is active data collection and response.
I have tried this with a bunch of APs just as a test and it worked fine. However, I'm not sure I consider my experiment to be a great proof of anything other than possibilities.
I didn't design it to be comparative against FIRST just as a demonstration.

I think Cisco now offers per-packet information headers (http://www.cacetech.com/documents/PPI_Header_format_1.0.1.pdf) (PPI headers) for 802.11n instead of radiotap (http://www.radiotap.org/) (see also this (http://netbsd.gw.com/cgi-bin/man-cgi?ieee80211_radiotap+9+NetBSD-current)).

Radiotap offers IEEE80211_RADIOTAP_RATE but I'm not sure about the active encoding and PPI headers offers rate, but aren't these in units of 500kbps? I know they are making additions for VHT to accomodate 802.11ac.

Also are the D-Link 1522s capable of tagging packets with information with the stock firmware?
OpenWRT has some development work for radiotap not so sure about PPI headers.

Not saying I'm against doing this just pointing out pros/cons.

Greg McKaskle
04-09-2012, 22:20
... So the only way I can envision finding the optimum or at least the 'good enough' for FIRST is active data collection and response. ...

Exactly, and that is what 802.11 participants do. The algorithms for adapting to changes in orientation and interference are part of the standard. They don't just broadcast, but listen, measure, adapt, and communicate status to the AP.

I was using radio tap, but I'm sure there are other standards, and it will continue to evolve and improve as it plays a larger role in our everyday lives.

Greg McKaskle

Alan Anderson
04-09-2012, 22:25
...active data collection and response...

...is built in to 802.11n.

techhelpbb
04-09-2012, 22:36
Exactly, and that is what 802.11 participants do. The algorithms for adapting to changes in orientation and interference are part of the standard. They don't just broadcast, but listen, measure, adapt, and communicate status to the AP.

I was using radio tap, but I'm sure there are other standards, and it will continue to evolve and improve as it plays a larger role in our everyday lives.

Greg McKaskle

Unfortunately not all the 802.11n participants have the features of 'beamforming' found in the newer devices.

This is not to say they don't have layers of responsive adaptation but as the thesis link demonstrated with Atheros chipset 802.11n development boards the responsive adjustment of transmit power to stike the best balance is not an existing feature. You can control the transmit power and it is effected by various existing settings but not in the manner I'm describing.

Relinked as it's now a page back:
Adaptive Transmit Power Control Based on Signal Strength and Frame Loss Measurements For WLANs (http://mss3.libraries.rutgers.edu/dlr/showfed.php?pid=rutgers-lib:26644)

Perhaps someone could find a device that has those features but that's a whole separate issue. Generally what I am describing is closest to: Aruba Adaptive Resource Management (http://www.arubanetworks.com/pdf/solutions/TB_ARM.pdf) (ARM), Dynamic Radio Management (http://www.extremenetworks.com/libraries/techbriefs/TBDynamicRadioMgmt_1067.pdf) (DRM), Radio Resource Management (http://www.cisco.com/en/US/tech/tk722/tk809/technologies_tech_note09186a008072c759.shtml) (RRM), and anyone else with their own WiFi architecture generally (my apologies Greg if that was your grander point). Of course if all the devices were from Aruba, Cisco or Extreme we could exploit their infrastructure as they decribe but I figure FIRST is not interested in spending that sort of money considering the robot APs. Also some of these adaptive infrastructures are probably a bit too slow at minute intervals given the duration of a FIRST match. Again they usually make the reasonable assumption your AP isn't bolted to a robot and dancing around.

802.11F does provide a channel for similar communications via inter-acess point protocol (http://en.wikipedia.org/wiki/Inter-Access_Point_Protocol). Though it's not clear to me if that is currently extended from the Cisco 1252 in any way to mitigate the specific power concerns I've highlighted with any other vendor. I'm sure Cisco's RRM works just great with other Cisco devices (I'm using it right now). However, so far as I know currently Cisco uses LWAPP (http://en.wikipedia.org/wiki/Lightweight_Access_Point_Protocol) not really 802.11F for their RMM feature set.

Given I have this feature working as a shell script pulling down the maximum radio power right now on some DD-WRT access points I know that FIRST doesn't need anything very fancy to achieve this basic balance. However, it's certainly not a feature you'll just get with random 802.11n hardware.

EricVanWyk
04-09-2012, 23:58
This sounds like a great conversation to have on the 802.11 board. This thread is the wrong place to have it. This thread is about Team 548's Einstein Statement. This thread is not about your crackpot theories on beamforming or adaptive power control. Please stop hijacking threads.

techhelpbb
05-09-2012, 01:00
This sounds like a great conversation to have on the 802.11 board. This thread is the wrong place to have it. This thread is about Team 548's Einstein Statement. This thread is not about your crackpot theories on beamforming or adaptive power control. Please stop hijacking threads.

All about going after me personally again instead of presenting evidence which I already did you the courtesy of doing and offering repeatedly to exit this topic.

My crackpot theories as you put it are backed by PhD level work yours are backed by....vapor.

I'll write this again, The Einstein reports CLEARLY indicated that insufficient logs were kept. I did not make that mistake I merely pointed out where additional logging should be implemented. If you doubt my points log the data and prove it.

Shifting blame like with 548's statement instead of being open and accountable is how this all got started and clearly some of you learned nothing. I've been more than tolerant of some of your blatant and often obvious discrimination. To the rest of you who treated me with some respect thanks for some consideration.

Akash Rastogi
05-09-2012, 01:32
Brian,

Stop instigating more conflicts. It does not help your cause or your image which reflects back on your reliability as a source of information. Instigation and calling someone out does not help you earn respect from readers.

Eric is right, this thread is for one thing and one thing only. Leave it be.

Sincerely,
Akash

techhelpbb
05-09-2012, 01:44
Brian,

Stop instigating more conflicts. It does not help your cause or your image. Eric is right, this thread is for one thing and one thing only.

Sincerely,
Akash

I've offered to exit already repeatedly as I just did and I shall. My image is irrelevant to any point. None of you has the ability to tarnish it with these tactics in any capacity. Just like I hope Team 548 understands that they should never let anyone else tell them who they are or dictate their abilities. I hope they move forward into a bright future.

As to the extra stuff you added the facts are the facts. I did not argue from my authority. I argued based on links and data I provided.
Anyone can not like me all they like but the facts are what the facts are. One can deal with evidence or accept risk.

Al Skierkiewicz
05-09-2012, 08:54
Brian,
As I pointed out your statements that FIRST and the Einstein weekend experts don't know what the RF levels on the field are simply false. While it is true that logs on the 1252 were not retained from the actual Einstein event, there has been a lot of data collected in this area both prior to their initial use and after, most recently during the Einstein weekend. Experiments were performed in situ, with all robots and in various configurations and orientations. While RF levels vary on the field and while robots are moving, there was no specific and repeatable indication that RF levels on the field were or are a contributing factor to data throughput or loss of communications. While it is easy for you to state that there are many APs providing signal and interference on a FIRST field, the fact is that there are very few at 5GHz. In the event that there is an issue, FIRST has a solution to swap out the Dlinks with another device.
You stated that directive, high gain antennas should be used. These devices were discussed and are being evaluated. However, they carry significant issues when used for short distance communications. Side lobes, hot spots and excessive signal may not be able to be controlled through the adaptive processing used in 802.11 especially considering the amount of moving and stationary reflective surfaces on the playing field.
You state that RF levels should be made as high as possible. The 1252 is capable of signal levels in excess of +30 dBm per output. While the 1522 output is less, over the distances covered by these devices even with teams locating the radios inside a robot, the fade margins are greater than 30 dB and typically 50 dB.
You state that we need to take action on RF problems on FIRST fields when in fact RF is not the demon you state it is. You are stating the worse case scenarios that are used in system design in the harshest environments as the norm for a field that is less than 50 feet in actual signal path distance. The majority of the readers of this forum may take away from these statements that any problems they experience are caused by RF level issues when in fact they are not. If anything, the general reader should take away from this discussion that RF levels, maximum throughput, connectivity, reflections and multipath, changing RF environments and interference have all been thought through by the designers of the 802.11 specifications and associated hardware to make this a robust communications link.
Those that oppose your statements are not attacking you and I hope you realize I am not attacking you. We are merely opposing those statements that mislead, misinform and confuse the readers of this forum. I have seen the tests performed, and the equipment used. I sat in on discussions with the engineers at DEKA, Cisco and with other consultants where all of these things were discussed and some of those people were on the IEEE 802.11 committee. Participants communicated electronically both prior and following the Einstein weekend while we analyzed data collected during Einstein matches and at the test weekend. During the Einstein weekend we actually went as far to open a 1522 and measure antenna parameters while attempting to detune the antennas as a team might if it opened a Dlink to repair a connector.
While I am not an employee of FIRST, I am defending the organization in this area simply because I am aware of the work they have done and dedication they have shown in making sure the wireless communications work. It is my intent that teams should not immediately jump to the conclusion that the field is at fault and thereby fail to continue to check for other issues they have actually caused on their own robot. That is to say, teams should be checking that power, mounting, software and sensors are all working properly.

I have neglected to add the contributions of the Qualcom team input during Einstein testing and discussion. That is a major flaw on my part, sorry guys. The Qualcom input during field testing was invaluable and those people also were fluent in 802.11.

techhelpbb
05-09-2012, 13:30
...


Al,

I don't really think that we are coming from all that different a place. I am not making excuses for teams that place their robot AP in a bad location. I am also not suggesting that a bad location for the robot AP should be accommodated.

I am merely suggesting a fully consistent monitored system for the field and robot APs that can detect both the optimal transmitter powers and any unusual cases system wide that cause collisions or congestion (the 2 being related).

I do not deny that FIRST supplemented the logged data for Einstein with the evidence collected for that match or during the report nor do I question the integrity or skill of those people or yourself. I am saying that such data was not in the logs generated and as the reports suggest more logging should be done.

I recognize that having the availability of talent and tools to perform the Einstein analysis measurements was a special case. A unique experience for the teams at Einstein to be sure (upside one learns some things, down side one has to deal with this). I feel that's an experience in troubleshooting that could be somewhat distilled and offer value to other people.

The Aruba (ARM), Extreme Networks (DRM) and Cisco (RMM) technology would perform a similar quantitative system level analysis and radio transmit power management/control (TPM (http://en.wikipedia.org/wiki/IEEE_802.11h)/TPC (http://en.wikipedia.org/wiki/Transmit_Power_Control)) continuously regardless of where the fields were setup or which robot is using them. This is not to say it has to be one of those technologies nor that any of them is a perfect fit for FIRST. I additionally suggested a cheaper alternative in DD-WRT. All of these technologies provide insight into the APs with the perspective to the data I would like everyone to be able to analyze.

I am perfectly willing to accept that if there's doubt of my concerns that other cheaper logging solutions like Greg seems to be offering are a fine compromise. I also don't want FIRST to waste resources fixing a problem they have no real life data on yet. My point was not to demand solutions immediately but to frame concerns leading to exploration of the critical aspects. One may not know what to look for if they don't consider what problems might exist. Even if the consensus is that the Einstein report indicates these problems do not exist on that field at that time with those robots. That does not make a very large test sample considering the growing size of FIRST.

In my opinion the goal is not to just keep turning up the maximum field AP transmit power. The goal is to use no more, and no less power than is required to achieve the field system communications at any time. Personally I suspect that all of the APs already have too high a maximum transmit power setting (that's my opinion). I previously linked a paper on the risks of raising the radio transmit power so I think it would be a bad idea to use the Cisco 1252 turned up to it's highest settings. That's a sledge hammer when you need a frequently calibrated and maintained instrument.

I also would like to clarify that I am not writing that there needed to be a bunch of extra 5GHz networks to consider the existence of these problems. There is sufficient network hardware with just the fields that one can generate interference (adjacent radio channel and hidden node). I did point out before that there is additional opportunity for someone to compound that at any event with a 5GHz capable laptop nearby.

I understand that the selection of directional antennas for the Cisco 1252 at 5GHz may be problematic at these distances. I merely offer that perhaps if MIMO antennas were used on the field AP the side lobes could be better minimized. I'm not sure what selection of MIMO antennas FIRST has for the Cisco 1252 under the circumstances. Perhaps other APs would broaden the available antenna options.


I would really like FIRST to be in a strong position to hand quantifiable measures of radio power and throughput issues to all teams. Having data like this would be a wonderful critical thinking exercise for the teams and reduce the feeling of trial and error. There's not a lot of time for trial and error during real matches I'd rather people relocate APs or fix software for a reason. I'm fine with approaching these matters methodically and I don't really think I've asked for significantly more than some slight input into things FIRST may soon monitor and log which is one of their published mitigations from the Einstein report. There's really nothing extreme about what I'm asking considering it would illuminate the real risks of what I've described if they actually exist in this system. If all that monitoring from all those fields and robots shows that these problems are not common then that's a great sample set and definitive.

I don't want people running around looking for ghosts or laying blame or suspicion when the Einstein reports clearly show there are plenty of issues to go around. On topic, I also rather not have extreme perspectives of people's participation clouding what might be otherwise easy process adjustments, team reputations, or FIRST's reputation. None of it is necessary or scientific.

Taylor
05-09-2012, 13:40
Moderators: A request. I think I've seen it done in the past, if it's too labor-intensive, then disregard.

Can we please split this discussion in two separate threads, one discussing 548's involvement and remarks regarding Einstein 2012, and another thread discussing the non-human interactions, interferences, and possibilities ? I'm sure the Robostangs (as well as many other members of the community) wish to put this incident behind them, but it is hard to do so when this "Team 548"-titled thread is hovering near the top of the portal.

EricH
05-09-2012, 13:52
Moderators: A request. I think I've seen it done in the past, if it's too labor-intensive, then disregard.

Can we please split this discussion in two separate threads, one discussing 548's involvement and remarks regarding Einstein 2012, and another thread discussing the non-human interactions, interferences, and possibilities ? I'm sure the Robostangs (as well as many other members of the community) wish to put this incident behind them, but it is hard to do so when this "Team 548"-titled thread is hovering near the top of the portal.
Split, or lock. Either way, I think it's time this thread disappeared into technical-land.

Brian, Eric, and to some extent, Al. You guys are using rather high-level terms; if you want to do that, that's great, but put it in, say, the control system forum. I don't think I could understand much of what you're saying unless I took extra time to sit down with a copy of "802.11 for Dummies", if such a publication exists, and decipher. Add to that your attacks on each other (and not each other's methods or theories), and if I were a mod I'd have locked or split this thread several pages ago because of the redirect and hostility shown on multiple occasions, as well as this forum rule: ChiefDelphi.com reserves the right to remove a post which does not relate to the topic being discussed in the forum. In addition, ChiefDelphi reserves the right to reorganize discussion forums in order to best serve the majority of our members. (ie: topics may, at a moderators discretion, be relocated to a more appropriate discussion forum, or deleted entirely).

Al Skierkiewicz
05-09-2012, 16:21
I would really like FIRST to be in a strong position to hand quantifiable measures of radio power and throughput issues to all teams.

To what end? Radio output power is not an issue and teams cannot modify the radio to make changes if it were an issue.

I merely offer that perhaps if MIMO antennas were used on the field AP the side lobes could be better minimized.
The antennas in use are MIMO antennas per the 802.11 standards and produce the least side lobing of any antenna available from Cisco for this radio. They are vertical dipoles and essentially have a uniform horizontal dispersion. Higher gain antennas do have side lobes and could prove problematic if for no other reason than the deep nulls between lobes. The antennas mount on standard TNC connectors and the 1252 is designed such that the case of the box becomes the ground plane for the antennas.

techhelpbb
05-09-2012, 16:36
To what end? Radio output power is not an issue and teams cannot modify the radio to make changes if it were an issue.


For the simple reason of finding the optimal placement of the robot AP.
Not to adjust the radio output power levels themselves.

Cisco refers to the transmit power control technology I'm describing as dynamic transmit power control (DTPC). It's part of the features of Cisco's CCX in their product line. Link to save space (http://wirelessccie.blogspot.com/2009/07/tpc-vs-dtpc-vs-world-mode.html). This is the same CCX technology that offers protection from the WiFi management frame hacks and I mentioned previously in this topic.

Thanks for the information on the antennas but these are omnidirectional antennas you are describing correct?
I thought the discussion was about a more directional antenna with a reduced side lobe.
I was thinking more along the lines of a MIMO panel antenna or are you describing the inside of a panel antenna?

Jon Stratis
05-09-2012, 16:52
For the simple reason of finding the optimal placement of the robot AP.
Not to adjust the radio output power levels themselves.


Is there any doubt that the optimal location for the robot AP is going to be high up, away from metal frame components, and away from motors?

Speaking for a team entering its 7th year... we've followed that general guideline as much as possible, and have never had problems with field connection. Is there anything more that is really needed for teams? We don't need to make this more complicated than it absolutely needs to be...

techhelpbb
05-09-2012, 17:07
Is there any doubt that the optimal location for the robot AP is going to be high up, away from metal frame components, and away from motors?

Speaking for a team entering its 7th year... we've followed that general guideline as much as possible, and have never had problems with field connection. Is there anything more that is really needed for teams? We don't need to make this more complicated than it absolutely needs to be...

The only remedial option a team might have is to improve their AP placement and connect it properly.
The reason for my concerns is more than just merely to serve that purpose.
It's one benefit.

Also, 2012 is a prime example that putting the D-Link AP near a rotating assembly with metal in it for say a shooter, might not be a great idea. That might be the top of the robot and the D-Link AP might be far away from motors.

The other benefit is to have a log of more information about the field should someone have any concerns. Such information about a wide variety of robots and fields could be used to find strange AP behavior or determine if there was any other source of interference (just examples).

Jon Stratis
05-09-2012, 17:37
For the simple reason of finding the optimal placement of the robot AP.
Not to adjust the radio output power levels themselves.


The only remedial option a team might have is to improve their AP placement and connect it properly.
The reason for my concerns is more than just merely to serve that purpose.
It's one benefit.


Now I'm confused... those seem like two contradictory statements to me. The only thing teams have the ability to influence is their radio placement and wiring on the robot, and it seems to me that they already have enough information to optimize that as best they can. As for your other concerns... I think you've explained them, and I think others with more insight into the inner workings of FIRST have pretty clearly indicated that those concerns are things that FIRST is or has looked at.

Playing chicken little with the wireless setup as you've been doing here on CD doesn't really help. All it can do is get teams worried and concerned, with no useful way to alleviate those concerns. It's clear that you're a knowledgeable and passionate person from your posts here, and I would encourage you to direct that passion in a constructive way with the appropriate audience.

techhelpbb
05-09-2012, 19:01
Now I'm confused... those seem like two contradictory statements to me. The only thing teams have the ability to influence is their radio placement and wiring on the robot, and it seems to me that they already have enough information to optimize that as best they can. As for your other concerns... I think you've explained them, and I think others with more insight into the inner workings of FIRST have pretty clearly indicated that those concerns are things that FIRST is or has looked at.

Playing chicken little with the wireless setup as you've been doing here on CD doesn't really help. All it can do is get teams worried and concerned, with no useful way to alleviate those concerns. It's clear that you're a knowledgeable and passionate person from your posts here, and I would encourage you to direct that passion in a constructive way with the appropriate audience.

While it's true there's nothing more a team can do than move their robot AP to improve it's radio signal quality. Besides perhaps limit the robot's range of movement on the field (that would be a really worst case).

Pages ago (page 11 first post from me at the top) I suggested teams use the GPIO and I2C to flash status information with LEDs while on the field. That would help teams find problems even if they can't communicate to the robot. Obviously the Einstein reports make it clear that a fair amount of problems this could help diagnose remain.

Pages ago I suggested getting better control over TCP/UDP communications and being more careful about how it is used.
With QoS next year the bandwidth limits will be better controlled by the fields. However that might impact how robots perform if teams do not consider those limits. This also implies being careful about network links to IP devices like COTS devices and network congestion. Further the additional bandwidth issues I've been writing about may spawn logged entries that well help teams find these issues.

Pages ago I also suggested testing the robots for a variety of things that are well within the scope of what a team should do. Such as loosing a camera feed to the driver's station. Loosing enable for short periods of time.

All other partial solutions require cooperation and communications with people involved in the fields and FIRST. I did send 2 e-mails to FIRST which still haven't received an acknowledgement. Those 2 e-mails are still relevant (as are the other students and Mentors that I know sent e-mails). Since then Al has clearly stated that while I might not get an acknowledgement they should be listening to this topic and elsewhere.

Clearly Greg's posts in this topic mark some interest in possible improvements in the logging in relation to my concerns. I appreciate his patience. Just going over all the possible points of interest to bring some of these concerns to that sort of attention was the reason for doing it. This way the detail is there to reference and perhaps at a later date reconsider.

I have no control over how powerless teams may feel in the face of this technical information and I'm trying to help them feel less powerless when things go wrong and they want to dig (I did twice frame the arguments in plain simple examples minus all the acronyms and abbreviations). I offered repeatedly to take this private or take it elsewhere. If that was the concern I would have done so without a hesitation as long as someone can be bothered to facilitate it. I have no ability to open new topics on this forum. I have no interest in being saddled with another forum just to discuss this right now.

This is all relevant to this topic because it shows how one gets a response when they have concerns.
One of my key points remains that this is a confusing and inefficient way to facilitate this.

Al Skierkiewicz
05-09-2012, 19:10
Brian,
Also checked during the Einstein weekend. While placing the radio deep inside the robot, behind metallic parts, behind the the bumper, on the floor of the chassis pan or behind an arm are all very bad locations, the orientation of the radio with respect to the field, the Cisco router or other robots varied the received signal level by less than 10dB. Even if the radio was located low on the robot and on the outside, turning the robot so that the radio was facing away from the Cisco router only made about a 10 dB difference, far less than I would have thought knowing the interior construction of the radio and placement of the PIFA antennas. In fact nearby objects only started to affect signal strength when within 2" of the top or sides of the radio.
So for teams, in general, mount the radio where it is protected from contact with other robots and mounted so that the LEDs are visible. The radio should not be mounted near high noise devices like the leads to CIM motors or FP motors or near the 5 volt regulator. The bottom of the radio already has shielding so mounting it on metal should not be a problem but if mounted vertically, perf stock or lexan would be the preferred backing material. There didn't seem to be a vast difference in signal between horizontal and vertical mounting although horizontal will likely give the best overall coverage. When looking at the face of the radio, there is an antenna on both the left and right sides so don't mount the sides against robot frame. Both antennas are used all of the time for both receive and transmit to achieve the highest throughput per 802.11 specifications. Secure the power lead in some fashion so that it won't wiggle. A simple stick on tie point mounted on the top of the radio, with a wire tie securing a loop of the power cord is the best option. I do not recommend hot glue as this makes repairs almost impossible when applied correctly. When not done correctly, the glue will give you false hope, likely mis-align the connector or damage the jack on the radio and the connector will fall out when you need it the most. If you choose to use a Radio Shack connector, insure it is the right dimensions. Often teams will use a connector meant for a larger diameter center pin. The result is noise on the power line during robot movement. Over time, as the connection becomes dirty, radio reset will be the result. If you are placing the radio near moving parts check clearances for all positions of the moving part. The metal of the moving part should not cover the radio when at rest or fixed position and it should not pass within two inches of the top or sides of the radio. Do not make severe bends in the ethernet cables, the max spec as I remember is 1" minimum bend radius for full bandwidth. Secure the ethernet cables near the radio so that they do not put strain on the jacks on the radio or pull out with robot movement. Putting a small loop in the cable will prevent any strain on cable or radio. Above all, make sure the 5 volt regulator is connected to the radio output on the PD and all wires are secure and insulated. Mount the regulator where it will be protected and does not move within the robot.

techhelpbb
05-09-2012, 19:52
I have additional questions about some of this.
This is not to say that I doubt the skill of the testers.
I still foresee other interactions but they are not issues teams can solve.

I think the last 2 posts (Al's and mine) summarize the basics of what a team might get from this topic. I suspect the other points are of only minor value compared to the posts for the last 2 pages before this which are more general. So I think I'll leave this here and see if I can find another place to discuss those details.

My hope is that what is already here will impact the available logged data decisions at the least. I am perfectly fine with that as a compromise. If problems such as those I touched on do appear at least this will serve as something to review.