Has anyone noted differences between running autonomous / hybrid mode while tethered (practice floor) versus running using radio (competition)?
We have been seeing very odd behavior on the competition floor at the start of autonomous, where our robot will move out from the wall from 1 to 6 feet or so, then make an immediate left turn into the wrong quadrant on the track (into oncoming traffic). We have run literally dozens of times on the practice floor at the Finger Lakes Regional and did not see this behavior once (always went the correct distance before the left turn, regardless of lane settings, starting delays, etc).
We tried a number of “experiments” to attempt to identify the cause (disabling IR command handling, changed from using gyro-controlled operation to just programmed values for “straight”, slowing acceleration of the drive wheels off the line to minimize any wheelspin). The behavior was still bad, although the programmed “straight” approach did change the hard left turn to a 45 degree turn to the left on occasion… We use geartooth sensors to determine distances, but those have not been suspect since the behavior seems to track so consistently with tethered/practice versus radio/competition…
We worked with the software mentor (Mike) from Team SparX to see if anything we were doing from a software perspective was obviously wrong. He had many great suggestions, which we had either already tried or subsequently tried, but nothing that modified our robots behavior.
Mike made one point that, although surprising, could be causing our problems in some form. Specifically, we may be experiencing differences procedurally in the way we enable our robot on the practice floor versus how it is done on the competition floor. He indicated that on the competition floor, the un-asserting of “disabled” and the asserting of “autonomous” are done as two discrete operations. In our code, this will allow for a short period of teleop between the disable going away and the autonomous being asserted (both tele-init and tele-op will occur potentially). We definitely do not do this on the practice floor (just set autonomous and disabled on our competition box, reset the robot, then un-assert disabled).
Although this is surprising, the code looks like it should handle it without problems/conflicts. I did put an interlock to prevent this into the code, just for interest factor. Unfortunately, we only had one match left at Finger Lakes to see if it affected behavior, and we were bumped by another robot on a delayed start so no useful data was gathered.
We are going to the Buckeye Regional next week, and are REALLY trying to avoid getting hit with the 10 point penalty each time this occurs (for going the wrong way on the track). Disabling the autonomous / hybrid mode is only a last resort, since it works so well (when it works ).
I am going to have to make this a quick post since I’m short on time.
What are your IR commands? We had major problems with IR on the FLR field I can elaborate on them if you want.
Gyro biasing is a very important thing as mentioned before.
Do you reset your distance count at the beginning of auton? (alignment adjustments before the match could affect your count)
Try printing out your distance in the user mode byte (after shifting). Train your drive team to watch it during practice round auton and tell you the value when it turns.
From the terms you are using, it sounds like you are using Kevin’s code, correct?
If so, try enabling Process_Gyro_Data(); in the Teleop_Spin loop.
I know you may not be using the gyro in teleop, but it may eliminate your problem if in fact the assertion of Autonomous does happen after a loop or two of teleop. (I really would like to be wrong here. We are doing exactly the same thing you are, we just haven’t been in competition yet to see these errors.)
Secondly, how and where are you calculating the gyro bias??
One thing to note is that on the field, the enable comes quite
quickly after autonomous, and on your manual box it may not
be so quick. This can foul any timer that is not checking for
both autonomous and enabled. I can’t think of anything else
that I would expect to be different, other than potential radio
modem trouble. By measuring the time between packet loops
you can detect missed packets and show them with a counter
on the user display.
We have had trouble, during testing with our sensor test mule,
with measured run lengths being short, and long. Our test mule
used a shaved sprocket to trigger the sensor and the gap had
a lot of variation in it. We straightened it, and closed the gap
as much as possible. After that the measured distances seemed
much more repeatable, but not totally flawless. The wheel counters
in our competiton robot are different, running directly on the output
gear in the toughbox. The gap to the gear in this case is critical
because the teeth are small. I suppose that the next time we run
into this we should try the tether and see if it makes a difference.
There is always the possibility of residual gremlins in the 8722.
We have had more than our fair share of problems with it.
If the rules allowed the 2005 controller, upgraded with the latest
firmware, we would be using it. We have never had gremlins
appear on the prior controller.
We do use the gyro during teleop mode (and hence do process the gyro data in the teleop_spin), and it seems to be working ok there. The bias calculation (we are using Kevin’s default code) occurs during the disable-mode handling. One of our experiments was to assure we weren’t losing the bias value by calculating it just before a match and hard-coding it in the gyro initialization. Not a perfect solution, but it showed that the bias value calc (or lack of it) wasn’t causing the bad robot behavior…
The IR commands were suspected early, as one of our IR commands was “turn left 30 degrees”. We disabled the IR handling (disabled the sensor in hardware, as well as removing the code that looked for the IR and processed the commands). We ran that way for a couple of matches, and the hard left turn was still there…
As for distance calculations, as I mentioned, we use geartooth sensors on both our wheels. Our autonomous is implemented as a state-machine that runs discrete functions used to “play the game”. These are things like “go straight”, “turn 90 right”, “turn 90 left”, “delay X seconds”, “raise lift”, etc. These are called in a specific order and with lane-specific values to traverse the track. The straight function sets up the distance and heading desired, starts the wheels at the desired speed, and polls in the 26.2ms loop for completion. Part of setting up the distance is the resetting of the distance counters. We suspected this early as well, so we did a lot of tracing on this (but of course, it works on the practice floor ).
The suggestion to put out the distance in the user bytes to display on the OI is an excellent suggestion! We will implement that for testing during the practice rounds at the Buckeye Regional. That should give us enough info to either identify it as the culprit or remove it from suspicion…
We even went so far as to discuss the issue with the IFI representative at the FLR. They went back and looked through the data they keep, and noted no dropped/corrupted packets to our radio modem. I didn’t really suspect that as a cause, but needed to cover all bases looking for clues.
Mike,
Thanks for the update. Please keep us posted. The last two years our team has had a similar problem where autonomous behavior while tethered was far different than what happened on the competition field. We had other, more important, demons to extricate from our machines, however, so we weren’t able to pursue the matter. It’s a frustrating problem, though, because it can only be tested during real competition.
At our regional, we had been encountering similar problems in autonomous mode with the robot not behaving as expected. We had a very difficult time debugging the problem. The fundamental difficulty we had in identifying the problem was that our assumptions about where the problem was occurring were fundamentally wrong, so we were looking in the wrong place(s) for the problem, which made it very hard to find! Taking a step back and asking others on CD for advice is a great approach to get a fresh perspective. Hopefully one of the myriad of replies you get here will help you identify the real problem. I would caution you to consider all the suggestions, even if you “already tried” one of them – on our team quick conclusions arrived at while debugging “under the gun” during the little time in between matches are sometimes incorrect due to being based upon incorrect data.
I think that the key issue is that you need to determine what the robot is doing when it goes astray. It sounds like the three most likely candidates are either (1) that the robot is trying to drive straight, guided by the gyro, but has an incorrect bias due to a problem with the bias calculation not being invoked as expected due to differences in the disabled -> autonomous transition sequence between your practice field testing and the real field, or (2) that the robot is trying to drive straight, but has suddenly started receiving incorrect sensor data causing it think that one wheel has suddenly stopped turning and that it then remedies the situation by turning sharply in the other direction, or (3) that the robot thinks it has finished driving straight and is prematurely performing the left turn planned for at the other end of the field. Determining which of these is actually occurring (or if even a fourth event is taking place) should be a priority.
We grappled with the same issue (identifying what the robot is trying to do without telemetry information) at our regional this year. We ended up locating the problem by testing in the pits (our problem turned out to be a flaky sensor) when we could use “printf()” output to our laptop to identify what the robot was trying to do at the time the problem occurred. However, in a real match, you don’t have the luxury of live “printf()” output from the controller.
Having learned a few things since our regional, though, I can now suggest to you two ways to be able to get “printf()” output during operation on the real field:
1 - Use a “serial port logging device” on the robot to record serial port information. Phil Malone of Team 1629 wrote a white paper on doing this using the SparkFun Logomatic Serial Datalogger. His white paper explaining how to use this product is published as a PDF file on the SparkFun site. We have since ordered one of these from SparkFun but haven’t yet received it to play with it. This device has the advantage of being fully FIRST legal for competition rounds.
2 - If you have a practice round (this wouldn’t be legal for a real round) and you don’t have time to order and receive the above datalogger, we learned of another approach to getting “live data” from Team 1307 who used a similar approach when testing their autonomous modes… You can strap a laptop to the robot, and have it connected to the programming port of the controller. I’d suggest using Hyperterminal (or your favorite terminal emulation program) to log all the output data to a file. By adding a lot of “debugging output” to your autonomous program, this will enable you to determine exactly what the controller thinks it is doing (and what sensor input it is seeing) when the problem occurs during the actual practice match. If you try this, make sure that the laptop is extremely well attached to the robot, and preferably in a highly protected spot, as you really don’t want it to come loose during the match or have another robot’s arm stuck through it!
Now that I’ve spoken my piece on trying to figure out what the robot is actually doing when the problem occurs, I’ll add a few specific suggestions on some of the possibilities that you and others have mentioned.
When you made this fix are you sure that you completely turned off the bias calculation code so that if and when the bias calculation code did run, it didn’t simply overwrite the hardcoded initial value? (I know that sounds like a stupid error, but I made that exact mistake during the build season when trying to debug the gyro bias calculation…) A possible way to see what would happen if the gyro bias is not calculated would be to turn off the calculation of the gyro bias and then run (on the practice field) the resulting code. When we do this with our robot (i.e. simulating a failed gyro bias calculation), we get a hard turn to one side (I forget which) when trying to drive straight, as the program thinks the robot is spinning quickly due to the gyro bias being radically incorrect.
This sounds like exactly how our autonomous software works, except that we used absolute magnetic encoders on our driven axles, rather than geartooth sensors. We ended up having multiple contributory causes to our problem. One of the root causes was that we were sometimes getting flaky values from one of our magnetic encoders. The symptoms we would see were that the robot would drive beautifully for a somewhat random distance, and then turn sharply to the right. This ended up being because the left encoder would stop reporting new values, which led to the program thinking that the robot was turning sharply to the left, which resulted in an immediate decrease in right motor power, which caused a sharp turn to the right. Thus, even though our robot was trying to “drive straight,” the bad encoder input caused the robot to turn sharply to the right since the incorrect sensor information led to the (incorrect) conclusion that the robot needed to compensate for stoppage of the left wheel.
The other likely scenario is that the robot thinks it has completed traveling the desired distance to make the left turn. You mentioned that the turn works correctly on a practice field, but is that with the robot driving the entire 40 feet and making a left turn, or a shorter “test distance?” The reason I ask is that depending upon the resolution of your encoders, the precision of the variables you are using, and the order of operations in your assignments, you may be seeing overflow on your “40 feet” distance. This problem could manifest itself as the code thinking it only needs to go a much smaller distance to go 40 feet. We’ve been bitten many times by mathematical operations like the following:
At first glance, and with most compilers, the above looks perfectly fine. dist_to_travel is a 16-bit signed integer, and 40*12 = 480 which fits very easily into a 16-bit signed integer. However, the microchip compiler, when generating the code for the above, looks at the 40 and 12 and notes that they both fit nicely into 8-bit math, and thus performs the calculation of 40 * 12 in 8-bits, which overflows and gives a very unexpected value. However, if you were to test this code by setting DRIVE_STRAIGHT_DIST_IN_FEET to 6 feet in order to have enough room to test the code out in the practice area, you would have 6 * 12 = 72, which fits nicely in 8-bit math and will work perfectly. Only when trying to drive the full 40 feet does the problem occur, resulting in mayhem during the real round, even though everything worked great on the practice field!
The above, which looks a lot like a bug for those (like myself) accustomed to ANSI-standard C compilers, is actually a documented feature of the microchip compiler:
2.7 ISO DIVERGENCES
2.7.1 Integer Promotions
ISO mandates that all arithmetic be performed at int precision or greater. By default, MPLAB C18 will perform arithmetic at the size of the largest operand, even if both operands are smaller than an int. The ISO mandated behavior can be instated via the -Oi command-line option.
For example:
unsigned char a, b;
unsigned i;
a = b = 0x80;
i = a + b; /* ISO requires that i == 0x100, but in C18 i == 0 */
Note that this divergence also applies to constant literals. The chosen type for constant literals is the first one from the appropriate group that can represent the value of the constant without overflow.
For example:
#define A 0x10 /* A will be considered a char unless -Oi
specified */
#define B 0x10 /* B will be considered a char unless -Oi
specified */
#define C (A) * (B)
unsigned i;
i = C; /* ISO requires that i == 0x100, but in C18 i == 0 */
Hopefully at least one of these suggestions will pan out for you!
Thanks for your insight! I definitely agree with you about not eliminating possibilities too early in the analysis of a problem. It’s not so much eliminating possibilities as much as narrowing the investigations into higher probability causes. You need to keep revisiting the bigger picture…
Of the three candidates you mentioned as causes of our problem, 1 or 3 are the more likely ones (I certainly have had my eye on the gyro, as it was a problem throughout build season but seemed to be behaving towards the end). The geartooth sensors were less likely for us, as we use them exclusively for determining when to shift our two-speed transmissions (which are locked into low for hybrid) and distance traveled. In our case, if one failed for the distance calculations, it would make us go further than we wanted (since we average the distances from both wheels to get the centerline-distance traveled). Haven’t eliminated the going-short possibility, but it’s less likely to be caused by the geartooth sensors misbehaving…
The only datapoint that doesn’t fit with the gyro being the cause is the fact that when we removed the gyro from the equation entirely (by doing “programmed straight” with hard values being written to the drive motor pwm’s), we still saw the turn left, just at a 45 degree angle (into the center wall). We looked at some video of one of those instances, and noted that there seemed to be a difference in the starting time of each wheel (the left seemed to be slow off-the-line compared to the right). So, it may be a different cause for the “programmed straight” not behaving…
Your suggestions about the data logger possibilities are excellent! We had considered investigating this during the off-season this year, as the “tether to get debug output” approach gets old realy quick (especially when the problems only occur when the robot has been running for a while). The picture someone has of “forever chasing the 'bot” while carrying a laptop is very apt! Not sure about the ability to put a laptop on our robot (as it’s pretty exposed on the top and no “inside” room), but I’ll bring it up to the team today at our team meeting. Definitely will purchase a data logger, however!
My suspicion about the gyro goes beyond the gyro bias calculation. I’m sure that the calculation was totally disabled (the only trace output we generate when ready for going out for competition is that for the bias calculation, just to be sure that we look “normal”; this didn’t print before the matches where we had the hard-coded bias value). My real issue is that when the bias value fails to get real values (gyro reports bad values used to generate the bias), it’s likely the same thing can occur after the bias is calculated. If this happens, no matter how good the bias value is, the gyro will be reporting inaccurate/false angles of turn and, hence, we’ll be attempt to correct for a non-existent turn… Haven’t proved this is the case yet, but have been brainstorming with the team to put in OI indicators (LED’s) to indicate to the drive team that a problem with the gyro has occurred…
The point you brought up about the compiler differences from ANSI-standard is something that all teams should be aware of. We discovered this about two weeks from the end of the build season, in exactly the way you stated: we had two seemingly simple values that when multipled together, were being used to bound an input value (12 * 54). The check kept failing, and when we printed the value 12*54, it was being computed as -120! The fact that the compiler by default doesn’t promote to an integer before doing math is definitely not what I was used to, and enabling the option to have it do that and recompiling the code didn’t help either. The resulting code also didn’t work, just in different ways, so we reverted to the original options and just fixed all instances of the byte values so the math worked (our mini-autonomous is where we saw it: with no distance being executed, the robot just did a spin in-place!).
Thanks again for your (and everyones) insight! We’ll keep working the problem and report what we find.