Robot Code Error issues

We are trying to get our practice bot up and running so our driver can get some driver practice before week 3 regionals however we are running into some issues and our lead programmer is out of town. When we drive the robot we get an error to the dashboard saying that their is an error in the RobotDrive loop. After doing some investigating, this turns out to be the blocks from the wpi library.??? No matter what code we run we are getting this error. We even remade a basic version of the code, same issue. We have reflashed the c-rio and still it doesn’t work. We have ran both the harmonic drive blocks and arcade drive blocks and nothing works. Also, we are using the CAN Jaguar setup.
Any thoughts/suggestions.

Thanks in advance,
Austin

If you don’t have a programmer to debug the issue, and it should be checked out, then you can disable the error by going to Begin.vi, double-clicking the “Open 2/4 Motor” vi and changing the “enabled” box to “disabled.”

The error can mean your code is running too slowly, but there are other reasons too.

Team 241 had problems with CAN Jaguar Timeout on CAN messages, but only occasionally. (We were using 2CAN connected to 4 Jaguars using the 6p4c connections.

The main point of what we noticed is these occasional errors that would happen when we operated tethered or in a local mode without using the FMS system had no visible impact on the performance of robot.

HOWEVER, when we connected to the FMS system, these errors became much more frequent and the robot was unresponsive. We are guessing that these errors cause different behavior under the FMS control rather than simple tether control.

We debugged the code and found we had removed the wait(0.05) after each invocation of TankDrive().
We fixed the code to us wait (0.1) and we dramatically reduced the occurrence of these errors to maybe 5 per 3 minute interval when tethered in the pits and running the robot up on blocks.

When we connected again in our matches we saw the error continuously through the match and the robot was very sluggish.

I started reading around and noticed that the FMS sends special messages to robots that use the CAN (over and above the messages it sends to CRio).

QUESTIONS:

  1. Are there a known set of errors that cause a robot to be dramatically differently under FMS control versus tethered control?

  2. If so, is there additional hardware/software a team can typically use to get behavior that more consistently emulates FMS? (I.e., if the FMS is sending periodic watchdog commands and causing robots to halt for certain errors and causing greater loads, is there a simple way that can be done by a team in tethered mode?)

  3. Is there a list of known coding bugs that cause performance problems in FMS mode but not tethered mode?

I’m not sure what you were reading, but there is no traffic between the FMS and the robot. The FMS is not aware of the type of motor control the robot is using and is not responsible for any CAN specific safety feature.

There may be a way for the field communications to impact the robot code, but not through the mechanism you are describing.

Greg McKaskle

Greg, As to your point, in browsing around, I had found…
http://www.chiefdelphi.com/forums/archive/index.php/t-86259.html
Radical Pi
07-14-2010, 09:25 AM

I think I may have a reason for the slowdown and the jitter. In the C++ code, all requests to the CAN driver are routed through either FRC_NetworkCommunication_JaguarCANDriver_sendMessage, or FRC_NetworkCommunication_JaguarCANDriver_receiveMessage. Also, when the driver is loaded by the OS, the init calls FRC_NetworkCommunication_JaguarCANDriver_registerInterface with a pointer to the driver class. By the names, I’d assume that all messages are run through the NetworkCommunication library before they get to the driver itself, most likely so a disabled robot cannot continue to send CAN messages. NetworkCommunication is searching for a driver station and communicating with the driver station, so it could account for the delay and the jitter.

I could not find source code for FRC_NetworkCommunication_JaguarCANDriver_sendMessage().
Does this communicate to the NetworkCommunication task?
Could Jaguar messages be a potential for a significant extra level of load on the NetworkCommunication task (especially if a team is trying to retrieve data from the Jaguars)?

I did not see an equivalent for PWM controlled Jaguars to FRC_NetworkCommunications*sendMessage nor something equivalent.

The behavior we observed was different on the competition field than the behavior we observed during practice on tether or using a radio on the practice field.
The primary observation was an order of magnitude more Set and Get CAN message timeouts on the message log and a robot that acted extremely sluggish (or did not move at all).

We suspect motor noise impacting the DAP-1522 or impacting the CAN bus- It might be we had a bad CAN cable or a bad termination.

Other than a more hostile radio environment, what are the variables on the competition field using FMS that can result in a change in observed behavior as compared to tethered or practice field operation?
What is the expected additional delay, and jitter on transferred Ethernet messages when using the competition field?

After we failed to make the elimination round, we spent some time rewiring our competition robot for PWM and will use that in Boston.
We will move our DAP-1522 to a less radio hostile position/orientation on the robot.
We might spend a very small amount of time to try to set up FMS Light and our practice bot and create radio hostility to see if we can recreate the observed bad behavior outside a competition FMS field.

Most likely, we will not use CAN and 2CAN again to control Jaguars on a competition robot until we know what caused our problems AND we can create a normal test environment where we can more accurately and consistently produce the FMS competition environment outside the time stress of a competition.

We will likely continue to simplify until we get robustness and try to make the some test environment tests more hostile than we expect the competition field to be: (Add noise in proximity to the radio, lower radio signal through shielding, inject expected worse case latency and jitter into the communication path, monitor behavior with wireshark)

We will also likely add software to log behaviors per second and see if we can correlate errors to other monitored conditions.

This is not quite how it works. The Network Communication task does not search for a DS. It simply blocks waiting for a packet. A variable in the Network Communication object keeps track of status of the communication and if the robot has been enabled. The SendMessage entry-point simply reads that variable to decide if a message should be sent. It should not affect timing.

There is additional load on the system in general when using CAN, but I believe we’ve minimized a lot of that since last year.

That’s because the PWM is controlled in the FPGA. NetworkCommunication is responsible for keeping the FPGA watchdog fed so that the PWM will continue to output.

I’m not aware of anything in the parts that I have access to that even distinguishes that the FMS is involved, except for NetConsole deciding to squelch output if the FMS is attached and the robot is enabled (to reduce extraneous traffic on the field radio).

I’m not certain that this is the cause, but I can’t think of anything at this point that I would expect to make a difference between field operation and practice operation.

-Joe

If the CAN Message is flowing through the Network Communication task and the NC task is blocked waiting for a DS message, does that imply the CAN messages won’t be processed through the NC task until the next DS message comes in?
If so, that could be a serious bottleneck when the DS message flow is intermittent due to software errors or radio problems. Is it possible that could be multiplying the impacts far worse than a PWM set up would.

What does “squelch” mean in this context? Does it slow the NetConsole flow down or does it just not produce messages for NetConsole?
(We did have NetConsole.out populated on our CRio for monitoring in our pits.- we did not remove it during competition runs)

If attached FMS is intentionally delaying NetConsole messages (as opposed to eliminating all of them), then does that means the errors are getting delayed if NetConsole is active and the entire processing is getting delayed?
If that were the case, that certainly could explain a mechanism that could cause different behavior within FMS versus a practice field.

Is there any user documentation on “* Squelch the NetConsole output when the FMS is attached and the robot is enabled”?
Is the NetConsole output NOT squelched while robot is disabled but still attached to the FMS?

Internet search points to
http://firstforge.wpi.edu/sf/scm/do/viewCommit/projects.wpilib/scm.wpilibcpp/cmmt6497;jsessionid=A9593FD856B2533BC545E58FC99BE6C5
which indicates that the source for that change was committed on Feb 7th 2011 at 1:02am. (working late?)

When the FMS is attached and the robot is enabled, the NetConsole server will read any data from the pseudo terminal and then do nothing with the data, immediately going back to read from the PTY. The result is that anything traced to the console while the FMS is attached and the robot is enabled with be lost forever. It is not delayed or saved or anything like that. I don’t think the NetConsole behavior will do anything other than reduce the CPU load and network load.

No. Fundamentally it is an attempt to not waste radio bandwidth while competing. In a field configured to spec, the NetConsole ports will be blocked at the router anyway (though I’ve seen them open).

Correct. When the robot is disabled, the NetConsole output is enabled. The idea here was if you get lucky and the port is not blocked (or you are debugging the FMS), it would be nice to be able to see the cRIO console on the field while preparing for the match.

Always working late! :wink: …but that commit log entry is dated for when the change was replicated from the internal SVN server to the public one, not when the actual change was submitted. If there were actually files for this on the public server (as there are in the WPILib for C++ source), I believe that log would show you the correct date of check-in and the person who checked it in. The distinction here is that the record you are looking at is part of the source forge data, not the SVN data.

-Joe