![]() |
Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
We had a couple of interesting and still unexplained catastrophic CAN problems at the Granite State Regional this weekend that we are still puzzling over. We use the 2CAN interface to connect 5 ‘tan’ Jaguars and everything seemed to work fine while we were tethered and also well during most of our matches.
We had two matches, however, where we failed to move at all with an endless stream of CAN transaction time outs being reported to the driver station diagnostic tab. Our drivers reported that they were able to recover well into one of these matches by rebooting the Robot and did successfully drive a bit in the little time remaining. Everything was always fine once we were tethered back at the pits and we were only able to reproduce this stream of CAN errors by actually unplugging one of the CAN cables. Interestingly, we were also able to reproduce the same sequence by disconnecting the Ethernet cable from the 2CAN interface. I was hoping to see some type of different error to help differentiate between a problem on the actual CAN bus vs. a problem with the 2CAN adapter. While the error messages were consistent with a cable disconnect of some type, it seems unlikely that a robot reboot would have resolved this type of problem and we were unable to locate any intermittent connections despite our best efforts. Our robot is now bagged and tagged and headed to the Hartford regional so we are unable to do any further experimentation and troubleshooting. There is now some pressure to abandon the CAN interface and rewire everything to our historical PWM interfaces if this problem can be understood and solved. Here are a few questions for anybody familiar with the Jaguar – 2CAN interface. My recollections are by memory but hopefully accurate. Any thoughts would be most appreciated. 1) How do folks recommend powering the actual 2CAN device ? There are references to ensuring the Dlink radio is powered by the boosted 12V (white Wago connector) to help avoid excessive voltage drops and possible problems. The 2CAN itself has a wide voltage range (6.5 to 28V) but should it be connected on the raw 12V fused power bus? Our autonomous start is probably the worst case power consumption where we simultaneously power up 3 big CIM motors to near full power. 2) Simultaneously accessing the 2CAN webpage while our robot code is simultaneously executing does result in a number of our robot CAN transactions failing (~ 1/1000). This is more of an annoyance for us than a problem but is suggestive of some type of system vulnerability when the 2CAN is being accessed from multiple sources. The WPI libraries seem to have access protection but the 2CAN itself may have some issue when it is being accessed by both the CRIO robot code and a browser on the2CAN page. 3) There is some mention that the FRC system does send additional CAN related messages to CAN based robots, can anybody confirm this possibility ? Could there be some type of negative interaction with the FMS system? 4) I did speak to a few CAN Jaguar users who spoke of possible overloading the CAN bus and needing to drop their access intervals down. The current WPI library interfaces, however, look like simple blocking calls where all access is nicely serialized and completes before the next sequence is initiated. While all of these calls may eventually slow down our control loops, it seems unlikely that we could overload the bus. 5) We do see some type of “too much error data!” message once in a while we were playing but could not locate this error message anywhere in our code or in the WPI libraries. Any idea as to the source of this unique message? 6) There is a rev 66 1/29/2011 for FRC plug-in update for FRC posted on the Cross the Road website. We bagged our robot before I could confirm what version of 2CAN plug-in is incorporated in the 2/11/2011 V28 CRIO update? Is there a need for manually installing this file or is the latest included with the V28 update ? Thanks in advance for any CAN thoughts… John |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
We noticed another issue in the CAN libraries (serial) that may be related. This only happened in the pits for us, but occasionally NetConsole would show an error about the FRC_CANJaguar_ReceiveTask failing, which would subsequently cause the user code to fail quite violently. As with you, a soft reboot of the cRIO fixed the problem.
Judging by the similar symptoms between Serial and 2CAN interfaces, I'm thinking there's something wrong with the FRC_NetworkCommunication portion of the CAN library. |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Hi John,
We also used CAN, but with all BLACK JAGS and 2CAN. We didn't have any issues at all with it. I can answer a few of your questions base on how we did things. 1) We power 2CAN from the raw 12V on the PD. We didn't experience any brown outs even when applying full power to 4 cims and 4 banebots simultaneously. I will say that low battery voltage, bad terminators, faulty wiring are the biggest contributors to CAN network problems. 2) We haven't experienced that issue, however, the 2CAN web page is only supposed to be accessible when attached to CRIO Ethernet Port 1, which is great for diagnostics, but is not allow in competition. 2CAN must be on port 2 in competition mode, it's a clearly defined rule. The only question that comes to mind is the 2CAN connect to the Wireless Bridge/AP. If it is, that could explain some the packet loss. 3)There is a "TRUSTED" mode that CAN uses for FIRST and it requires FIRST specific firmware on the JAGS. Level 92 I believe is the latest. It shouldn't have any negative impact on the FMS. We certainly didn't experience any through 9 qualifying rounds plus QF's, SF's and Finals. 4 & 5) I haven't seen or experienced either of those. 6) I took no chances and FTP'd the latest 2CAN driver to the CRIO after I installed V28. |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Quote:
<R59> If CAN-bus communications are used, the CAN-bus must be connected to the cRIO-FRC through either the Ethernet network connected to Port 1, Port 2, or the DB-9 RS-232 port connection. And <R50> Connections to the cRIO-FRC Ethernet ports must be compliant with the following parameters: A. The DAP-1522 radio is connected to the cRIO-FRC Ethernet port 1 (either directly or via a CAT5 Ethernet pigtail). B. Ethernet-connected COTS devices or custom circuits may connect to either cRIO-FRC Ethernet port; however, these devices may not transmit or receive UDP packets using ports 1100-1200 except for ports 1130 and 1140. That seems to be two conflicting rules, but the 2CAN would probably count as a "pigtail" Also, the 2CAN must be connected to the PD board by a dedicated 20 A breaker, ie not on the end part. <R39> F. Custom circuits and sensors powered via the cRIO-FRC or the Digital Sidecar are protected by the breaker on the circuit(s) supplying those devices. Power feeds to all other custom circuits must be protected with a dedicated 20-amp circuit breaker on the PD Board. Hooray for rules quotations. I am confused about in rule <R59> they say the radio may be connected directly OR with a pigtail. Them female to female connections, eh? (Yeah, yeah I know, a pigtail is a male-female...) |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
As another data point-
95 also had CAN bus issues at GSR, as did several other teams. We used all black jags and the serial port. It was enough of an issue that FIRST personal polled CAN teams and advised them that they couldn't say what was going on, but that it seemed like CAN wasn't always working. We elected to switch to PWM control (which also brought a half dead sidecar to our attention). The last I had heard was that a common factor in all this was that the team was using Java as their programming language. Beyond that, there didn't seem to be any commonality. I heard some accusations of it only occurring on one driver station, or side of the field, but I can't confirm if that was true or not. |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
1511 also saw similar failures during our time at FLR this past weekend. Fortunately for us this never manifested out on the playing field, only when testing in our pit.
I was able to attach the Windriver debugger a few times when we caught the error and have some technical details to compile and share with the developers (on my list of stuff to do hopefully today). It appears to be almost certainly a race condition during startup whereby when the failure is tripped a task initiated for CAN handling is terminating due to following a bad pointer. For completeness, our setup is serial/black jag based, all black jags in chain. C++ with latest (2/16) update, cRIO with image v28. |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Quote:
I also remember reading it somewhere else, like FIRST Forums, or a Team Update, but I can't remember exactly where. If/when I find/remember I'll update this. Quote:
Quote:
We intentionally chose to not use the serial interface for the following reasons: 128Kbps transfer rate over serial versus 1MB transfer rate using 2CAN. (CAN is typically a 1MB network) I also noticed that under heavy processing load the serial interface on the CRIO would lag, or appear to garble some data, etc.. So we opted to avoid that potential pitfall. |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Add Team 241 to the list of teams at GSR with CAN/2CAN problems.
But we were using C++. We did not ask around early enough and did not get the word that it was a general problem. We did switch our robot over to PWM before it was crated for Boston. http://www.chiefdelphi.com/forums/sh...13&postcount=3 |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Just call me curious...
Everyone that has the Jaguar problem on the CAN bus, please consider telling us: Have you put capacitors on the backs of your motors to dampen the noise? Have you considered putting capacitors across the encoder power wires? It's possible that noise from your electrical sources and the brush motors attached to the motor side of the Jaguars is causing issues for the CAN bus. We've noticed this issue as well, but we have 2 solutions. We detect the failure and force a reset. We are trying to get rid of all the noise that could cause things like this to happen. |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
We have also experienced the -44087 error along with the same host of symptoms, It happened once during a practice match and never during an actual match. We of course are using a 2CAN and have only seen the issue when it is used in FRC (cRIO, FRC JAG firmware 92). We have not seen the issue when the 2CAN is being used in the Crosslink Control System. It does seem to be a race condition based on the observed behavior and difficulty reproducing the failure. I have reported the problem to Omar and he has been unable to reproduce the issue. The fact that a soft reboot of the cRIO fixes the issue, and the problem has been seen in both the serial and Ethernet gateways, leads me to believe the problem is above the gateway level. We will continue to test using the 2CAN and advise if there are any bugs found on the side of the 2CAN.
|
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Last year we experienced several issues using CAN and lag was one of them.
We connected an oscilloscope to two digital outputs. One scope channel was controlled by software code that turned on at the start and off at the end of our periodic loop. The other channel was driven by a toggle function at the beginning of our code. This helped to define the state and made a good trigger signal. We found that our code was taking longer than our periodic time. Not a good situation. The problem occurred because of the CAN communications. We were driving 8 Jaguars and polling for just about all the data we could, so we could log the data. As a result we multiplexed the data out over 4 periods. This reduced our code run time down to about 55 milliseconds. We run our periodic loop at 100 milliseconds. This arrangement has been institutionalized this year with a special cable that connects directly to the break out board and scope. We regularly verify that the code is running under the period. A change we made from last year was ground loop isolation. Remove the ground wire in the CAN cable. This is a direct ground loop mess. If you are using one encoder to drive two Jaguars you will need optical isolation to remove the ground loop. Another change is that we added filtering on the Jaguar at the encoder connector. We were seeing resets on the Jaguars and the filter seems to be the solution. Resets will cause all kinds of problems. I have posts regarding these issues on other threads. We are using a 2CAN and C++. The serial port baud rate of 115k was just not fast enough to transfer all the data we wanted to transfer. We don’t have all of the issues fixed, but I wanted to share these in hopes that it will help others using CAN Jaguar. Good luck! -Hugh |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Thanks for the detailed write-up!
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
Please let me know if you come up with any more details that might lead to the issue. Thanks, -Joe |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Quote:
Quote:
-Joe |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Quote:
On the practice field teams must unplug their DLink and replace it with the Practice field DLink. |
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
Quote:
|
| All times are GMT -5. The time now is 05:17. |
Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi