Chief Delphi

Chief Delphi (http://www.chiefdelphi.com/forums/index.php)
-   CAN (http://www.chiefdelphi.com/forums/forumdisplay.php?f=185)
-   -   Unexplained intermittent CAN / 2CAN Jaguar problems at GSR (http://www.chiefdelphi.com/forums/showthread.php?t=93338)

John Heden 04-04-2011 19:45

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by jtdowney (Post 1049248)
R50A states "The DAP-1522 radio is connected to the cRIO-FRC Ethernet port 1 (either directly or via a CAT5 Ethernet pigtail)." We took this to mean that under no circumstance can an active device (2CAN) sit between the DAP-1522 and the cRIO. That leaves two choices, connect the 2CAN to the DAP-1522 or connect the 2CAN to Ethernet port 2 on the cRIO.

My teams robot (programmed with Java using IterativeRobot) has gone through one event with our 2CAN plugged into our DAP-1522 and have had no CAN related trouble. We were running cRIO v28 (v29 wasn't out at the time) and 2CAN firmware v2.5 with the SVN rev 66 plugin on the cRIO. We have 6 black jaguars on the CAN bus with no sensor inputs or limit switches.

Perhaps we were very fortunate during our regional but we have not had any serious CAN issues (knock on wood) since build. All of our trouble then could be traced back to poorly made cables when we did have problems.

I am hoping our luck caries us through championship.

Greetings,

R59 adds some expansion to R50 stating:
<R59> If CAN-bus communications are used, the CAN-bus must be connected to the cRIO-FRC through either the Ethernet network connected to Port 1, Port 2, or the DB-9 RS-232 port connection.

Our 2CAN connection directly to the radio is probably a rule violation (inadvertent) but was an attempt to prevent errors when we were connected (legally) to port 1. A port 2 to 2CAN connection will be our next experiment to try to avoid this issue.

john

John Heden 04-04-2011 20:00

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Did you by chance happen to have Windriver's target debugger connection open when this occurred in your pits? If so, did it report any abnormal terminations or errors?

If not, you might consider doing so as this can provide a wealth of information when things go awry. You don't have to be "debugging" the code at the time -- you can be running an "deployed" program. The nice part about it is that it will report any task failures/terminations to you as well as stack information if available.
No we didn't. Good idea for us to try if we can replicate this when we get our robot back.

Quote:

Are you able to go into a bit more detail on what your dashboard task is doing exactly? Particularly in relation to CANJaguar objects, as well as frequency of iteration.
We send dashboard data during disable as well, but do it as part of the normal disable processing routine rather than as a separate task.
I'm not familiar with the "normal disable processing routine". Is that a Java or Labview construct ? We use C++ and spawn a separate simple polling loop that polls our 5 jaguars for encoder values, limit switch states, current draw, etc. There are probably 15 or so Jaguar calls in a polling loop with a 50ms sleep (~20 Hz period). This thread is launched immediately at the end of our robot constructor.

Thanks,
john

John Heden 04-04-2011 20:16

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by nuttle (Post 1049236)
Do you know if your 2CAN has had the firmware updated to version 2.5 or not? We have a regional this coming week/end and will try out v29 on the cRIO. We've been using serial, but it is easy enough to switch back and forth that we might try the 2CAN again, at least for practice matches. We have not yet had a chance to try either of these updates.

Yes! We updated to V29 and had V2.5. We started to try a switch to serial but ran into some difficulties trying to craft a proper RS232 to Black JAG cable during the competition as well as finding a RS232 to USB converter. There were a number of folks at Hartford who were suspicious of the 2CAN and suggested that a RS232 conversion might solve our intermittent problem. We remain optimistic that a robust 2CAN solution is possible.

jtdowney 04-04-2011 21:16

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by John Heden (Post 1049417)
Greetings,

R59 adds some expansion to R50 stating:
<R59> If CAN-bus communications are used, the CAN-bus must be connected to the cRIO-FRC through either the Ethernet network connected to Port 1, Port 2, or the DB-9 RS-232 port connection.

Our 2CAN connection directly to the radio is probably a rule violation (inadvertent) but was an attempt to prevent errors when we were connected (legally) to port 1. A port 2 to 2CAN connection will be our next experiment to try to avoid this issue.

john

What R59 means to me is that the 2CAN (or any custom circuit) is allowed to be on the 10.0.0.0/8 subnet and communicate through port 1 by being connected to the DAP-1522. It does not mean you can directly wire the 2CAN between the cRIO and the DAP-1522 because that would violate R50.

MikeE 04-04-2011 22:05

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
I was experimenting with the CAN bus on our spare electronics board earlier and discovered some characteristics caused by termination problems which might be helpful.
I'm not claiming this is the cause of many, or indeed any, CAN issues for other teams but it is one potential failure mode.

Our practice board uses the documented terminator of an RJ-12 with a 100ohm resistor crimped between pins 3 & 4. (We also added a short length of telephone wire crimped into pins 1 & 6 as a handle for insertion / extraction which I don't believe has any bearing on the issues.)
DMM testing showed a ~100ohm resistance as expected. However at some stage the resistor leads had bent towards each other to the point where they were almost touching, creating a potential short on the CANH & CANL lines. Mechanical shock could potentially cause them to touch for an instant.

I discovered that when the lines were shorted even briefly the CAN bus failed completely. Interestingly, removing the short did not restore CAN, and neither did removing the short and rebooting the cRIO. However removing the short and power-cycling the robot did restore CAN. I predict that just power-cycling the Jaguars would have the same effect, but I didn't test this (although our code would still need to be rerun to initialize CAN properly).

We are using serial-CAN so we could still control the bridging BlackJag using RS232 while the CAN bus was out. I conclude that only seeing the bridging Jaguar is a useful diagnostic indicator for a potential terminator problem - sorry 2CAN users.

drakesword 05-04-2011 00:03

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Did some experimenting this weekend

We used last years robot with a 2CAN on port 1 and tested the following with 4 jags. Using code to time, we ran the robot for two minutes and printed out the total number of errors

A) 6 Conductor wires with bad routing (had a few cables wrapped around power cables)
B) 6 Conductor wires with good routing (not near power wires)
C) 2 Conductor wires with bad routing
D) 2 Conductor wires with good routing
E) 120 Ohm Termination
F) 100 Ohm Termination
G) Using PID
H) Using just voltage control
I) Having a dead battery

A C E F Had an average of 16 errors
B Had an average of 12 errors
*G Had an average of 17 errors*
H Had an average of 8 Errors
I Had an average of 29 Errors

Now an interesting trend we saw with PID marked with *
Not many errors were seen with just normal movements. We used current control first set to max at 40 Amps then set with 20 amps max. With 40 amps the robot drove without a problem. With 20 amps we had issues turning (wheels are out of alignment) but more interesting was when the robot stalled or near stalled (with 20 amps) we through MANY MANY MORE errors then if we didnt stall. So at a 20 amp stall there were more can errors. But with draws approaching 40 amps (without stalling) there were less errors.

This seems to indicate more along the lines of an issue with either the firmware on the JAG with PID and higher calculation or with the driver on the cRIO waiting for a response that cannot be returned due to said calculations . . .

Obviously with a low battery or high draws you will have more errors due to jags browning out.

Hugh Meyer 05-04-2011 09:27

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Bryant,

How did the "D" option turn out? I didn't see it in the results.

-Hugh

drakesword 05-04-2011 11:16

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Sorry forgot to list it.

Yes the average was 11 errors which we didn't find as a significant improvement over B.


Another small note I forgot to add is that with B and D is we encountered these errors even if the robot was not moving at all for the entire 2 minutes.

I do believe the case that it is processor starving.

I have not looked at the source of the jag firmware but does thae jag ever decide to "drop" a frame to keep up with timing of the PID? Or could there be an interrupt overriding the sending of a frame?

When I was messing with the jags on my own I was sending one frame to each Jag I wanted to control. Then the jags would execute the command until another one was sent.

So in that situation I would say "Move the motor X rotations" and not do anything else. The Jag happily went on its job making the motor go X revolutions.

Its my understanding that this has been modified for safety (which is understandable) to require the "trusted" heartbeat. Does anyone know the interval that the heartbeat must be receive in order to keep it alive? and Does anyone know the actually rate in which the heartbeat is being sent?

mjcoss 05-04-2011 12:35

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
I'm all in favor of trying to standardize information and see if any picture emerges.

1) Correct/Proper wiring both CAN and Power.
CAN wiring is correct, and has proper termination
2) Ground isolation of components.
Not sure about this.
3) CAN Interface used, Serial or 2CAN
2CAN connected to port 2 of CRIO. Currently running firmware 2.1, but think that updating to 2.5 would be a good idea.
4) Type of JAGS used, Black, Grey, Both
9 Black Jags all running firmware 92.
5) The programming language used
C++
6) The control mode(s) used.
Voltage mode
7) encoders, pots, limit switches are being used.
limit switches are connected to 3 Jags

We built two robots, and so I've been playing with our B bot to see if I can find the root cause of our timeouts.

Some observations:
1) Even in disabled mode, I occasionally see timeouts. Now there is some polling done in disabled mode to check the status of the limit switches, but the traffic should small, and there really shouldn't be any problem at all.

2) I did accidentally make the problem worse, and this lead me to believe that it was a CAN bus bandwidth issue. Instead of using the Jaguars to implement a PID loop for our lifter, we are using the CRIO. I created a separate task to monitor the setpoint, and speed adjustments. Originally, the task ran 100 times a second. This triggered a large number of timeout messages. Slowing down the task down reduced the timeout messages. But after reading through this thread, it's not clear to me why this should be the case.

3) Autonomous mode seems to be much more sensitive to timeout issues than teleop.

And finally one question: I've seen posts in this thread that suggest adding capacitors to reduce noise. Is there any consensus on doing that? What size capacitors? How exactly should they be wired?

Thanks.

kamocat 05-04-2011 23:09

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
A heartbeat or trusted command is required every 100ms for a Jaguar to stay enabled.
(The "set output" messages are trusted commands. The heartbeat will keep all the Jaguars enabled, but if you want to use trusted commands, you will need to update the output of every Jaguar that you want to keep enabled.)
I believe resuming the heartbeat will NOT cause your motors to move again if the output is not re-sent. (This makes sense with the behavior I remember, but I have not explicitly tested for this behavior.)

EDIT:
mjcoss, I can probably answer some of your questions.

Yes, in updating all your motors every 10ms, you will see many timeout issues. Setting the output takes 5-10ms (often 8ms) for a single Jaguar.

Autonomous is likely more sensitive than Teleop because Teleop sends the output continuously(Theoretically this is every 20ms, but with 4 Jaguars it could be as slow as every 40ms. In LabVIEW, the drive motors are dealt with in near-parallel, which reduces the time by about 50%. I don't know if the other languages do this as well.)
In Autonomous, the output is usually only updated when it changes. This means that if the heartbeat is temporarily prevented from getting through, it will not do anything when it finally DOES get through, unless you've used a trusted output command.

Radical Pi 06-04-2011 01:21

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by kamocat (Post 1050137)
Yes, in updating all your motors every 10ms, you will see many timeout issues. Setting the output takes 5-10ms (often 8ms) for a single Jaguar.

You wouldn't happen to have any numbers of what parts of the system these times are coming from, would you? I've been digging through the jaguar source code again and found some interesting stuff in the CAN interface.

CAN/Serial have the lowest priority interrupts, being called whenever new data is on the bus. This interrupt also has the responsibility of interpreting the message and either returning an ACK or any relevant data over the bus. Above that is an interrupt timed every 1ms which reads in any queued commands over CAN, runs a PID tick, checks limit switches, updates LED, and updates the outputs. At an even higher level is the ADC data reader. At the top of the regularly called interrupts is an interrupt for an edge on either the PWM or encoder lines. Obviously above that is the internal watchdog.

So, from what I gather, if a message is sent at an unlucky time, the internal interrupts could be slowing down the response over CAN, with some system conditions increasing the chance of this than others. (ex. PID makes the 1ms run a little longer, an encoder causes more interrupts to be called, etc).

Jumping back to the cRIO, a jaguar has 10ms to ACK before an error is returned. Judging by the 8ms (that was without PID, right?) average posted by kamocat, there isn't much wiggle-room in the system. Here's another interesting fact: last year the timeout was 50ms.

Now, here's a test I'd run if I had the hardware to do so. Connect a jaguar over CAN, attach an encoder and a motor to it. Record an average response time of the jaguar with and without the encoder being spun. If the response time increases by a significant figure with the encoder attached, then possibly the interrupts in the jaguars are causing these problems.

Sorry if I rambled a bit up there, time for sleep

kamocat 06-04-2011 02:01

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Sorry, I couldn't fish any info out of TI about why this is. My questions on the TI forum were largely left unanswered.

Once I get some more time, I could do some additional testing. That may not be until June.

mjcoss 06-04-2011 14:35

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Thanks for the average update times, Marshal. Just as a side note, We're not updating all of the motors, just the ones controlled by the PID loop (2 in this case).

I really would like to understand why I'm seeing intermittent timeouts in disabled mode. I'm going to update the 2CAN firmware tonight and hope that it addresses some of the issues.

drakesword 07-04-2011 22:17

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Haha bigger issues today! Oh boy, there was a delay in starting the field (roughly 12 mins) when the match started we were only able to drive with one jag WOO. Brought it back (with the power still on) and opened the 2CAN page, We only saw 1 Jag. Power cycled and then we saw all 5!

ozrien 08-04-2011 00:49

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
I've definitely seen this symptom before, where a Jaguar becomes unresponsive until it is power cycled, so I ran some tests tonight and I've managed to reproduce this a number of ways....

Test 1 - 1 2CAN and 1 Black Jaguar.
Take a 2CAN and connect CAN cable from it to single Jaguar. Put termination resistor on Jaguar.
Power 2CAN, 2CAN LED will be orange. [ok]
Bridge CAN High and CAN low together, hold until 2CAN LED turns red. [ok]
Let go of CAN HIgh and CAN low and 2CAN LED will return orange. [ok]
Repeat until 2CAN LED stays red even after unbridging the CAN lines. [not ok]
Power cycle JUST the Jaguar and LED returns to orange. [not ok]

What happens here is when there is no cRIO, webdash, or host of any kind, the 2CAN periodically sends an enum CAN request to see who's out there. If no one responds with a CAN frame of any kind, 2CAN LED goes red to indicate a CAN problem. If 2CAN gets CAN frames within a timeout, LED is orange to indicate no CAN problems and that it's ready for any Ethernet traffic from cRIO/WebDash.

After bridging and unbridging several times the Jaguar no longer responds to the enum request from the 2CAN, which becomes apparent when the LED stays red.

Test 2 - 2 X Black Jag, one is Serial Gateway
Connect Black Jag Serial to PC using serial connection.
Connect Black Jag #2 to Serial Black Jag using CAN.
Open BDC comm
Drive Black Jag #2, setting full throttle. No need for motors.
Begin bridging and unbridging CAN H and CAN L until Black Jag#2 no longer drives (blinks orange). [ok].
Press enum on BDC COmm to refind Jaguars
You will only see the Serial Black Jaguar ID in the drop down [not ok].
Close/open BDC comm and you still will not be able to connect to Black Jag #2 [not ok].
Power cycle Jags and you will be able to resume CAN communication.

Test 3 - CAN tool and 1 Black Jag.
Connect a USB CAN tool to a black Jag.
I used Intrepid Control System's ValueCAN 3, and Vehicle Spy 3 software.
Transmit a 29 bit ID frame with DLC set to zero and arbitration id set to 0x240. This is an enum request.
Confirm Jaguar responds with CAN frame. [ok].
Transmit frame periodically, say 20 ms.
Each frame should have a response frame after each request (few milliseconds). [ok]
Begin bridging and unbridging CAN High and CAN low.
Confirm that CAN tool's transmit error count increases then decreases back to zero each time you bridge and unbridge.
Eventually the Jaguar will no longer respond even after unbridging CAN lines [not ok].

Conclusion
Briding and unbriding the CAN lines can put the Jaguars in to a state that's unrecoverable until you power cycle them. This can happen if the legs of your termination resistor are close together.
I suspect that momentarily losing termination resistor because of bad cable can also cause this symptom. Maybe the Jaguar can't handle Error frames gracefully or gets stuck in Bus off state (see error states in CAN spec, 127 occurrences of 11 recessive bits should bring CAN nodes back into error active state). Or maybe a race condition in the isrs (mentioned earlier in the post).

mjcoss 09-04-2011 06:06

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
The problems continue. We are at the Philly regional where in several of our matches, we've been stalled due to CAN bus timeouts. I switched our bus terminator with a spare and that seems to have helped. I love that they publish an update that states that if you are having cascading timeout errors that you can request a reboot, only to be told by the field personal that "sorry, we are behind schedule".

We're running v29 on the CRIO and 2.5 on the 2CAN. I'm just about to throw in the towel and switch to PWM, any may do so for St. Louis if there isn't some progress towards a solution. But the about post certainly points to a bug in the Jags, however it requires that there is repeated shorts on the CAN bus if I parse it correctly.

drakesword 09-04-2011 09:34

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
I do believe that the firmware and driver should not require a response from the jag for every message sent. Instead it should be based off of the Heartbeat only and it should increase the timeout time. This would reduce the amount of bandwidth needed to communicate with the cRIO.

For the jag to respond to every command is crazy.

Is there any way we can look at the FRC jag firmware source?

kamocat 09-04-2011 10:25

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
The Jag does respond to every message. I know this from looking at the CAN subVIs.
On the "send" messages, it does an ACK as well. (Not sure what this is for)

You can find the Jag firmware source here:
http://www.luminarymicro.com/products/rdk-bdc24.html
or here:
http://www.luminarymicro.com/products/rdk_bdc.html

I believe what you want is in the firmware development package.

drakesword 09-04-2011 12:52

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by kamocat (Post 1050899)
The Jag does respond to every message. I know this from looking at the CAN subVIs.
On the "send" messages, it does an ACK as well. (Not sure what this is for)

You can find the Jag firmware source here:
http://www.luminarymicro.com/products/rdk-bdc24.html
or here:
http://www.luminarymicro.com/products/rdk_bdc.html

I believe what you want is in the firmware development package.


Thanks. Yes I know that the jag responds to every message, I was saying maybe it shouldn't need to ...

jhersh 11-04-2011 02:02

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by drakesword (Post 1050913)
Thanks. Yes I know that the jag responds to every message, I was saying maybe it shouldn't need to ...

The main reason it needs to handshake is that the CAN peripheral in the Jag handles each message based on interrupts... if a message isn't handled before a new message comes in, it will lose the packet and not act on the expected command. To prevent this, the cRIO waits until the Jag claims to be done (via an ACK).

jhersh 11-04-2011 02:11

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by drakesword (Post 1049721)
Its my understanding that this has been modified for safety (which is understandable) to require the "trusted" heartbeat. Does anyone know the interval that the heartbeat must be receive in order to keep it alive? and Does anyone know the actually rate in which the heartbeat is being sent?

Either a heartbeat or a trusted set output command must be received by a Jag within 100ms of the last one to keep the Jag running. Every 15ms, the cRIO checks if the user has sent a trusted set output command within 75ms of the last heartbeat or trusted set output command. If not, it sends a heartbeat. This means that the cRIO will send a heartbeat to keep the Jag running somewhere between 75ms and 90ms (if needed).

jhersh 11-04-2011 02:18

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by mjcoss (Post 1049372)
One thing that I have seen which is causing no end of issues is that if you get a timeout on the messages, the API is return no indication of the failure. So, for example, if the GetForwardLimitOK() function is called, and times out, you get back false. There is no way to know that that has happened and if you are making decisions based on these results... We have an encoder on our lift mechanism. To zero the encoder, we drive to the bottom limit switch, and when we get there, we set the encoder to 0. This works fine until we lose the message due to timeout. From that point on the lift is offset by where ever the timeout occurred. There really needs to be a way within the API to detect that the transaction timed out.

Yes... that's a definite shortcoming of the C++ API. Both the LabVIEW and Java CANJaguar APIs have a mechanism for the user's program to know that there has been an error. I intended to remedy that this season, but didn't have the time. It will be fixed next year.

Quote:

Originally Posted by mjcoss (Post 1049372)
All in all, I'm really regretting the decision to use the CAN bus. And for the most part all of the features that I really wanted to use, that were provided by the CAN bus, proved to be unusable.

I'm sorry to hear that you aren't happy with the CAN implementation. What features did you want to use that you aren't using? What made them unusable for you?

mjcoss 11-04-2011 14:46

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Here's what we were planning:

1) We are using a two speed transmission from AndyMark. It has uses 2 motors, and has a single optical encoder built into the the gearbox (on the output shaft). I had originally intended to use speed & position control on the drive subsystem. Position control for autonomous, and speed control during teleop. We first tried just using a y splitter to feed both Jaguars, and then tried a master/slave arrangement. While we got it to work, it seemed unstable. I'm sure that with more time, we might have been able to get it to work reliably but there really wasn't enough time.

2) Our lifter mechanism had the same problem. We had intended to use position control to put the arm at preset heights. Again, we had two banebot motors driving a custom gearbox with an encoder of the output shaft. The position controls would work for some positions, but would oscillate and not find the set point at others. I tried tuning it several times and couldn't get something that was 100% stable. Again we tried a Y split, and a master/slave configuration. We ended up moving the encoder to the CRIO and ran the PID loop on the CRIO. This gave me better control on the loop, and was immediately stable.

3) We have limit switches directly connected to the Jaguars, for the lifter mechanism, and the claw. The bottom one on the lift mechanism was used to home the encoder. We drop to the bottom limit switch and set the encoder count to 0. This works fine unless we timeout on the get limit message. If that message times out, the API returns at limit, and we stop. The premature stop of the motor made zero be in the wrong place, throwing all subsequent offset off by upwards of 6 inches. Apparently, in Java you can catch the exception, but I didn't see anything in the C++ code to allow the caller to know that the results of the call was bogus. I've modified my version of the limit switch calls to return an error status so that I can decide what to do if an error occurs.

Now add into the mix that, under some unknown set of circumstances, we get a flood of CAN bus timeout messages making the robot completely unusable. Given all that, one has to wonder if using the CAN bus is a good idea. In addition, we get some number of intermittent timeouts, which while not fatal causes weird behavior because the code can't determine that a timeout has occurred since the caller is returned a valid return value even if the call times out.

I have updated the firmware to the latest versions. I've tried two different bus terminators. And new cables.

One item that we've seen that has lead me to point a finger at the 2CAN is that on our robot we have an onboard compressor. We have noticed that if during autonomous if the air pressure in the system is low and the compressor needs to run immediately, we get the flood of CAN bus messages. It is as if the run all 4 CIM motors, plus rehome the lifter, plus drop the arm, plus run the compressor is a trigger for the failure My theory is that the voltage drop is causing a failure in the 2CAN. Of course, it could be something completely unrelated.

We are going to St. Louis and I'm seriously debating whether we should rewire the bot to use PWM cables.

drakesword 12-04-2011 18:22

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by mjcoss (Post 1051739)
Here's what we were planning:

One item that we've seen that has lead me to point a finger at the 2CAN is that on our robot we have an onboard compressor. We have noticed that if during autonomous if the air pressure in the system is low and the compressor needs to run immediately, we get the flood of CAN bus messages. It is as if the run all 4 CIM motors, plus rehome the lifter, plus drop the arm, plus run the compressor is a trigger for the failure My theory is that the voltage drop is causing a failure in the 2CAN. Of course, it could be something completely unrelated.

We are going to St. Louis and I'm seriously debating whether we should rewire the bot to use PWM cables.

Its my understanding that the 2CAN wont drop out till about 6 volts. Regardless are you wiring it to the termination blocks or to the 12 regulated supply or the 24 volt regulated supply?

We had ours wired on the 24v line parallel to the solenoid bumper. We had dropouts fairly often but when it happened we saw the PHY lights go out on the connector. We switched the 2CAN to port 2 and had another dropout some time later. We then swapped out our radios (we had issues) and never had a dropout since. We also developed a pre-match procedure where one of the drivers logged into the 2can and checked to see that all the jags were online. Once we started and only one jag showed up. Quick power cycle and all of them were back online.

I'm going to write a quick java applet that will display the 2CAN status information without the need to open a browser (aka have it pop up with the driver station on load)

mjcoss 13-04-2011 12:19

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Off hand I don't know for certain how the 2CAN is connected (I'm more of a software guy), but I believe that it powered via the termination blocks. Should it be moved?

Just as an aside, how are people connecting the 2CAN? We have the CRIO connected to the DLINK via the CRIO's port 1 and the 2CAN via the CRIO's port 2. The second connector on the 2CAN is unused. We had though of adding a camera to the robot and I was planning on plugging the camera into the unused port of the 2CAN.

However, when I first wired it up, I had the 2CAN connected to port 1 of the CRIO and the DLINK was connected to the second port of the 2CAN, but it seemed to me that in this configuration, the 2CAN was going to have a lot more work to do that was unnecessary, as it would have to bridge all the traffic to and from the CRIO and driver station.

drakesword 15-04-2011 18:49

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Preliminary release for the app is available. Go to this post

rrossbach 15-04-2011 23:31

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Our robot has 8 Jags (4 black, 4 grey) running off of 2CAN, and we do a fair amount of runtime data logging on the cRIO for post-match analysis. We see occasional timeouts (approx every 20sec or so) but catch them so it has never really caused a control problem.

Interestingly, even when seeing timeouts we have never seen any CAN errors reported on the 2CAN webdash - assuming that the webdash is working correctly, that leads us to suspect that the occasional timeouts not caused by any wiring/comms issues with the CAN bus itself, but rather something else going on in the software stack within the cRIO.

Once nationals are over we're planning to test a variety of scenarios (running Jags off the 2CAN both with and without the cRIO, requesting tons of status info from the Jags continuously, etc) to try to collect very specific data on what scenarios provoke the timeouts. It's a great learning exercise for the students, and hopefully the data will help narrow down exactly what's going on. Can't promise exactly when we'll dig into this but we plan to post the "scenario design" we come up with and seek input from the community here.

- Ron
Team #2607 controls mentor

mjcoss 21-04-2011 09:55

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
So I finally got a chance to run some tests on our second bot that was made in parallel with our competition bot. It is not exactly the same but we were seeing similar CAN bus timeouts.

I replace the bus termination with a new one using a 120 ohm resistor (our old ones were 100 ohm), and we move the 2CAN power supply to the 24volt regulated power off the power distribution block (I think that what the electrical guy said0. Over a 10 minute session, cycling thru teleop, autonomous, and disabled states, we saw only 2 timeout messages.

Unfortunately, I can't recreate the one "known" failure case that we have seen on our competition bot. That is, if in autonomous, we need to run the compressor at the start of autonomous mode, we get a complete CAN bus failure. The problem is that our second bot was partial disassembled, and so not all of the components are running/attached.

But it seems like a good sign, although why it would loose even one is a bit of a mystery. Maybe if we increase the wait time a small step to see if the message was just delayed or was really lost.

jtdowney 21-04-2011 10:23

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
Quote:

Originally Posted by mjcoss (Post 1055134)
we move the 2CAN power supply to the 24volt regulated power off the power distribution block

Just a heads up, this is not a competition legal setup per R38A.

techhelpbb 21-04-2011 12:06

Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR
 
We rebuilt the frame for our 2nd robot and once again, with an entirely rebuilt electrical board and new mounting for the cRIO (and some heavier wiring in some places)...we still have CAN timeout errors occasionally...though we can still manage in voltage mode as we have done so far. So this is now 3 entire reworks of that system all producing similar results. I should like to point out however, that we can literally stall this current robot and we can't demonstrate that the timeouts we see match that stall event.

After addressing some issues with crimps of the RJ connectors on this build, I'm adding to my list of projects a Jaguar CAN cable tester, maybe one that can produce a 1 or 2 MHz test signal for a testing of the wires. Maybe with a nice counter to see if the pulses are being dropped.

We have set up a collection of test equipment that can be mounted on our robot (while it drives around freely) and when we get done with competition hopefully we'll get some nice readings of the signals at various points in this system. Clear up some of possible issues that have been suggested. I'd really like to get a nice protracted recording of the CAN bus communications prior to the timeouts and an analog recording of the power supply voltages, the CAN bus and a few other things. At very least it'll be highly educational for the students and handy to have the tools (we'll share as we get it going). Hopefully what we'll work out will be budget conscious enough that it can be replicated by other teams without undue cost (so far I feel that's the case).


All times are GMT -5. The time now is 05:17.

Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi