Go to Post And yes. The Super Bowl is on, and I'm replying to a thread on Chief Delphi... - BobbyVanNess [more]
Home
Go Back   Chief Delphi > Technical > Electrical > CAN
CD-Media   CD-Spy  
portal register members calendar search Today's Posts Mark Forums Read FAQ rules

 
Reply
Thread Tools Rate Thread Display Modes
  #106   Spotlight this post!  
Unread 05-04-2011, 00:03
drakesword drakesword is offline
Registered User
AKA: Bryant
FRC #0346 (Robohawks)
Team Role: Mentor
 
Join Date: Jan 2006
Rookie Year: 2004
Location: USA
Posts: 200
drakesword is on a distinguished road
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

Did some experimenting this weekend

We used last years robot with a 2CAN on port 1 and tested the following with 4 jags. Using code to time, we ran the robot for two minutes and printed out the total number of errors

A) 6 Conductor wires with bad routing (had a few cables wrapped around power cables)
B) 6 Conductor wires with good routing (not near power wires)
C) 2 Conductor wires with bad routing
D) 2 Conductor wires with good routing
E) 120 Ohm Termination
F) 100 Ohm Termination
G) Using PID
H) Using just voltage control
I) Having a dead battery

A C E F Had an average of 16 errors
B Had an average of 12 errors
*G Had an average of 17 errors*
H Had an average of 8 Errors
I Had an average of 29 Errors

Now an interesting trend we saw with PID marked with *
Not many errors were seen with just normal movements. We used current control first set to max at 40 Amps then set with 20 amps max. With 40 amps the robot drove without a problem. With 20 amps we had issues turning (wheels are out of alignment) but more interesting was when the robot stalled or near stalled (with 20 amps) we through MANY MANY MORE errors then if we didnt stall. So at a 20 amp stall there were more can errors. But with draws approaching 40 amps (without stalling) there were less errors.

This seems to indicate more along the lines of an issue with either the firmware on the JAG with PID and higher calculation or with the driver on the cRIO waiting for a response that cannot be returned due to said calculations . . .

Obviously with a low battery or high draws you will have more errors due to jags browning out.
Reply With Quote
  #107   Spotlight this post!  
Unread 05-04-2011, 09:27
Hugh Meyer's Avatar
Hugh Meyer Hugh Meyer is offline
Registered User
FRC #1741 (Red Alert Robotics)
Team Role: Mentor
 
Join Date: Feb 2009
Rookie Year: 2008
Location: Greenwood Indiana
Posts: 158
Hugh Meyer has much to be proud ofHugh Meyer has much to be proud ofHugh Meyer has much to be proud ofHugh Meyer has much to be proud ofHugh Meyer has much to be proud ofHugh Meyer has much to be proud ofHugh Meyer has much to be proud ofHugh Meyer has much to be proud of
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

Bryant,

How did the "D" option turn out? I didn't see it in the results.

-Hugh
Reply With Quote
  #108   Spotlight this post!  
Unread 05-04-2011, 11:16
drakesword drakesword is offline
Registered User
AKA: Bryant
FRC #0346 (Robohawks)
Team Role: Mentor
 
Join Date: Jan 2006
Rookie Year: 2004
Location: USA
Posts: 200
drakesword is on a distinguished road
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

Sorry forgot to list it.

Yes the average was 11 errors which we didn't find as a significant improvement over B.


Another small note I forgot to add is that with B and D is we encountered these errors even if the robot was not moving at all for the entire 2 minutes.

I do believe the case that it is processor starving.

I have not looked at the source of the jag firmware but does thae jag ever decide to "drop" a frame to keep up with timing of the PID? Or could there be an interrupt overriding the sending of a frame?

When I was messing with the jags on my own I was sending one frame to each Jag I wanted to control. Then the jags would execute the command until another one was sent.

So in that situation I would say "Move the motor X rotations" and not do anything else. The Jag happily went on its job making the motor go X revolutions.

Its my understanding that this has been modified for safety (which is understandable) to require the "trusted" heartbeat. Does anyone know the interval that the heartbeat must be receive in order to keep it alive? and Does anyone know the actually rate in which the heartbeat is being sent?
Reply With Quote
  #109   Spotlight this post!  
Unread 05-04-2011, 12:35
mjcoss mjcoss is offline
Registered User
FRC #0303
 
Join Date: Jan 2009
Location: Bridgewater,NJ
Posts: 70
mjcoss is a jewel in the roughmjcoss is a jewel in the roughmjcoss is a jewel in the roughmjcoss is a jewel in the rough
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

I'm all in favor of trying to standardize information and see if any picture emerges.

1) Correct/Proper wiring both CAN and Power.
CAN wiring is correct, and has proper termination
2) Ground isolation of components.
Not sure about this.
3) CAN Interface used, Serial or 2CAN
2CAN connected to port 2 of CRIO. Currently running firmware 2.1, but think that updating to 2.5 would be a good idea.
4) Type of JAGS used, Black, Grey, Both
9 Black Jags all running firmware 92.
5) The programming language used
C++
6) The control mode(s) used.
Voltage mode
7) encoders, pots, limit switches are being used.
limit switches are connected to 3 Jags

We built two robots, and so I've been playing with our B bot to see if I can find the root cause of our timeouts.

Some observations:
1) Even in disabled mode, I occasionally see timeouts. Now there is some polling done in disabled mode to check the status of the limit switches, but the traffic should small, and there really shouldn't be any problem at all.

2) I did accidentally make the problem worse, and this lead me to believe that it was a CAN bus bandwidth issue. Instead of using the Jaguars to implement a PID loop for our lifter, we are using the CRIO. I created a separate task to monitor the setpoint, and speed adjustments. Originally, the task ran 100 times a second. This triggered a large number of timeout messages. Slowing down the task down reduced the timeout messages. But after reading through this thread, it's not clear to me why this should be the case.

3) Autonomous mode seems to be much more sensitive to timeout issues than teleop.

And finally one question: I've seen posts in this thread that suggest adding capacitors to reduce noise. Is there any consensus on doing that? What size capacitors? How exactly should they be wired?

Thanks.
Reply With Quote
  #110   Spotlight this post!  
Unread 05-04-2011, 23:09
kamocat's Avatar
kamocat kamocat is offline
Test Engineer
AKA: Marshal Horn
FRC #3213 (Thunder Tech)
Team Role: Mentor
 
Join Date: May 2008
Rookie Year: 2008
Location: Tacoma
Posts: 894
kamocat is just really nicekamocat is just really nicekamocat is just really nicekamocat is just really nicekamocat is just really nice
Send a message via AIM to kamocat Send a message via MSN to kamocat
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

A heartbeat or trusted command is required every 100ms for a Jaguar to stay enabled.
(The "set output" messages are trusted commands. The heartbeat will keep all the Jaguars enabled, but if you want to use trusted commands, you will need to update the output of every Jaguar that you want to keep enabled.)
I believe resuming the heartbeat will NOT cause your motors to move again if the output is not re-sent. (This makes sense with the behavior I remember, but I have not explicitly tested for this behavior.)

EDIT:
mjcoss, I can probably answer some of your questions.

Yes, in updating all your motors every 10ms, you will see many timeout issues. Setting the output takes 5-10ms (often 8ms) for a single Jaguar.

Autonomous is likely more sensitive than Teleop because Teleop sends the output continuously(Theoretically this is every 20ms, but with 4 Jaguars it could be as slow as every 40ms. In LabVIEW, the drive motors are dealt with in near-parallel, which reduces the time by about 50%. I don't know if the other languages do this as well.)
In Autonomous, the output is usually only updated when it changes. This means that if the heartbeat is temporarily prevented from getting through, it will not do anything when it finally DOES get through, unless you've used a trusted output command.
__________________
-- Marshal Horn

Last edited by kamocat : 05-04-2011 at 23:21.
Reply With Quote
  #111   Spotlight this post!  
Unread 06-04-2011, 01:21
Radical Pi Radical Pi is offline
Putting the Jumper in the Bumper
AKA: Ian Thompson
FRC #0639 (Code Red Robotics)
Team Role: Programmer
 
Join Date: Jan 2010
Rookie Year: 2010
Location: New York
Posts: 655
Radical Pi has a spectacular aura aboutRadical Pi has a spectacular aura aboutRadical Pi has a spectacular aura about
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

Quote:
Originally Posted by kamocat View Post
Yes, in updating all your motors every 10ms, you will see many timeout issues. Setting the output takes 5-10ms (often 8ms) for a single Jaguar.
You wouldn't happen to have any numbers of what parts of the system these times are coming from, would you? I've been digging through the jaguar source code again and found some interesting stuff in the CAN interface.

CAN/Serial have the lowest priority interrupts, being called whenever new data is on the bus. This interrupt also has the responsibility of interpreting the message and either returning an ACK or any relevant data over the bus. Above that is an interrupt timed every 1ms which reads in any queued commands over CAN, runs a PID tick, checks limit switches, updates LED, and updates the outputs. At an even higher level is the ADC data reader. At the top of the regularly called interrupts is an interrupt for an edge on either the PWM or encoder lines. Obviously above that is the internal watchdog.

So, from what I gather, if a message is sent at an unlucky time, the internal interrupts could be slowing down the response over CAN, with some system conditions increasing the chance of this than others. (ex. PID makes the 1ms run a little longer, an encoder causes more interrupts to be called, etc).

Jumping back to the cRIO, a jaguar has 10ms to ACK before an error is returned. Judging by the 8ms (that was without PID, right?) average posted by kamocat, there isn't much wiggle-room in the system. Here's another interesting fact: last year the timeout was 50ms.

Now, here's a test I'd run if I had the hardware to do so. Connect a jaguar over CAN, attach an encoder and a motor to it. Record an average response time of the jaguar with and without the encoder being spun. If the response time increases by a significant figure with the encoder attached, then possibly the interrupts in the jaguars are causing these problems.

Sorry if I rambled a bit up there, time for sleep
__________________

"To have no errors would be life without meaning. No strugle, no joy"
"A network is only as strong as it's weakest linksys"
Reply With Quote
  #112   Spotlight this post!  
Unread 06-04-2011, 02:01
kamocat's Avatar
kamocat kamocat is offline
Test Engineer
AKA: Marshal Horn
FRC #3213 (Thunder Tech)
Team Role: Mentor
 
Join Date: May 2008
Rookie Year: 2008
Location: Tacoma
Posts: 894
kamocat is just really nicekamocat is just really nicekamocat is just really nicekamocat is just really nicekamocat is just really nice
Send a message via AIM to kamocat Send a message via MSN to kamocat
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

Sorry, I couldn't fish any info out of TI about why this is. My questions on the TI forum were largely left unanswered.

Once I get some more time, I could do some additional testing. That may not be until June.
__________________
-- Marshal Horn
Reply With Quote
  #113   Spotlight this post!  
Unread 06-04-2011, 14:35
mjcoss mjcoss is offline
Registered User
FRC #0303
 
Join Date: Jan 2009
Location: Bridgewater,NJ
Posts: 70
mjcoss is a jewel in the roughmjcoss is a jewel in the roughmjcoss is a jewel in the roughmjcoss is a jewel in the rough
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

Thanks for the average update times, Marshal. Just as a side note, We're not updating all of the motors, just the ones controlled by the PID loop (2 in this case).

I really would like to understand why I'm seeing intermittent timeouts in disabled mode. I'm going to update the 2CAN firmware tonight and hope that it addresses some of the issues.
Reply With Quote
  #114   Spotlight this post!  
Unread 07-04-2011, 22:17
drakesword drakesword is offline
Registered User
AKA: Bryant
FRC #0346 (Robohawks)
Team Role: Mentor
 
Join Date: Jan 2006
Rookie Year: 2004
Location: USA
Posts: 200
drakesword is on a distinguished road
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

Haha bigger issues today! Oh boy, there was a delay in starting the field (roughly 12 mins) when the match started we were only able to drive with one jag WOO. Brought it back (with the power still on) and opened the 2CAN page, We only saw 1 Jag. Power cycled and then we saw all 5!
Reply With Quote
  #115   Spotlight this post!  
Unread 08-04-2011, 00:49
ozrien's Avatar
ozrien ozrien is offline
Omar Zrien
AKA: Omar
no team
Team Role: Mentor
 
Join Date: Sep 2006
Rookie Year: 2003
Location: Sterling Heights, MI
Posts: 521
ozrien has a brilliant futureozrien has a brilliant futureozrien has a brilliant futureozrien has a brilliant futureozrien has a brilliant futureozrien has a brilliant futureozrien has a brilliant futureozrien has a brilliant futureozrien has a brilliant futureozrien has a brilliant futureozrien has a brilliant future
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

I've definitely seen this symptom before, where a Jaguar becomes unresponsive until it is power cycled, so I ran some tests tonight and I've managed to reproduce this a number of ways....

Test 1 - 1 2CAN and 1 Black Jaguar.
Take a 2CAN and connect CAN cable from it to single Jaguar. Put termination resistor on Jaguar.
Power 2CAN, 2CAN LED will be orange. [ok]
Bridge CAN High and CAN low together, hold until 2CAN LED turns red. [ok]
Let go of CAN HIgh and CAN low and 2CAN LED will return orange. [ok]
Repeat until 2CAN LED stays red even after unbridging the CAN lines. [not ok]
Power cycle JUST the Jaguar and LED returns to orange. [not ok]

What happens here is when there is no cRIO, webdash, or host of any kind, the 2CAN periodically sends an enum CAN request to see who's out there. If no one responds with a CAN frame of any kind, 2CAN LED goes red to indicate a CAN problem. If 2CAN gets CAN frames within a timeout, LED is orange to indicate no CAN problems and that it's ready for any Ethernet traffic from cRIO/WebDash.

After bridging and unbridging several times the Jaguar no longer responds to the enum request from the 2CAN, which becomes apparent when the LED stays red.

Test 2 - 2 X Black Jag, one is Serial Gateway
Connect Black Jag Serial to PC using serial connection.
Connect Black Jag #2 to Serial Black Jag using CAN.
Open BDC comm
Drive Black Jag #2, setting full throttle. No need for motors.
Begin bridging and unbridging CAN H and CAN L until Black Jag#2 no longer drives (blinks orange). [ok].
Press enum on BDC COmm to refind Jaguars
You will only see the Serial Black Jaguar ID in the drop down [not ok].
Close/open BDC comm and you still will not be able to connect to Black Jag #2 [not ok].
Power cycle Jags and you will be able to resume CAN communication.

Test 3 - CAN tool and 1 Black Jag.
Connect a USB CAN tool to a black Jag.
I used Intrepid Control System's ValueCAN 3, and Vehicle Spy 3 software.
Transmit a 29 bit ID frame with DLC set to zero and arbitration id set to 0x240. This is an enum request.
Confirm Jaguar responds with CAN frame. [ok].
Transmit frame periodically, say 20 ms.
Each frame should have a response frame after each request (few milliseconds). [ok]
Begin bridging and unbridging CAN High and CAN low.
Confirm that CAN tool's transmit error count increases then decreases back to zero each time you bridge and unbridge.
Eventually the Jaguar will no longer respond even after unbridging CAN lines [not ok].

Conclusion
Briding and unbriding the CAN lines can put the Jaguars in to a state that's unrecoverable until you power cycle them. This can happen if the legs of your termination resistor are close together.
I suspect that momentarily losing termination resistor because of bad cable can also cause this symptom. Maybe the Jaguar can't handle Error frames gracefully or gets stuck in Bus off state (see error states in CAN spec, 127 occurrences of 11 recessive bits should bring CAN nodes back into error active state). Or maybe a race condition in the isrs (mentioned earlier in the post).
Reply With Quote
  #116   Spotlight this post!  
Unread 09-04-2011, 06:06
mjcoss mjcoss is offline
Registered User
FRC #0303
 
Join Date: Jan 2009
Location: Bridgewater,NJ
Posts: 70
mjcoss is a jewel in the roughmjcoss is a jewel in the roughmjcoss is a jewel in the roughmjcoss is a jewel in the rough
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

The problems continue. We are at the Philly regional where in several of our matches, we've been stalled due to CAN bus timeouts. I switched our bus terminator with a spare and that seems to have helped. I love that they publish an update that states that if you are having cascading timeout errors that you can request a reboot, only to be told by the field personal that "sorry, we are behind schedule".

We're running v29 on the CRIO and 2.5 on the 2CAN. I'm just about to throw in the towel and switch to PWM, any may do so for St. Louis if there isn't some progress towards a solution. But the about post certainly points to a bug in the Jags, however it requires that there is repeated shorts on the CAN bus if I parse it correctly.
Reply With Quote
  #117   Spotlight this post!  
Unread 09-04-2011, 09:34
drakesword drakesword is offline
Registered User
AKA: Bryant
FRC #0346 (Robohawks)
Team Role: Mentor
 
Join Date: Jan 2006
Rookie Year: 2004
Location: USA
Posts: 200
drakesword is on a distinguished road
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

I do believe that the firmware and driver should not require a response from the jag for every message sent. Instead it should be based off of the Heartbeat only and it should increase the timeout time. This would reduce the amount of bandwidth needed to communicate with the cRIO.

For the jag to respond to every command is crazy.

Is there any way we can look at the FRC jag firmware source?
Reply With Quote
  #118   Spotlight this post!  
Unread 09-04-2011, 10:25
kamocat's Avatar
kamocat kamocat is offline
Test Engineer
AKA: Marshal Horn
FRC #3213 (Thunder Tech)
Team Role: Mentor
 
Join Date: May 2008
Rookie Year: 2008
Location: Tacoma
Posts: 894
kamocat is just really nicekamocat is just really nicekamocat is just really nicekamocat is just really nicekamocat is just really nice
Send a message via AIM to kamocat Send a message via MSN to kamocat
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

The Jag does respond to every message. I know this from looking at the CAN subVIs.
On the "send" messages, it does an ACK as well. (Not sure what this is for)

You can find the Jag firmware source here:
http://www.luminarymicro.com/products/rdk-bdc24.html
or here:
http://www.luminarymicro.com/products/rdk_bdc.html

I believe what you want is in the firmware development package.
__________________
-- Marshal Horn
Reply With Quote
  #119   Spotlight this post!  
Unread 09-04-2011, 12:52
drakesword drakesword is offline
Registered User
AKA: Bryant
FRC #0346 (Robohawks)
Team Role: Mentor
 
Join Date: Jan 2006
Rookie Year: 2004
Location: USA
Posts: 200
drakesword is on a distinguished road
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

Quote:
Originally Posted by kamocat View Post
The Jag does respond to every message. I know this from looking at the CAN subVIs.
On the "send" messages, it does an ACK as well. (Not sure what this is for)

You can find the Jag firmware source here:
http://www.luminarymicro.com/products/rdk-bdc24.html
or here:
http://www.luminarymicro.com/products/rdk_bdc.html

I believe what you want is in the firmware development package.

Thanks. Yes I know that the jag responds to every message, I was saying maybe it shouldn't need to ...
Reply With Quote
  #120   Spotlight this post!  
Unread 11-04-2011, 02:02
jhersh jhersh is offline
National Instruments
AKA: Joe Hershberger
FRC #2468 (Appreciate)
Team Role: Mentor
 
Join Date: May 2008
Rookie Year: 1997
Location: Austin, TX
Posts: 1,006
jhersh has a reputation beyond reputejhersh has a reputation beyond reputejhersh has a reputation beyond reputejhersh has a reputation beyond reputejhersh has a reputation beyond reputejhersh has a reputation beyond reputejhersh has a reputation beyond reputejhersh has a reputation beyond reputejhersh has a reputation beyond reputejhersh has a reputation beyond reputejhersh has a reputation beyond repute
Re: Unexplained intermittent CAN / 2CAN Jaguar problems at GSR

Quote:
Originally Posted by drakesword View Post
Thanks. Yes I know that the jag responds to every message, I was saying maybe it shouldn't need to ...
The main reason it needs to handshake is that the CAN peripheral in the Jag handles each message based on interrupts... if a message isn't handled before a new message comes in, it will lose the packet and not act on the expected command. To prevent this, the cRIO waits until the Jag claims to be done (via an ACK).
Reply With Quote
Reply


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 03:59.

The Chief Delphi Forums are sponsored by Innovation First International, Inc.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.
Copyright © Chief Delphi