One of the things I like about CAN is that it gives us the ability to detect failures (loss of power, loss of communication, etc).
I made a program to do just that, and log it to a file. After running it for an hour, I got some interesting results. I’ve linked the code and the log file, so I’ll explain the format it is logged in.
First off, the file is created in the top directory, using the name “CAN_status_”+timestamp (In the file I’ve provided, it is 5:33pm on July 6th.)
Anyways, for each entry (each line), there is the status, the list of devices which the status applies, and the timestamp.
The possible statuses are:
- lost (communication with this Jaguar has been lost)
- power up (power has been cycled since this was last checked)
- got comms (communication with Jaguar has resumed, but the interruption was not due to loss of power)
- brown out (voltage fault)
- over temp (temperature fault)
- over current (current fault)
I was surprised at the number of interruptions there were, seeing as the robot was undisturbed during this hour. It seems each of the black jaguars (devices 10, 11, and 12) had an interruption in communication about once a minute, and the tan jag (device 13) had no interruptions whatsoever.
The interruptions seem to be on the scale of 200ms.
Why might there be communication interruptions on an undisturbed robot? Is enumeration a flaky thing?