RoboRIO WPILib Logging

In several threads (mostly this one) it has been brought up that we need to have default logging software.

Adding this to the RoboRIO firmware would be great but having it as part of WPILib and allowing teams to modify it may be more useful.

What example programs have people written that implements logging? Please provide links to published code.

What should be logged on the RoboRIO?

How should we format the logs?

What is the best way for the logs to be easily retrieved, viewed, and used by teams?

What problems do we need to avoid when implementing logging on all robots running WPILib?

This SO post has some good discussion on the issues surrounding logging and power failure, as they relate to filesystems and unsaved data.

We log whatever whatever we think is important, depending on the bot. Our public release, of the 2016 bot is posted here.

We program in labview. The logger is at the bottom of the robot main diagram. Features…
We create an file on the roborio, that has the file number of the last log filename, each time you enable the robot it creates a new file. If the file does not exist, it initializes the file to 100000.

This is handy as you can use a ftp client to manipulate the file as needed, each day you can start a new sequence if you like.

It would be handy to maybe make this use time and date for filename, but to have meaning, you have to keep roborio accurate…

We loop the file and log every 100ms, and log a bunch of data, this can be modified as needed. One good feature we have is that we log only when robot is enabled either in auto or teleop, and 30 seconds after it was disabled. This is handy if you get disconnected from field.

Things we log that would be good every year…
Robot mode, error status, loop time of our last 10ms timing loop, Motor commands, motors distances, sensor status, navx data, etc.

There are saved log files in the github.

1 Like

This is our trace logging class:
https://github.com/trc492/Frc2017FirstSteamWorks/blob/master/src/trclib/TrcDbgTrace.java
Here is what a typical autonomous log file looks like:
https://github.com/trc492/Frc2017FirstSteamWorks/blob/master/TraceLog/20170419!Houston/Qualification098_2017-04-21%4015-36-01.log

1 Like

If anyone is looking for a simple logging solution, the HuskieBoard MXP expansion board has a micro sd card slot. And, if you use the firmware and corresponding LabVIEW API located here: https://github.com/HuskieRobotics/HuskieBoardPublic
Then logging is as simple as passing a string into a VI.
Note: I’m the one who wrote the firmware and API for the HuskieBoard

1 Like

We’ve used our own logging framework for a few years. We log to two different files. One is events, like 492, while the other is a csv file of periodic data. The log fies are named with date/timestamp. We look for jumps in the roboRIO date and rename the log file if there is a jump. This makes sure the log files are named correctly after the DS connects. We also write a third file which contains an appended list of log file names, so that the latest file can easily be found (which is helpful since the RTC in our DS laptop is dying. We spent a long time getting the file naming right (years), and in retrospect we should have just done an incrementing number. If a flash drive was installed, we logged to the flash drive. If it was not installed (or removed in the middle of logging) we would fall back to logging in the roboRIO home directory. We flushed each file each periodic loop.

For the event log, we log the time/date, priority of event (Error, Warning, Info, Debug). Errors and warnings are sent to the DS using the report error and report warning methods. Debug is printed to System.out, and info is only written to the log. We use the command framework and automatically log each time a command is initialized, ended (and if it ended was it because of timeout), and interrupted. We also had around 100 things that were logged.

For the CSV data, we used a functional interface for the data to be stored, which was written every periodic loop. We logged ~40 different items.

The text and CSV formats made it very easy to open on any computer. You could either take the flash drive out and put it in your computer or ftp to the roboRIO to get the files. The flash drive was fat formatted. We never had total corruption of the flash drive. We did have once or twice that we couldn’t find the file we were looking for, and it was common for windows to complain about about errors in the file system. The roboRIO gets very slow at file operations when there are a few thousand files in a directory. There are probably ā€œbetterā€ ways for handling the data reliably, but ours was pretty reliable for minimum time investment.

Our code is here: https://github.com/Beachbot330/BeachbotLib/tree/master/java/src/org/usfirst/frc330/util

1 Like

Here’s a couple tidbits I’ve shared before:

Java method-handles based csv logger for periodic data.

Python script for grabbing all log files. - Uses FTP, and made the ā€œget all filesā€ a single-click operation.

HTML/Javascript viewer for graphing and analyzing our log files

1 Like

The console is logged, right?

Can you just customize your own logs of whatever events / variables you choose by putting print statements into your code?

This seems like a really dumb question, I’m sure I’m missing something / barely know what I’m talking about.

A log/trace system is typically separate from the console, because you log/trace SO MUCH stuff, and you want to be able to sift through it later. Basically print out everything so you can diagnose a problem later.

Console (where you print) is typically treated as ephemeral and not as much is printed out so you can actually see what you want during operation (especially with a control-loop based system like an FRC bot). Here you might start putting prints in to diagnose a problem after it happens.

If the console output is logged anywhere i’m not sure where.

We’ve implemented data logging in our robot framework code, so that each individual input and output state can have its state logged every cycle. It’s also possible to log other state (calculated values/variables etc.) from elsewhere in your code, using the same mechanism.

This all results in a JSON file being written to the RIO, which you can grab via FTP or similar afterwards. Our log viewer tool allows you to choose which I/O channels to view, which means you can correlate anything and everything. You can also save ā€˜views’ of configured channels, so if you are trying to debug a problem, you can apply the same channel view selection to new log files (saving reconfiguring time).

This tool has more than helped us debugging issues, and get detailed data on things such as our drive train efficiency.

We wrote a post about this a while back which is here. It also describes the JSON format used, so if you want to write your own state in that format (rather than using our framework), you can, and then use the viewer tool.

1 Like

We should be sure we are all talking about the same thing. ā€œLoggingā€ in a robotics context may refer to trace-like logging (print statements scattered through code to help debug the codepaths taken), time-series metric data logging (things like CSVs), logging raw sensor data (like camera frames), or some mix of any or all of the above.

Use-cases for logged data may include manual inspection of text data, plotting of time-series data, visualization of logged data (think retroactive Smart Dashboard that lets you scrub through time), or even testing code snippets by playing back logged data as inputs.

Trace-level logging is already supported (sort of) via driver station logs. You will see your print statements show up on a timeline with other key match events, state transitions, battery voltage, etc. The downside of course is that if your robot loses comms, you don’t get this data. There also isn’t a ton of headway for verbose logging before you bog down your code/network link.

Other forms of logging aren’t yet official supported, but as many others have posted here, are often implemented by teams. There are also tons of open source tools available to help with this.

Most of the industrial and academic robotics frameworks I have used have converged at pretty similar architectures for logging designed around some form of message passing architecture. This can support all of the variants of logging and use-cases described above. The idea is to establish a standard protocol for message serialization and transport, and then provide simple APIs for ā€œpublishingā€ or ā€œsubscribingā€ to data on various named channels. For example, your user application regularly publishes messages about its internal state that get broadcast to anything that is listening, and a dashboard program (perhaps running elsewhere on the network) subscribes and renders widgets on the screen every time it gets an update. For logging, an additional program runs that subscribes to updates and simply dumps them to a file. If you want to visualize logged data with your visualization program, you use a log playback tool that regurgitates the logged data (typically with some sort of ā€œtime scrubbingā€ interface) - the dashboard doesn’t care that it’s coming from a log file rather than your code.

WPIlib already provides (somewhat limited) versions of 2/3 of the necessary pieces above: a transport and serialization mechanism (Network Tables) and a dashboard (Smart Dashboard). Writing a program that dumps Network Tables traffic to a file and can play it back is trivial. Both Network Tables and Smart Dashboard need a lot of love (or should be replaced by something better, which I think is already on the WPIlib teams’ radar), but to me it makes sense to build the ecosystem out in this direction.

3 Likes

So I’m curious about how some of these systems impact timing on the RIO. I suspect a lot of the response will be ā€œit’s minimalā€ but when we are talking about milliseconds and they can add up with improper logging.

I don’t have a lot of experience with logging on embedded systems but from an IT operations perspective, you can waste a lot of compute resources churning out logs inefficiently.

Has anyone done a thorough investigation about how best to log while dealing with timing? Maybe Austin from 971 can chime in, I suspect he has some knowledge on the subject.

ROS does a lot of this (though it is far from an ideal platform for FRC at the moment).

There’s always going to be an upper bound on what you can log no matter how you do it, but we were logging O(5,000) double values per second to an ASCII text file at times with a ~5% RoboRIO CPU hit. The most important part is putting all file I/O in its own thread, so the only price paid in your timing-critical code is copying the data.

A well-constructed logging framework should also offer safeguards such as throttling to keep you from swamping the network or disk.

1 Like

Yep - the rostopic/rosbag/rviz ecosystem is a great pattern. I wouldn’t use ROS for FRC, but I’d definitely advocate borrowing this aspect of its design.

Which is kinda why I was hoping someone would open source a well constructed one. Logging is one of those things that’s REALLY useful but also really simple on the surface but actually really hard. Most teams would benefit from having some sort of decent logging but lack the skill set to create the framework.

For the logging in our framework, the file writes are on a separate thread so as not to be in the main code path for file I/O. We write chunks every 100ms so as not to be holding too much memory at a time.

This year, measured code execution time per cycle would be in the 19-22ms range with logging of some 50 state values enabled, and around 9ms with no state logging occurring. We did not measure CPU% impact.

Andrew covered it but yeah, logging can be a problem when you do it wrong. I think a framework would be great and I’d love to see one.

Yeah, your prior post on CD was one of the few Google results when searching for RIO work with ROS. I’m torn on using ROS for FRC… I mean, we did this year… kinda… it’s more like we used the ROS protocols to talk to the RIO… which I guess is using ROS? Kinda hard to say where the line is at because ROS is very fluid at the moment… it has a full stack but you don’t have to run the full stack. ROS is going to enable a lot of really cool stuff but there are a lot of downsides and baggage that come with it at the moment. I suspect 2.0, given it’s emphasis on real time aspects, is going to be a better fit for FRC.

If you want this functionality built into WPILib, WPILib is open source. Alternatively, make your own community project, it’s been a shame we haven’t seen many of them recently.

You’d likely want something with 2 different types of logs: telemetry logs, and event logs.

Telemetry logs are periodic, they’re kind of like a history of your motor outputs, sensor inputs, joystick readings, stuff like that.

Event logs are more like your printf statements, telling you a command has started running, or something has errored, something like that.

I’d recommend keeping a ring buffer for your telemetry logs, and writing them out on a regular basis from a new thread with a lower priority. I typically use .csv format as the storage.

For event logs, you generally want to write these to a buffer, and wake up another thread to write them out to file. Again, I usually store these as .csv (w/ timestamp, caller, etc)

It seriously spikes CPU utilization, even more so if you’re doing it a lot. I’ve seen team’s utilization increase by +40% when they’re writing extremely often (I’d have to pull the log up but it was on the order of milliseconds).

Since I’m moving to software lead for my team this year, this is something I have a huge interest in.

To that end I’m more than willing to work on this in order to facilitate better FTA-student interactions, and bring teams to a better understanding of ā€œwhat’s going onā€ on the field. I think initial capability should be things that help FTA’s diagnose a mid-match lost connection on the field, mid-match ā€œmy climb stopped working!ā€ situations, etc. Emphasis should be on showing the evidence to the team in an easily-consumed manner. Therefore it seems like we should be able to define a default set of values to ā€˜log’, and then send them to the driver’s station at whatever rate is plausible. The PDB has a plethora of valuable information.

One thing to facilitate better consumption of the data is more automated meta-data about a match from FMS. The live score may be a bit of a stretch, but if we could at least get the match number & match type (practice, qual, non-field modes, etc) it’d go a long way towards automating saving a file, finding the right log for a match, etc.

One final note - this has to be tested on a live field using the ā€˜default’ configs teams set (static IP vs mDNS, etc). Not getting custom data through the field network using the allowed ports was been hugely frustrating the last two seasons.

Now for opinions on ā€˜how’…

One of the tricks for live-logging of a numerical system through a bandwidth-limited connection is a strict array of doubles. If you think about it, a number in an array has two inherent pieces of information: position (order = metadata) and value. During the ā€˜handshake’ phase of a comms link, the mapping of the arrays is sent: a byte identifier identifying the type of array, followed by an array of ID’s (we can get away with string identifiers, but edge-case systems use 1-byte identifiers) which state what the value at the position of the array represents. Think of it like the header row of a spreadsheet.

A ā€œmessageā€ would be comprised of an array-type identifier, time-since-origin (microseconds), and then the array of values. Types of messages would be raw sensor values, raw input values, state machines, etc - and even a re-mapping of what any given array type means. We could skip decoding if we want, and simply us an existing messaging framework so long as the data is just an array of doubles. In our system, we could even log as quickly as we want on the robot in-memory, and then send the messages over in batches using whatever message framework we want. Most *MQ frameworks support batched messages, as-would custom byte decoders.

We could use JSON for the message format, but I wouldn’t recommend it. Everything would be converted from a double to a string, thereby reducing how much logging we could do for a single match.

Using this setup we got approximately 30hz logging to the driver’s station in 2012-2014, ranging from 13 fields in 2012 to 40-ish in 2014. We had live feeds each year, and as we learned more we were able to get the bandwidth of the system down to just a few dozen kbps. The eventual limiting factor was student understanding of not kludging up the logging system on the robot side (i.e. CPU usage). This is something I plan to get right this coming season.