Writing to CSV file

Our team is trying to record encoder values to a file in the robot code itself. How would you write the encoder values with the current time to a CSV file? If there is a better way to do this please feel free to let me know.

WPILib should have the Jackson library automatically available to import. You can check out the 1-minute, 3-minute or another tutorial here: https://github.com/FasterXML/jackson-databind

Also keep in mind that Shuffleboard may be able to accomplish your goal. If you could be more specific to what your goal is, we might be able to help you more.

1 Like

The Jackson library is available, but JSON may not be the best choice for data logging. The issue with JSON is that you can’t write partial data (you always need a closing “]” or “}”). You might want to consider using CSV instead, which is very simple to write to and doesn’t have this problem (as each “set” of data is a line in the file, and no closing brackets are required).

1 Like

Thank You, I’ll look into it!!

Thank you for the recommendation! How would you use a CSV file in this case?

Here’s one of the common classes we wrote a while back to write a set of data values to a CSV file.

relevant XKCD:

2 Likes

Hahaha, thank you for the relevant xkcd. Thank you for the help!

Not a .csv… but this may be getting at the deeper reason for wanting that data.

1 Like

Thank you for the link!

Sorry, probably some more useful followup.

The whole reason we log CSV files is to ensure we have detailed performance data for each match. It’s particularly useful for debugging non-reproducible issues that were seen once on the field. Even the mechanical and electrical teams like them - by logging current and voltages and speeds, SW team can often feed them info about loose connectors, hot motors, tight chain, etc.

In terms of doing this practically:

  1. I definitely recommend implementing your csv creation such that it logs to an external USB drive (should mount to /U/ if you plug into the “outside” usb port). These files can get big and the RIO doesn’t have a ton of internal storage (literally, 0.5GB, and a good chunk of that is taken up by the OS). We use these low-profile drives to ensure there’s not a bunch of mass cantilevered off the USB port which might vibrate around, weaken the connection, come loose, etc.

  2. Here’s a python script to get the files off of the roboRIO and onto your local computer. It’s not doing anything super fancy, just sequences the unix tools for SSH/SCP to find and copy all the files on the USB drive to the PC. We wrote it purely because it was easier than digging around in the electrical board to unplug the USB drive. Also helps prevent that port from wearing out from repeated usage. But, in a pinch, just unplugging the drive from the RIO and plugging into your laptop and copying the files with windows works just fine.

  3. Excel works just fine for viewing some of the data, but can get a bit tedious if you’re wanting to graph the same thing repeatedly. We usually view our log csv files with this viewer. It assumes a particular format of CSV file:

TIME, value_name_1, value_name_2, value_name_3, ...
sec,  units_for_1,  units_for_2,  units_for_3,      ...
0, 1, 4, 7.93, ...
0.02, 2, 4, 7.94, ...
0.04, 1, 4, 7.98, ...
0.05435, 1, 4, 7.97 ...

Amongst the many projects I have running right now, one of them is doing a formal write-up on all the forms of logging and calibration we do. Long story short, there’s this distinct set of robot-development activities that I call “Calibration” - they’re not really traditional software development, and as such require a specific set of tools to accomplish. One of those main tools is something to record and plot time-varying values which represent physical things on the robot. For the asynchronous workflow (robot runs, SW team comes in after the fact to postmortem), this CSV method is the toolset we use.

3 Likes

Is there a reason you guys make a custom CSV logging system as opposed to just putting things on Shuffleboard and using the recording feature? I’ve never used the recording feature before, but it sounds like the recording feature might be a simpler solution here since it’s intended to be used to play back data from a match.

I guess CSV has the advantage of being very simple, but I’m just wondering if you’ve considered other options.

Also, I wonder how well an InfluxDB (or another timeseries database) + Grafana would work. This might be something useful to try out, but it’s a whole other topic to try and do something like that.

1 Like

The biggest reason was we wanted a system that guaranteed capturing one sample per loop of code execution, at a well-defined point in the code execution. From what I understand, when interacting over network tables, the behavior is more asynchronous. I think that if you had an asynchronous producer/consumer model like that, the producer would need to source a queue of data to meet the requirement… which, to my knowledge, isn’t part of the current network tables implementation.

Additionally, there’s some technical inertia - we started developing these systems back in 2016, as workarounds to the limitations of Smartdashboard. They kept working, and people know how to use them, so there’s resistance to “change for the sake of change” during the build season. Evaluating other options hasn’t risen high enough on anyone’s summer to-do list to actually force the change.

I’ve looked into other database formats in the past, but nothing beat the ease of implementation of CSV (Since, largely, we weren’t 100% satisfied with any of the off-the-shelf offerings for tools, and knew we’d be at least partially implementing things ourselves). In a broader sense, our goal is really to emulate parts of the functionality present in ATI’s Vision or Vector’s Canape suite of tools, which can drive some subtle differences from more CS-y data visualizations.

Totally not to say we’ve picked the best option. Technical inertia is a very real thing.

1 Like

Yes, NetworkTables is asynchronous by default and doesn’t have a built in queue to track multiple changes (I have a draft spec for doing this but I’ve not implemented it yet).

However, if your main loop is slower than 10 ms, and you only care about one set of values each loop iteration, you can call NetworkTableInstance.flush() every loop to get effectively synchronous behavior. This also improves latency (mainly useful for things like vision processing on coprocessors).

1 Like

Gotcha. I believe this would also drive the requirement that the consuming end will need to poll NT at least once per data sample? Or does the flush() push the data out to clients? Probably would want to have a loop counter in the dataset as well to help the consuming end understand if it has read duplicate or missed samples.

1 Like

flush() pushes out the data to clients. If the consuming end is using NetworkTable listeners rather than polling (I think all dashboards do this) they will see every value change. However, there is no mechanism in the network stream to indicate to the remote end that a flush has occurred–if you want to group multiple value changes, you’ll have to do this heurestically based on arrival times of individual value changes (each individual value change is timestamped as it arrives and is processed).

This doesn’t work for the client to client case (as it passes asynchronously through the server update dispatch loop), but it will work for the more common cases of server to client (e.g. Rio to dashboard) and client to server (e.g. coprocessor to Rio).

A bit more detail: NT keeps a local table of all values. There’s a dispatch loop that by default runs every 100 ms that calls flush() to send all value changes to clients. Calling flush() yourself sends all value changes immediately (and resets this timer). Flush is rate limited to every 10 ms to protect the network (we don’t like Christmas trees on the field).

1 Like

Sweet, makes sense. Tacking it on to our list of stuff to do.

For broader context, I’m mixing up two parts of our whole data logging & calibration setup. One piece is logging data to file on the robot. A second piece is transferring data in real-time to various web-based interfaces on a computer being used to calibrate the robot. I tend to view it all as one suite of functionality.

I am fairly tied to the visualizations that we’ve created - we know how to use them and I really like how they work. The by-far-most-annoying portion of our setup is the implementation of the websockets/JSON data streaming and keeping that robust.

All that being said - we probably still will continue to log local CSV’s. If the radio or the network goes down for some reason, I want the data to keep recording to help the postmortem. Definitely a self-driven paranoid requirement though.

1 Like

Bringing this thread back to the data logging on Rio topic, I have a draft WPILib PR for high speed binary data logging on the Rio. It needs some JNI work so it’s usable from Java, but the concept is for it to provide the low level binary storage bits for a higher level framework. The reason for going binary for logging is that it’s far faster than text, especially when logging floating point values–I’ve benchmarked approximately 200 ns per timestamped double with this PR. The file format is straightforward so it should be easy to convert to CSV with a small Python or Java program on desktop.

3 Likes