Whitepaper: Drivetrain Algorithm Model

I’ve been thinking for a while about how to break up drivetrain control algorithms into logical layers. I know a lot of teams already do this in one way or another; for example, having a unified pose controller that can be commanded by either driver input or autonomous code. This paper is my attempt to create a conceptual model that’s general enough to describe any drivetrain control system, both to give a common footing when discussing algorithms, and to suggest ways to organize the actual code.

Comments and suggestions are welcome.

Drivetrain_Model.pdf (180.7 KB)


This aligns with the concept of pipelines I’ve been trying to push.

This style of architecture encourages testable code since only the top and bottom most layers require hardware either in loop or mocked out.

Personally I’d really like to see some standard for output to kinematic functions maybe around the geom/twist messages with some sort of custom state object along side it.

The real important part to me is separating the reading inputs and the writing outputs segments. Cool work. Any intention to turn it into something maybe a bit more usable for teams?


Do you mean a standard way of representing robot state, like \vec x=\{x, y, \theta,\dot x, \dot y, \dot \theta, \ddot x, \ddot y\} or something? That would definitely be useful, but there would be some complications. While I was working on this paper, I originally was trying to represent driver commands as \Delta \vec x, or d\vec x/dt, but I don’t think that works, at least not with the normal velocity-command driving. What you’re fundamentally trying to do is set one part of the state vector (the velocities) while not caring about what the other members end up being.

Well, the problem with this is that it’s so generic that it doesn’t really lend itself to implementation as a software library or anything; I intentionally leave the datatypes and stuff undefined. If you just mean providing more in-depth examples of how to write code that conforms to this model, sure, I could do that. I’m also implementing it in my FRC SysML library, but that’s probably even less useful for most teams than the whitepaper…

Yup I specifically mentioned the ros message type since it seems to work for them.


Re a more concrete implementation - I’ll toss some function headers for what I had in mind later tonight when I’m not on my phone.

SysML - tell me more.

*Bit of a side-note*

This “pipeline” concept is one of the big positives of LabVIEW IMO. With text-based programming languages I find it pretty difficult to tell what parts of what code feed into what other parts. Even with an IDE that lets you search for variable declarations and usages around the project, it can still be confusing if you don’t plan and document properly.

With LabVIEW, variables are carried by wires that literally connect the variable declaration to its use. This makes the program naturally flow from start to end much like the example models in the whitepaper. A lot of times, especially on here, LabVIEW gets bad-mouthed because it’s seen as a lesser programming language, but every language is better at some things than others and this is one of LabVIEW’s biggest positives IMO.

For example, here’s the 2015 robot drive code I just found in my old archives (ignore what the code actually does, it’s probably full of mistakes). It has a clear flow from joystick inputs on the left side all the way to motor outputs on the right.


Just another of my pet projects I work on from time to time. Basically, I set up some base libraries for the FRC domain, with relevant components and properties and stuff:

Then, I can use them to model different aspects of robots:

The end goal is to be able to run analyses and rule-checks on the complete mechanical/electrical/software model of the robot, using Cameo for more complex mathematical relationships. So far, I just have some scripts that let you roll up mass and stuff.

All of this is in MagicDraw, because that’s what I learned to use at work. One of these days I’ll transform the libraries to OWL or something, so other people can use them. And I should probably rebuild them based on our foundation ontologies, since they have some pretty nice features for modeling components and functional relationships and stuff: https://trs.jpl.nasa.gov/bitstream/handle/2014/42082/11-1269.pdf?sequence=1

1 Like
public abstract class DrivetrainController() {
    // Layer 1
    public abstract JoystickOutputs[] GetControllerInput();
    // Layer 2
    public abstract JoystickOutputs[] GetFilteredControllerInput();
    // Layer 3
    public abstract GeomTwist GetDesiredTransform();
    // Layer 4
    public abstract MotorOutput[] GetMotorOutputs();
    // Layer 5
    public abstract void WriteMotorOutputs();

Then subclass it with predefined implementations with standard implementations that have custom constructors.


Hmm, interesting. In the case where you’re doing closed-loop velocity control onboard a motor controller, would you just create an implementation where GetMotorOutputs() sends the new velocity setpoint to the motor controller, and make WriteMotorOutpts() a do-nothing function?

Also, you’d want each function to take the output of the preceding function as an argument, no? And would there just be a run() method or something that calls all of them in order?

The way I’ve typically done things like this would be that GetMotorOutput generates a structure containing the information that you’d need to write to the other controllers. Then WriteMotorOutput does the actual writing. In FRC cases this would be PWM or CAN messages but there’s nothing specific about that should you do something funky.

Yeah, some sort of step() function I just skipped calling that out cuz it just has to be there to call everything in the right order, can’t come up with a reason for anyone to override that.

Since we have a field relative swerve drive on ROS, figured I’d comment on how we did things.

On the joystick end, it is kinda weird because we have to support both a real DS and simulated joysticks. For the real DS, we have code in our hardware interface to periodically read from the DS and publish a joystick message. That’s fairly straightforward. For the sim version, we hook up the sim hardware interface to read from a ROS joystick input node and publish to the sim interface. The sim interface then calls the various HALSIM joystick set code, which is then read by the common part of the hardware interface code as if it were just like a real DS input.

In either case, we have a joystick message out, which goes to what we call the teleop node. That’s kind of a combo of your layer 2 and 3. We shape the raw stick inputs - ramping, power curves, deadband, etc. This is also where we apply the field centric offset. That comes from a node which reads from our IMU and republishes a sensor_msgs/Imu Documentation, rotated by an amount needed to correctly set 0 heading to be away from our driver. There’s a service call we make to the node which sets zero, and this captures a z angle offset which is applied to every subsequent imu message passed through the node.

The teleop node publishes a Twist message to a software mux. This is set up to take inputs from multiple sources of Twist messages - mainly teleop nodes and several different auto sequencing nodes. The mux picks the highest priority message and passes it through to the swerve drive controller.

The swerve drive controller converts robot velocity commands (both strafe and rotation) into motor commands. It also reads encoder values to calculate odometery (which in ROS is a nav_msgs/Odometry Documentation). And it also published a TwistStamped so other nodes can know what it is doing (useful for things like localization).

The way ROS does hardware control is it sets up a read - update - write loop. read and write are generic - they know that e.g. motors exist, but not what they’re used for. Each time read is called, the generic part of the hardware interface reads from all of the configured hardware and stores the state of each in buffers. Write does the opposite - takes commands queued up in command buffers and sends them to the actual hardware. This has two benefits - one, read and write are single threaded, and two, these buffers can just as easily be read/written to/from simulated devices as real hardware.

Each controller is a shared library, and there’s an interface to get devices by name, and then also to read and/or write the shared buffers accessed by the read() and write() steps of the control loop. There’s code in place to prevent multiple controllers from getting write access to hardware, but multiple controllers can read specific hardware’s state. Each controller’s update() function is called once per control loop, in arbitrary order (potentially in parallel as well), but due to the no-multiple-writers rule it works out.

Update generally reads a command from a topic (in our case, a Twist message) and converts it into individual commands to motors. Or pneumatics. Or digital outputs, whatever, same process. These writes can be actual motor commands, but we’re also set up to write e.g. config values to talons (switch modes, set PIDF, arb feed forward, most anything else the controller can handle). This is also where a layer of watchdogs are implemented - no Twist messages within 0.x seconds means the robot stops.


Dinner’s still cooking, so onwards

Auto code (or automate sequences) are run using http://wiki.ros.org/actionlib. Think command base programming, more or less - lets us run async automated actions.

Encoder values are published so any nodes can get to them. ROS provides PID nodes, so for e.g. swerve path following, we have 3 separate PID nodes, each taking an axis position to move to along with the current axis position. The output of each is combined by a node which merges 3 separate PID outputs into a single Twist command, and this goes into the Twist mux. Pretty much everything from auto code is robot-relative (or converted to it for e.g. map-relative paths), so there’s no harm in having the field relative code limited to the teleop node.

Normally only 1 topic is publishing at once but the mux is there just in case to make sure we’re not mixing two Twist message streams together.

Pose estimation can come from the swerve controller, as mentioned above. It could be a separate node, but typically odom is done as part of update(), simply because the controller already has access to all of the encoder info anyway, so why not?

We’ve been also had luck using visual odom / mapping from a camera, fused with IMU for heading. I know there’s also the non-linear state estimation meme going around, that’s good too - basically feed camera data into the /stateestimator oval on the ROS diagram. In our case localization doesn’t use the odom calculated by the robot, just the input Twist message plus camera data and IMU rotation.


That’s perfectly reasonable, but I’m worried about conflating WriteMotorOutput with my layer 5, because they’re not quite the same thing. Layer 5 is where motor speed (or position) commands get translated into motor throttle, not necessarily where commands get sent out of the RIO. So if you’re doing motor PID on the RIO, that PID controller is layer 5. If you’re doing PID on a motor controller, then that’s layer 5, and there is no layer 5 code on the RIO.

That’s interesting. If I’m understanding correctly, you’re not actually doing any kind of corrections based on camera localization etc. when following trajectories, and relying completely on swerve odometry?

Ah, yes, I was conflating those two as “this is the actual data write” (hence the different name) I was including the PID as some step 4.5 which maybe should exist anyway outside of the Step 4 Kinematics to do output filtering and transformation. Likely most teams would not need to do this but for things like traction control systems it would be extremely handy as a hook. It would be a sound place for the setpoint → PID output conversion.

Gah, one of these days I’ll read what’s actually there.

Yeah, I missed a true Layer 5 in my quick version, what I described as 4.5 is exactly your Layer 5. Though I’m not sure if adding a separate filtering layer isn’t still a not awful idea…

Not quite. We’re also using visual odom from our camera, which does correct based on what it sees around it.

But what I meant here by field relative was just orientation to make driving field-centric. We do all of the field->robot coordinate transforms when paths are generated, and the IMU doesn’t drift enough over the course of driving a trajectory to have to correct it while following a path.