# Velocity PID Programming: Terms dependent on time

Several of the simplified position PID’s I’ve seen fail to mention you need to incorporate time into your formulas for the I and D terms.

So, it seems that would carry over to a velocity PID.

Let’s say I have a simple velocity PID in pseudo-code (note - this pseudocode is not correct).

(Desired_Velocity - Actual_Velocity) * Prop Gain = Prop_Term;
(This_Velocity_Error - Last Velocity Error) * Der Gain =Der_Term;
Last_PID_Output = Feed_Forward;

PID Output = Prop_Term + Der_Term + Int_Term + Feed_Forward;

Which of these terms need to be normalized by dividing them by the loop timing?

Ok, so I’m going to put the corrected velocity PID here, to make sure no one uses the incorrect one above:

** Correct pseudocode
Prop = (setpoint - measured_value) * Prop Gain
Der = (error - previous_error)/dt * Der Gain
Feed_Forward = Last_Output
Output = prop + int + Feed_Forward**

You’re missing parentheses in the term for proportional gain. Anyway, if the loop is always run at the same frequency, dividing in the loop time is unnecessary, as the tuned gains can account for the loop time. If it is not constant, the I and D gains are divided by the loop time.

Thanks - the pseudocode is corrected.

While I understand running in a forced timing loop will get good results, forced timing in LabVIEW results in a very high processor utilization.

“Waited” loops are only a minimum. For instance, if you have a 20ms loop and your timing control is a 20 ms wait, the actual loop execution time is never shorter than 20ms, but can be as long as it likes.

Forcing priority so that a loop runs at consistent timing last year added 10-15% processor utilization. As a result, we’d like to use the more elegant solution and account for time in our velocity PID calculation.

After glancing back at my notes from last year to confirm it, both the integral and the derivative of a PID loop require time normalization. That means the velocity term itself in a velocity PID should need it as well (and I’m guessing on that), the integral should not (integral of velocity is position), but I don’t know about the derivative term, acceleration, though I would guess it should as well.

The internet is woefully light on velocity PID documents that a layman can get info from.

What the heck is the integral term in a position PID anyway? How do you integrate position?

I believe the Integral value would be multiplied, not divided. The longer the time interval, the bigger the integral. The Derivative term, as you said, is divided, because a longer time interval means a lower derivative (less slope).

You’re not integrating position, you’re integrating the error of the position. So when the system actually achieves the set position, the error becomes zero, and the Integral term stops changing. That’s the great thing that Integral adds (compared to just Proportional). Without it, the system always needs to have some error, otherwise the output goes to zero.

Integral term:

i += (error * dt * ki)

Alternatively, integral could be

i += (error * dt * kp) / ti using ti as integral time instead of integral gain.

Derivative term:

d = ((error - error_last) / dt) * kd

Alternatively, derivative could be

d = ((error - error_last) * kp * td) / dt) using td as the derivative time constant instead of kd as derivative gain.

In a positional controller, I deals with steady-state error:
-If the P and D terms land the output just a little bit from where it wants to be, eventually I will wind up and push it there.
-If the output requires a constant power to hold position, the I term will find it and stay there (once again, not instantly but over time).

This has issues:
-In a pure implementation, the I term HAS to overshoot to unwind.
-Most of the complex logic in implementing PID deals with limiting the integral term.
I’ve tried various integral limiting methods including:
-Reset I when state setpoint changes (ALWAYS)
-Hold I in reset when disabled (ALWAYS)
-Limit absolute boundaries of I
-Reset I when error zero-crosses
-Limit boundaries of I based on either a linear equation or lookup curve based on error (e.g. at error 0, I boundary is set at ±1, but at error 15, I boundary is set to ±0.25, and at error 30, I is completely zeroed out).

BUT, for a speed controller, the control is slightly different. Think about it as the input parameter is the derivative of position (which it is), so to get back to a motor power output you need to integrate the output. Then, you end up with:
P becomes I -> I does the work of P in a normal controller, since it has to find steady-state power and essentially has to do the heavy lifting
I becomes double-integrated and is useless
D becomes P -> P does the work of D and deals with transient response
Simply, if you implement a normal PI controller (with feed-forward), and calibrate P as if it was D and I as if it was P, you will end up with a velocity controller.

If you feed forward, then the ideal I contribution is 0 at steady state so you can limit the I term in various ways except zero-cross reset without worrying too much. You can also cal the I gain really low since it dosen’t have to lift very much. P gain would stay the same with or without feed forward control.

As for timing, I’ve had good luck with the RT timed loops. They do eat a lot of CPU usage, but they are worth it. A few tricks I’ve learned since we switched from LV 8.6 in terms of CPU load:
-Any VI call should be considered substantial overhead and avoided when possible.
-Any VI call that can be a subroutine (set in VI Properties->Execution->Priority) is substantially less overhead than a normal VI call, and should be considered zero overhead, BUT every subVI from a subroutine call has to also be a subroutine, and you can’t debug subroutines in realtime.
-Any VI call that can be re-entrant and is smal (e.g. math functions, button mapping functions, etc.) should be inlined (set re-entrant state to ‘Preallocate clones’, uncheck error handling and allow debugging, and check Inline SubVI, all under VI Properties->Execution)
-The WPIlib should be treated as horribly inefficient and any call to it should be thought out very carefully. I’ve gone through the trouble of reworking the majority of it to be more efficient, this is a huge PITA and time waster but it does save a ton of CPU time.
–If you actually read most of the VI’s, you will likely cry.

And, while debugging in realtime
-Any front panel indicator is inefficiency when the VI is open on your laptop and the code is running. The CPU penalty goes up with faster loop times also.

Ok, I made a couple edits to the original post in the hopes of preventing someone who stubles onto this thread to use the first, incorrect one.

Please double check my writing on the ‘corrected’ pseudocode.

You left out the “int” calculation

``````Correct pseudocode
Prop = (setpoint - measured_value) * Prop Gain
Der = (error - previous_error)/dt * Der Gain
Feed_Forward = Last_Output
Output = prop + **int **+ Feed_Forward
``````

We’re getting closer.

1. The “int” term needs to be defined. I think you want:
Int += error * dt * IntGain
2. I don’t think we usually want Feed_Forward for positional control, and if we did, it wouldn’t just be the last output. This implementation looks like another integration (it keeps adding the output to the last output). Feed Forward is where you have a predicted function of what input is needed to achieve the desired setting. The great thing about Feed Forward is that the rest of the PID
algorithm only needs to account for the non-idealness of the prediction function and the other errors introduced in the system.

Why do RT timed loops “eat a lot of CPU usage” ? Is the context switching overhead for a timed loop really that much different from a “waited” loop?

For example, let’s use Tom’s 15% number. Let “X” be the actual execution time, in ms, of the code in his 20ms task. Then we have:

X/20 - X/(20+X) = 0.15

… which gives X = 9.4 ms.

So Tom’s “20 ms” waited loop is actually running at a period of 29.4 ms, not 20. No wonder it uses more CPU time when you actually run it at 20 ms (using a timed loop).

This is what we do:
m_error = (setpoint - input) * dTime_s; //Using dTime_s will keep the errors consistent if time is erratic

… we compute the time deltas and pass them throughout the entire loop. We usually stay between 10-12ms as there really shouldn’t be much overhead to run.

This means we have large P I D terms… so 100 for P is about 1 for the usual standard. This may be a bit unorthodox but quite effective if you give it a try, as we found in last year’s game for the shooter with non-linearized 884 victor’s stress (ouch). Comments and criticism are welcome.

P.S. They are linearized off-season with help from the Simbot’s poly but now all this may not matter with the new 888’s

The short answer, in my OPINION is:

The Proportional and Feed Forward terms need no time normalization. They are instantaneous terms.

The Integral term must be multiplied by the time interval. You should be doing this before accumulating the error each cycle (accumulator += error * t). Don’t multiply your accumulated values by the time interval my mistake.

The Derivative term must be divided by the time interval ((error - lasterror) / t).

This will allow you to keep consistent P, I and D constants that are independent of your PID loop interval.

I say in my OPINION because there are countless ways to play with these to try and optimize performance, and if you ask two people, you’ll probably get two different answers.

This results in the following pseudo-code:

FF = Kff * setpoint

P = Kp * error

accumulator += error * timeInterval
I = Ki * accumulator

D = Kd * (error - lasterror) / timeInterval

Output = FF + P + I + D

(at least that’s what the title is).

For velocity PID

, a feed forward algorithm would calculate the steady-state motor power required and would add it to the end.

Algorithm implementations I’ve thought of for velocity control feed-forward:
-Equation to solve for motor power given RPM

assuming 12v battery. I takes care of battery differences.
-Equation to sovle for motor power given RPM
and vBatt.
-2d curve table (interpolated lookup table w/ 1 input x 1 output) from RPM
to find motor power
-3d map table (interpolated lookup table w/ 2 inputs x 1 output) from RPM
and vBatt to find motor power
I’m sure there are others.

-The standard LabVIEW loop runs at default priority and is timed by a Wait call at the end. At the end of a task, LV will do a context switch somewhere else for the time of the wait and come back when it’s done. So, you actual loop time equals the wait time plus the actual execution time, so this really sucks for timing determinism but is really easy to implement
-An alternative LV loop is timed to the UDP packets. A wait on occurrance is set (basically a synchronous blocking wait that waits for an event or times out) and called when a control packet is received and parsed. This removes the actual execution time from the timing equation but introduces timing non-determinism from the driver station laptop, network overhead and latency variations, and network load. This is generally not good either.
-LV can also use an ‘RT Timed Task’, which basically guarantees perfect timing if the CPU is not overloaded with tasks of the same or higher priority (which I would expect of any task, but this is the only way to implement a highly deterministic task in LV).
-Our normal loops generally run at 22ms or so but we have many of them (so they still manage to run up to ~90% CPU under worst-case timing conditions).

With a lot of architecture discussions and efficiency improvements, we are currently running all code in a single 10ms RT task with no worries (yet). If we have issue then we can step down to 12ms or 16ms or 20ms.

In all honesty, I have no idea how we can be so inefficient that we’re running up to the limits of a 400mhz PowerPC with what we’re doing, at the relatively slow speeds we run at, with the relatively small amount of code, without thinking about running vision on that processor as well.

You’re right. I saw the first line of the original post, and went with that. Sorry if I caused any confusion. The rest of my comment regarding Feed Forward still stands, and I think it agrees with your post.

Because of this, I tested the current code build on a cRio (4-slot) and the code is using ~38% CPU right now. Since our test chassis (joysticks to 2-pwm drive with servo shifters only) used ~33-34%, but was reading a lot of FPGA IO for display, I’m guessing the overhead (OS + Netcomm) is around 26-30%, and the FPGA transfers use around 3% depending on the number of transfer objects.

Both tests were done with a 10ms RT task at priority 2000 with only the highest-level VI front panel open displaying an iteration timer and RT CPU loads on the front panel. Number is combined CPU load from all priorities.

We’ve really optimized the architecture and all of the calls to maximize flexibility in calibration and debugging while minimizing the front panel objects on code blocks. This seems to be working really really well.

Why do RT timed loops “eat a lot of CPU usage” ? Is the context switching overhead for a timed loop really that much different from a “waited” loop?

I’m not really sure.

In all tests I’ve done, they just do.

It’s likely the fact that the code is actually running faster than a waited loop.

Running faster, as in “task period is shorter”. Makes sense.

I’m not certain about labview’s implementation but in C++ you can request the system clock for its time since it powered up. This has a resolution of 1ms. Before you do your dt calculation you can figure out how much time has passed by asking for the current system time and subtracting the last time you asked for the time. I think this is much better than any loop that you state how long it takes. This will account for anything that changes the cpu load, perhaps vision code or the garbage collector.

new_time = current_sys_time
do pid calculations with dt = new_time - old_time
old_time = new_time

The same thing exists in LabVIEW. It’s the ‘Tick Count’ block with the same 1ms accuracy.

I believe there’s a more accurate Tick Count in the RT section that can measure from other clock sources. BUT, I haven’t ever needed more than ms accuracy.

In an ideal realtime world there is no garbage collector, all of the RAM is statically allocated in blocks and never dynamically allocated. In an ideal realtime world, people also don’t try to run vision on the same processor that runs hard RT control loops without actual RT control loops, when the RT task is already using the majority of the CPU, and expect the non-RT control loop to be reasonably consistent.

If the resolution of the “system clock” is truly 1 millisecond (as you said), then if you’re running a control loop at 10ms, you could have an error up to 20%.

There’s a much more accurate system timer that is available:

Timer::GetPPCTimestamp() has low overhead and returns time in seconds in double precision with 1/3 microsecond
resolution
. According to the WPILibC++Source20120204rev3111 source code date February 4 2012, it uses a 64-bit
counter incrementing at 33MHz, which if true wouldn’t overflow in your lifetime. Warning: I’m told that earlier versions
of GetPPCTimestamp() used a 32-bit counter instead of 64, and would rollover every 130 seconds. Check the source
code on your machine to make sure.