*I’d like continue a discussion that started in another thread. I made a new thread here as a courtesy to the OP of the original thread who expressed his desire not to discuss the larger issues involved.
Here are the things I’d like to have a discussion about. I’d like to hear from FRC programming gurus.
Scenario1:
A team has code which uses 100ms and 20ms periodic tasks, and they are running low on throughput margin (i.e. CPU usage is too high). They think that they can reduce CPU usage by using interrupts instead of periodic tasks. What say you?
Scenario2:
A team has subsystems that need to respond to sensor data changes within 1ms. Should they: a) use Interrupt Service Routines to control these subsystems, or b) carefully investigate other approaches (such as a hardware limit switch directly connect to a motor controller for example).
The timing of this thread is impeccable. We actually have a situation where we are going to add a sensor, that we think will only be active for 20-24 msecs. I was worried that we need an interrupt to make sure we capture the input condition of the sensor and was asking around so I can advise my software team what to expect. So we have a situation that we want to run some code based on a quick input condition and trigger based on an interrupt is how we would like to trigger a task or loop as well.
If I recall correctly the “best practice” was to use the interrupt to set a flag in the timed routine. That way the interrupt took very few clock cycles and the context switching wasn’t too bad. Also it minimized the possibility of getting interrupted during an interrupt.
Obviously this wouldn’t work if the sensor had to respond to a condition in an amount of time less than the timer cycle. And from experience, debugging any interrupt routine can be a real pain.
If you had a timed task that took as long as in scenario 1, you may want to consider making it into a separate thread (using the Task class in C++ or similar). If any task doesn’t require much communication with the main control loop, you could put it in a separate Task.
Now, thoughts on Scenario 1:
If you offload these timed actions into ISRs, they would take up as much (or more) CPU time as if they were running in the main process. The main disadvantage to this is if it happens inside a time-critical operation (like sending data to the host), it could cause problems.
Scenario 2:
If the team in question had code that could respond within 1ms, ISRs are a good choice. A hardwired solution would be very difficult to make, debug, and change, compared to an ISR.
What kind of signal does the sensor provide? If it’s a digital input, consider using a counter to capture the occurrence. The FPGA will detect the signal as quickly as it can do anything, and the robot code can watch for the counter to change value on whatever schedule is appropriate for responding to the signal.
Nice suggestion. We will give that a shot on Monday. You might see it work next Saturday…lol…sounds like we are coming to Kokomo on Saturday to the TechnoKats field at 1pm.
For 100ms and 20ms loops, I would first profile to see what is taking CPU. I would probably decide to lighten the processing of the 20ms and 100ms loops by removing Smart Dashboard access, refnum by name access, and perhaps some of the scaling and wrappers of WPILib. If using LV, I’d serialize some of the items in the loop to avoid context switches that I really don’t need/want. Unless I have tons of these loops, I suspect that would be good enough.
If I’m trying to respond to something in less than 1ms, I’m probably in trouble anyway since the motor controllers only pay attention to their PWM power value every 5 or 10 ms depending on the model. Other actuators seem slow as well, but I don’t know the values. The limit switch and other Jag inputs are polled at 1ms by comparison. CAN messages bridged over serial or enet might be fast enough.
Anyway, I’d have to compare the timed loop with very little inside it versus the interrupt.
If I was relatively certain the code wouldn’t change, and had the FPGA tools, I’d move the polling into the FPGA. Depending on the complexity of the response, it may be possible to put it in the FPGA too. But again, if the code is going to change, you will shave many times before you complete the project.
Doing the polling in the FPGA, and the response in RT is essentially what FRC interrupts are. If the polling fits the envelope of what is available, it allows high speed monitoring and flexible response for FRC.
We haven’t used the interrupt handling abilities of WPILib (C++) on our team – is there any guidance around as to what works/doesn’t work from inside an ISR? For example, I don’t remember seeing any special logic in WPILib to allow concurrent access to the actuator or sensor classes.
The period of the tasks is well within the bandwidth of the OS and hardware we use in FRC. The real question is the run time in each period. If we have 3 tasks that run t1, t2, t3 periods and require c1, c2, c3 computation time in each period then you calculate c1/t1+c2/t2+c3/t3 and have some feeling for how much bandwidth is being used. Of course there are other things going on. The OS consumes bandwidth for its clock and scheduler. In FRC robots some bandwidth is used by the task receiving DS messages plus one might have PID callbacks etc. If the sum of all terms for the OS, app and WPI tasks gets anywhere near 1 then you are in trouble (actually lower than 1, see Rate-monotonic scheduling - Wikipedia).
If the events that generate the interrupts are periodic and you are running the same-ish code to handle the event then you are not saving much time. However if the events are aperiodic using interrupts may save considerable bandwidth. If they use interrupts, the best practice is to quickly 1) determine if your hardware caused the interrupt 2) clear/reset the interrupt cause 3) set a flag, give a semaphore etc to synchronize with code running in a task to react to the event.
1ms interrupts can be handled by the hardware and OS we use in FRC. The interrupt latency of our system is measured in single microseconds. Again the real question is how long does it take to “handle” the interrupt. If the interrupt is periodic or the worst case is nearly periodic you can analyze the effect on the system just as you would a task.
BTW, handling a mechanical limit switch using interrupts is a little tricky if the hardware does not debounce the input. I don’t know that our DIO inputs do this.
We have used interrupts on the cRIO before in previous years, and my advice is to run away screaming. Most of the times, they work quite well, but occasionally, we would get a full cRIO reboot without any debug information coming back from the cRIO about why. Our programmers very carefully read the docs on what instructions were and weren’t allowed in the interrupt, and were handling the interrupt by incrementing a semaphore which would then wake up a high priority task to do the actual work. The code was looked over with a fine tooth comb. Maybe we were still managing to do something wrong, but given the number of matches that it cost us where we either sat dead or were stuck in an infinite reboot loop, we won’t be doing it again.
In our case, we were using the interrupts to capture an encoder value when a magnet would pass over a zeroing hall effect. In my opinion, this is a perfect use case for using an interrupt where you want a very fast response time.
Did you actually care about the encoder value? Or, as the “zeroing” description implies, were you just reading it in order to use as a reference for later readings of the delta? If that’s all you were using it for, you wouldn’t have needed to do any special programming. The FPGA lets you define a digital input pin as an encoder reset signal.
I tried to use interrupts this year and concur - something is flaky. But it is somewhere in the library or the FPGA. Interrupts are used on FRC robots for the main timer, for the arrival of network data and many other purposes - so they do work. In my case interrupts would fire for a while and then stop firing. The rest of the code continued to run (not crash as in Austin’s case) but the limit switch I tied to the interrupt pin just stopped generating interrupts.
There is a list of OS functions (not to call) in the VxWorks Programmer’s Guide. Basically do not call anything that only makes sense from a task context (no taking semaphores, no delays, no memory allocation, no formatted I/O, no mutexes, doing anything that might block etc) plus no floating point math (unless you save/restore the floating point context).
In Austin’s example my (wild) guess is that the source of the interrupts (occasionally) did not get turned off. If more interrupts occur than the VxWorks can handle its internal queue will overflow (something like a Windoze BSoD).
I used interrupts in the first year that the cRio was used, and didn’t have any problems. YMMV though…
I would probably recommend restructuring the code so that everything can run in a single thread. In an FRC context, my opinion is that typically threads cause more trouble than they’re worth.
A team has subsystems that need to respond to sensor data changes within 1ms. Should they: a) use Interrupt Service Routines to control these subsystems, or b) carefully investigate other approaches (such as a hardware limit switch directly connect to a motor controller for example).
First, I would ask if they really need 1ms response time. If possible, I would try to reframe the problem so that the response time isn’t required.
I like Alan’s suggestion to use a counter, that sounds like a real winner.
As a sidenote, when possible, I connect my limit switches to Jaguars.
Yikes - a little conservative maybe?. We have a a world-class real-time multi-tasking OS at our disposal and we don’t use it? In years where our students don’t “get” the multi-tasking from the start I create templates that prevent them from messing up the timing. WPILib hides synchronization and mutual exclusion pretty well and the messaging API is straight forward. By the end of build season the students always seem to “get it”.
In '08, for the introduction of the cRIO, some AEs built an ambitious robot named NItro. It had three omni wheels with encoders and they decided to use NI Soft Motion and interrupts to calculate the Jacobian for the motor control values for path planning.
I came on late to help them finish the project and found that they weren’t acknowledging the interrupts but were still running fine. I also found that they were running over 15k interrupt handlers per second, but again, running fine. At the booth, they ran the robot all day long in its wooden pen going through battery after battery.
This was with LV, so I cannot comment on the C++ handlers, and this was before the FRC FPGA was complete, so it is not apples to apples.
Anyway, interrupts are in the product and should work. If you have issues, please provide details so that we can improve them or their documentation.
It’s obvious and straightforward in LabVIEW. The External Reset input terminal, and its associated polarity select, are at the top of the Encoder Open function icon.
In Java, some of the Encoder constructors accept a third Digital Input to specify the index signal.
In C++, it looks like the reset/index feature is not exposed in the Encoder class. That’s unfortunate. I don’t know enough about the low-level resource interface to the FPGA to be comfortable trying to suggest adding a new constructor that includes it.
Is this edge triggered or will it clamp to 0 whenever the input is active? The former seems better as you can travel over the sensor and be fine (so long as you don’t go past the sensor. :ahh: )
The LabVIEW help suggests that the External Reset is level sensitive, meaning it should continuously zero the encoder value as long as it is active. I can’t verify that with complete certainty using the documentation I have access to.