PIDController causing code to fail silently

Hola,

We attempted to add a PIDController to our RobotPy code:

self.left_motors = wpilib.Talon(2)
self.left_encoder = wpilib.Encoder(2, 3)
self.left_pid_controller = wpilib.PIDController(self.p, self.i, self.d, self.left_encoder, self.left_motors)

But having the PIDController constructor line causes very strange behavior:

We raise a SystemExit when a joystick button is pressed in order to have RobotPy hot reload the code. This works fine, but if we include the PIDController line the code completely stops running. By “stops running” I mean Robot Comms are green, Robot Code is green, SmartDashboard says it’s connected, but the none of the code is run. Print statements won’t work, SmartDashboard updates on every tick aren’t happening, etc. There is no Python running on the cRio.

Any idea what’s going on or how to fix it?

Is anything printed to the console?

It sounds like a crash, although it’s hard to know for sure without a NetConsole dump (which should print something if this happens). Assuming it’s a crash, I’ll explain what’s likely going on. There’s a workaround (reimplement the PIDController class in Python), but naturally that’s a bit ugly. I don’t have a real fix at the moment, although I’ll try to reproduce this locally.

Basically the problem is that hot reloads on the cRio VxWorks OS are a hack with resultant ugly corner cases, mostly because both Python and WPILib don’t have good cleanup code. The hot reload implementation will be much cleaner and work much better on the 2015 control system (which is based on Linux instead of VxWorks), as it will be straightforward to restart the entire robot process (yes, this means that we can get really fancy and auto-restart immediately on file edits, without even a button press).

In short, when you set up a PIDController:

  • RobotPy’s PIDController is a wrapper around the WPILib C++ PIDController class.
  • The WPILib C++ PIDController class uses the WPILib Notifier class and asks for the Calculate callback to get called every X ms.
  • The Notifier class uses the FPGA timer and low-level interrupts to call back into the PIDController.

When you do a hot reload, the RollbackImporter tries to delete all of the loaded modules, and all of them are reloaded the next time your code imports them. The modules include wpilib itself, and I’m guessing that for some reason the Notifier itself or the PIDController callback is not getting cleaned up properly, and when the next timer tick comes along X ms later, the interrupt handler references a dangling pointer, resulting in a crash. It’s also conceivable a race condition exists somewhere in the cleanup process.

Harumph. The netconsole did not indicate any crashes. With a different problem we had a crash and the netconsole output something like “fatal exception at 0x00000000”. This did not happen with the PIDController line.

The output we had was strange, now that I think more about it. If I remember correctly it printed:

System exit called

and nothing after that.

This is part of our boot.py:

    ...
    while True:
        rollback = RollbackImporter()
        robot = None
        try:
            print("Importing user code.")
            robot = __import__("robot")
            print("Running user code.")
            robot.run()
            #runpy.run_module("robot", run_name="__main__")
        except SystemExit:
            print("System exit called")
            pass
        except:
            print("-"*60)
            traceback.print_exc(file=sys.stdout)
            print("-"*60)

    print("User code raised SystemExit; waiting 2 seconds before restart")
    time.sleep(2)
    ...

We would expect to see “User code raised SystemExit;…” after “System exit called”, but we didn’t.

Hmmm… I think we’ve abandoned the PIDController for this season, but I’d like to figure out this problem as soon as I get the chance.

For what it’s worth, I’ve never trusted the hot reload functionality, and we just always reboot our robot when we’re reloading code. It’s only 30 seconds or so.

Also, I’ve had very odd problems in the past using the PIDController, so I’ve tried to avoid using it for most things. When we’ve needed that level of control, we tend to use the PID control available on the Jaguar, instead of using the PIDController class.

One thing worth noting too is you can get more consistent control on your main thread if you control the amount of delay more precisely than using wpilib.Wait() at the end of each loop. We use this object to do it.


try:
    import wpilib
except ImportError:
    from pyfrc import wpilib


class PreciseDelay(object):
    '''
        Used to synchronize a timing loop.

        Usage:

            delay = PreciseDelay(time_to_delay)
            while something:
                # do things here
                delay.wait()

        TODO: Does this add unwanted overhead?
    '''
    
    def __init__(self, delay_period):
        '''
            :param delay_period: The amount of time to do a delay
        '''
        
        self.timer = wpilib.Timer()
        self.delay_period = delay_period
        
        self.timer.Start()
        
    def wait(self):
        '''Waits until the delay period has passed'''
        
        # we must *always* yield here, so other things can run
        wpilib.Wait(0.001)
        
        while not self.timer.HasPeriodPassed(self.delay_period):
            wpilib.Wait(0.001)


With something like this controlling your delay, you can implement PID-like control in your main loop without too many problems.

Awesome PreciseDelay. Thanks, Dustin