paper: Low Latency Interrupt Framework

Thread created automatically to discuss a document in CD-Media.

Low Latency Interrupt Framework
by: dcbrown

Proposed 3-tier interrupt framework that supports low-latency but allows general user written code to be associated and executed with asynchronous events/interrupts.

Work in progress.

InterruptFramework-V0_5.doc (1.63 MB)
InterruptFramework-V0_6.doc (1.67 MB)
myLib_Interrupts.zip (19.9 KB)

I just read through it and it seems very interesting.

I quite like the example driver, and the explanation was very good.

I have a quick question, how would you integrate this with code written for other frameworks (e.g. Kevin’s code)?

Looks really cool, and I hope updates are forthcoming.

Samuel Harrington

You’d do a functional driver port. The resulting port into the framework results in its ability to utilize it within projects under MPLAB and EasyC/WPILIB - one set of code from that point forward.

The new framework utilizes a run-time binding of interrupt to isr/data structures vs the compile time binding in Kevin’s code. This permits a library to be utilized. You can tweak the “standard” driver if you want or just call it and use it as is. The tweaking could get you all the way back to the point that the driver you are using for your robot is essentially equivalent to compile-time binding… or you could have/utilize or choose an available driver package that is more generic such as the example to follow.

For example. The quad encoder is bound to the interrupt (phase_a) and phase_b pins via the driver writer supplied “package” routine. The user calls this “package” to declare/setup the system. Typically this is done only once so the Bind/unBind will reflash program memory jump tables for the appropriate physical and logical service routines. Typically only Binds will be done to hook or register the driver we want invoked by the respective interrupt as this physical configuration usually doesn’t change while the robot is running. The user of the driver doesn’t need to know how it works, just that pin ‘X’ is the phase a encoder pin and pin ‘Y’ is the B-phase encoder pin. The driver supplier in this case only supports 2 channels, but the driver programmer could easily expand to include 6 channel support.

The user of the driver then calls start/read/set/stop routines. Start enables the interrupt and stop disables the interrupt to control loading on the processor. As part of start, it specifies whether to invert the counting or not. Read returns the current count value for the channel and set changes the value to the number provided. These routines are provided by the driver writer. The driver writer could have decided to only invert the data at Read time, but in this case does it within the logical ISR handler.

The two handlers provided include one that is run at interrupt level (physical driver), while interrupts are disabled and a 2nd half of the driver (logical driver) to be run while interrupts are enabled. The user could have bound a third user driver to the same interrupt to do other processing (like changing the motor speed to accelerate or stop the robot after so many counts).

Anyway, the physical driver grabs the phase ‘a’ and phase ‘b’ values from the pins and saves them away. You wouldn’t need phase ‘a’ if this was a programmable edge interrupt but we don’t know which interrupt line the user will bind the driver to so its generic.

The logical driver then processes this information while interrupts are enabled. This logical driver inverts the count here, but that code could easily be moved to the Read user API routine to reduce the logical driver cpu run-time footprint.


/*
 * User provided routines... bind a user driver to encoder 
 *
void QuadEncoder0_MyCode( void )
{
      // Tie my function to encoder0 which is my lft wheel
      UsrISR_Bind( &MyLeftWheel_LogISR, XYZ_interrupt );
}

/*
 * Driver writer supplied routines...
  *   bind or configure driver to desired robot-specific interrupts/pins...
 */
void QuadEncoder_Package( channel, interrupt, phase_b_pin )
{
    if (channel == 0) {
        PhyISR_Bind( &QuadEncoder0_PhyISR, interrupt, RISING_EDGE );
        LogISR_Bind( &QuadEncoder0_LogISR, interrupt                     );
	
        QuadEncoder_Initialize( channel, interrupt, phase_b_pin );
        return;
    }

    if (channel == 1) {
        PhyISR_Bind( &QuadEncoder1_PhyISR, interrupt, RISING_EDGE );
        LogISR_Bind( &QuadEncoder1_LogISR, interrupt                     );
	
        QuadEncoder_Initialize( channel, interrupt, phase_b_pin );
        return;
    }

}
void QuadEncoder_unPackage( channel, interrupt, phase_b_pin )
{
    if (channel == 0) {
        PhyISR_unBind( &QuadEncoder0_PhyISR, interrupt );
        LogISR_unBind( &QuadEncoder0_LogISR, interrupt );
	
        QuadEncoder_UnInitialize( channel, interrupt, phase_b_pin );
        return;
    }

    if (channel == 1)
   {
        PhyISR_unBind( &QuadEncoder1_PhyISR, interrupt );
        LogISR_unBind( &QuadEncoder1_LogISR, interrupt );
	
        QuadEncoder_UnInitialize( channel, interrupt, phase_b_pin );
        return;
    }
}


/*
 * Driver writer supplied API (user callable) routines...
 */
unsigned char QuadEncoder_Start( channel, invert )
{
	if (channel > QUADENCODER_CHANNEL_MAX) return(1);
	switch(channel)
	{
		case 0: quadencoder0.invert = invert;
			InterruptEnable( quadencoder0.interrupt );
			return(0);
		case 1: quadencoder1.invert = invert;
			InterruptEnable( quadencoder1.interrupt );
			return(0);
		default: return(1);
	}
	return(0);
}
unsigned char QuadEncoder_Stop( channel )
{
	if (channel > QUADENCODER_CHANNEL_MAX) return(1);
	switch(channel)
	{
	  case 0: DisableInterrupt( quadencoder0.interrupt ); return(0);
	  case 1: DisableInterrupt( quadencoder1.interrupt ); return(0);
	  default: return(1);
	}
	return(0);
}
	
long QuadEncoder_Read( channel )
{
long tmp;
	if (channel > QUADENCODER_CHANNEL_MAX) return(0);
	switch(channel)
	{
	  case 0: CRITICAL_REGION_QUADENCODER0_BGN;
	 	tmp = quadencoder0.count;
 	             CRITICAL_REGION_QUADENCODER0_END;
		break;
	  case 1: CRITICAL_REGION_QUADENCODER1_BGN;
		tmp = quadencoder1.count;
 	             CRITICAL_REGION_QUADENCODER1_END;
		break;
	  default: tmp = 0;
	}
	return(tmp);
}
unsigned char QuadEncoder_Set( channel, value )
{
	if (channel > QUADENCODER_CHANNEL_MAX) return(0);
	switch(channel)
	{
	  case 0: CRITICAL_REGION_QUADENCODER0_BGN;
		quadencoder0.count = value;
 	             CRITICAL_REGION_QUADENCODER0_END;
		break;
	  case 1: CRITICAL_REGION_QUADENCODER1_BGN;
		quadencoder1.count = value;
 	             CRITICAL_REGION_QUADENCODER1_END;
		break;
	  default: return(1);
	}
	return(0);
}

/* 
 * Driver write supplied handler routines...
 */
void QuadEncoder0_PhyISR( void )
{
  quadencoder0.a_phase = ReadInterruptState( quadencoder0.interrupt   );
  quadencoder0.b_phase = ReadPin               ( quadencoder0.phase_b_pin );
  return;
}
void QuadEncoder0_LogISR( void )
{
long delta;

  if (quadencoder0.lastphase == quadencoder0.a_phase) return;
  quadencoder0.lastphase = quadencoder0.a_phase;

  if (quadencoder0.a_phase == 0) return;	// 1->0 transition, don't care

  delta = 1;
  if (quadencoder0.invert == 0)
  {
      if (quadencoder0.b_phase == 0)
	delta = -1;
  }
  else
  {
      if (quadencoder0.b_phase != 0)
	delta = -1;
  }
  CRITICAL_REGION_QUADENCODER0_BGN;
  quadencoder0.count += delta;
  CRITICAL_REGION_QUADENCODER0_END;

  return;
}
:
.

Just an example of what a driver writer could choose to do within the framework. They could also decide to make channel==phase a pin, i.e. channel 0 was always portb<2> or something similar.

Another update to the document, V0.6.

I added a brief section on how a device driver writer would make something for the framework.

I’ve changed how I’m invoking the user registered device drivers. I got rid of the ram function tables becuase it seemed wasteful of a limited resource. I’m setting up to use program memory and flash it as needed. A block of program memory is set aside for each of the 20 interrupts in the framework. When a device driver is bound to a particular interrupt, the function routine address is flashed into the program memory jump table as part of setup. The next time that interrupt is invoked, the supplied device driver is branched to via the dispatch table. This had a couple positives and negatives.

The negative is that you don’t want to be dynamically changing which drivers are running attached to which interrupts. The good news is this shouldn’t happen. A robot’s configuration is pretty stable while running (no one running along side swaping io wires). Although it is possible you’d like to share an io pin with a couple different drivers, the best way of handling something like that is to write your own software mux routine to multiplex between the two drivers. It’s just something that seems a bit odd-ish so I didn’t worry about it too much. There are ways of programming around it within the custom drivers you’d need anyway. Another negative is the programmable jump table code sucks up chunks of program space. I thought about reserving a flash block for each interrupt per service layer but thats something like 2k instructions with the 64 byte flash blocks of the 8722. But no matter how you work the issue, a chunk of program memory gets used. Currently a little under 1k bytes of program space gets used for the dispatch tables. The code to setup and flash the jump table entries is a bit ummm interesting, which is also a negative. The time it requires to flash the tables is in the ms range, but since this only has be to done once per code image the overhead isn’t too bad. The 2nd time the flash is done to bind a driver to an interrupt the code finds the appropriate address already there so doesn’t have to do anything. Yeah, it boarders on self-modifying code but what the heck you only live once.

The positives of the program jump tables are improved latency and reduced ram resources. The jump tables bring the framework back toward being more like compile-time bindings. The three PCLAT registers don’t need to be saved either since the code no longer does jumps through ram. This saves context time within the ISR and every bit helps (a total of 12 instruction cycles). The jump code is also cleaner code wise with a few less instruction cycles - only 1 pipeline break vs the three in using ram function pointers. I’m holding back on doing the final step of putting the jump tables in-line within the ISR proper. That would save 2 instruction cycles but overly complicate the flash code. With these changes the maximum execution path is just under 100 cycles. The average ISR time was about 20 cycles less than that. So, for common interrupts the latency time is under 10usec which was my original goal.

Along the way I found the compiler being “helpful” and taking logic that was optimally laid out and compiling it so it took 2.5x longer than it should by reorganizing the code in program memory. I haven’t found which optimization did it, but sticking in asm nop turned it all off and generated the code expected. I keep looking at coding the main chunk of the physical hardware ISR layer in assembly. Its not that big and I’m getting tired of fighting the compiler.

Bud

Uploaded interrupt framework source code.

Thanks for posting this. I found your writeup well-written and easy to follow. It’s been quite awhile since I’ve looked at driver-level code, I’m curious to look at the source.