Unexpected results from Encoder::GetRate()

I’m trying to set up a speed controller using the US Digital E4P-250-250-D-D-D-B encoder and the WPI Encoder class. Everything basically works fine but I’m seeing an odd discrepency between the values out of GetRate() and GetDistance(). Spinning the encoder at a constant rate, I would expect that the rate returned by GetRate() (units/sec) would more or less match the one-second delta returned by GetDistance(). But what I see is this: the distance returned for one second is almost exactly half of GetRate(). Because GetRate() is fairly noisy, I’m testing using a 20 sample moving average (with a sampling rate of 40hz). I’m sure I’m missing something obvious… Can one of the WPIlib experts here clue me in on what it is? Thanks!

Update, possible clue:

Looking through the WPI source I noticed this comment in Counter.cpp (used by 1x and 2x Encoder):

/*

  • Get the Period of the most recent count.
  • Returns the time interval of the most recent count. This can be used for velocity calculations
  • to determine shaft speed.
  • @returns The period of the last two pulses in units of seconds.
    */
    double Counter::GetPeriod()

Just now seeing the same problem when running x1 with the kit encoders.

GetDistance() gives the correct value but the GetRate() gives double rate.

Anybody using this function with success?

We have all the latest updates in SW.

Update:

I decided to check the GetPeriod() in Encoder.cpp. I don’t see any problem with the LyraS question but something else seems strange to me here and maybe someone can tell me why its correct.

double Encoder::GetPeriod()
{
if (m_counter)
{
return m_counter->GetPeriod() * DecodingScaleFactor(); //****THIS IS WHERE I SEE A PROBLEM ***
}
else
{
tEncoder::tTimerOutput output = m_encoder->readTimerOutput(&status);
double value;
if (output.Stalled)
{
// Return infinity
double zero = 0.0;
value = 1.0 / zero;
}
else
{
value = (double)output.Period / (double)output.Count;
}
wpi_assertCleanStatus(status);
return value * 1.0e-6 / (DecodingScaleFactor() * 4.0);
}
}

In the 1x and 2x the counters are used. The period that the counters return should be for the actual pulses not the spec pulse frame. Hence, the periods should be shorter when running 2x. So when

if (m_counter)
{
return m_counter->GetPeriod() * DecodingScaleFactor(); //****THIS IS WHERE I SEE A PROBLEM ***
}

occurs, it should correct the counter period to the spec period which would be longer. This statement makes the period the same or shorter since DecodingScaleFactor() returns 1 for 1x , .5 for 2x.

It seems that this should read

if (m_counter)
{
return m_counter->GetPeriod() / DecodingScaleFactor(); //****USE DIVISION TO CORRECT PERIOD ***
}

The division by DecodingScaleFactor() would then be similar to that used in the 4x return value.
If my reasoning is correct, then the 2x should be off by a factor of 4 but the 1x should be ok. So unfortunately this doesn’t explain why we still are double the rate when running at 1x and maybe when I get some time, I’ll check this hypothesis. Where is Joe Hershberger when we need him?

Have you guys tried 2x and 4x modes? Do you get what you expect in those modes?

At a glance, it looks like the period should be divided by DecodingScaleFactor(), as suggested. I don’t think the 4x version should have the “4.0” in the denominator either. I need to try it out when I’m at a system that has an encoder. I’m at the regional in Cleveland this weekend.

-Joe

Not yet…competing in LA regional this weekend also.

I don’t think the 4x version should have the “4.0” in the denominator either.

Yes, I could see no reason for that either. Thought this was a hold over from the 4x error days. Hard to believe no one has reported any problems with this function. Guess everyone is doing their own rate derivation which we were going to do , but I wanted to give the wpi lib a chance since I didn’t remember these being a problem last year. We are using the GetRate() with a .5 multiplier. Seems to work ok then.

Encoder::GetRate() in 4x mode returns crap. And by crap I mean the value returned is proportional to the speed of the robot, but I know our bot is not going 80 feet per second (55 mph)… I tried changing (4x mode) to return:

value * 1.0e-6  / DecodingScaleFactor()

Instead of:

value * 1.0e-6  / (DecodingScaleFactor() * 4)

Which lowered the error, however it is still way too high.

I know I can just multiply the returned value by some gain, but I’d rather have the code actually be right. I’m about to go chuck the dang thing up in the lathe so I know what the value returned should be, and work from that.

Good to have another data point. Couple of questions.

Is the GetDistance() correct?

If so, can you use it to make a quantitative estimate on what the ratio of encoder rate to actual rate in x4 after you eliminated the 4 factor in the denominator? I’m guessing it might be off by a factor of two like the 1x. If it is, that could provide a clue to whats going on.

In my experience, using 4x to get rate always introduces extra noise due to inherent phase errors built into the code wheels.
http://www.chiefdelphi.com/forums/showthread.php?t=74763&highlight=encoder+noise

Did you actually call SetDistancePerPulse() with the correct scale? The API can’t magically know your gear ratios or wheel circumferances.

Yes, GetDistance returns the correct distance when the bot is pushed.

1x and 2x are not as noisy. And are normally distributed. 4x is distributed towards the center and ± some value. It’s either a code problem or an encoder problem. I’m going to hook it up to a function generator and figure out which it is.

The function gen is a good idea, but you can use your GetDistance() to make an accurate estimate of the average rate. Could save you a little time.

If you used a large enough dt, true. However, 1x or 2x using GetRate returns perfectly fine data and takes less time to implement. The point of using a function generator is simply to verify that it is indeed the encoder, not the FPGA code, that is the problem.

I could also just fix my stupid 2nd probe and look at the encoder output on the scope…

BTW, the FPGA code could be changed to account for the phase lag. It’s actually a very easy fix, although it would require you to experimentally determine the phase lag and pass it to the FPGA. Determining this would simply require spinning the encoder at a constant speed and look at the channel B phase offset.

Ok, then you are not seeing the factor of 2 on 1x rate that we are and determining avg rate would be unnecessary. How did you independently verified the 1x rate was ok?

It is hard to envision that the encoder could be in error if GetDistance() is working, so I vote for the code.

The speed is still off by a constant factor, but that’s not what I’m focusing on at the moment. I agree that this is a code issue because GetDistance is correct.

The encoders are the reason for the noisy data returned by 4x decoding. GetDistance isn’t greatly effected by this because every rising/falling edge is offset an equal but opposite amount, so it averages out to be correct. The problem comes in when you try to take the derivative (GetSpeed), because now these offsets are no longer averaged out.

I did not follow the entire thread of discussion so if I misunderstood the issue, I apologize. But I do want to mention my experience on dealing with the accelerometer by integrating it twice to get distance. How did you differentiate the distance value to get speed? In particular, how did you get dt? The reason I ask is because I discovered that if you have code that is supposed to be run periodically at a fixed interval, don’t count on the interval being accurate. For example, my integrator code is supposed to run every 10 msec, eventually I found out that it got called every 20+ msec instead. So don’t use the period as your dt because it is not accurate. Your speed data will be all over the place. I scratched my head on the accelerometer for a while until I finally timed the interval of my code being called and found that I can’t depend on it being constant. So I now wrote the code to measure the interval and use it as my dt instead. Then my integrator is now quite accurate.

So I was hoping that you would use GetDistance() with a large dt (say 10 revolutions) to determine this constant to verify that it matches our estimated factor of 2 constant.

The encoders are the reason for the noisy data returned by 4x decoding. GetDistance isn’t greatly effected by this because every rising/falling edge is offset an equal but opposite amount, so it averages out to be correct. The problem comes in when you try to take the derivative (GetSpeed), because now these offsets are no longer averaged out.

We are in full agreement here and as I stated before, our team never uses x4 because it really is unnecessary with 1 deg resolution encoders and why introduce the asymmetric 4x phase errors.

So the noise is expected, but the mean values being off by a constant is what I’m focused on. This smacks of a FPGA counter average period problem.

This is a little off topic in that the encoders use pulse interval timers to derive speed and do not depend on periodic loops. But I suspect that the FPGA interval timer that measures an average pulse period may have a bug in it.

In support of your comments, we use the iterative robot class and a navigation function that integrates encoder rate resolved to field coordinates. Distance tests were off by a factor of 2. Either the speeds were double or the loop iteration rate was double. We integrated dt = 1/(Update_Rate) and checked that the computed time matched real time so this led us to the double encoder GetRate() error which we verified independently using motor rpm measurements.

But what is Update_Rate? If it’s derived from the period of some periodic loop, you need to verify the real interval of the periodic loop. How did you compute real time? I use the following code to measure my periodic intervals:


UINT32 timeCurr = GetFPGATime(); // in usec
float period = (float)(timeCurr - m_timestamp)/1000000.0;
m_timestamp = timeCurr;

Update_Rate is a define… say 100. hz.
dt = 1/Update_rate;

We use iterative class SetPeriod(dt);

Then we measure cumulative time rather than loop period:
We put this in the loop : time = time + dt :
and simply compare time print out with a stop watch. Since we were looking for a factor of 2 this was accurate enough. We easily checked within a few %.

I am sorry. I am probably not helping with your particular problem. I am merely trying to point out that in the iterative robot class, don’t count on the period being accurate and use it as dt for differentiation or integration. You said your Update_Rate is 100 Hz. That makes the period 10 ms. If you put “time = time + 1/Update_Rate” in your loop, your time is really summing a constant 10 ms on every loop period, not real time. If you do this instead:


class MyRobot: public IterativeRobot
{
private:
    UINT32 m_timestamp;
    ....

public:
    void AutonomousInit()
    {
        m_timestamp = GetFPGATime();
        ....
    }
    void AutonomousPeriodic()
    {
        UINT32 timeCurr = GetFPGATime(); // in usec
        float period = (float)(timeCurr - m_timestamp)/1000000.0;
        m_timestamp = timeCurr;
        printf("loop period is %f
", period);
        ...
    }
};

“period” will be the accurate loop interval that can be used as dt in your differentiation or integration. In theory period should print out 0.01s (10 ms), but in our case, it is way off. It could be because our loop execution took longer than the 10ms to execute that may cause a period to be skipped in some cases. Therefore, as a good practice, never use the period you set in any time critical calculations.

I’ve done a lot of real time control systems and never had a problem with interrupt driven loops because we always verify that execution cannot exceed the loop period. So I suspect you have a problem delay in your loop and indeed if you are using a debug console printf(…) in a 10 ms loop then this can easily cause you a problem since it requires about 1ms per char and once you exceed the buffer, this becomes a realized delay.

Therefore, as a good practice, never use the period you set in any time critical calculations
… unless you have verified your computations can’t exceed the loop time:)

Edit: Actually this is a design issue rather than good practice. Either you design it as a periodic loop or not. Often programmers don’t understand the hardware fully and run into these types of problems and wind up adding the type of compensation programming that you are suggesting. In fact, IMHO, when things are time critical, the periodic loop is often preferred since it is more predictable than non periodic solutions.

Ok, back to Encoder Problem:
I got hold of a CRIO finally and ran some encoder outputs while we bench drove the robot. I set the iterative loop rate to 10 hz and printed to the console :
printf( … ", GetDistance(), GetRate(), Derived rate) where derived rate was the avg rate over the 100ms period.

Ran full forward, full reverse, partial forward, partial reverse cases for 1x,2x,4x.

Data can be downloaded from


I recommend downloading and viewing the plots since they are cutoff on scrid site.

For some reason the encoder counts were about 1/5.7 of what they should have been… ie the avg derived rate was 1.2 fps rather than 6.8 fps. So the encoder might be broken but it did register linear GetDistance().
Aside from this, the GetRate() was consistent for 1x,2x,4x and showed plots with basically two interwoven levels. One close to the derived rate of 1.2 fps and the other around 12 fps when running max speed. So there is definitely something odd for all modes.

So, today, I tried some more direct debug of encoder.cpp. I disconnected the encoder inputs and replaced them with a periodic pulse on the A channel and open (pull up value) on the B channel. We figured we could manually cause the counters to increment by alternating between open and ground on the A input. Well, something really weird happened. The 0 input caused a 1 count and the 1 input caused the count to decrement to 0 count again. Huh?? So the GetRaw() just alternated between 1 and 0 rather than accumulate.

So must be some good explaination… right?

Maybe more on Monday.