Thought I'd add some pictures from the PIC16 architecture guide. It defines the two stage pipeline used within the processor with each instruction having 4 clock cyles they name Q1..Q4. You can see in the diagram that the fetched instruction after a branch/call has to be flushed since it is from the wrong pc.
Quote:
|
The clock input (from OSC1) is internally divided by four to generate four non-overlapping quadrature clocks, namely Q1, Q2, Q3, and Q4. Internally, the program counter (PC) is incremented every Q1, and the instruction is fetched from the program memory and latched into the instruction register in Q4. The instruction is decoded and executed during the following Q1 through Q4.
|
Quote:
An “Instruction Cycle” consists of four Q cycles (Q1, Q2, Q3, and Q4). Fetch takes one instruction cycle while decode and execute takes another instruction cycle. However, due to Pipelining, each instruction effectively executes in one cycle. If an instruction causes the program counter to change (e.g. GOTO ) then an extra cycle is required to complete the instruction.
The instruction fetch begins with the program counter incrementing in Q1. In the execution cycle, the fetched instruction is latched into the “Instruction Register (IR)” in cycle Q1.
This instruction is then decoded and executed during the Q2, Q3, and Q4 cycles. Data memory is read during Q2 (operand read) and written during Q4 (destination write). The diagram shows the operation of the two stage pipeline for the instruction sequence shown.
At time TCY0, the first instruction is fetched from program memory. During TCY1, the first instruction executes while the second instruction is fetched. During TCY2, the second instruction executes while the third instruction is fetched. During TCY3, the fourth instruction is fetched while the third instruction (CALL SUB_1) is executed. When the third instruction completes execution, the CPU forces the address of instruction four onto the Stack and then changes the Program Counter (PC) to the address of SUB_1. This means that the instruction that was fetched during TCY3 needs to be “flushed” from the pipeline. During TCY4, instruction four is flushed (executed as a NOP) and
the instruction at address SUB_1 is fetched. Finally during TCY5, instruction five is executed and the instruction at address SUB_1+1 is fetched.
|
So, PC is incremented and latched in Q1, the next instruction fetch is initiated during Q1. The returned instruction from program memory is then latched in Q4 for execution during the next instruction cycle. If the current instruction changes the PC by writing new data in Q3 as with a branch or call then that new PC won't show up to be used until the following instruction cycle's Q1 period but the next instruction fetch is already under way... so flush the next instruction by executing it as a no-operation.