Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
cpu_fuse [2016/05/23 00:45]
flype
cpu_fuse [2016/05/29 19:20] (current)
flype
Line 1: Line 1:
-====== APOLLO Core - Bonding ​Feature ======+ 
 +====== APOLLO Core - Fusing ​Feature ======
  
 ===== Overview ===== ===== Overview =====
  
-The APOLLO ​Core is a [[cpu_superscalar|SuperScalar]] ​68k processor as was the 68060. "​SuperScalar" ​means that the processor ​can schedule more than one instruction ​per cycle. Under some circumstances ​the 68060 could schedule ​two simple ​instructions in one cycle where the 68040 had to execute ​the two instructions one after the other which then took two cycles.+The APOLLO ​CPU supports ​feature that increases the number of instructions executed per clock. While the core is [[cpu_superscalar|SuperScalar]] ​which means that it can execute two instructions ​per clock cycle if the destination operand of one instructions isn't the same as the source operand of the subsequent instruction,​ it can also **bundle ​two instructions ​to be executed as a single instruction**. This is possible because the ALU (Arithmetic and Logical Unit) of the APOLLO Core is internally a 3-operand ALU while the 68k only has 2-operand code. 
 + 
 +---- 
 + 
 +===== Examples ===== 
 + 
 +__2-operand code :__ 
 + 
 +''​add.l d0,​d1''​ means you add the number ​in register d0 to the number in d1 and store the result in d1. In C syntax: ''​d1 = d0 + d1;''​. 
 + 
 +__3-operand code :__ 
 + 
 +''​add.l d0,​d1,​d2''​ means you add the number in d0 to the number in d1 and store the result in a third register d2 (or d1 if you wanted to do exactly what ''​add.l d0,​d1''​ does on a 68k processor). In C syntax: ''​d2 = d0 + d1;''​. 
 + 
 +In addition to this the APOLLO ALU has internally more operations than officially supported by the 68k.
   ​   ​
-A typical pair of instructions ​that can be executed on superscalar processor such as the 68060 requires the instructions to be scheduled in parallel to be independentThis means the result of the first instruction must not be used in the subsequent instruction:+We can now exploit these two internal "​extras"​ for instruction bundling and "​fuse"​ two instructions ​into single (internal) instruction. 
 + 
 +ASM example ​:
   ​   ​
-    add.l d0,d1 +  move.l d0,d2 
-    add.l d2,d3 +  add.l  d1,d2 
-   + 
-The 68060 can execute these two instructions in the same cycle and so can the Apollo. But the Apollo core has many advantages over the 68060 which is also why it is faster than the 68060 when running at the same clock rate+In C syntax : 
-   + 
-As already mentioned above, the Apollo core is also superscalar. However, it can execute ​a much higher variety of instruction in its different pipelines than the 68060. Only a few complex ​instructions ​such as DIV and MOVEM are always executed ​in the first pipeline. Already this feature alone means that far more combinations ​of two independent ​instructions ​can be scheduled in the same cycle on Apollo core as compared to the 68060.  +  ​d2 = d0 + d1; 
-   + 
-But the Apollo core can also execute some instruction pairs in parallel ​that the 68060 cannot. This is called "​Instruction Bonding"​By bonding ​two instructions,​ the Apollo ​core can execute ​some combinations of dependent instructions in parallel on two of its pipelinesOne example: +If you check again what I wrote about 3-operand code, you will see that this is precisely what the single 3-operand instruction add.l d0,d1,d2 does! The APOLLO Core recognises such bundles of 68k instructions ​and executes them together ​in a single clock cycle. This is not the same as standard SuperScalar execution because ​the second 68k instruction depends on the result of the first instruction and thus could not be executed in a single cycle on the 68060. 
-   + 
-    move.l #1234,d1 +Since APOLLO ​is SuperScalar, it can execute ​these two instructions in addition to yet another instruction or bundle ​of instructions ​increasing instructions per clock dramatically
-    add.l d1,d2 + 
-   +This also means that in order to optimise code for the APOLLO you would sometimes get even better results by not separating instructions that depend on each otherWhile on an 68060 you would try to fit an extra independent instruction between the two instructions ​mentioned aboveyou should not do so on the APOLLO and just leave it to the core to execute ​the two instructions together
-This updates two registersd1 and d2 (C syntax: d1 = 1234; d2 += d1;which cannot be done by the instruction fusing explained in another thread which will execute two instructions as one in a single pipelineFusing instructions requires both instructions to operate on the same destination register and thus can be executed as one instruction in one pipelineBonding instructionson the other handupdates two different destination registers and therefore will be executed by two pipelinesHowevera traditional superscalar processor such as the 68060 could not do this as the two instructions are dependent on each other+ 
-   +---- 
-Using instruction fusing and instruction bonding in addition to normal superscalar execution the Apollo core can execute two instructions in parallel more often than a 68060 couldThis is one of the reasons why the Apollo core is faster than even higher clocked 68060 processors.+ 
 +===== Supported fusing combinations ===== 
 + 
 +In latest public core (SILVER2). 
 + 
 + (1) 
 + MOVE.L (An)+,(Am)+ 
 + MOVE.L (An)+,(Am)+ 
 + => 
 + MOVE.Q (An)+,(Am)+ 
 + 
 + (2) 
 + MOVE.B (d16,​An),​Dn 
 + EXTB.L Dn 
 + =
 + MVS.B ​ (d16,A0),Dn 
 + 
 + (3) 
 + MOVE.W (d16,​An),​Dn 
 + EXT.L  Dn 
 + => 
 + MVS.W ​ (d16,A0),Dn 
 + 
 + (4) 
 + MOVE.L Dn.Dm 
 + NOT.X ​ Dm 
 + 
 + (5) 
 + MOVE.L Dn.Dm 
 + NEG.X ​ Dm 
 + 
 + (6) 
 + MOVE.L Dn.Dm 
 + ADDQ.X #,Dm 
 + 
 + (7) 
 + MOVE.L Dn.Dm 
 + SUBQ.X #,Dm 
 + 
 + (8) 
 + MOVE.L Dn.Dm 
 + ANDI.W #,Dm 
 + 
 + (9) 
 + MOVE.L Dn.Dm 
 + OR.X  Do,Dm 
 + 
 + (10) 
 + MOVE.L Dn.Dm 
 + AND.X ​ Do,Dm 
 + 
 + (11) 
 + MOVE.L Dn.Dm 
 + ADD.X ​ Do,Dm 
 + 
 + (12) 
 + MOVE.L Dn.Dm 
 + SUB.X ​ Do,Dm 
 + 
 + (13) 
 + MOVEQ ​ #,Dn 
 + OR.X   ​Dm,​Dn 
 + 
 + (14) 
 + MOVEQ ​ #,Dn 
 + AND.X ​ Dm,Dn
  
 ---- ----
Line 28: Line 110:
 [[links|Links]] |  [[links|Links]] | 
 [[apollo_core|APOLLO]] |  [[apollo_core|APOLLO]] | 
 +
  
  
  • cpu_fuse.txt
  • Last modified: 2016/05/29 19:20
  • by flype