

## **MIPS Open Developer Day**

Saraj Mudigonda Yuri Panchul Daniel Bowman Siobhan Lyons







- 12:30 1:15pm Welcome & Introduction
- 1:15 1:45pm Demo: MIPS Components in Action
- 1:45 2:00pm Break
- 2:00 3:30pm Hands-on Labs & Exercises
- 3:30 3:45pm Break
- 3:45 4:45pm Build Your Own MIPS-based SoC
- 4:45 5:15pm Present Your innovation
- 5:15 5:30pm Wrap-up



- Workshop Prerequisites
  - Sign up and activate a MIPS Open account CLICK HERE
  - Accept the License Agreement and request the MIPS Open FPGA package in the downloads section CLICK HERE
  - A Windows or Linux notebook
- Loaner Hardware
  - Altera/Xilinx FPGA Boards
  - USB Hub
  - Cables
  - SSD Drive
- What is included in SSD Drive?
  - MIPS Open FPGA Developer Day package
  - Altera Quartus and Xilinx Vivado tools
- Housekeeping
  - Please delete the Altera Quartus and Xilinx Vivado tools on the SSD!!!
  - When you leave the class, please leave the FPGA boards, cables and USB hub.
  - The SSD is yours to take home



## **Welcome & Introduction**





## **MIPS Open Milestones**







## **MIPS Open Components**











- No license fees, no royalties, non-exclusive, worldwide license
- Latest MIPS R6 architecture, microAptiv cores, Tools
- Right and license under R6 architecture patents to design, build and sell cores
- Use of the "MIPS Certified" trademark logo for certified cores





## **MIPS Open Advisory Committee Membership Levels**

Individual Membership

COMPUTING

#### Free per year

- Allows participation in all working groups
- Represent individual academic and non-profits

**o**prpl

Silver Membership

#### \$10,000 per year

- Can be appointed to lead
   a working group
- Vote as a Silver class representation on the board of representations







## **MIPS Open™ Development**



## 

## **MIPSOpen.com** Now Live

LOGIN



DOWNLOADS MIPS OPEN<sup>TM</sup> COMPONENTS  $\lor$  RESOURCES  $\lor$  ABOUT  $\lor$ 

## Wave's First MIPS Open Program Components Now Live

Immediate Access to the Proven, Industry-Standard and Patent-Protected MIPS RISC Architecture







## 3 Easy Steps to Download, Innovate, Design & Build





Accept License Agreement



**MIPS Open Architecture MIPS Open IDE MIPS Open FPGA MIPS Open Cores** MIPS Open FPGA Getting Started Package MIPS Open Architecture MIPS Open IDE – Linux Ţ microAptiv UP Core ⊥ **⊥** Ŀ MIPS Open IDE - Windows Ţ MIPS Open FPGA Labs microAptiv UC Core **I** Ŀ MIPS Open FPGA SOC ⊥ View License Agreement View License Agreement View License Agreement View License Agreement

## 

## MIPSOpen<sup>™</sup> Open Use cores

## MIPS Leading in its class Performance efficient microAptiv Cores

- Improved 5 stage pipeline architecture
- 32 GPRs, with up to 16 shadow register sets
- Minimal interrupt latency
- Integrated DSP ASE outperfroms Cortex-M4
- 3.5 CoreMarks/MHz, 1.7DMIPs/MHz

#### **Higher performance & scalability**

- Shadow Registers for faster context switching
- Mostly single operation instructions
- Simpler memory addressing modes
- User Defined Instructions (UDI) for custom ISA





Wave Computing  $\ensuremath{\mathbb{C}}$  2019: MIPS Open Developer Day, 4 June 2019



## **Demo: MIPS Components in Action**







## MIPS microAptiv UP and its interface options

- System bus, AHB-Lite, with optional bridge to AXI
- CorExtend / UDI User-Defined Instructions
- Cop2 an older, more flexible and complicated coprocessor interface
- Data ScratchPad RAM, DSPRAM
  - Custom block, can be used as fixedlatency memory or high-speed I/O
- Instruction ScratchPad RAM, ISPRAM





## MIPS Open FPGA system bus uses AHB-Lite





**VVVVE**°

COM



### • Serial loader

- A hardware block that receives a file in Motorola S-Record format, parses it using state machine and writes into a system memory
- Slow clock for run-time debug
  - A clock divider that allows running the processors with few Hertz frequency
    - Useful to observe cycle behavior of the processor in real time
- External SDRAM memory controller
- External interrupt controller
- Extra wiring to support labs to observe cache and CPU pipeline bypasses







## Take a Break





## Hands-on Labs & Exercises The workflow





- programs/01\_light\_sensor
- programs/02\_interrupts
- programs/03\_cache\_misses
- programs/04\_pipeline\_bypasses





- For this workshop you will need to connect 3 USB devices (SSD drive, FPGA download cable, USB-to-UART serial cable). We have a 4 port USB hub if you need one.
- Connect the drive to USB port
- Reboot or start your laptop
- Hit key F12 prior to booting your normal OS
- Select external USB drive as the new source
- You now have Lubuntu loaded with Intel FPGA Quartus and Xilinx Vivado tools





- cd ~/mipsopen/boards/de10\_lite (or another board)
- make all load
- Press reset (or KEY 0 on some boards) to reset the processor
- The default hardcoded program should start to work
- cd ~/mipsopen/programs/00\_counter (or other program)
- make program srecord uart
- If computer uses serial connection other than ttyUSB0 (the default), then:
  - make program srecord uart UART=1 (or 2, 3, etc)
- The program uploaded via USB-to-UART is now running





- cd ~/mipsopen /boards/de10\_lite (or another board)
- For Intel FPGA boards, run ./make\_project.sh to create a scratch directory project
- Run synthesis and FPGA configuration in scratch directory
- Press reset (or KEY 0 on some boards) to reset the processor.
- The default hardcoded program should start to work.
- cd ~/mipsopen /programs/00\_counter (or other program)
- make program srecord uart [UART=0,1,2...]
- The program uploaded via USB-to-UART is now running.





## **Integration with Light Sensor**

programs/01\_light\_sensor



## Digilent PmodALS - Ambient Light Sensor







## Connecting Light Sensor to Terasic DE10-Lite

USB-to-UART (needed to upload the program into MIPSfpga SoC): green TX jumper goes into 3 pin from upper right corner, black GND jumper goes into 6 pin from upper right corner. Light Sensor is connected to the second row of pins as shown with color-coded jumpers.





**WAVE**®

rd

## Connecting Light Sensor to Digilent Nexys A7









**SPI** Protocol

https://reference.digilentinc.com/pmod:communication\_protocols:spi







## Light Sensor SPI interface module

#### system\_rtl/mfp\_pmod\_als\_spi\_receiver.v

```
module mfp_pmod_als_spi_receiver
                                                                assign sck = \sim cnt [3];
                                                                 assign cs = cnt [8];
    input
                        clock,
                                                                wire sample_bit = ( cs == 1'b0 && cnt [3:0] == 4'b1111 );
    input
                        reset_n,
                                                                wire value_done = ( cs == 1'b1 && cnt [7:0] == 8'b0 );
    output
                        cs,
    output
                        sck,
                                                                 always @ (posedge clock or negedge reset_n)
    input
                        sdo,
                                                                 begin
    output reg [15:0] value
                                                                     if (! reset_n)
);
                                                                     begin
                                                                         shift <= 16'h0000;</pre>
    reg [ 8:0] cnt;
                                                                         value <= 16'h0000;</pre>
    reg [15:0] shift;
                                                                     end
                                                                     else if (sample_bit)
    always @ (posedge clock or negedge reset_n)
                                                                     begin
    begin
                                                                         shift <= (shift << 1) | sdo;</pre>
        if (! reset_n)
                                                                     end
             cnt <= 8'b100;
                                                                     else if (value_done)
        else
                                                                     begin
             cnt <= cnt + 8'b1;</pre>
                                                                         value <= shift;</pre>
    end
```



{

```
programs/01_light_sensor/main.c
int main ()
    int n = 0;
    for (;;)
    {
        MFP_RED_LEDS = MFP_LIGHT_SENSOR >> 4;
        MFP_7_SEGMENT_HEX = MFP_LIGHT_SENSOR;
        MFP_GREEN_LEDS = n ++;
        delay();
```

## Header that defines uncached I/O addresses

#### programs/01\_light\_sensor/mfp\_memory\_mapped\_registers.h

| #define            | MFP_RED_LEDS_ADDR      | 0xBF800000 |
|--------------------|------------------------|------------|
| <pre>#define</pre> | MFP_GREEN_LEDS_ADDR    | 0xBF800004 |
| #define            | MFP_SWITCHES_ADDR      | 0xBF800008 |
| <pre>#define</pre> | MFP_BUTTONS_ADDR       | 0×BF80000C |
| #define            | MFP_7_SEGMENT_HEX_ADDR | 0xBF800010 |

#define MFP\_LIGHT\_SENSOR\_ADDR 0x

0xB0404000

#define MFP\_RED\_LEDS
#define MFP\_GREEN\_LEDS
#define MFP\_SWITCHES
#define MFP\_BUTTONS
#define MFP\_7\_SEGMENT\_HEX
#define MFP\_LIGHT\_SENSOR

| (* | (volatile            | unsigned | *) | MFP_RED_LEDS_ADDR      | ) |
|----|----------------------|----------|----|------------------------|---|
| (* | <pre>(volatile</pre> | unsigned | *) | MFP_GREEN_LEDS_ADDR    | ) |
| (* | (volatile            | unsigned | *) | MFP_SWITCHES_ADDR      | ) |
| (* | (volatile            | unsigned | *) | MFP_BUTTONS_ADDR       | ) |
| (* | (volatile            | unsigned | *) | MFP_7_SEGMENT_HEX_ADDR | ) |
| (* | (volatile            | unsigned | *) | MFP_LIGHT_SENSOR_ADDR  | ) |



OMPUTING

- programs/01\_light\_sensor/main.c
- programs/01\_light\_sensor/mfp\_memory\_mapped\_registers.h
- system\_rtl/mfp\_pmod\_als\_spi\_receiver.v
- system\_rtl/mfp\_ahb\_lite\_pmod\_als.v
- system\_rtl/mfp\_ahb\_lite\_matrix\_config.vh
- system\_rtl/mfp\_ahb\_lite\_matrix.v
- system\_rtl/mfp\_ahb\_lite\_matrix\_with\_loader.v
- system\_rtl/mfp\_system.v
- boards/de10\_lite/de10\_lite.v
- boards/nexys4\_ddr/nexys4\_ddr.v



## Other sensors from Digilent



# C O M P U T I N G

## Interrupts

programs/02\_interrupts



## The action of an I/O interrupt



CO

The source of the figure: http://virtualirfan.com/history-of-interrupts



Connecting buttons to interrupt signals inside system\_rtl/mfp\_system.v

```
`ifdef MFP_DEMO_INTERRUPTS
assign SI_Int[2:0] = IO_Buttons [2:0];
`else
assign SI_Int[2:0] = 3'b0;
`endif
```







#### Default general exception handler in programs/02\_interrupts/exceptions.S

.section .exceptions

.org 0x180

general\_exception\_vector:

.type general\_exception\_vector, @function

j general\_exception\_handler





#### Custom general exception handler in programs/02\_interrupts/main.c

```
#include <mips/cpu.h>
```

```
#include "mfp_memory_mapped_registers.h"
```

```
volatile int n;
```

```
void __attribute__ ((interrupt, keep_interrupts_masked)) general_exception_handler ()
{
    unsigned cause = mips32_getcr (); // Coprocessor 0 Cause register
    if (cause & CR_HINT0) // Checking whether interrupt 0 is pending
        n = 0;
    else if (cause & CR_HINT1) // Checking whether interrupt 1 is pending
        n = 0x100000;
```

#### 

#### Software side

Setting the interrupts in programs/02\_i nterrupts/main .C // Count with interrupts, without polling buttons in the loop

// Clear boot interrupt vector bit in Coprocessor 0 Status register

```
mips32_bicsr (SR_BEV);
```

// Set master interrupt enable bit, as well as individual interrupt enable bits
// in Coprocessor 0 Status register

```
mips32_bissr (SR_IE | SR_HINT0 | SR_HINT1 | SR_HINT2 | SR_HINT3 | SR_HINT4 | SR_HINT5);
for (n = 0;;)
{
    MFP_7_SEGMENT_HEX = ((n >> 8) & 0xffffff00) | (n & 0xff);
    __asm__ volatile ("di"); // Disable interrupts
    n ++;
    __asm__ volatile ("ei"); // Enable interrupts
}
```

Changes in linker script in programs/02\_interrupts/program.ld SECTIONS { /\*\*\*\* Exception vectors \*\*\*\*/ .exceptions 0x80000000 : /\* Exception vectors. \*/ { \*(.exceptions) /\* For some reason the following is necessary for this section to be kept when .rec is produced \*/ BYTE(0) = 0



### **Observing CPU L1 cache in action**

programs/03\_cache\_misses



### Causing different patterns of cache misses

- Caches exploit temporal and spatial locality of instructions and data to improve the processor's performance
- MIPS microAptiv UP core has several cache configurations:
  - I-cache and D-cache
  - 1, 2, 3 or 4 way set associative
  - 1, 2, 4, 8, 16 KB

 MIPS Open Day Package allows to directly observe cache behavior on LED with slow clock feature

Memory access patterns from Computer Architecture course by David Wentzlaff from Princeton University. 2011.





programs/02\_cache\_misses/main.c

// Wait for switch 2

while ((MFP\_SWITCHES & 4) == 0)
;

```
int main ()
{
    int n = 0;
    int i, j;
```

int a [8][8];





Connecting cache miss signals to LED inside system\_rtl/mfp\_system.v

```
wire burst = (HTRANS == `HTRANS_NONSEQ && HBURST == `HBURST_WRAP4);
```

```
assign IO_GreenLEDs =
```

```
{ `MFP_N_GREEN_LEDS - (1 + 1 + 4 + 4) { 1'b0 } },
HCLK,
burst,
HADDR [7:4],
```



{

Pattern data:

Miss/Blink, Hit/Nothing, Nothing, Nothing Miss/Blink, Hit/Nothing, Nothing, Nothing Pattern for data:

Series of 8 misses, then 24 hits Series of 8 misses, then 24 hits again Watch for misses because of instructions

|          | a [0][0] | a [0][1] | a [0][2] | a [0][3] |
|----------|----------|----------|----------|----------|
| a [0][0] | 0        | 1        | 2        | 3        |
| a [0][4] | 4        | 5        | 6        | 7        |
| a [1][0] | 8        | 9        | 10       | 11       |
| a [1][4] | 12       | 13       | 14       | 15       |
| a [2][0] | 16       | 17       | 18       | 19       |

|          | a <mark>[0][0]</mark> | a [0][1] | a [0][2] | a [0][3] |
|----------|-----------------------|----------|----------|----------|
| a [0][0] | 0                     | 8        | 16       | 24       |
| a [0][4] | 32                    | 40       | 48       | 56       |
| a [1][0] | 1                     | 9        | 17       | 25       |
| a [1][4] | 33                    | 41       | 49       | 57       |
| a [2][0] | 2                     | 10       | 18       | 26       |





## **Exposing CPU Pipeline Bypasses**

programs/04\_pipeline\_bypasses



### Pipeline bypasses and code that exposes them



VI

COM



Connecting pipeline bypass signals to LEDs inside system\_rtl/mfp\_system.v

```
assign IO_GreenLEDs =
{
   { MFP_N_GREEN_LEDS - (1 + 1 + 4 + 4) \{ 1'b0 \} \},
   HCLK,
   burst,
   HADDR [7:4],
   mpc_aselwr_e, // Bypass res_w as src A
   mpc_bselall_e, // Bypass res_w as src B
   mpc_aselres_e, // Bypass res_m as src A
   mpc_bselres_e // Bypass res_m as src B
```

};

### More pipelining to explore in microAptiv UP core

• GPR – general purpose registers

- DSP extension for accelerating Digital Signal Processing algorithms
  - Such as digital filters, FFT
  - Uses light vector operations
  - Options for saturation and rounding
- MDU Multiply / Divide Unit
  - Different area/performance options
  - Configurable in MIPS Open core
  - Fixed in MIPSfpga

Note: DSP and MDU options are available in full MIPS microAptiv UP core, not in basic configuration of MIPSfpga. User can configure full core and replace core source in MIPSfpga.









# Take a Break





### Build Your own SoC Present Your Innovation





# Build Your own SoC Present Your Innovation

https://www.mipsopen.com/forums/forum/mips-open-developerday-june-4th-2019/





https://www.mipsopen.com/forums/forum/mips-opendeveloper-day-june-4th-2019/





#### Extending CPU with CorExtend interface UDI – User Defined Instructions





- Another name for UDI, User-Defined Instructions.
- Easy to use mechanism for adding new instructions
- User implements a custom block in Verilog.
- Instructions read from two general-purpose registers and write back into a specified register.
- The added instructions do not have to stall the pipe.
- The added instructions can stall the pipe if necessary.
- There is a mechanism to kill the instructions in event of a processor exception.
- The block can have internal state and connect to outside logic.







http://zatslogic.blogspot.com/2016/01/using-mips-microaptiv-up-processor.html





#### 

#### CorExtend instruction processing

#### http://zatslogic.blogspot.com/2016/01/using-mips-microaptiv-up-processor.html





# C O M P U T I N G

- 120 cores working on Terasic DE5-Net board with Altera / Intel FPGA
- The instructions to send messages between the processors in non-coherent mesh





http://www.isfpga.org/fpga2017/slides/D2\_S3\_04.pdf and http://nachiket.github.io/publications/mips\_fpga2017.pdf





#### A proof of concept using CorExtend for Al



### A proof of concept using CorExtend for AI

- Analyze an AI algorithm
- Define a formula to compute in hardware
  - D0 \* W0 + D1 \* W1 + D2 \* W2 + D3 \* W3
- Define the format of CorExtend instructions to accelerate the computation
- Implement a custom CorExtend block
- Create a set of C macros for the programming convenience
- Create two implementations of the algorithm
  - Pure software
  - Mixed software-hardware
- Analyze the generated assembly code
- Run the comparison benchmark

## C O M P U T I N G

#### A test example: the algorithm in software

```
#define ms(n0, w0, n1, w1, n2, w2, n3, w3)
    (((n0) * (w0) + (n1) * (w1) + (n2) * (w2) + (n3) * (w3)) \& 0xff)
uint32 t attribute ((noinline)) software only implementation
   uint32 t n0,
   uint32 t n1,
   uint32 t n2,
   uint32 t n3
    return ms
              ms (n0, 0, n1, 4, n2, 8, n3, 12),
              7,
              ms (n1, 1, n2, 5, n3, 9, n0, 13),
              5,
              ms (n2, 2, n3, 6, n0, 10, n1, 14),
              3,
              ms (n3, 3, n0, 7, n1, 11, n2, 15),
          );
```



#### Three formats of UDI instructions

| 3                                                       | 1                      | 26  | 25                   | 21       | 20 16                   | 6 | 15        | 11  | 10          | 6    | 5   |                      | 0 |
|---------------------------------------------------------|------------------------|-----|----------------------|----------|-------------------------|---|-----------|-----|-------------|------|-----|----------------------|---|
|                                                         | SPECIAL2<br>011100     |     | rs (optional)        |          | rt (optional)           |   | rd/ud/imm |     | op<br>xxxxx |      |     | UDI opcode<br>01xxxx |   |
|                                                         | 6                      |     | 5                    |          | 5                       |   | 5         |     | 5           |      |     | 6                    |   |
| UDIop <sub>[3:0]</sub> rs, rt, rd, imm5 <sub>10:6</sub> |                        |     |                      | UDI5     | UDI5 \$7, \$3, \$15, 30 |   |           |     |             |      |     |                      |   |
| L                                                       | JDIop <sub>[3:0]</sub> | rs, | rt, imm10            | $D_{15}$ | 5:6                     |   | UDI0      | t   | 0, t1, 665  | 5    |     |                      |   |
| ι                                                       | JDIop <sub>[3:0]</sub> | rs, | imm15 <sub>20:</sub> | 6        |                         |   | UDI15     | i a | a0, a1, v0  | ), ( | )x7 | 133                  |   |

You can make up to 19 bit immediate by reducing the number of UDI instructions



#### C macro for UDI using GCC \_asm\_ extension

```
#define mips udi rs rt rd imm5 v(n, rs, rt, rd, imm5)
         extension
        ( {
           unsigned rs = (rs);
           unsigned rt = (rt);
           unsigned __rd;
             asm volatile
               "udi%1 %2, %3, %0, %4"
                : "=r" ( rd)
                : "K"
                      (n)
                , "r" ( rs)
                 "r" ( rt)
                ,
                , "K" (imm5)
            );
           ___rd;
        })
```



# Version without \_\_volatile\_\_ to use additional GCC optimizations





#### Three formats of UDI instructions

- A format with one register
- 16-bit immediate
- The register is both source and destination
- Number of UDI instructions is reduced to 8
- But we have extra bit for immediate

```
#define mips udi rs imm15(n, rs, imm15)
          extension
        ({
            unsigned rs = (rs);
              asm
                "udi%1 %0, %2"
                  "+r"
                        rs)
                  "K"
                        (n)
                  "K"
                       (imm15)
                .
            );
            rs;
        })
#define mips udi rs imm16(n, rs, imm16)
        mips udi rs imm15
            (n) + ((imm16) >> 15),
            rs,
            (imm16) & 0x7fff
```

```
#define mh(n0, w0, n1, w1, n2, w2, n3, w3)
         extension
        ({
            uint32 t w;
           w = mips32r2 ins (n0, n1, 8, 8);
           w = mips32r2 ins (w, n2, 16, 8);
           w = mips32r2 ins (w, n3, 24, 8);
           w = mips udi rs imm16 (0, w)
                (w0) | ((w1) << 4) | ((w2) << 8) | ((w3) << 12));
           W;
        })
```





```
uint32_t __attribute__ ((noinline)) hardware_accelerated_implementation
    uint32 t n0,
    uint32 t n1,
    uint32 t n2,
    uint32 t n3
{
    return mh
               mh (n0, 0, n1, 4, n2, 8, n3, 12),
               7,
               mh (n1, 1, n2, 5, n3, 9, n0, 13),
               5,
               mh (n2, 2, n3, 6, n0, 10, n1, 14),
               3,
               mh (n3, 3, n0, 7, n1, 11, n2, 15),
           );
}
```



| software_only_implementation:   | hardware accelerated implementation: |
|---------------------------------|--------------------------------------|
| sll \$2,\$4,1                   | move \$2,\$4                         |
| addu \$2,\$2,\$4                | move \$ <mark>8,\$5</mark>           |
| sll \$3,\$5,3                   | ins \$2,\$ <mark>5,8</mark> ,8       |
| sll \$8,\$4,2                   | ins \$8,\$6,8,8                      |
| sll \$2,\$2,2                   | ins \$2,\$ <mark>6,16,8</mark>       |
| subu \$ <mark>12,\$3,\$5</mark> | ins \$8,\$7,16,8                     |
| addu \$2,\$2,\$4                | ins \$2,\$7,24,8                     |
| addu \$8,\$8,\$4                | ins \$8,\$4,24,8                     |
| sll \$9,\$7,1                   | move \$ <mark>3,\$6</mark>           |
| sll \$11,\$7,3                  | udi1 \$8,22865                       |
| addu \$9,\$9,\$7                | ins \$3,\$7,8,8                      |
| addu \$3,\$2,\$5                | udil \$2,18496                       |
|                                 |                                      |

47 instructions without UDI

#### 25 instructions with UDI



# C O M P U T I N G

```
int main ()
    uint32 t i, os, oh;
    for (;;)
    {
        i = MFP SWITCHES ^ 0 \times 123;
        MFP GREEN LEDS = 0 \times 11;
                                                             );
        os = software only implementation
                      & Oxff,
                   i
                  (i >> 1) \& Oxff,
                                                    }
                  (i >> 2) \& Oxff,
                  (i >> 3) & 0xff
                                                   return ₀;
              );
```

```
MFP\_GREEN\_LEDS = 0 \times 22;
```

```
oh = hardware accelerated implementation
            & Oxff,
          1
         (i >> 1) \& Oxff,
         (i >> 2) \& Oxff,
         (i >> 3) & 0xff
MFP GREEN LEDS = 0 \times 33;
MFP 7 SEGMENT HEX = (oh \ll 8) | os;
```





`include "m14k\_const.vh"

#### module m14k\_udi\_mipsfpga\_ai

| input | UDI_gclk,        | // Clock       |
|-------|------------------|----------------|
| input | UDI_greset,      | // Reset       |
| input | UDI_gscanenable, | // Scan enable |

#### // Static signals

| input  | UDI_endianb_e, | // Endian : 0 = little , 1 = big                    |
|--------|----------------|-----------------------------------------------------|
| input  | UDI_kd_mode_e, | <pre>// Mode : 0 = user , 1 = kernel or debug</pre> |
| output | UDI_present,   | // UDI module is present                            |





// E-Stage signals (in the order of their timing)

input [31:0] UDI ir e, // Instruction register input UDI irvalid e, // Instruction register valid // The instruction is illegal output UDI ri e, input [31:0] UDI rs e, // Value of register RS from register file **input** [31:0] UDI rt e, // Value of register RT from register file **output** [ 4:0] UDI wrreg e, // Register file index to write the result // Zero index indicates don't write input UDI start e, // Values of RS and RT valid

);

// M-stage signals (in the order of their timing)

| output        | UDI_stall_m, | 11 | Stall the pipeline   |
|---------------|--------------|----|----------------------|
| output [31:0] | UDI_rd_m,    | // | Result to write back |

| input | UDI_run_m,  | 11 | Qualify UDI_kill_m.  |
|-------|-------------|----|----------------------|
| input | UDI_kill_m, | // | Kill the instruction |

// Other signals

output UDI\_honor\_cee, // UDI module has local state

input [`M14K\_UDI\_EXT\_TOUDI\_WIDTH -1:0] UDI\_toudi, // External
output [`M14K\_UDI\_EXT\_FROMUDI\_WIDTH -1:0] UDI\_fromudi // Output f

# 

| assign<br>assign<br>assign | UDI_present<br>UDI_wrreg_e<br>UDI_stall_m<br>UDI_honor_cee<br>UDI_fromudi |                                                                                                                            |    |
|----------------------------|---------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------|----|
|                            | spc2<br>udi<br>usr_udi_vld                                                | <pre>= ( UDI_ir_e [31:26] == 6'b011100 );<br/>= ( UDI_ir_e [ 5: 4] == 2'b01 );<br/>= ( UDI_ir_e [ 3: 1] == 3'b000 );</pre> |    |
|                            | ir_ok<br>UDI_ri_e                                                         | <pre>= spc2 &amp; udi &amp; usr_udi_vld &amp; UDI_irvalid_e;<br/>= ! ir_ok;</pre>                                          |    |
| wire                       | run_instr                                                                 | <pre>= ir_ok &amp; UDI_start_e;</pre>                                                                                      |    |
| wire [1                    | L5:0] imm16 = {                                                           | UDI_ir_e [0], UDI_ir_e [20:6] };                                                                                           |    |
| ∋pen <sup>-</sup>          | W                                                                         | ave Computing © 2019: MIPS Open Developer Day, 4 June 2019                                                                 | 74 |

```
wire [31:0] e res, e res q;
assign e_res [ 7: 0] = UDI_rs_e [ 7: 0] * imm16 [ 3: 0];
assign e res [15: 8] = UDI rs e [15: 8] * imm16 [ 7: 4];
assign e res [23:16] = UDI rs e [23:16] * imm16 [11: 8];
assign e_res [31:24] = UDI rs e [31:24] * imm16 [15:12];
mvp cregister wide \# (32) e res r
    .q (eresq
    .scanenable ( UDI gscanenable ),
    .cond ( run_instr
    .clk (UDI gclk
    .d
                ( e res
);
wire [7:0] m res 01 = e res q [15: 8] + e res q [ 7: 0];
wire [7:0] m_res_23 = e_res_q [31:24] + e_res_q [23:16];
wire [7:0] m res = m res 01 + m res 23;
assign UDI rd m = { 24'b0, m res };
               Wave Computing © 2019: MIPS Open Developer Day, 4 June 2019
```

# C O M P U T I N G

#### Simulate MIPS Open system using Verilog simulator

| /mfp_testbench/SI_ClkIn                                | 1'h1         |                         |              |              |               |                 |
|--------------------------------------------------------|--------------|-------------------------|--------------|--------------|---------------|-----------------|
| 🧇 /mfp_testbench/cycle                                 | 32'd19772    | 19775                   | 19776        | 19777        | 19778         | 19779           |
| - Instr                                                | _            |                         |              |              |               |                 |
| 🗇 /mfp_testbench/opstr                                 | 64'alns      | ???                     | Ins          |              | ???           | Ins             |
| - I/O                                                  | _            |                         |              |              |               |                 |
| /mfp_testbench/IO_7_SegmentHEX                         | 32'h00007070 | 00007070                |              |              |               |                 |
| - UDI                                                  |              |                         |              |              |               |                 |
| <pre>/mfp_testbench/system/m14k_top/udi/UDI_ir_e</pre> | 32'h7d027a04 | 32 <sup>th</sup> 7ca7bc |              | 32'h7cc7fe04 | 32'h7c62bc04  | (32'h70fec      |
| <pre>/mp_testbench/system/m14k_top/udi/UDI_rs_e</pre>  | 32'h00000004 |                         | 32'h91232448 | 32'h00000048 | 32'h000000b4  | 32'h4891        |
| <pre>/mfp_testbench/system/m14k_top/udi/UDI_rt_e</pre> | 32'h00000034 |                         | 32'hdeadbeef | 32'h00912324 | 32'h00000434  | 32'h0000        |
| /mfp_testbench/system/m14k_top/udi/UDI_wrreg_e         | 5'h08        | 5'h05                   | 5'h03        | 5'h06        | (5'h03        | 5'h07           |
| 🔩 /mfp_testbench/system/m14k_top/udi/UDI_rd_m          | 32'h00000034 | 32'h00000               | 034          | 32'h000000b4 |               |                 |
| - UDI AI                                               |              |                         |              |              |               |                 |
| /mfp_testbench/system/m14k_top/udi/run_instr           | 1'h0         |                         |              |              |               |                 |
| /mfp_testbench/system/m14k_top/udi/imm16               | 16'h09e8     | 16'h1ef0                | 16'hea62     | 16'h1ff8     | 16'h0af0      | 16'hfb73        |
| <pre>/mfp_testbench/system/m14k_top/udi/e_res</pre>    | 32'h00000020 | <u>32'h00000.</u>       | 32'hee5ed890 | 32'h00000040 | (32'h00000000 | <u>32'h383b</u> |
| ✓ /mfp_testbench/system/m14k_top/udi/e_res_q           | 32'hb0404400 | 32'hb04044              | 400          | 32'hee5ed890 |               |                 |
| /mfp_testbench/system/m14k_top/udi/m_res_01            | 8'h44        | <u>8'h44</u>            |              | 8'h68        |               |                 |
| /mfp_testbench/system/m14k_top/udi/m_res_23            | 8'hf0        | <u>8'hf0</u>            |              | 8'h4c        |               |                 |
| /mfp_testbench/system/m14k_top/udi/m_res               | 8'h34        | 8'h34                   |              | 8'hb4        |               |                 |





- The computation results matches
- Software computation takes 62 cycles
- Software-hardware computation takes 30 cycles two times less
- The result can be improved orders of magnitude by making complicated AI engine that has both CorExtend and DSPRAM interfaces for highly parallel multi-functional computational unit



### Synthesis – the whole MIPS Open FPGA system





**WAVE** 

COMPUTING



#### UDI submodule





#### Synthesis for the system Fits 65% of Terasic DE10-Lite board





# C O M P U T I N G

#### Fmax 33 MHz is practical even for Linux debug

| Quartus Prime Lite Edition - /hom                                                              | e/panchul/git-clones/mipsfpga-ai/                            | boards/d                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | e10_lite/proje          | ect/de10_lite - de | e10_lite                                       | _ 1        |
|------------------------------------------------------------------------------------------------|--------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|--------------------|------------------------------------------------|------------|
| <u>F</u> ile <u>E</u> dit <u>V</u> iew <u>P</u> roject <u>A</u> ssignments P <u>r</u> ocessing | <u>T</u> ools <u>W</u> indow <u>H</u> elp                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    | Search                                         | ı altera.c |
| - C                                                                                            | de10_lite                                                    | <ul> <li></li> <li><td>🗳 🔷 🕴</td><td>TOP 🕨 🤸</td><td>≰ 🔶 🧲</td><td>»</td></li></ul> | 🗳 🔷 🕴                   | TOP 🕨 🤸            | ≰ 🔶 🧲                                          | »          |
| Project Navigator 🔥 Hierarchy 🗘 🤉 🖓 🖻 🕅                                                        | ) 🗘 Compilation Report - de1                                 | 0_lite                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | ×                       |                    |                                                |            |
| IP upgrade recommended. Launch IP Upgrade Tool X                                               | Table of Contents                                            | 🖵 🗗 🛛                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | ow 1200mV 85C           | Model Fmax Summ    | ary                                            |            |
|                                                                                                | Flow Log                                                     |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | < <filter>&gt;</filter> |                    |                                                |            |
| Entity:Instance                                                                                | <ul> <li>Analysis &amp; Synthesis</li> <li>Fitter</li> </ul> |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Fmax                    | Restricted Fmax    | Clock Name                                     | Note       |
| AX 10: 10M50DAF484C7G                                                                          | Flow Messages                                                | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 32.84 MHz               | 32.84 MHz          | MAX10_CLK1_50                                  |            |
| ▷ 😳 de10_lite 🚠                                                                                | I flow Suppressed Messages                                   | 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 116.04 MHz              | 116.04 MHz         | GPIO[17]                                       |            |
|                                                                                                | 👂 🔚 Assembler                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    |                                                |            |
|                                                                                                | 🗢 📂 TimeQuest Timing Analyzer                                | _                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                         |                    |                                                |            |
|                                                                                                | E Summary                                                    | =                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                         |                    |                                                |            |
|                                                                                                | 📰 Parallel Compilation                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    |                                                |            |
| < III 5                                                                                        |                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    |                                                |            |
| Tasks Compilation ♦ = 🖓 🕫 🕅                                                                    | E Clocks                                                     |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    |                                                |            |
| Tasks     Compilation     ⇒     ≡     ₽     ∅                                                  |                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    |                                                |            |
| Task                                                                                           | Fmax Summary                                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    |                                                |            |
| ✓ ▼ ► Compile Design 00                                                                        | 📰 Setup Summary                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    |                                                |            |
| ✓ ▷ ► Analysis & Synthesis 00                                                                  | 📰 Hold Summary                                               |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    |                                                |            |
| <ul> <li>✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓</li></ul>                                       | Recovery Summary                                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         | •                  | k in the design, regard                        |            |
|                                                                                                | Removal Summary                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         |                    | only computed for pa<br>ports are driven by th |            |
| < III >                                                                                        | < III                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                         | -                  | ng gonorated clocks                            |            |



- Return Hardware
  - Altera/Xilinx FPGA Boards
  - USB Hub
  - Cables
- Housekeeping
  - Please delete the Altera Quartus and Xilinx Vivado tools on the SSD!!!
  - When you leave the class, please leave the FPGA boards, cables and USB hub.
  - The SSD is yours to take home





**Thank You** 

