### **MIPS-Classic Cores**

Presented by Yuri Panchul MIPS Open Technical Lead MIPS Open Meetup in Moscow April 15, 2019



### **MIPS IP Cores - Features Summary**

|                        | microAptiv                     | M51xx        | M62xx        | interAptiv    | I7200                                  | 16500/-F                            | P5600           | P6600           |
|------------------------|--------------------------------|--------------|--------------|---------------|----------------------------------------|-------------------------------------|-----------------|-----------------|
| MIPS Primary ISA       | MIPS32 r5                      | MIPS32 r5    | MIPS32 r6    | MIPS32 r5     | nanoMIPS32                             | MIPS64 r6                           | MIPS32 r5       | MIPS64 r6       |
| Virtual/Phys Addr Bits | 32/32                          | 32/32        | 32/32        | 32/32         | 32/32                                  | 48/48                               | 32/40           | 48/40           |
| FPU                    | $\checkmark$ (UC version only) | $\checkmark$ | -            | MT            | -                                      | MT w/SIMD                           | Hi Perf w/SIMD  | Hi Perf w/SIMD  |
| DSP/SIMD extensions    | DSPASE r2                      | DSPASE r2    | DSPASE r2    | DSPASE r2     | DSPASE r2                              | MSA 128-bit                         | MSA 128-bit     | MSA 128-bit     |
| Virtualization         | -                              | $\checkmark$ | -            | -             | -                                      | $\checkmark$                        | $\checkmark$    | $\checkmark$    |
| Small code size ISA    | microMIPS32                    | microMIPS32  | microMIPS32  | MIPS16e2 ASE  | nanoMIPS32                             |                                     | -               | -               |
| Multi-threading        | -                              | -            | -            | 2 VPE, 9 TC   | 3 VPE, 9 TC                            | 4 VPE                               | -               | -               |
| SuperScalar            | -                              | -            | -            | -             | Dual-issue in order                    | Dual-issue in order                 | Multi-issue OoO | Multi-issue OoO |
| Pipeline stages        | 5                              | 5            | 6            | 9             | 9                                      | 9                                   | 16              | 16              |
| Relative Frequency*    | 0.6x                           | 0.6x         | 0.75x        | 1x            | 0.95x                                  | 0.90x                               | 1.10x           | 1.10x           |
| SPRAMs (I/D/U)         | ✓ / ✓ / -                      | √ / √ / -    | ✓ / ✓ / -    | ✓ / ✓ / -     | $\checkmark   \checkmark   \checkmark$ | - / 🗸 / -                           | - / - / -       | - / - / -       |
| L1 caches              | $\checkmark$                   | $\checkmark$ | $\checkmark$ | $\checkmark$  | $\checkmark$                           | $\checkmark$                        | $\checkmark$    | $\checkmark$    |
| L2 cache               | -                              | -            |              | $\checkmark$  | $\checkmark$                           | $\checkmark$                        | $\checkmark$    | $\checkmark$    |
| Coherent Multi-Core    | -                              | -            | -            | Up to 4 cores | Up to 4 cores                          | Up to 6 cores,<br>Up to 64 clusters | Up to 6 cores   | Up to 6 cores   |
| Native System Bus I/F  | AHB-Lite                       | AHB-Lite     | AXI          | OCP 2 or AXI  | AXI                                    | AXI or ACE                          | AXI             | AXI             |

\* Relative Frequencies are approximate, are provided for rough guidance only, and will vary to some extent in different process nodes



### **MIPS** technology differentiation

MIPS architecture and IP cores offer powerful, unique capabilities



#### **MIPS IP cores**

- 1. Offer leading Power Performance Area (PPA) across the range
- 2. Provide ultimate scalability: multi-thread, multi-core, multi-cluster
- 3. Address Functional Safety to ISO 26262 for automotive and IEC 61508 for industrial



#### **MIPS® I-Class Processor Core Roadmap**

#### MIPS32 and MIPS64 evolution





Setting a new standards in mainstream 64-bit processing



### Key Components of a Multi-Core Cluster?





#### How to build a Multi-Core Cluster?

16500



- Multi-threaded multi-core 32-bit processor IP
- Designed for high performance embedded systems with real-time requirements





3

#### nanoMIPS ISA – advancing small code size

Achieves outstanding code density targeting small footprint applications or constraint devices





gcc compiler, with indicated optimization target



**Success Stories** 

Market specific use-cases

#### **Classical Application Segments**





### Application Segment 1 ADAS

#### **I6500-F lead customer: Mobileye ADAS Platform**





Per core, normalized to A53 Values (@ same frequency)

#### Simultaneous Multi-Threading



DMIPS - 32b

#### CoreMark

#### SPECint2000 (rate)

Based on Cortex A53 data reported by ARM on website and/or in presentation materials, plus benchmarked results on Linaro (HiSilicon Octa A53 Kirin620) with Linux kernel: 3.18.0-linaro-hikey SMP preempt, RFS Debian squeeze, with GCC-based 5.0.0 toolchain. A7 scores are ARM claims. Measured results are lower.

16400 results are based on production released RTL, FPGA platform benchmarking and in case of SPEC, 1 enhancement for next release performance models • testing



## Application Segment 2 LTE/5G



#### Success Story, Multi-threading & Mediatek

#### 17200

RF

|                   | Mediatek 5G LTE modem                                                                                                                                                                                                              | About MIPS in the application                                                                                    |  |  |  |
|-------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|--|--|--|
| LTE               | <ul> <li>Mediatek designed MIPS into modem across multiple products</li> <li>First version, Helio X30, in production</li> </ul>                                                                                                    |                                                                                                                  |  |  |  |
| MIPS<br>Advantage | <ul> <li>Hardware Multi-Threading (MT)</li> <li>Fast inter-thread communications</li> <li>Higher performance/high processing efficiency</li> <li>Scalability         <ul> <li>1-4 cores, 2 threads per core</li> </ul> </li> </ul> | Stories CPU A<br>CPU 2<br>CPU 2<br>CPU 1<br>CPU 1<br>CPU 1<br>CPU 1<br>CPU 2<br>CPU 1<br>Dual-issue<br>threading |  |  |  |
|                   | Deterministic ; Real-time interrupts                                                                                                                                                                                               | L1 (control)/L2/L3 (protocol CPUs) L1 (baseband)                                                                 |  |  |  |



"MIPS CPUs, with their powerful multi-threading capability, offer a combination of efficiency and high throughput for LTE modems that contributes significantly to system performance." *said Dr. Kevin Jou, SVP and CTO, MediaTek*. 2018



### **I7200 Performance Advantage**



Wave Computing Confidential © 2018



Application Segment 3 Data-center



Scala

3

- Scalable Multi-threading to Multi-core to Multi-cluster
- Configurable Heterogeneous inside & outside
- Optimized for High-throughput data processing applications
- Real-time, secure, deterministic and low latency

"The MIPS Simultaneous Multi-Threading architecture is an important technique to ensure that such workloads run efficiently as measured by the CPU's instructions per clock, or IPC. We have seen that this efficiency translates directly into a smaller area as well as lower power for silicon implementations based on MIPS." 2018

Pradeep Sindhu, CEO of Fungible



# VAVE

### 16400 & 6500 – Why Multi Threading?

A powerful differentiator among CPU IP cores



3

#### Why MT?

- A path to higher performance, and higher efficiency
- 30%-60% higher performance for 10% per Thread increase

#### Easy to use – programming model is same as multi-core

A thread looks like a core to standard SMP OS 

#### Simultaneous / concurrent execution

Zero Cycle overhead context switching





# 

### MIPS IP Cores – Mapping to ARM

|                        | microAptiv                     | M51xx        | M62xx        | interAptiv              | <b>I7200</b>                           | <i>16500/-F</i>                     | P5600           | P6600           |
|------------------------|--------------------------------|--------------|--------------|-------------------------|----------------------------------------|-------------------------------------|-----------------|-----------------|
| MIPS Primary ISA       | MIPS32 r5                      | MIPS32 r5    | MIPS32 r6    | MIPS32 r5               | nanoMIPS32                             | MIPS64 r6                           | MIPS32 r5       | MIPS64 r6       |
| Virtual/Phys Addr Bits | 32/32                          | 32/32        | 32/32        | 32/32                   | 32/32                                  | 48/48                               | 32/40           | 48/40           |
| FPU                    | $\checkmark$ (UC version only) | $\checkmark$ | -            | MT                      | -                                      | MT w/SIMD                           | Hi Perf w/SIMD  | Hi Perf w/SIMD  |
| DSP/SIMD extensions    | DSPASE r2                      | DSPASE r2    | DSPASE r2    | DSPASE r2               | DSPASE r2                              | MSA 128-bit                         | MSA 128-bit     | MSA 128-bit     |
| Virtualization         | -                              | $\checkmark$ | -            | -                       | -                                      | $\checkmark$                        | $\checkmark$    | $\checkmark$    |
| Small code size ISA    | microMIPS32                    | microMIPS32  | microMIPS32  | MIPS16e2 ASE            | nanoMIPS32                             | -                                   | -               | -               |
| Multi-threading        | -                              | -            | -            | 2 VPE, 9 TC             | 3 VPE, 9 TC                            | 4 VPE                               | -               | -               |
| SuperScalar            | -                              | -            | -            | -                       | Dual-issue in order                    | Dual-issue in order                 | Multi-issue OoO | Multi-issue OoO |
| Pipeline stages        | 5                              | 5            | 6            | 9                       | 9                                      | 9                                   | 16              | 16              |
| Relative Frequency*    | 0.6x                           | 0.6x         | 0.75x        | 1x                      | 0.95x                                  | 0.90x                               | 1.10x           | 1.10x           |
| SPRAMs (I/D/U)         | ✓ / ✓ / -                      | √ / √ / -    | ✓ / ✓ / -    | ✓ / ✓ / -               | $\checkmark   \checkmark   \checkmark$ | - / ✓ / -                           | - / - / -       | - / - / -       |
| L1 caches              | $\checkmark$                   | $\checkmark$ | $\checkmark$ | $\checkmark$            | $\checkmark$                           | $\checkmark$                        | $\checkmark$    | $\checkmark$    |
| L2 cache               | -                              | -            |              | $\checkmark$            | $\checkmark$                           | $\checkmark$                        | $\checkmark$    | $\checkmark$    |
| Coherent Multi-Core    | -                              | -            | -            | Up to 4 cores           | Up to 4 cores                          | Up to 6 cores,<br>Up to 64 clusters | Up to 6 cores   | Up to 6 cores   |
| Native System Bus I/F  | AHB-Lite                       | AHB-Lite     | AXI          | OCP 2 or AXI            | AXI                                    | AXI or ACE                          | AXI             | AXI             |
| Sample ARM<br>Mapping  | M3 I                           | M4           | M23 M33      | uting Confidential © 20 | R52 R7 R8                              | A53                                 | A57             | <b>A72</b>      |



**Thank You** 



Wave Computing © 2019