SlideShare a Scribd company logo
Enabling Better Products
Mirabilis Design
EDA Software Company based in Silicon Valley
Integrating sub-system teams to the mission using System-Level Design
Highly experience Management and Engineering team
Over 150 man-years of background in semiconductors, automotive and
aerospace
VisualSim Architect –Design the Right product
Graphical modeling and simulation platform with complete set of system-level modeling IP
Eliminate all surprises prior to integration
Optimizing specification, collaboration between mission, sub-systems
and suppliers, evaluating use-cases and identify test scenarios for
system validation
Networking
18th companies
& 32nd universities
Electronics Modeling
35th customer
2008
Company Incorporated
2011
First Engagement with
HP and ISRO
2013
Announced
VisualSim
2014
University Program
10th Customer
2015
Stochastic and
Network modeling
2016 2018 2019
Automotive
& Avionics
2020
System-level IP
Open API
2022/23
Re-engineered
AI, DNN, Power, GPU
2021
Requirements Tracking
50th customer
VisualSim- The Product
Spend time designing … not working on Word/Excel/Powerpoint
VisualSim IP Library
Custom Creator
Communication
Power
RF, Baseband, Channels
Communication systems,
A/D transceivers, Antenna,
Analog, Signal/audio/Image
Processing
Power States, Allocation,
Transition, Loss, Battery,
Consumption, Management,
Generation, Distribution,
and Thermal
Sensors, Interfaces,
Distribution, Traces,
Software, VCD, ML, DNN
Traffic
Reports
Latency, Throughput,
Utilization, Ave/peak
power (instant, ave) ,
hit-ratio, Heat, Temp
RISC-V and Chiplets
RTOS and
Software
SiFive, In-Order/Out-
of-Order Generator,
Tilelink
Generic RTOS, ARINC
653, AUTOSAR, task
Graph
AMBA (AHB/ APB/ AXI/CHI),Tilelink
Corelink (600, 700), NoC (Generic,
Arteris, Signature, OpenEdges),
Virtual Channel, DMA, Crossbar,
Serial Switch, Bridge, UCie
SOC
Board-
Level
VME, PCI/PCI-X/PCIe 6.0, SPI 3.0,
1553B, FlexRay, CAN-FD/XL,
AFDX, TTEthernet, OpenVPX
Processors ARM (M0-55), R5, Cortex (A8,
A72, A53, A76, A77, A65, A78,
A720), Nvidia- Pascal to Ampere,
Generic GPU, mC, Leon, Power,
X86, DSP- TI and ADI, Tensilica,
Renesas SH, AI Engine, TPU
Stochastic
Queue ,Time
Queue, Quantity
Queue, Resources,
Scheduler
Scripting, RegEx, Task
graph, Use cases,
Hardware Builder,
C/C++/Java/Python
MatLab, STK
Storage Flash, NVMe, Disk, SSD,
NAS, Fibre Channel,
FireWire
TSN, AVB, 10BaseT1S, Switched Ethernet,
Resilient Packet Ring, RP3, WiFi 802.11,
Bluetooth, PAN, Spacewire, SpaceFibre,
IEEE802.1Q, Time-Triggered Ethernet,
AFDX, 5G
Networking
Memory
• Memory Controller, SDR, DDR
DRAM 2,3,4, 5, LPDDR 2, 3, 4,5
HBM2.0, HMC, QDR, RDRAM,
MPMC, cache, Coherent cache
FPGA Xilinx- Versal, Zynq,
Ultrascale, Kintex
Altera-Stratix, Arria,
Microsemi- Smartfusion,
Programmable logic
generator
Trade-Off
Requirements,
Thermal, Power,
Performance, Failure
Verification, Upgrade
Assemble System Model using Pre-Built System-Level IP
Scheduling/Arbitration
proportional
share
WFQ
static
dynamic
fixed priority
EDF
TDMA
FCFS
Communication Templates
Architecture # 1 Architecture # 2
Computation Templates
DSP
AI
GPU
DRAM
CPU
FPGA
m
E
DSP
TDMA
Priority
EDF
WFQ
RISC
DSP
LookUp
Cipher
AI DS
P
CPU
GP
U
mE DD
R
static
Which architecture is better suited
for our application?
Add the Task Graph to Define the Workload
I/O
DSP
CPU1
CPU2
task1 task2 task3 task4
Contention
- limited resources
- scheduling/arbitration
Interference of multiple
applications
- limited resources
- scheduling/arbitration
- anomalies
Complex behavior
- input stream
- data dependent behavior
Analyze the Results
System with faster Bus is slower in places
Unpredictable System Response
Impact of System Architecture Exploration
• System sizing and topology design
• Power consumption, cooling & management
• Device distribution across one/multi-die
• Application mapping on CPU, GPU, TPU, DSP
• SW, firmware, scheduler and network tuning
• Merges Shift-Left and Shift-Right
• System-level model integrates requirements,
creates a single model of the entire system,
trade-offs power-performance-area and
generate tests
• To optimize associated area
• To design thermal structure
• To create Chiplet IP industry
• To meet timing and power
• To meet mission requirements
• Single platform from Concept to End-of-Life
• Collaboration between design teams,
suppliers, customers
ARM Cortex A53
Benchmark FPGA VisualSim Difference Comments
ED1 5.94ms 6.425ms 7.55% Integer processing
MM 12.084ms 11.863ms 1.08% Most load operations with
random addresses
MM_st 13.984ms 14.65ms 4.5% Most store operations with
random addresses
Test System
Xilinx Ultrascale+ Zynq® UltraScale+™ XCZU9EG-2FFVB1156E MPSoC running on the ZCU102 board
Specification: 4 core ARM Cortex A53 at 1200Mhz; 32KiB i-cache; 32KiB d-cache, 1MiB L2; 2GB DDR4
DRAM 2400
Comparing Power for ARM Cortex A53
Frequency VisualSim Simulated
Power
Measured Power as
reported by Anandtech
Delta percentage
500.0 Mhz 0.037 W 0.038 W 2.63%
600.0 Mhz 0.053 W 0.051 W -3.92%
700.0 Mhz 0.073 W 0.080 W 8.75%
800.0 Mhz 0.097 W 0.090 W -7.77%
1000.0 Mhz 0.157 W 0.159 W 1.25%
1100.0 Mhz 0.193 W 0.188 W -2.65%
1200.0 Mhz 0.233 W 0.227 W -2.64%
1300.0 Mhz 0.277 W 0.269 W -2.97%
Source: Anandtech.com
Over 97% accuracy
Comparing different Cores- Dhrystone
Processor MoP Hit
Ratio
MoP Mean
Latency
I1 Hit Ratio I1 Mean
Latency
D1 Hit
Ratio
D1 Mean
Latency
L2 Hit
Ratio
L2 Mean
Latency
DSU Hit
Ratio
DSU Mean
Latency
ARM Cortex
A53
- - 99.97 1.93E-09 99.98 2.02E-09 18.75 9.33E-08 - -
ARM Cortex
A77
99.90 1.75E-09 67.22 6.25E-08 99.96 7.32E-10 14.19 1.82E-07 6.96 2.05E-09
RISC-V u74 - - 99.98 4.15E-09 99.98 1.86E-09 39.58 5.25E-08 - -
Processor Instructions Latency Max MIPS
ARM Cortex A53 ~ 56,66,000 0.0055846 ~ 1039
ARM Cortex A77 ~ 44,78,000 0.0011795 ~ 3960
RISC-V u74 ~ 60,58,000 0.007726 ~ 797
VisualSim drives Efficiency & Productivity
Model Creation (6)
Implementation (18)
Using Current Design Methodology
Project Schedule
)
Implementation (12)
Using VisualSim Design Methodology
Time savings
based on 24
month project
is 20-40%
Note: All times in months
TM
Communication and Refinement (4)
Analysis (2.5)
Model Creation (0.5)
Analysis (1.5)
Communication and Refinement (6)
Advantageous over generic modeling environment due to Shorter duration & greater applicability
VisualSim System Model using UCIe in ADAS SoC
Vary Compute, Interconnect and Traffic
Package_Type = Advanced
Max_Link_Speed_GTps = 32
Number of Modules = 4
Tx_Buffer_Size = 8192 ( No packets dropped)
Protocol = PCIe_Gen6
Flit_Size = 256 Bytes
Num_of_Flits_per_Flow_Control_Check =8
Run Simulation with Different Configurations and Topology
Power
Generation
Power
Storage
Power
Consumption
Thermal
Management
• Different charging schemes
• Impact of surge and shocks
• Battery Lifecycle
• Battery Consumption
• Statistics
• Heat and
temperature
• Impact of
cooling strategy
• Add impact of
power spikes
• State based power consumption
of electronics (controller, SOC)
and Mechanical (brakes, wheels)
• Average, instant and Cumulative
• Power per device and application
Verification and Debugging
• 4 Types of Power
Generators in VisualSim
• Constant, variable, motor,
solar charge
• Charge sent to battery
1 2 3 5
6
• Optimize and test the power management algorithms
• Sizing of power generators and battery
• Optimize the schedule, supplynet and voltage
• Estimate power consumed by the software application
Downstream Integration
• Generate UPF file with power domains and
associated voltage levels
• Generate SystemVerilog power testbench
• Generate powerState change VCD dump
7
Power
Management
• Change in power
state controlled by
time, utilization,
temperature and
expected activity
4
Add the Power and Thermal
Behavior Task Graph
Power Table
Power management Unit
SystemVerilog Output for Power System Test
VCD Waveform for Verification
create_power_domain PD_Top -include_scope
create_power_domain -name PD_1_2.0 -elements {"CLKMUX"}
create_power_domain -name PD_1_1.0 -elements {"PLL","G2","G3"}
create_power_domain -name PD_1_3.0 -elements {"PROC"}
create_supply_port -port VDD_1.0 -direction in -domain PD_Top
create_supply_port -port VDD_2.0 -direction in -domain PD_Top
create_supply_port -port VDD_3.0 -direction in -domain PD_Top
create_supply_port -port VSS_0.0 -direction in -domain PD_Top
create_supply_net VDD_1.0 -domain PD_Top
create_supply_net VDD_2.0 -domain PD_Top
create_supply_net VDD_3.0 -domain PD_Top
create_supply_net VSS_0.0 -domain PD_Top
connect_supply_net VDD_1.0 -ports VDD_1.0
connect_supply_net VDD_2.0 -ports VDD_2.0
connect_supply_net VDD_3.0 -ports VDD_3.0
connect_supply_net VSS_0.0 -ports VSS_0.0
add_power_state PD_1_2.0 -state Active 
{-supply_expr (VDD_2.0 == {ON, 2.0}) && (VSS_0.0 =={ON,0.0})}
add_power_state PD_1_2.0 -state 
OFF {-supply_expr (VDD_2.0 == {OFF, 0.0}) && (VSS_0.0 =={ON,0.0})}
add_power_state PD_1_1.0 -state Active 
{-supply_expr (VDD_1.0 == {ON, 1.0}) && (VSS_0.0 =={ON,0.0})}
add_power_state PD_1_1.0 -state OFF 
{-supply_expr (VDD_1.0 == {OFF, 0.0}) && (VSS_0.0 =={ON,0.0})}
add_power_state PD_1_3.0 -state Active 
{-supply_expr (VDD_3.0 == {ON, 3.0}) && (VSS_0.0 =={ON,0.0})}
add_power_state PD_1_3.0 -state OFF 
{-supply_expr (VDD_3.0 == {OFF, 0.0}) && (VSS_0.0 =={ON,0.0})}
Power Modeling Integration
System Verification
• Validate product not just HW/SW
• Application relevant test vectors
• Generate test cases and run against RTL
• Compare simulation output against RTL
• Match architecture timing within range
• Verify functional correctness
• Task sequencing @ DSP/uP
• Resource contention
Eliminate product failure by maximizing relevant verification
Golden
Reference
Comparator
Match Tag
Architecture
model of IP
Verilog/C/
Hardware
Reference Data
Example: Infotainment
Architecting Hardware-Software for Infotainment System
Mirabilis Design Confidential
DRAM
Display
IO
A
M
B
A
A
X
I
B
u
s
CPU
GPU
Display
Ctrl
P
C
I
e
Video Camera SRAM
Packet
• System Overview
• Camera : 30fps, VGA corresponds
• CPU : Multi-core ARM Cortex-A53 1.2GHz
• GPU : 64Cores(8Warps×8PEs), 32Threads,
1GHz
• DisplayCtrl : DisplayBuffer 293,888Byte
• SRAM : SDR, 64MB, 1.0GHz
• DRAM : DDR3, 64MB, 2.4GHz
Explore at the board- and semiconductor-level to size uP/GPU, memory bandwidth and bus/switch configuration
System Model of an Infotainment System
Mirabilis Design Confidential
NXP i.MX6 /
nVIDIA Drive PX
Xilinx FPGA
Kintex 8
Discrete
DMA
ARM A53
GPU
Display Ctrl
SRAM3
DRAM3
Video IN
Parameters
Video OUT
Conducting Architecture Trade-off
• By changing the amount of video input data (packet number), observe the SRAM -> DRAM transfer
performance and examine the upper limit performance of the video input that the system can
tolerate. 210Packet/Sec
12ms
21Packet/Sec
41.4us
300Packet/Sec
• 250 Packet/Sec is the system limit
• With 300 Packet/Sec, simulation cannot be
executed due to FIFO buffer overflow.
Reference Data:
Mapping Applications onto SoC
Mapping Algorithm to Multi-Resources
Standard HW
Library
Component
Basic/Starting Configuration
Grayscale_Conversion - PS [A72 Core 1]
IIR – Logic (PL)
FFT – AI Engine Tile
Edge_Image - Logic (PL)
iFFT – AI Engine Tile
Edge_Image_Enhancement – Logic (PL)
Segmentation – PS [A72 Core 2]
Image
Processing
Algorithm
Experiments with Different Implementations
Run 3 – Using Direct Path
between Logic and AI
Run 2 – Segmentation
Mapped to AI Engine
Run 1 – Base Configuration
Mapped to Logic and ARM
Application latency increasing over time.
Latency increases due to Segmentation.
Remap segmentation task AI Tiles
Latency is deterministic
Latency requirement (App latency
< 80 msec) is met.
Utilization across NoC is acceptable
Application latency in bounded range.
NoC Utilization is high.
Changed interconnect for Segmentation
from NoC to Direct
VisualSim Chiplet
Solution
Using the Chiplet Library to Design SoC
ADAS SoC Block Diagram
UCIe
AI Engine Tiles
Warp
Scheduler
PE
PE
PE
PE
Local Mem
GPU
Memory chiplet
ADC
DDR5
Processor subsystem
Core L1
B
u
s
SLC
• Optimal
mesh size
(mxn) ?
• Best sample
size (16
bytes vs 32
bytes etc) ?
Use a single protocol
stack or multi protocol
stack?
Do we need PCIe
gen6 or still use
gen5 for meeting
application
requirements?
VisualSim System Model using UCIe in ADAS SoC
Statistics for Multi-Die SoC
• Note the AI Engine
latency spikes
• For multi protocol,
half bandwidth for
each protocol.
• Older gen protocols
are mixed with PCIe 6,
• Lower FLIT size
increases latency.
Comparing Different Configurations using UCIe Interface
All Die Adapters using PCIe 6.0
Die Adapters using PCIe 6.0
and Streaming Protocols (AXI)
Lower latency when using PCIe 6.0
Reference Data
Example: Deep Neural Network
Mask Region-CNN (MR-CNN) for object detection and image
segmentation
Overall representation of Mask
R-CNN model
Network Architecture of Mask R-CNN
output
CPU Preprocessing
CPU Postprocessing
Using ChatGPT to translate AI model (Mask R-CNN) in to VisualSim
Task Graph
• Each of the layers are defined as different
tasks in the task graph and the dependency
between them is modeled.
• A database is used to list the
layers/functions and the parameters
associated with them.
• These will be used to determine the
number of Multiply Accumulate (MAC)
operations corresponding to each
layer/function
Class, box
mask
VisualSim Model of DNN Hardware and Task Graph
Application sequence from
Task Graph is mapped to
HW architecture
• PE – 12x14
• 4 memory hierarchy
• Power computation
per PE, Buses and
memory
Results – Base model (168 AI Cores, 90% data availability at
SRAM)
• Peak Power
consumption at
around 10.8 Watts
• Obtained FPS = 0.414
Results – 8x8 (64) cores, 90% data availability at SRAM
• Peak Power consumption at
around 5.6 Watts as the number
of cores were reduced
• Obtained FPS = 0.29, which is
lower than the base model
results as the number of
resources for doing MAC
operations were lower
Results - 100% data availability at
SRAM, 168 cores
• The number of off chip memory
accesses were reduced. The only
accesses made were to load the
images and weights into the
SRAM
• Obtained FPS = 9.93, which is
higher than the base model
results as the number of off chip
memory accesses were reduced
• Peak Power consumption (10.4
W) is lower as off chip memory
accesses were reduced
Results - 60% data availability at SRAM,
168 cores
• The number of off chip memory
accesses were increased
• Obtained FPS = 0.04, which is
lower than the base model
results as the number of off chip
memory accesses were
increased
Reference Data:
Hardware-Software Partitioning SoC Architecture Design
SoC System Specification
Processor Core – RISC-V or ARM A53 core
Processor Speed – 1200 MHz
L1 cache:
I Cache : 32 KB
: 2 way set
associative
D Cache : 32 KB
: 4 way set associative
L2 Cache
Size :1 MB
Associativity :16 way
Ext DRAM
Size :4 GB
Type :DDR4
Speed :2400 MHz
HW Accelerator
Speed : 100 MHz
Software
Multimedia task
Stochastic instruction trace
Goals
Peak Power < 1.0W
Number of Matrices > 19K
VisualSim SoC Model
MPEG Application
IP or RISC-V level
• Evaluate pipeline stages
• Width, Speed
• Number of execution units, Levels of cache
SoC
• Number of RISC-V cores
• Accelerators
• Cache memory hierarchy and coherence
System level
• Development of an IoT device, ECU or an
integrated platform
Behavior
Hardware
Bus Topology
CASE 1: All SW tasks
Observations:
1. Avg power
consumption within
requirements (<1.0 W)
2. Performance
requirement not
achieved (Only a max of
9.4K frames)
Sequence diagram
Rotate Frame
task is found to
be resource
intensive
CASE 2: Run Rotate Frame Task on HW Accelerator
Observations:
1. Avg power consumption
requirement not met (>
1.3 W)
2. Performance
requirement achieved (
max of 19.9K frames)
CASE 3: Run Rotate Frame task on
HW Accelerator + Power management
Observations:
1. Avg power consumption
requirement met (<1.0
W)
2. Performance
requirement achieved (
max of 19.8K frames)
Comparing different
Processor Cores
ARM, RISC-V
Generated Statistics
Per Execution
unit stats, stall
percentages,
buffer
occupancies
are reported
• Detailed Cache, Bus
and Memory stats
are generated per
simulation.
• Stats Include – hit
ratio, throughput,
latency, number of
write backs, evictions
etc.
ARM Cortex M4
ARM Cortex M55
Use cases
Run Num Description M4 (Latency) M55 (Latency) U74 (Latency)
1 Running Dhrystone on
core. No
cache/bus/memory access
5.576700039E-4 9.47200014E-5 1.77875568E-5
2 Cache/Bus/Memory
access
8.7438000752E-4 1.6319750281E-4 5.05307708E-5
* Number of loops are different for each core
Automotive applications
Mapping tasks to RISC-V
ECU Performance Analysis under Different Use Cases
Demo environment
1. Brake ECU integrated to a CAN Network
2. Sensors write data to the memory
3. Brake Pedal or Proximity sensor triggers the braking action from the Brake ECU
ECU
Using a RISC-V processor for the Brake ECU
Analysis
1. Latency (Time taken for the signal to reach all the wheels from the Brake ECU)
2. Processor performance (MIPS)
3. Power Consumption (Breaking activity, ECU usage and Network activity)
6/28/2024 Mirabilis Design Inc. 52
6/28/2024 Mirabilis Design Inc. 53
System Overview
Gateway
Transfer messages between different CAN
networks
CAN Bus
CAN bus is the network that connects
sensors and ECU’s
Wheel
1
Wheel
4
Wheel
3
Wheel
2
Gateway
CAN
Bus
Engine
Proximity
Sensor
Brake
Pedal
Gyro
Sensor
Road
condition
sensor
CAN
Bus
CAN
Bus
ECU
Automotive Network System
6/28/2024 Mirabilis Design Inc.
N
CAN Wire
CAN Node
Wheel1
Wheel2
Wheel3
Wheel4
Brake
Pedal
Proximity
Sensor
Gyro
Sensor
Gateway
ECU
Road
condition
sensor
Engine
CAN
BUS
CAN
BUS
CAN
BUS
N N
N N N
N
N
N
N
N
N
N
N
6/28/2024 Mirabilis Design Inc. 55
VisualSim Model
RISC-V
Model
location:
VS_ARdemo
automotiveBr
ake_Model_W
ith_ECU_A53
Brake_CAN_m
odel_ECU_ne
w_RISC-V.xml
Configuration of the ECU/Processor
6/28/2024 Mirabilis Design Inc. 56
Processor Spec
1. Processor (ECU) RISC-V – 5 Pipeline stages
2. Number of core 1 - 2
2. Processor Speed 100 MHz - 1.2GHz
3. DRAM Type DDR3 SDRAM (Synchronous DRAM)
4. DRAM Speed Range 400 – 1066 MHz
5. Cache Speed 500Mhz
6. Cache Size 64Kbytes
7. Memory Controller DDR3, 750MHz
8. Bus CAN
ECU Data input
1. Wheels 2. Engine 3. Proximity Sensor 4. Brake Pedal
5. Gyro Sensor 6. Road Condition Sensor
Designing Brake ECU using Single Core – RISC-V
6/28/2024 Mirabilis Design Inc. 57
Results – single core RISC-V
6/28/2024 Mirabilis Design Inc. 58
Slight
improvement
in Processor
Task Latency
at few
instances
Enabling Better Products

More Related Content

Similar to Mirabilis_Presentation_DAC_June_2024.pptx

Accelerated development in Automotive E/E Systems using VisualSim Architect
Accelerated development in Automotive E/E Systems using VisualSim ArchitectAccelerated development in Automotive E/E Systems using VisualSim Architect
Accelerated development in Automotive E/E Systems using VisualSim Architect
Deepak Shankar
 
Webinar on Latency and throughput computation of automotive EE network
Webinar on Latency and throughput computation of automotive EE networkWebinar on Latency and throughput computation of automotive EE network
Webinar on Latency and throughput computation of automotive EE network
Deepak Shankar
 
Webinar on radar
Webinar on radarWebinar on radar
Webinar on radar
Deepak Shankar
 
HMI Replacement_GE MARK V, ABB Procontrol 13, MHI MIDAS 8000, SIEMEN TXP
HMI Replacement_GE MARK V, ABB Procontrol 13, MHI MIDAS 8000, SIEMEN TXP HMI Replacement_GE MARK V, ABB Procontrol 13, MHI MIDAS 8000, SIEMEN TXP
HMI Replacement_GE MARK V, ABB Procontrol 13, MHI MIDAS 8000, SIEMEN TXP
Hyemin Hwang
 
Modeling Abstraction
Modeling AbstractionModeling Abstraction
Modeling Abstraction
Deepak Shankar
 
Dc
DcDc
Dc
rkonte
 
Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study
Traditional vs. SoC FPGA Design Flow A Video Pipeline Case StudyTraditional vs. SoC FPGA Design Flow A Video Pipeline Case Study
Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study
Altera Corporation
 
The Cloud & Its Impact on IT
The Cloud & Its Impact on ITThe Cloud & Its Impact on IT
The Cloud & Its Impact on IT
Anand Haridass
 
Task allocation on many core-multi processor distributed system
Task allocation on many core-multi processor distributed systemTask allocation on many core-multi processor distributed system
Task allocation on many core-multi processor distributed system
Deepak Shankar
 
Using VisualSim Architect for Semiconductor System Analysis
Using VisualSim Architect for Semiconductor System AnalysisUsing VisualSim Architect for Semiconductor System Analysis
Using VisualSim Architect for Semiconductor System Analysis
Deepak Shankar
 
Reconfigurable Computing
Reconfigurable ComputingReconfigurable Computing
Reconfigurable Computing
ppd1961
 
ES-Basics.pdf
ES-Basics.pdfES-Basics.pdf
ES-Basics.pdf
Srisurya26
 
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Deepak Shankar
 
Psim brochure press
Psim brochure pressPsim brochure press
Psim brochure press
Elizabeth Gannett
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
inside-BigData.com
 
hyperlynx_compress.pdf
hyperlynx_compress.pdfhyperlynx_compress.pdf
hyperlynx_compress.pdf
raimonribal
 
Track B- Advanced ESL verification - Mentor
Track B- Advanced ESL verification - MentorTrack B- Advanced ESL verification - Mentor
Track B- Advanced ESL verification - Mentor
chiportal
 
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERSROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
Deepak Shankar
 
Pratik Shah_Revised Resume
Pratik Shah_Revised ResumePratik Shah_Revised Resume
Pratik Shah_Revised Resume
Pratik Shah
 
Sudheer vaddi Resume
Sudheer vaddi ResumeSudheer vaddi Resume
Sudheer vaddi Resume
Sudheer Vaddi
 

Similar to Mirabilis_Presentation_DAC_June_2024.pptx (20)

Accelerated development in Automotive E/E Systems using VisualSim Architect
Accelerated development in Automotive E/E Systems using VisualSim ArchitectAccelerated development in Automotive E/E Systems using VisualSim Architect
Accelerated development in Automotive E/E Systems using VisualSim Architect
 
Webinar on Latency and throughput computation of automotive EE network
Webinar on Latency and throughput computation of automotive EE networkWebinar on Latency and throughput computation of automotive EE network
Webinar on Latency and throughput computation of automotive EE network
 
Webinar on radar
Webinar on radarWebinar on radar
Webinar on radar
 
HMI Replacement_GE MARK V, ABB Procontrol 13, MHI MIDAS 8000, SIEMEN TXP
HMI Replacement_GE MARK V, ABB Procontrol 13, MHI MIDAS 8000, SIEMEN TXP HMI Replacement_GE MARK V, ABB Procontrol 13, MHI MIDAS 8000, SIEMEN TXP
HMI Replacement_GE MARK V, ABB Procontrol 13, MHI MIDAS 8000, SIEMEN TXP
 
Modeling Abstraction
Modeling AbstractionModeling Abstraction
Modeling Abstraction
 
Dc
DcDc
Dc
 
Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study
Traditional vs. SoC FPGA Design Flow A Video Pipeline Case StudyTraditional vs. SoC FPGA Design Flow A Video Pipeline Case Study
Traditional vs. SoC FPGA Design Flow A Video Pipeline Case Study
 
The Cloud & Its Impact on IT
The Cloud & Its Impact on ITThe Cloud & Its Impact on IT
The Cloud & Its Impact on IT
 
Task allocation on many core-multi processor distributed system
Task allocation on many core-multi processor distributed systemTask allocation on many core-multi processor distributed system
Task allocation on many core-multi processor distributed system
 
Using VisualSim Architect for Semiconductor System Analysis
Using VisualSim Architect for Semiconductor System AnalysisUsing VisualSim Architect for Semiconductor System Analysis
Using VisualSim Architect for Semiconductor System Analysis
 
Reconfigurable Computing
Reconfigurable ComputingReconfigurable Computing
Reconfigurable Computing
 
ES-Basics.pdf
ES-Basics.pdfES-Basics.pdf
ES-Basics.pdf
 
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
Mastering IoT Design: Sense, Process, Connect: Processing: Turning IoT Data i...
 
Psim brochure press
Psim brochure pressPsim brochure press
Psim brochure press
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
hyperlynx_compress.pdf
hyperlynx_compress.pdfhyperlynx_compress.pdf
hyperlynx_compress.pdf
 
Track B- Advanced ESL verification - Mentor
Track B- Advanced ESL verification - MentorTrack B- Advanced ESL verification - Mentor
Track B- Advanced ESL verification - Mentor
 
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERSROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
ROLE OF DIGITAL SIMULATION IN CONFIGURING NETWORK PARAMETERS
 
Pratik Shah_Revised Resume
Pratik Shah_Revised ResumePratik Shah_Revised Resume
Pratik Shah_Revised Resume
 
Sudheer vaddi Resume
Sudheer vaddi ResumeSudheer vaddi Resume
Sudheer vaddi Resume
 

More from Deepak Shankar

Evaluating UCIe based multi-die SoC to meet timing and power
Evaluating UCIe based multi-die SoC to meet timing and power Evaluating UCIe based multi-die SoC to meet timing and power
Evaluating UCIe based multi-die SoC to meet timing and power
Deepak Shankar
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Deepak Shankar
 
Capacity Planning and Power Management of Data Centers.
Capacity Planning and Power Management of Data Centers. Capacity Planning and Power Management of Data Centers.
Capacity Planning and Power Management of Data Centers.
Deepak Shankar
 
Automotive network and gateway simulation
Automotive network and gateway simulationAutomotive network and gateway simulation
Automotive network and gateway simulation
Deepak Shankar
 
Using ai for optimal time sensitive networking in avionics
Using ai for optimal time sensitive networking in avionicsUsing ai for optimal time sensitive networking in avionics
Using ai for optimal time sensitive networking in avionics
Deepak Shankar
 
Designing memory controller for ddr5 and hbm2.0
Designing memory controller for ddr5 and hbm2.0Designing memory controller for ddr5 and hbm2.0
Designing memory controller for ddr5 and hbm2.0
Deepak Shankar
 
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Deepak Shankar
 
Develop High-bandwidth/low latency electronic systems for AI/ML application
Develop High-bandwidth/low latency electronic systems for AI/ML applicationDevelop High-bandwidth/low latency electronic systems for AI/ML application
Develop High-bandwidth/low latency electronic systems for AI/ML application
Deepak Shankar
 
Webinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Webinar: Detecting Deadlocks in Electronic Systems using Time-based SimulationWebinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Webinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Deepak Shankar
 
Webinar on Functional Safety Analysis using Model-based System Analysis
Webinar on Functional Safety Analysis using Model-based System AnalysisWebinar on Functional Safety Analysis using Model-based System Analysis
Webinar on Functional Safety Analysis using Model-based System Analysis
Deepak Shankar
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
Deepak Shankar
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
Deepak Shankar
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?
Deepak Shankar
 
Is accurate system-level power measurement challenging? Check this out!
Is accurate system-level power measurement challenging? Check this out!Is accurate system-level power measurement challenging? Check this out!
Is accurate system-level power measurement challenging? Check this out!
Deepak Shankar
 
Architectural tricks to maximize memory bandwidth
Architectural tricks to maximize memory bandwidthArchitectural tricks to maximize memory bandwidth
Architectural tricks to maximize memory bandwidth
Deepak Shankar
 

More from Deepak Shankar (15)

Evaluating UCIe based multi-die SoC to meet timing and power
Evaluating UCIe based multi-die SoC to meet timing and power Evaluating UCIe based multi-die SoC to meet timing and power
Evaluating UCIe based multi-die SoC to meet timing and power
 
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
Compare Performance-power of Arm Cortex vs RISC-V for AI applications_oct_2021
 
Capacity Planning and Power Management of Data Centers.
Capacity Planning and Power Management of Data Centers. Capacity Planning and Power Management of Data Centers.
Capacity Planning and Power Management of Data Centers.
 
Automotive network and gateway simulation
Automotive network and gateway simulationAutomotive network and gateway simulation
Automotive network and gateway simulation
 
Using ai for optimal time sensitive networking in avionics
Using ai for optimal time sensitive networking in avionicsUsing ai for optimal time sensitive networking in avionics
Using ai for optimal time sensitive networking in avionics
 
Designing memory controller for ddr5 and hbm2.0
Designing memory controller for ddr5 and hbm2.0Designing memory controller for ddr5 and hbm2.0
Designing memory controller for ddr5 and hbm2.0
 
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
Introduction to Architecture Exploration of Semiconductor, Embedded Systems, ...
 
Develop High-bandwidth/low latency electronic systems for AI/ML application
Develop High-bandwidth/low latency electronic systems for AI/ML applicationDevelop High-bandwidth/low latency electronic systems for AI/ML application
Develop High-bandwidth/low latency electronic systems for AI/ML application
 
Webinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Webinar: Detecting Deadlocks in Electronic Systems using Time-based SimulationWebinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
Webinar: Detecting Deadlocks in Electronic Systems using Time-based Simulation
 
Webinar on Functional Safety Analysis using Model-based System Analysis
Webinar on Functional Safety Analysis using Model-based System AnalysisWebinar on Functional Safety Analysis using Model-based System Analysis
Webinar on Functional Safety Analysis using Model-based System Analysis
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?
 
Is accurate system-level power measurement challenging? Check this out!
Is accurate system-level power measurement challenging? Check this out!Is accurate system-level power measurement challenging? Check this out!
Is accurate system-level power measurement challenging? Check this out!
 
Architectural tricks to maximize memory bandwidth
Architectural tricks to maximize memory bandwidthArchitectural tricks to maximize memory bandwidth
Architectural tricks to maximize memory bandwidth
 

Recently uploaded

Pengukuran berat badan anak dan tinggi badan anak
Pengukuran berat badan anak dan tinggi badan anakPengukuran berat badan anak dan tinggi badan anak
Pengukuran berat badan anak dan tinggi badan anak
DeviDamayanti53
 
calcaneal fracture seminar by dr vishu.pptx
calcaneal fracture seminar by dr vishu.pptxcalcaneal fracture seminar by dr vishu.pptx
calcaneal fracture seminar by dr vishu.pptx
Skmch
 
A study on drug utilization evaluation of bronchodilators using DDD method
A study on drug utilization evaluation of bronchodilators using DDD methodA study on drug utilization evaluation of bronchodilators using DDD method
A study on drug utilization evaluation of bronchodilators using DDD method
Dr. Afreen Nasir
 
Workshop Wednesday with SaaStr CEO Jason Lemkin - VC Workshop
Workshop Wednesday with SaaStr CEO Jason Lemkin - VC WorkshopWorkshop Wednesday with SaaStr CEO Jason Lemkin - VC Workshop
Workshop Wednesday with SaaStr CEO Jason Lemkin - VC Workshop
saastr
 
Biography of the late Mrs. Stella Atsupui Eddah.pdf
Biography of the late Mrs. Stella Atsupui Eddah.pdfBiography of the late Mrs. Stella Atsupui Eddah.pdf
Biography of the late Mrs. Stella Atsupui Eddah.pdf
AbdulSadickZutah
 
Call India - AmanTel on the App Store.ppt
Call India - AmanTel on the App Store.pptCall India - AmanTel on the App Store.ppt
Call India - AmanTel on the App Store.ppt
Best International calling app on the market
 
Effective-Recruitment-Strategies and leveraging linkedin
Effective-Recruitment-Strategies and leveraging linkedinEffective-Recruitment-Strategies and leveraging linkedin
Effective-Recruitment-Strategies and leveraging linkedin
DivyaMehta193660
 
At the intersection of SEO & Product - Vanda Pokecz presentation
At the intersection of SEO & Product - Vanda Pokecz presentationAt the intersection of SEO & Product - Vanda Pokecz presentation
At the intersection of SEO & Product - Vanda Pokecz presentation
Vanda Pokecz
 
2024-07-07 Transformed 06 (shared slides).pptx
2024-07-07 Transformed 06 (shared slides).pptx2024-07-07 Transformed 06 (shared slides).pptx
2024-07-07 Transformed 06 (shared slides).pptx
Dale Wells
 
TEST WORTHINESS: VALIDITY, RELIABILITY, PRACTICALITY
TEST WORTHINESS: VALIDITY, RELIABILITY, PRACTICALITYTEST WORTHINESS: VALIDITY, RELIABILITY, PRACTICALITY
TEST WORTHINESS: VALIDITY, RELIABILITY, PRACTICALITY
AaSs197122
 
the sparks foundation JOB READINESS- how to be job ready. task 2
the sparks foundation JOB READINESS- how to be job ready. task 2the sparks foundation JOB READINESS- how to be job ready. task 2
the sparks foundation JOB READINESS- how to be job ready. task 2
Rashi427200
 
Risks & Business Risks Reduce - investment.pdf
Risks & Business Risks Reduce  - investment.pdfRisks & Business Risks Reduce  - investment.pdf
Risks & Business Risks Reduce - investment.pdf
Home
 
Destyney Duhon personal brand exploration
Destyney Duhon personal brand explorationDestyney Duhon personal brand exploration
Destyney Duhon personal brand exploration
minxxmaree
 
stackconf 2024 | Using European Open Source to build a Sovereign Multi-Cloud ...
stackconf 2024 | Using European Open Source to build a Sovereign Multi-Cloud ...stackconf 2024 | Using European Open Source to build a Sovereign Multi-Cloud ...
stackconf 2024 | Using European Open Source to build a Sovereign Multi-Cloud ...
NETWAYS
 
Recruitment articles and posts- different & effective ways of recruitment
Recruitment articles and posts- different & effective ways of recruitmentRecruitment articles and posts- different & effective ways of recruitment
Recruitment articles and posts- different & effective ways of recruitment
Rashi427200
 
HERO.pdf hero company working cap management project
HERO.pdf hero company working cap management projectHERO.pdf hero company working cap management project
HERO.pdf hero company working cap management project
SambalpurTokaSatyaji
 
stackconf 2024 | On-Prem is the new Black by AJ Jester
stackconf 2024 | On-Prem is the new Black by AJ Jesterstackconf 2024 | On-Prem is the new Black by AJ Jester
stackconf 2024 | On-Prem is the new Black by AJ Jester
NETWAYS
 
stackconf 2024 | Buzzing across the eBPF Landscape and into the Hive by Bill ...
stackconf 2024 | Buzzing across the eBPF Landscape and into the Hive by Bill ...stackconf 2024 | Buzzing across the eBPF Landscape and into the Hive by Bill ...
stackconf 2024 | Buzzing across the eBPF Landscape and into the Hive by Bill ...
NETWAYS
 
Marketing Articles and ppt on how to do marketing ..Challenges faced during M...
Marketing Articles and ppt on how to do marketing ..Challenges faced during M...Marketing Articles and ppt on how to do marketing ..Challenges faced during M...
Marketing Articles and ppt on how to do marketing ..Challenges faced during M...
ankitamarik05
 
Building Digital Products & Content Leadership
Building Digital Products & Content LeadershipBuilding Digital Products & Content Leadership
Building Digital Products & Content Leadership
Rajesh Math
 

Recently uploaded (20)

Pengukuran berat badan anak dan tinggi badan anak
Pengukuran berat badan anak dan tinggi badan anakPengukuran berat badan anak dan tinggi badan anak
Pengukuran berat badan anak dan tinggi badan anak
 
calcaneal fracture seminar by dr vishu.pptx
calcaneal fracture seminar by dr vishu.pptxcalcaneal fracture seminar by dr vishu.pptx
calcaneal fracture seminar by dr vishu.pptx
 
A study on drug utilization evaluation of bronchodilators using DDD method
A study on drug utilization evaluation of bronchodilators using DDD methodA study on drug utilization evaluation of bronchodilators using DDD method
A study on drug utilization evaluation of bronchodilators using DDD method
 
Workshop Wednesday with SaaStr CEO Jason Lemkin - VC Workshop
Workshop Wednesday with SaaStr CEO Jason Lemkin - VC WorkshopWorkshop Wednesday with SaaStr CEO Jason Lemkin - VC Workshop
Workshop Wednesday with SaaStr CEO Jason Lemkin - VC Workshop
 
Biography of the late Mrs. Stella Atsupui Eddah.pdf
Biography of the late Mrs. Stella Atsupui Eddah.pdfBiography of the late Mrs. Stella Atsupui Eddah.pdf
Biography of the late Mrs. Stella Atsupui Eddah.pdf
 
Call India - AmanTel on the App Store.ppt
Call India - AmanTel on the App Store.pptCall India - AmanTel on the App Store.ppt
Call India - AmanTel on the App Store.ppt
 
Effective-Recruitment-Strategies and leveraging linkedin
Effective-Recruitment-Strategies and leveraging linkedinEffective-Recruitment-Strategies and leveraging linkedin
Effective-Recruitment-Strategies and leveraging linkedin
 
At the intersection of SEO & Product - Vanda Pokecz presentation
At the intersection of SEO & Product - Vanda Pokecz presentationAt the intersection of SEO & Product - Vanda Pokecz presentation
At the intersection of SEO & Product - Vanda Pokecz presentation
 
2024-07-07 Transformed 06 (shared slides).pptx
2024-07-07 Transformed 06 (shared slides).pptx2024-07-07 Transformed 06 (shared slides).pptx
2024-07-07 Transformed 06 (shared slides).pptx
 
TEST WORTHINESS: VALIDITY, RELIABILITY, PRACTICALITY
TEST WORTHINESS: VALIDITY, RELIABILITY, PRACTICALITYTEST WORTHINESS: VALIDITY, RELIABILITY, PRACTICALITY
TEST WORTHINESS: VALIDITY, RELIABILITY, PRACTICALITY
 
the sparks foundation JOB READINESS- how to be job ready. task 2
the sparks foundation JOB READINESS- how to be job ready. task 2the sparks foundation JOB READINESS- how to be job ready. task 2
the sparks foundation JOB READINESS- how to be job ready. task 2
 
Risks & Business Risks Reduce - investment.pdf
Risks & Business Risks Reduce  - investment.pdfRisks & Business Risks Reduce  - investment.pdf
Risks & Business Risks Reduce - investment.pdf
 
Destyney Duhon personal brand exploration
Destyney Duhon personal brand explorationDestyney Duhon personal brand exploration
Destyney Duhon personal brand exploration
 
stackconf 2024 | Using European Open Source to build a Sovereign Multi-Cloud ...
stackconf 2024 | Using European Open Source to build a Sovereign Multi-Cloud ...stackconf 2024 | Using European Open Source to build a Sovereign Multi-Cloud ...
stackconf 2024 | Using European Open Source to build a Sovereign Multi-Cloud ...
 
Recruitment articles and posts- different & effective ways of recruitment
Recruitment articles and posts- different & effective ways of recruitmentRecruitment articles and posts- different & effective ways of recruitment
Recruitment articles and posts- different & effective ways of recruitment
 
HERO.pdf hero company working cap management project
HERO.pdf hero company working cap management projectHERO.pdf hero company working cap management project
HERO.pdf hero company working cap management project
 
stackconf 2024 | On-Prem is the new Black by AJ Jester
stackconf 2024 | On-Prem is the new Black by AJ Jesterstackconf 2024 | On-Prem is the new Black by AJ Jester
stackconf 2024 | On-Prem is the new Black by AJ Jester
 
stackconf 2024 | Buzzing across the eBPF Landscape and into the Hive by Bill ...
stackconf 2024 | Buzzing across the eBPF Landscape and into the Hive by Bill ...stackconf 2024 | Buzzing across the eBPF Landscape and into the Hive by Bill ...
stackconf 2024 | Buzzing across the eBPF Landscape and into the Hive by Bill ...
 
Marketing Articles and ppt on how to do marketing ..Challenges faced during M...
Marketing Articles and ppt on how to do marketing ..Challenges faced during M...Marketing Articles and ppt on how to do marketing ..Challenges faced during M...
Marketing Articles and ppt on how to do marketing ..Challenges faced during M...
 
Building Digital Products & Content Leadership
Building Digital Products & Content LeadershipBuilding Digital Products & Content Leadership
Building Digital Products & Content Leadership
 

Mirabilis_Presentation_DAC_June_2024.pptx

  • 2. Mirabilis Design EDA Software Company based in Silicon Valley Integrating sub-system teams to the mission using System-Level Design Highly experience Management and Engineering team Over 150 man-years of background in semiconductors, automotive and aerospace VisualSim Architect –Design the Right product Graphical modeling and simulation platform with complete set of system-level modeling IP Eliminate all surprises prior to integration Optimizing specification, collaboration between mission, sub-systems and suppliers, evaluating use-cases and identify test scenarios for system validation Networking 18th companies & 32nd universities Electronics Modeling 35th customer 2008 Company Incorporated 2011 First Engagement with HP and ISRO 2013 Announced VisualSim 2014 University Program 10th Customer 2015 Stochastic and Network modeling 2016 2018 2019 Automotive & Avionics 2020 System-level IP Open API 2022/23 Re-engineered AI, DNN, Power, GPU 2021 Requirements Tracking 50th customer
  • 3. VisualSim- The Product Spend time designing … not working on Word/Excel/Powerpoint
  • 4. VisualSim IP Library Custom Creator Communication Power RF, Baseband, Channels Communication systems, A/D transceivers, Antenna, Analog, Signal/audio/Image Processing Power States, Allocation, Transition, Loss, Battery, Consumption, Management, Generation, Distribution, and Thermal Sensors, Interfaces, Distribution, Traces, Software, VCD, ML, DNN Traffic Reports Latency, Throughput, Utilization, Ave/peak power (instant, ave) , hit-ratio, Heat, Temp RISC-V and Chiplets RTOS and Software SiFive, In-Order/Out- of-Order Generator, Tilelink Generic RTOS, ARINC 653, AUTOSAR, task Graph AMBA (AHB/ APB/ AXI/CHI),Tilelink Corelink (600, 700), NoC (Generic, Arteris, Signature, OpenEdges), Virtual Channel, DMA, Crossbar, Serial Switch, Bridge, UCie SOC Board- Level VME, PCI/PCI-X/PCIe 6.0, SPI 3.0, 1553B, FlexRay, CAN-FD/XL, AFDX, TTEthernet, OpenVPX Processors ARM (M0-55), R5, Cortex (A8, A72, A53, A76, A77, A65, A78, A720), Nvidia- Pascal to Ampere, Generic GPU, mC, Leon, Power, X86, DSP- TI and ADI, Tensilica, Renesas SH, AI Engine, TPU Stochastic Queue ,Time Queue, Quantity Queue, Resources, Scheduler Scripting, RegEx, Task graph, Use cases, Hardware Builder, C/C++/Java/Python MatLab, STK Storage Flash, NVMe, Disk, SSD, NAS, Fibre Channel, FireWire TSN, AVB, 10BaseT1S, Switched Ethernet, Resilient Packet Ring, RP3, WiFi 802.11, Bluetooth, PAN, Spacewire, SpaceFibre, IEEE802.1Q, Time-Triggered Ethernet, AFDX, 5G Networking Memory • Memory Controller, SDR, DDR DRAM 2,3,4, 5, LPDDR 2, 3, 4,5 HBM2.0, HMC, QDR, RDRAM, MPMC, cache, Coherent cache FPGA Xilinx- Versal, Zynq, Ultrascale, Kintex Altera-Stratix, Arria, Microsemi- Smartfusion, Programmable logic generator Trade-Off Requirements, Thermal, Power, Performance, Failure Verification, Upgrade
  • 5. Assemble System Model using Pre-Built System-Level IP Scheduling/Arbitration proportional share WFQ static dynamic fixed priority EDF TDMA FCFS Communication Templates Architecture # 1 Architecture # 2 Computation Templates DSP AI GPU DRAM CPU FPGA m E DSP TDMA Priority EDF WFQ RISC DSP LookUp Cipher AI DS P CPU GP U mE DD R static Which architecture is better suited for our application?
  • 6. Add the Task Graph to Define the Workload I/O DSP CPU1 CPU2 task1 task2 task3 task4 Contention - limited resources - scheduling/arbitration Interference of multiple applications - limited resources - scheduling/arbitration - anomalies Complex behavior - input stream - data dependent behavior
  • 7. Analyze the Results System with faster Bus is slower in places Unpredictable System Response
  • 8. Impact of System Architecture Exploration • System sizing and topology design • Power consumption, cooling & management • Device distribution across one/multi-die • Application mapping on CPU, GPU, TPU, DSP • SW, firmware, scheduler and network tuning • Merges Shift-Left and Shift-Right • System-level model integrates requirements, creates a single model of the entire system, trade-offs power-performance-area and generate tests • To optimize associated area • To design thermal structure • To create Chiplet IP industry • To meet timing and power • To meet mission requirements • Single platform from Concept to End-of-Life • Collaboration between design teams, suppliers, customers
  • 9. ARM Cortex A53 Benchmark FPGA VisualSim Difference Comments ED1 5.94ms 6.425ms 7.55% Integer processing MM 12.084ms 11.863ms 1.08% Most load operations with random addresses MM_st 13.984ms 14.65ms 4.5% Most store operations with random addresses Test System Xilinx Ultrascale+ Zynq® UltraScale+™ XCZU9EG-2FFVB1156E MPSoC running on the ZCU102 board Specification: 4 core ARM Cortex A53 at 1200Mhz; 32KiB i-cache; 32KiB d-cache, 1MiB L2; 2GB DDR4 DRAM 2400
  • 10. Comparing Power for ARM Cortex A53 Frequency VisualSim Simulated Power Measured Power as reported by Anandtech Delta percentage 500.0 Mhz 0.037 W 0.038 W 2.63% 600.0 Mhz 0.053 W 0.051 W -3.92% 700.0 Mhz 0.073 W 0.080 W 8.75% 800.0 Mhz 0.097 W 0.090 W -7.77% 1000.0 Mhz 0.157 W 0.159 W 1.25% 1100.0 Mhz 0.193 W 0.188 W -2.65% 1200.0 Mhz 0.233 W 0.227 W -2.64% 1300.0 Mhz 0.277 W 0.269 W -2.97% Source: Anandtech.com Over 97% accuracy
  • 11. Comparing different Cores- Dhrystone Processor MoP Hit Ratio MoP Mean Latency I1 Hit Ratio I1 Mean Latency D1 Hit Ratio D1 Mean Latency L2 Hit Ratio L2 Mean Latency DSU Hit Ratio DSU Mean Latency ARM Cortex A53 - - 99.97 1.93E-09 99.98 2.02E-09 18.75 9.33E-08 - - ARM Cortex A77 99.90 1.75E-09 67.22 6.25E-08 99.96 7.32E-10 14.19 1.82E-07 6.96 2.05E-09 RISC-V u74 - - 99.98 4.15E-09 99.98 1.86E-09 39.58 5.25E-08 - - Processor Instructions Latency Max MIPS ARM Cortex A53 ~ 56,66,000 0.0055846 ~ 1039 ARM Cortex A77 ~ 44,78,000 0.0011795 ~ 3960 RISC-V u74 ~ 60,58,000 0.007726 ~ 797
  • 12. VisualSim drives Efficiency & Productivity Model Creation (6) Implementation (18) Using Current Design Methodology Project Schedule ) Implementation (12) Using VisualSim Design Methodology Time savings based on 24 month project is 20-40% Note: All times in months TM Communication and Refinement (4) Analysis (2.5) Model Creation (0.5) Analysis (1.5) Communication and Refinement (6) Advantageous over generic modeling environment due to Shorter duration & greater applicability
  • 13. VisualSim System Model using UCIe in ADAS SoC
  • 14. Vary Compute, Interconnect and Traffic Package_Type = Advanced Max_Link_Speed_GTps = 32 Number of Modules = 4 Tx_Buffer_Size = 8192 ( No packets dropped) Protocol = PCIe_Gen6 Flit_Size = 256 Bytes Num_of_Flits_per_Flow_Control_Check =8 Run Simulation with Different Configurations and Topology
  • 15. Power Generation Power Storage Power Consumption Thermal Management • Different charging schemes • Impact of surge and shocks • Battery Lifecycle • Battery Consumption • Statistics • Heat and temperature • Impact of cooling strategy • Add impact of power spikes • State based power consumption of electronics (controller, SOC) and Mechanical (brakes, wheels) • Average, instant and Cumulative • Power per device and application Verification and Debugging • 4 Types of Power Generators in VisualSim • Constant, variable, motor, solar charge • Charge sent to battery 1 2 3 5 6 • Optimize and test the power management algorithms • Sizing of power generators and battery • Optimize the schedule, supplynet and voltage • Estimate power consumed by the software application Downstream Integration • Generate UPF file with power domains and associated voltage levels • Generate SystemVerilog power testbench • Generate powerState change VCD dump 7 Power Management • Change in power state controlled by time, utilization, temperature and expected activity 4 Add the Power and Thermal
  • 16. Behavior Task Graph Power Table Power management Unit SystemVerilog Output for Power System Test VCD Waveform for Verification create_power_domain PD_Top -include_scope create_power_domain -name PD_1_2.0 -elements {"CLKMUX"} create_power_domain -name PD_1_1.0 -elements {"PLL","G2","G3"} create_power_domain -name PD_1_3.0 -elements {"PROC"} create_supply_port -port VDD_1.0 -direction in -domain PD_Top create_supply_port -port VDD_2.0 -direction in -domain PD_Top create_supply_port -port VDD_3.0 -direction in -domain PD_Top create_supply_port -port VSS_0.0 -direction in -domain PD_Top create_supply_net VDD_1.0 -domain PD_Top create_supply_net VDD_2.0 -domain PD_Top create_supply_net VDD_3.0 -domain PD_Top create_supply_net VSS_0.0 -domain PD_Top connect_supply_net VDD_1.0 -ports VDD_1.0 connect_supply_net VDD_2.0 -ports VDD_2.0 connect_supply_net VDD_3.0 -ports VDD_3.0 connect_supply_net VSS_0.0 -ports VSS_0.0 add_power_state PD_1_2.0 -state Active {-supply_expr (VDD_2.0 == {ON, 2.0}) && (VSS_0.0 =={ON,0.0})} add_power_state PD_1_2.0 -state OFF {-supply_expr (VDD_2.0 == {OFF, 0.0}) && (VSS_0.0 =={ON,0.0})} add_power_state PD_1_1.0 -state Active {-supply_expr (VDD_1.0 == {ON, 1.0}) && (VSS_0.0 =={ON,0.0})} add_power_state PD_1_1.0 -state OFF {-supply_expr (VDD_1.0 == {OFF, 0.0}) && (VSS_0.0 =={ON,0.0})} add_power_state PD_1_3.0 -state Active {-supply_expr (VDD_3.0 == {ON, 3.0}) && (VSS_0.0 =={ON,0.0})} add_power_state PD_1_3.0 -state OFF {-supply_expr (VDD_3.0 == {OFF, 0.0}) && (VSS_0.0 =={ON,0.0})} Power Modeling Integration
  • 17. System Verification • Validate product not just HW/SW • Application relevant test vectors • Generate test cases and run against RTL • Compare simulation output against RTL • Match architecture timing within range • Verify functional correctness • Task sequencing @ DSP/uP • Resource contention Eliminate product failure by maximizing relevant verification Golden Reference Comparator Match Tag Architecture model of IP Verilog/C/ Hardware
  • 19. Architecting Hardware-Software for Infotainment System Mirabilis Design Confidential DRAM Display IO A M B A A X I B u s CPU GPU Display Ctrl P C I e Video Camera SRAM Packet • System Overview • Camera : 30fps, VGA corresponds • CPU : Multi-core ARM Cortex-A53 1.2GHz • GPU : 64Cores(8Warps×8PEs), 32Threads, 1GHz • DisplayCtrl : DisplayBuffer 293,888Byte • SRAM : SDR, 64MB, 1.0GHz • DRAM : DDR3, 64MB, 2.4GHz Explore at the board- and semiconductor-level to size uP/GPU, memory bandwidth and bus/switch configuration
  • 20. System Model of an Infotainment System Mirabilis Design Confidential NXP i.MX6 / nVIDIA Drive PX Xilinx FPGA Kintex 8 Discrete DMA ARM A53 GPU Display Ctrl SRAM3 DRAM3 Video IN Parameters Video OUT
  • 21. Conducting Architecture Trade-off • By changing the amount of video input data (packet number), observe the SRAM -> DRAM transfer performance and examine the upper limit performance of the video input that the system can tolerate. 210Packet/Sec 12ms 21Packet/Sec 41.4us 300Packet/Sec • 250 Packet/Sec is the system limit • With 300 Packet/Sec, simulation cannot be executed due to FIFO buffer overflow.
  • 23. Mapping Algorithm to Multi-Resources Standard HW Library Component Basic/Starting Configuration Grayscale_Conversion - PS [A72 Core 1] IIR – Logic (PL) FFT – AI Engine Tile Edge_Image - Logic (PL) iFFT – AI Engine Tile Edge_Image_Enhancement – Logic (PL) Segmentation – PS [A72 Core 2] Image Processing Algorithm
  • 24. Experiments with Different Implementations Run 3 – Using Direct Path between Logic and AI Run 2 – Segmentation Mapped to AI Engine Run 1 – Base Configuration Mapped to Logic and ARM Application latency increasing over time. Latency increases due to Segmentation. Remap segmentation task AI Tiles Latency is deterministic Latency requirement (App latency < 80 msec) is met. Utilization across NoC is acceptable Application latency in bounded range. NoC Utilization is high. Changed interconnect for Segmentation from NoC to Direct
  • 25. VisualSim Chiplet Solution Using the Chiplet Library to Design SoC
  • 26. ADAS SoC Block Diagram UCIe AI Engine Tiles Warp Scheduler PE PE PE PE Local Mem GPU Memory chiplet ADC DDR5 Processor subsystem Core L1 B u s SLC • Optimal mesh size (mxn) ? • Best sample size (16 bytes vs 32 bytes etc) ? Use a single protocol stack or multi protocol stack? Do we need PCIe gen6 or still use gen5 for meeting application requirements?
  • 27. VisualSim System Model using UCIe in ADAS SoC
  • 28. Statistics for Multi-Die SoC • Note the AI Engine latency spikes • For multi protocol, half bandwidth for each protocol. • Older gen protocols are mixed with PCIe 6, • Lower FLIT size increases latency.
  • 29. Comparing Different Configurations using UCIe Interface All Die Adapters using PCIe 6.0 Die Adapters using PCIe 6.0 and Streaming Protocols (AXI) Lower latency when using PCIe 6.0
  • 31. Mask Region-CNN (MR-CNN) for object detection and image segmentation Overall representation of Mask R-CNN model Network Architecture of Mask R-CNN output CPU Preprocessing CPU Postprocessing
  • 32. Using ChatGPT to translate AI model (Mask R-CNN) in to VisualSim Task Graph • Each of the layers are defined as different tasks in the task graph and the dependency between them is modeled. • A database is used to list the layers/functions and the parameters associated with them. • These will be used to determine the number of Multiply Accumulate (MAC) operations corresponding to each layer/function Class, box mask
  • 33. VisualSim Model of DNN Hardware and Task Graph Application sequence from Task Graph is mapped to HW architecture • PE – 12x14 • 4 memory hierarchy • Power computation per PE, Buses and memory
  • 34. Results – Base model (168 AI Cores, 90% data availability at SRAM) • Peak Power consumption at around 10.8 Watts • Obtained FPS = 0.414
  • 35. Results – 8x8 (64) cores, 90% data availability at SRAM • Peak Power consumption at around 5.6 Watts as the number of cores were reduced • Obtained FPS = 0.29, which is lower than the base model results as the number of resources for doing MAC operations were lower
  • 36. Results - 100% data availability at SRAM, 168 cores • The number of off chip memory accesses were reduced. The only accesses made were to load the images and weights into the SRAM • Obtained FPS = 9.93, which is higher than the base model results as the number of off chip memory accesses were reduced • Peak Power consumption (10.4 W) is lower as off chip memory accesses were reduced
  • 37. Results - 60% data availability at SRAM, 168 cores • The number of off chip memory accesses were increased • Obtained FPS = 0.04, which is lower than the base model results as the number of off chip memory accesses were increased
  • 39. SoC System Specification Processor Core – RISC-V or ARM A53 core Processor Speed – 1200 MHz L1 cache: I Cache : 32 KB : 2 way set associative D Cache : 32 KB : 4 way set associative L2 Cache Size :1 MB Associativity :16 way Ext DRAM Size :4 GB Type :DDR4 Speed :2400 MHz HW Accelerator Speed : 100 MHz Software Multimedia task Stochastic instruction trace Goals Peak Power < 1.0W Number of Matrices > 19K
  • 40. VisualSim SoC Model MPEG Application IP or RISC-V level • Evaluate pipeline stages • Width, Speed • Number of execution units, Levels of cache SoC • Number of RISC-V cores • Accelerators • Cache memory hierarchy and coherence System level • Development of an IoT device, ECU or an integrated platform Behavior Hardware Bus Topology
  • 41. CASE 1: All SW tasks Observations: 1. Avg power consumption within requirements (<1.0 W) 2. Performance requirement not achieved (Only a max of 9.4K frames)
  • 42. Sequence diagram Rotate Frame task is found to be resource intensive
  • 43. CASE 2: Run Rotate Frame Task on HW Accelerator Observations: 1. Avg power consumption requirement not met (> 1.3 W) 2. Performance requirement achieved ( max of 19.9K frames)
  • 44. CASE 3: Run Rotate Frame task on HW Accelerator + Power management Observations: 1. Avg power consumption requirement met (<1.0 W) 2. Performance requirement achieved ( max of 19.8K frames)
  • 46. Generated Statistics Per Execution unit stats, stall percentages, buffer occupancies are reported • Detailed Cache, Bus and Memory stats are generated per simulation. • Stats Include – hit ratio, throughput, latency, number of write backs, evictions etc.
  • 49. Use cases Run Num Description M4 (Latency) M55 (Latency) U74 (Latency) 1 Running Dhrystone on core. No cache/bus/memory access 5.576700039E-4 9.47200014E-5 1.77875568E-5 2 Cache/Bus/Memory access 8.7438000752E-4 1.6319750281E-4 5.05307708E-5 * Number of loops are different for each core
  • 51. ECU Performance Analysis under Different Use Cases Demo environment 1. Brake ECU integrated to a CAN Network 2. Sensors write data to the memory 3. Brake Pedal or Proximity sensor triggers the braking action from the Brake ECU ECU Using a RISC-V processor for the Brake ECU Analysis 1. Latency (Time taken for the signal to reach all the wheels from the Brake ECU) 2. Processor performance (MIPS) 3. Power Consumption (Breaking activity, ECU usage and Network activity) 6/28/2024 Mirabilis Design Inc. 52
  • 52. 6/28/2024 Mirabilis Design Inc. 53 System Overview Gateway Transfer messages between different CAN networks CAN Bus CAN bus is the network that connects sensors and ECU’s Wheel 1 Wheel 4 Wheel 3 Wheel 2 Gateway CAN Bus Engine Proximity Sensor Brake Pedal Gyro Sensor Road condition sensor CAN Bus CAN Bus ECU
  • 53. Automotive Network System 6/28/2024 Mirabilis Design Inc. N CAN Wire CAN Node Wheel1 Wheel2 Wheel3 Wheel4 Brake Pedal Proximity Sensor Gyro Sensor Gateway ECU Road condition sensor Engine CAN BUS CAN BUS CAN BUS N N N N N N N N N N N N N
  • 54. 6/28/2024 Mirabilis Design Inc. 55 VisualSim Model RISC-V Model location: VS_ARdemo automotiveBr ake_Model_W ith_ECU_A53 Brake_CAN_m odel_ECU_ne w_RISC-V.xml
  • 55. Configuration of the ECU/Processor 6/28/2024 Mirabilis Design Inc. 56 Processor Spec 1. Processor (ECU) RISC-V – 5 Pipeline stages 2. Number of core 1 - 2 2. Processor Speed 100 MHz - 1.2GHz 3. DRAM Type DDR3 SDRAM (Synchronous DRAM) 4. DRAM Speed Range 400 – 1066 MHz 5. Cache Speed 500Mhz 6. Cache Size 64Kbytes 7. Memory Controller DDR3, 750MHz 8. Bus CAN ECU Data input 1. Wheels 2. Engine 3. Proximity Sensor 4. Brake Pedal 5. Gyro Sensor 6. Road Condition Sensor
  • 56. Designing Brake ECU using Single Core – RISC-V 6/28/2024 Mirabilis Design Inc. 57
  • 57. Results – single core RISC-V 6/28/2024 Mirabilis Design Inc. 58 Slight improvement in Processor Task Latency at few instances