Architecture for Massively Parallel
             HDL Simulations
                Rich Porter
               Art of Silicon




1
Art of Silicon
●   Founded in 2005
●   Bristol based
●   Multimedia centric Silicon IP
●   Bespoke IP creation
●   Consultancy




2
It's all about MONEY
●   Engineer productivity is key to success
       –   Employment cost is $$$s
       –   Tooling cost
       –   Computers
●   Makes good business sense to improve
    productivity
       –   Less time per chip
       –   More chips per man year
●   Productive engineers are happy
3   engineers!
Coding Time
●   Engineers do NOT spend all their time
    writing code
       –   5% writing code
       –   95% debugging code
               ●   Thinking & waiting for waves, regressions
●   Highly desirable to quantify the effect of
    any design delta
       –   As quickly & as easily as possible
       –   For all engineers & management

4
Minimizing Idle Time
●   Scalable test engine
         –   40+ instances per engineer
         –   Up to as many as they can use
●   Large compute farm of dense elements
    –   8, 12, 16, 24 core boxes
●   Reduce iterations during the day
●   Provide full regression bandwidth at night,
    weekends
●   So how do I do this?
5
Verilator
●   Used to generate standalone license-free
    test executable
●   Execute as many as your compute
    infrastructure can support
●   Deliver to other teams
       –   Architecture
       –   Systems
       –   Toolchain
       –   Customer
6
Signoff
●   Event driven simulators are gold standard
    for signoff
       –   Gate level simulations
●   HAVE to run in this environment too
●   Spending engineer time to create 2
    environments costs $s
       –   Gratuitous creation of extra work



7
Architecture
●   Single testbench architecture
       –   Ease of maintanence
       –   Issue replication
       –   Portable to silicon
●   It is essential that test stimulus and cycle
    level behaviour is identical across
    platforms
       –   Even though one platform is event driven
            and the other is cycle based

8
What AoS did
●   Stimulus & checker            DUV
       –   in C++
                                  checker
       –   Identical code
●   'Gaskets' used
                                 model
       –   Provides uniform I/F
       –   interface­>signal = value;
       –   value =                   
            interface­>signal­>signed();
●   All design code in regular verilog
9
Example – verilog
     initial begin
     `ifdef verilator
       $c("aos_simctrl_init(&", rst_cnt,
        ", &", cycles, 
                                            DUV
        ", &", std_simctrl_finish_r, ");"
       );
     `else
       $aos_simctrl_init(
                                            checker
         rst_cnt,
         cycles,
         std_simctrl_finish_r
       );
     `endif
                                            model
     end

     always @(posedge clk1)
     `ifdef verilator
       $c("aos_simctrl_clk();");
     `else
       $aos_simctrl_clk;
     `endif




10
Autogeneration
●    Single sourced from SPIRIT XML
     description
     –   Verilog
                   ●   module declarations
                   ●   grey box instantiation
                   ●   verilog side gasket instantiation
     –   C++
                   ●   headers
                   ●   port descriptions (name, size, direction)
                   ●   PLI wrappers

11
Test Configuration
●    Tests were configured at execution using
     plusargs
        –   +argument=value
●    Scanning code was identical for both
        –   To ensure consistency
●    Regression script
        –   Created these configurations
        –   Passed them to batch scheduler
        –   Collated results for web presentation
12
Logging
●    Unified logging interface
        –   Consistent output
        –   Captured in markup to preserve semantics
        –   No cludgy parsing of text post mortem
             that varies from platform to platform
●    Test fail produces identical logs across
     platforms
        –   No problems with spaces, split lines,
             mixed streams, output ordering

13
Triage
●    Group messages by content, code, file,
     line
        –   Placing most numerous first
●    Order by cycles within
        –   Increasing cycle count to failure
●    Failure at the top is debug candidate
●    Web I/F can provide rich set of
     filters/collation options

14
Performance
●    Performance has steadily increased

      E5405 @ 2000MHz




      E5345 @ 2333MHz


                                                    kHz
      X5472 @ 3000MHz




      X3430 @ 2400MHz



                        0    50   100   150   200




●    Memory footprint remains low
            –    No simulation kernel
15
Key Features
●    Single testbench providing cycle accurate
     stimulus
●    Autogeneration from single source
●    Single test configuration
●    Unified logging




16
Conclusion / AoS Experience
●    Little spent waiting for simulations
        –   60 wide queue gave 6MHz effective
        –   Engineers spent time really debugging
●    Small amount of time spend bringing up
     each simulator
        –   Not much, but was very early on
●    Verilator proved robust
●    See no reason why cannot scale to 1000s
     of simulations
17
Recommendations
●    Get a compute farm
●    Use a queue
●    Replace CPUs regularly
●    Prioritize high value jobs
     –   $ licenses
     –   interactive jobs
●    Evaluate what verilator can do for you
●    Profile, record and report
18
Questions
●    Questions



                  Rich Porter
          rich.porter@artofsilicon.com




19

More Related Content

PDF
TVM VTA (TSIM)
PDF
One definition rule - что это такое, и как с этим жить
PDF
TensorFlow XLA RPC
PDF
Bridge TensorFlow to run on Intel nGraph backends (v0.4)
PDF
Joel Falcou, Boost.SIMD
PDF
Алексей Кутумов, Coroutines everywhere
PPTX
C++17 now
PDF
How to make a large C++-code base manageable
TVM VTA (TSIM)
One definition rule - что это такое, и как с этим жить
TensorFlow XLA RPC
Bridge TensorFlow to run on Intel nGraph backends (v0.4)
Joel Falcou, Boost.SIMD
Алексей Кутумов, Coroutines everywhere
C++17 now
How to make a large C++-code base manageable

What's hot (20)

PDF
Clang tidy
PDF
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
PDF
Qt Rest Server
PDF
Checking the Cross-Platform Framework Cocos2d-x
PDF
Zone IDA Proc
PDF
Global Interpreter Lock: Episode I - Break the Seal
PPTX
soscon2018 - Tracing for fun and profit
PDF
Top 10 bugs in C++ open source projects, checked in 2016
PDF
Better Code: Concurrency
PDF
4Developers 2018: Evolution of C++ Class Design (Mariusz Łapiński)
DOCX
Travel management
PDF
Riga Dev Day 2016 - Having fun with Javassist
PDF
Metaprogramming and Reflection in Common Lisp
PDF
clang-intro
PPTX
Code generation with javac plugin
DOCX
Network lap pgms 7th semester
PDF
ESCMAScript 6: Get Ready For The Future. Now
PDF
PVS-Studio in 2021 - Error Examples
PDF
JavaOne 2015 - Having fun with Javassist
PDF
TensorFlow local Python XLA client
Clang tidy
Bridge TensorFlow to run on Intel nGraph backends (v0.5)
Qt Rest Server
Checking the Cross-Platform Framework Cocos2d-x
Zone IDA Proc
Global Interpreter Lock: Episode I - Break the Seal
soscon2018 - Tracing for fun and profit
Top 10 bugs in C++ open source projects, checked in 2016
Better Code: Concurrency
4Developers 2018: Evolution of C++ Class Design (Mariusz Łapiński)
Travel management
Riga Dev Day 2016 - Having fun with Javassist
Metaprogramming and Reflection in Common Lisp
clang-intro
Code generation with javac plugin
Network lap pgms 7th semester
ESCMAScript 6: Get Ready For The Future. Now
PVS-Studio in 2021 - Error Examples
JavaOne 2015 - Having fun with Javassist
TensorFlow local Python XLA client
Ad

Viewers also liked (12)

PDF
Validation Missteps Making Us Full Time Firefighters
PDF
The Validation Attitude
PDF
Free Electronic Lab: Community Leader in Opensource EDA Development
PDF
RDBMS-based Coverage Collection and Analysis
PDF
Emulation on Your Desktop
PDF
HW-SW Co-Verification: A Constrained Random Approach
PDF
Verilator: Fast, Free, But for Me?
PDF
Mainline Functional Verification of IBM's POWER7 Processor Core
PDF
Verification Bug Metrics: A Different Approach
PDF
Processor Verification Using Open Source Tools and the GCC Regression Test Suite
PDF
Approaches for Power Management Verification of SOC
PDF
Coverage and Introduction to UVM
Validation Missteps Making Us Full Time Firefighters
The Validation Attitude
Free Electronic Lab: Community Leader in Opensource EDA Development
RDBMS-based Coverage Collection and Analysis
Emulation on Your Desktop
HW-SW Co-Verification: A Constrained Random Approach
Verilator: Fast, Free, But for Me?
Mainline Functional Verification of IBM's POWER7 Processor Core
Verification Bug Metrics: A Different Approach
Processor Verification Using Open Source Tools and the GCC Regression Test Suite
Approaches for Power Management Verification of SOC
Coverage and Introduction to UVM
Ad

Similar to Architecture for Massively Parallel HDL Simulations (20)

PDF
ASIC SoC Verification Challenges and Methodologies
PDF
Zehr dv club_12052006
PDF
Systematic Model based Testing with Coverage Analysis
PDF
Lafauci dv club oct 2006
PPTX
Hdl simulators
PDF
Lear design club_presentation_collaboration-verification
PDF
Project P erts2012
PPTX
Reverse Architecting of a Medical Device Software
PDF
Concurrent systems composing
PDF
Validation and Design in a Small Team Environment
PDF
Validation and-design-in-a-small-team-environment
PDF
RTF - Prasad bhatt
PDF
FPGA Camp - Aldec Presentation
PDF
2019 2 testing and verification of vlsi design_verification
PPTX
Online test program generator for RISC-V processors
PPTX
Is Advanced Verification for FPGA based Logic needed
PDF
Shreeve dv club_ams
PDF
Fel Flyer F10
PDF
Design verification--the-past-present-and-future
PDF
Design Verification: The Past, Present and Futurere
ASIC SoC Verification Challenges and Methodologies
Zehr dv club_12052006
Systematic Model based Testing with Coverage Analysis
Lafauci dv club oct 2006
Hdl simulators
Lear design club_presentation_collaboration-verification
Project P erts2012
Reverse Architecting of a Medical Device Software
Concurrent systems composing
Validation and Design in a Small Team Environment
Validation and-design-in-a-small-team-environment
RTF - Prasad bhatt
FPGA Camp - Aldec Presentation
2019 2 testing and verification of vlsi design_verification
Online test program generator for RISC-V processors
Is Advanced Verification for FPGA based Logic needed
Shreeve dv club_ams
Fel Flyer F10
Design verification--the-past-present-and-future
Design Verification: The Past, Present and Futurere

More from DVClub (20)

PDF
IP Reuse Impact on Design Verification Management Across the Enterprise
PDF
Cisco Base Environment Overview
PDF
Intel Xeon Pre-Silicon Validation: Introduction and Challenges
PDF
Verification of Graphics ASICs (Part II)
PDF
Verification of Graphics ASICs (Part I)
PDF
Stop Writing Assertions! Efficient Verification Methodology
PPT
Validating Next Generation CPUs
PPT
Verification Automation Using IPXACT
PDF
Trends in Mixed Signal Validation
PDF
Verification In A Global Design Community
PDF
Design Verification Using SystemC
PDF
Verification Strategy for PCI-Express
PDF
SystemVerilog Assertions (SVA) in the Design/Verification Process
PDF
Efficiency Through Methodology
PDF
Pre-Si Verification for Post-Si Validation
PDF
OpenSPARC T1 Processor
PDF
Intel Atom Processor Pre-Silicon Verification Experience
PDF
Using Assertions in AMS Verification
PDF
Low-Power Design and Verification
PDF
UVM Update: Register Package
IP Reuse Impact on Design Verification Management Across the Enterprise
Cisco Base Environment Overview
Intel Xeon Pre-Silicon Validation: Introduction and Challenges
Verification of Graphics ASICs (Part II)
Verification of Graphics ASICs (Part I)
Stop Writing Assertions! Efficient Verification Methodology
Validating Next Generation CPUs
Verification Automation Using IPXACT
Trends in Mixed Signal Validation
Verification In A Global Design Community
Design Verification Using SystemC
Verification Strategy for PCI-Express
SystemVerilog Assertions (SVA) in the Design/Verification Process
Efficiency Through Methodology
Pre-Si Verification for Post-Si Validation
OpenSPARC T1 Processor
Intel Atom Processor Pre-Silicon Verification Experience
Using Assertions in AMS Verification
Low-Power Design and Verification
UVM Update: Register Package

Recently uploaded (20)

PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
Build Your First AI Agent with UiPath.pptx
PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PDF
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
PPTX
Training Program for knowledge in solar cell and solar industry
PDF
NewMind AI Weekly Chronicles – August ’25 Week IV
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
4 layer Arch & Reference Arch of IoT.pdf
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Five Habits of High-Impact Board Members
PDF
sustainability-14-14877-v2.pddhzftheheeeee
Enhancing plagiarism detection using data pre-processing and machine learning...
OpenACC and Open Hackathons Monthly Highlights July 2025
NewMind AI Weekly Chronicles – August ’25 Week III
Build Your First AI Agent with UiPath.pptx
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
Convolutional neural network based encoder-decoder for efficient real-time ob...
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Credit Without Borders: AI and Financial Inclusion in Bangladesh
Accessing-Finance-in-Jordan-MENA 2024 2025.pdf
Training Program for knowledge in solar cell and solar industry
NewMind AI Weekly Chronicles – August ’25 Week IV
Consumable AI The What, Why & How for Small Teams.pdf
4 layer Arch & Reference Arch of IoT.pdf
Custom Battery Pack Design Considerations for Performance and Safety
The influence of sentiment analysis in enhancing early warning system model f...
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Taming the Chaos: How to Turn Unstructured Data into Decisions
Five Habits of High-Impact Board Members
sustainability-14-14877-v2.pddhzftheheeeee

Architecture for Massively Parallel HDL Simulations

  • 1. Architecture for Massively Parallel HDL Simulations Rich Porter Art of Silicon 1
  • 2. Art of Silicon ● Founded in 2005 ● Bristol based ● Multimedia centric Silicon IP ● Bespoke IP creation ● Consultancy 2
  • 3. It's all about MONEY ● Engineer productivity is key to success – Employment cost is $$$s – Tooling cost – Computers ● Makes good business sense to improve productivity – Less time per chip – More chips per man year ● Productive engineers are happy 3 engineers!
  • 4. Coding Time ● Engineers do NOT spend all their time writing code – 5% writing code – 95% debugging code ● Thinking & waiting for waves, regressions ● Highly desirable to quantify the effect of any design delta – As quickly & as easily as possible – For all engineers & management 4
  • 5. Minimizing Idle Time ● Scalable test engine – 40+ instances per engineer – Up to as many as they can use ● Large compute farm of dense elements – 8, 12, 16, 24 core boxes ● Reduce iterations during the day ● Provide full regression bandwidth at night, weekends ● So how do I do this? 5
  • 6. Verilator ● Used to generate standalone license-free test executable ● Execute as many as your compute infrastructure can support ● Deliver to other teams – Architecture – Systems – Toolchain – Customer 6
  • 7. Signoff ● Event driven simulators are gold standard for signoff – Gate level simulations ● HAVE to run in this environment too ● Spending engineer time to create 2 environments costs $s – Gratuitous creation of extra work 7
  • 8. Architecture ● Single testbench architecture – Ease of maintanence – Issue replication – Portable to silicon ● It is essential that test stimulus and cycle level behaviour is identical across platforms – Even though one platform is event driven and the other is cycle based 8
  • 9. What AoS did ● Stimulus & checker DUV – in C++ checker – Identical code ● 'Gaskets' used model – Provides uniform I/F – interface­>signal = value; – value =                    interface­>signal­>signed(); ● All design code in regular verilog 9
  • 10. Example – verilog initial begin `ifdef verilator   $c("aos_simctrl_init(&", rst_cnt,    ", &", cycles,  DUV    ", &", std_simctrl_finish_r, ");"   ); `else   $aos_simctrl_init( checker     rst_cnt,     cycles,     std_simctrl_finish_r   ); `endif model end always @(posedge clk1) `ifdef verilator   $c("aos_simctrl_clk();"); `else   $aos_simctrl_clk; `endif 10
  • 11. Autogeneration ● Single sourced from SPIRIT XML description – Verilog ● module declarations ● grey box instantiation ● verilog side gasket instantiation – C++ ● headers ● port descriptions (name, size, direction) ● PLI wrappers 11
  • 12. Test Configuration ● Tests were configured at execution using plusargs – +argument=value ● Scanning code was identical for both – To ensure consistency ● Regression script – Created these configurations – Passed them to batch scheduler – Collated results for web presentation 12
  • 13. Logging ● Unified logging interface – Consistent output – Captured in markup to preserve semantics – No cludgy parsing of text post mortem that varies from platform to platform ● Test fail produces identical logs across platforms – No problems with spaces, split lines, mixed streams, output ordering 13
  • 14. Triage ● Group messages by content, code, file, line – Placing most numerous first ● Order by cycles within – Increasing cycle count to failure ● Failure at the top is debug candidate ● Web I/F can provide rich set of filters/collation options 14
  • 15. Performance ● Performance has steadily increased E5405 @ 2000MHz E5345 @ 2333MHz kHz X5472 @ 3000MHz X3430 @ 2400MHz 0 50 100 150 200 ● Memory footprint remains low – No simulation kernel 15
  • 16. Key Features ● Single testbench providing cycle accurate stimulus ● Autogeneration from single source ● Single test configuration ● Unified logging 16
  • 17. Conclusion / AoS Experience ● Little spent waiting for simulations – 60 wide queue gave 6MHz effective – Engineers spent time really debugging ● Small amount of time spend bringing up each simulator – Not much, but was very early on ● Verilator proved robust ● See no reason why cannot scale to 1000s of simulations 17
  • 18. Recommendations ● Get a compute farm ● Use a queue ● Replace CPUs regularly ● Prioritize high value jobs – $ licenses – interactive jobs ● Evaluate what verilator can do for you ● Profile, record and report 18
  • 19. Questions ● Questions Rich Porter [email protected] 19