Accelerator Memory subsystem architecture and Performance Modeling

(US) Santa Clara CA , Austin TX, Portland OR, Fort Collins CO
Engineering – Silicon Engineering /
Full-time /
Join a cutting-edge and well-funded hardware startup in Silicon Valley as an Accelerator Performance Architect focused on Memory Subsystems. Our mission is to reimagine silicon and create Risc-V based Accelerated computing platforms that will transform the industry. You will have the opportunity to work with some of the most talented and passionate engineers in the world to create designs that push the envelope on performance, energy efficiency and scalability. We offer a fun, creative and flexible work environment, with a shared vision to build products to change the world.


    • In-depth knowledge of Memory subsystem architecture, microarchitecture and design including caches, NOC and LPDDR/DDR/HBM.
    • Expert Performance Modeling using C/C++. Experienced in all different modeling techniques from analytical modeling,event driven and cycle accurate Modeling.
    • Knowledge and experience with common performance benchmarks and workloads in the ML space.
    • Ability to work well in a team and be productive under aggressive schedules.
    • Proficiency in System Verilog, C or C++, scripting languages such as Python.
    • Experience with high-level simulators for  power estimation is a plus.
    • Excellent skills in problem solving, written and verbal communication, excellent organization skills, and highly self-motivated.


    • Define Architecture and Microarchitecture for Data Parallel accelerator based memory subsystem components including Cache,Interconnects,HBM,DDR,LPDDR
    • Develop Performance Models in C++ for design space exploration to achieve optimal performance under area and power constraints.
    • Performance exploration and correlation - explore high performance strategies and validate that the RTL design meets targeted performance
    • Develop Performance Verification tests to ensure quality of model and design.
    • ML Workload analysis with a focus on improving Memory subsystem performance. 
Education and Experience

Bachelor’s degree plus 4 years of industry experience.
Master’s degree plus 2 years of industry experience.
Ph.D with internship experience.