Software Engineer, Model Inference

Palo Alto
R&D /
Full-Time Employment /
On-site
Lamini AI is at the forefront of bringing LLMs to production.  We are on a mission to help every company unlock the power of generative AI, by putting their own data to work. Our team is made up of highly experienced ML engineers and tech industry veterans and we’re backed by leading computing and technology companies.

About the role:
We are looking for a systems expert to be one of our founding engineers, who’s excited to reinvent programming languages with AI and enable a software engineering revolution powered by AI. We can teach you the AI part, as we’ve taught over 100k people around the world. 

You’ll be working directly with the founders, influencing the product direction, and playing a key role in leading and growing our team. 

Most of all, we’re looking for someone who is eager to iterate fast on building new ML cloud systems, and is hungry to build and own enormous contributions.

Must to have: 
2+ Experience building and rapidly prototyping production cloud based software

2+ years of experience in researching or contributing to ML/DL systems and frameworks

Strong coding skills (in at least one of Python and C++)

Solid fundamentals in machine learning and deep learning topics

Demonstrated fluency with data structures, algorithms, architecture, and agile software best practices in any language

Desire to work in an inclusive and collaborative environment

An interest in continually learning from others, teaching others, and digging into new challenges

Nice to have: 
Desire to create speed of light training and inference systems for next-generation AI

Deep technology expertise in machine learning systems, e.g. TinyML, Triton, CUDA, ROCm, Exo, MLIR, Halide, etc

Software architect of a programming system or language

Solid fundamentals in other computer science and computer engineering topics: algorithms and data structures, operating systems, computer architecture, etc.

Experience with GPU architecture and programming: CUDA and its related libraries and toolkits (e.g., cuDNN, cuBLAS, CUTLASS, nvprof, Nsight Compute, Nsight Systems, etc.); ROCm and its related libraries and toolkits.
At Lamini AI, we are committed to providing an environment of mutual respect where equal employment opportunities are available to all applicants without regard to race, color, religion, sex, pregnancy (including childbirth, lactation and related medical conditions), national origin, age, physical and mental disability, marital status, sexual orientation, gender identity, gender expression, genetic information (including characteristics and testing), military and veteran status, and any other characteristic protected by applicable law. Lamini AI believes that diversity and inclusion among our employees is critical to our success as a company, and we seek to recruit, develop and retain the most talented people from a diverse candidate pool. Selection for employment is decided on the basis of qualifications, merit, and business need.