(Senior) Research Scientist - Large Language Models for Genomics
Toronto, Ontario
Systems and Target Biology /
Full-time /
Hybrid
About Us
Deep Genomics is at the forefront of using artificial intelligence to transform drug discovery. Our proprietary AI platform decodes the complexity of RNA biology to identify novel drug targets, mechanisms, and therapeutics inaccessible through traditional methods. With expertise spanning machine learning, bioinformatics, data science, engineering, and drug development, our multidisciplinary team in Toronto and Cambridge, MA is revolutionizing how new medicines are created.
About the Role
Join us in building the future of AI-driven therapeutic design as a (Senior) Research Scientist specializing in large language models for genomics within our Systems and Target Biology group. You will develop and implement systems for using large language models to discover and characterize evidence for new biological discoveries using Deep Genomics’ foundation models. As a part of this role, you will interact closely with the machine learning team developing foundation models, the engineering and infrastructure teams to build scalable systems, and the statistical genetics and experimental groups to synthesize evidence for therapeutic actionability.
Ideal Candidate
We are looking for someone with 3+ years of experience in using and developing solutions using LLMs for complex, multi-agent, workflows. The ideal candidate will be passionate about leveraging AI and foundation models to disrupt therapeutic design workflows and is adept at translating complex scientific requirements into robust computational solutions. A background in genomics or computational biology is highly desirable, as well as clear experience in MLOps and architecting agentic solutions from the ground up.
Key Responsibilities
- Design and implement multi-agent workflows that integrate internal foundation models (e.g., BigRNA, REPRESS, FlashRNA) and external tools to identify new biological hypotheses.
- Develop systems that leverage Retrieval Augmented Generation (RAG) by connecting LLMs to internal scientific documents, SOPs, and structured biological databases.
- Collaborate with the machine learning team on model distillation strategies to create smaller, faster models suitable for a real-time, interactive chat interface.
- Build out and maintain the infrastructure for the LLM agent, including databases and model context protocol (MCP) endpoints.
- Work closely with end-users in therapeutic design, target discovery, and experimental biology to identify key use cases, gather feedback, and rapidly iterate on the product.
- Ensure the system is transparent and trustworthy by building "explainable AI" features that help users understand and verify the AI's outputs and decisions.
Basic Qualifications
- MSc or PhD in Computer Science, Computational Biology, Bioinformatics, or a related field.
- 3+ years of hands-on experience architecting and building complex applications using Large Language Models.
- Expert knowledge of Python and modern MLOps frameworks and tools; experience with agentic frameworks like LangChain is essential.
- Demonstrated experience in building multi-agent systems that can plan, execute tasks, and interact with external tools and APIs.
- Familiarity with high-performance computing environments and cloud services (e.g., AWS, GCP).
- Excellent communication skills and the ability to work effectively in a multidisciplinary team, translating the needs of biologists and drug developers into technical solutions.
- Intellectual curiosity, critical thinking, and a commitment to innovation and scientific rigor.
Preferred Qualifications
- A strong background in genomics, computational biology, or bioinformatics, including experience with NGS data analysis or large-scale biological datasets.
- Prior experience in the biotech or pharmaceutical industry, particularly in a drug discovery context.
- Experience with model distillation or creating smaller, specialized models from larger foundation models.
- Familiarity with scientific workflow management systems and tools (e.g., Docker, Conda).
What we offer
- A collaborative and innovative environment at the frontier of computational biology, machine learning, and drug discovery.
- Highly competitive compensation, including meaningful stock ownership.
- Comprehensive benefits - including health, vision, and dental coverage for employees and families, employee and family assistance program.
- Flexible work environment - including flexible hours, extended long weekends, holiday shutdown, unlimited personal days.
- Maternity and parental leave top-up coverage, as well as new parent paid time off.
- Focus on learning and growth for all employees - learning and development budget & lunch and learns.
- Facilities located in the heart of Toronto - the epicenter of machine learning and AI research and development, and in Kendall Square, Cambridge, Mass. - a global center of biotechnology and life sciences.
Deep Genomics welcomes and encourages applications from people with disabilities. Accommodations are available on request for candidates taking part in all aspects of the selection process.
Deep Genomics thanks all applicants, however only those selected for an interview will be contacted.