Site Reliability Engineer
Software Engineering /
We are looking for a self-directed Site Reliability / Senior DevSecOps Engineer with cloud IT administration experience to join our growing team. You will support our engineering team and AI scientists by configuring secure, scalable, and highly available systems that conform to HIPAA safeguards and HiTrust requirements.
This is a fast-paced Agile environment with a strong focus on adopting modern cloud-native technologies. Strong communication skills, stellar Infrastructure as Code and troubleshooting skills, and obsession with security is key. This is a terrific opportunity for a technologist to lead Site Reliability Engineering and DevSecOps at Overjet.
- Build CI/CD pipelines.
- Set up and maintain Kubernetes clusters.
- Set up infra monitoring that can alert on impact for customers and provide levels of escalation.
- Configure secure systems that protect PHI across dev, test and production environments.
- Setup systems for backup and recovery.
- Configure SIEM system in support of HiTrust certification.
- Perform user management and device endpoint management tasks in support of HiTrust certification.
- Develop and continually improve guidelines and procedures related to system management including process and efficiency.
- Continuously improve the reliability of Overjet's services, and hence, our QoS, by following a data-driven approach.
- Enthusiastically participate in on-call rotations.
- Bachelor’s degree in Computer Science or equivalent experience.
- Experience with a distributed systems environment and ability to troubleshoot across the whole stack involving various independent services.
- Strong problem solving skills and endless desire for automation.
- At least 3 years of experience in working on public cloud infrastructure with automation.
- At least 3 years of experience with containerization and orchestration related to technologies like Kubernetes, Docker.
- At least 3 years of experience with Terraform or other infrastructure as code tools.
- Demonstrated proficiency with Python.
- Experience with monitoring and alerting tools.
Preferred Skills, Certifications and Experience
- At least one of the following certifications: GCP Professional Cloud DevOps Engineer, AWS Certified DevOps Engineer
- Excellent written and verbal communication skills.
- Ability to multi-task and quickly respond to surprises.
- Experience with mentoring and knowledge dissemination.
- HiTrust implementation experience