DevOps Engineer

Bangalore
Engineering – Engineering /
Full-Time /
Hybrid
About SafeAI
SafeAI sees the future of heavy industry with off-road autonomous vehicles reshaping industries like construction and mining. We are pioneering a new approach to autonomous off-road equipment with AI-powered, vehicle and manufacturer agnostic technology that enables heavy industry operations to retrofit any piece of equipment with autonomous technology. Since 2017, we’ve been steadily establishing a foundation for the future, targeting the most important, heavily used off-road vehicles and industry segments to drive meaningful impacts across safety, productivity and cost reduction. We are now entering an era of massive industry adoption and are excited to be a commanding force in accelerating this movement to transform heavy industry with connected autonomy. 
 
About the Team 
As a fast paced, high growth company serving a very important mission, our amazing and talented team is a huge part of bringing this mission to reality. The work that you do at SafeAI will give you a unique perspective on developing and deploying leading edge autonomous technology and solutions, while working with top tier participants in the industry, and across the globe. The leadership team at SafeAI brings a unique blend of autonomous technology and industry specific experience from some of the top companies in the world, such as Google, Apple, Tesla, Continental, Uber, Caterpillar, BHP and Rio Tinto. We are strategically headquartered in Silicon Valley, with team members and office locations established in Perth, Australia, Tokyo, Japan and New Delhi, India that you would collaborate with on a regular basis.

As a DevOps Engineer at SafeAI, you'll be instrumental in ensuring the reliability and robustness of our robotic fleet management and autonomy team. You'll design, implement, and optimize infrastructure solutions that ensure seamless integration and communication within our autonomous fleet. Your expertise will be crucial in enhancing system uptime, automating deployment processes, and ensuring a scalable and secure environment. Collaborate with a vibrant global team, utilize cutting-edge technologies, and play a key role in driving the operational excellence that underpins the exciting journey of autonomy.
 
Key Responsibilities:
·      Infrastructure Management: Design, set up, and maintain environments for both software and machine learning workflows using cloud platforms like AWS, Azure, or GCP
·      Continuous Integration & Deployment: Implement CI/CD pipelines for software as well as ML models, ensuring quick and reliable deployments
·      Model Deployment: Automate the deployment of ML models into production, ensuring scalability and real-time performance
·      System Monitoring: Monitor both system health and ML model performance, ensuring maximum uptime and model accuracy
·      Scalability & Performance: Ensure infrastructure and models are scalable, optimizing for peak loads and high availability
·      Security: Implement best practices for system and data security, especially when handling sensitive data used in ML models
·      Automation: Automate repetitive tasks, from code deployments to model retraining and scaling
·      Collaboration: Work closely with data scientists, ML engineers, and software developers to ensure seamless integration of new features, services, and models
·      Backup & Disaster Recovery: Set up and maintain backup procedures and disaster recovery solutions for both software and ML models
·      Research & Development: Stay updated with the latest DevOps and MLOps trends and technologies, recommending and implementing improvements
 
Qualifications:
·      Extensive Experience: A minimum of 10+ years in DevOps, with a significant portion dedicated to MLOps or related roles
·      Bachelor's degree in a relevant field or equivalent experience
·      Cloud Certification: Certification in cloud platforms, with experience in ML services like AWS SageMaker, Azure ML, or GCP AI Platform
·      Knowledge of CI/CD: Familiarity with CI/CD principles, especially as they apply to ML workflows plus Jenkins
·      ML Frameworks: Experience with ML frameworks like TensorFlow, PyTorch, or Scikit-learn
·      Data Management: Understanding of data versioning tools like DVC and data storage solutions suitable for ML
·      Containerization Concepts: Knowledge of containerization as it applies to ML, using tools like Docker and Kubernetes
·      Monitoring Tools: Experience with monitoring ML models in production using tools like MLflow or TFX
·      Security Best Practices: Awareness of security challenges specific to ML, like model security and data privacy
·      Scripting Skills: Proficiency in scripting for automation in both software and ML contexts
·      Expertise in Python
·      Team Collaboration: Demonstrated ability to work with data scientists, ML engineers, and software developers, showcasing strong communication skills
In addition to a very competitive compensation and benefits package, we offer a fantastic culture and fun place to work within an established start-up environment. As an Equal Opportunity Employer M/F/D/V/SO, we do not discriminate in employment and personnel practices on the basis of race, sex, age, handicap, religion, national origin or any other basis prohibited by applicable law.  
 
We hope that you’re a great candidate for this position and look forward to speaking with you.