Lead/Principal DevOps Engineer

London, UK /
Software – Infrastructure Engineering /
The Software Infrastructure team is responsible for developing and delivering secure, scalable, highly-available services that support all of all.health’s  technology and services. In addition to supporting millions of user devices streaming data into our systems, we also run massive-scale systems used to power all.health's unique insight system and the machine learning and big data analytics platforms used to test and develop our next generation of algorithms and devices.


    • Identify projects required to improve availability and reduce operational expense.
    • Design and implement systems to safely enable all.health's continued data growth without incurring incremental operational overhead or loss of availability.
    • Design and implement highly available services to reduce complexity in legacy systems and enable fault isolation and improved reliability.
    • Design and implement adjustments to the overall network and system infrastructure of our cloud services to improve security, flexibility and availability of the overall system.
    • Diagnose and resolve performance issues in complex systems to ensure reliable user-facing performance of the overall all.health system.
    • Participate in a 24/7 on-call rotation, responsible for the stable and reliable operation of all cloud services.
    • Perform postmortem analysis in response to operational incidents, analyzing system telemetry to drive continual improvement.
    • Participate and contribute in architectural discussions, reviews, improvements and future projects
    • Increase automation and environment agnostic infrastructure while keeping an eye on new tools and techniques when possible


    • BA/BS in Computer Science or equivalent experience
    • Experience operating a complex highly available system
    • Experience with high volume services
    • Experience with distributed systems
    • Experience with Docker, Kubernetes, Azure and Google Cloud Platform.
    • Strong understanding of configuration management: Puppet, Salt Stack, Ansible
    • Experience with environment agnostic, transportable applications and infrastructure
    • Experience with complex deployments and data structures
    • Must be able to work in London, this role is not eligible for visa sponsorship