Site Reliability Engineer

Irvine, CA /
Software – Tower Software /
Full-time
Anduril is a defense technology company, bringing Silicon Valley talent and funding to the defense sector. Our technology helps our customers solve their toughest challenges by enabling them to make better, more informed decisions in life-and-death situations. We’ve assembled a diverse team of experts in artificial intelligence, computer vision, sensor fusion, optics, and data analysis that are creating software and hardware solutions to radically evolve the capabilities of the United States and our allies. If you are passionate about solving problems that have real impact, come join Anduril and build the future of defense!

Anduril’s Lattice framework integrates deployed sensors into a single operational platform. From a dozen Sentry Towers to swarms of Ghost Drones, Lattice fuses sensor data to produce a single view and control interface for human operators.

Lattice requires a robust distributed infrastructure of thousands of interconnected devices on multiple hardware platforms across a variety of secure networks!

Site Reliability Engineers at Anduril are responsible for these thousands of critical Lattice nodes. From virtual clusters in secure clouds to ruggedized industrial PCs in austere environments to low-power SOCs flying in the air, SREs work closely with product, platform, and hardware teams to support the full life-cycle of all Lattice nodes: development, imaging, deployment, operation, monitoring, troubleshooting, and upgrading.

Responsibilities:

    • Work with all involved teams on the life-cycle of baseline Lattice node operating system
    • Work with product teams to instrument and monitor health of deployed systems
    • Propose and implement solutions for issues, processes, and architecture
    • Remotely fix software, hardware, and networking issues in sophisticated environments
    • Participate in a 24/7 on-call rotation

Skills:

    • 2+ years of Linux administration
    • 2+ years of scripting and/or programming experience
    • 1+ years directly supporting a 24/7 production environment
    • Quickly understand and navigate sophisticated networked systems
    • Comfortable operating in a dynamic environment
U.S. Person status is required as this position needs to access export controlled data.