Site Reliability

San Francisco

Houseparty’s mission is to connect people in the most human way possible when they are physically apart. We believe that our greatest opportunity to improve people’s lives is to reinvent how we communicate. Houseparty empowers people to have more frequent conversations with the people they care about most, meet new friends, and have fun together.

As a Site Reliability Engineer at Houseparty, you'll drive a variety of projects involving scalability and performance monitoring, internal tooling, system and process automation, and much more. If this sounds interesting, we'd love to talk.

Role and Responsibilities:

    • You'll be responsible for all things infrastructure: design, architecture, communication, monitoring, availability, scalability (obviously), failover and recovery
    • You will identify, analyze, and mitigate risk exposure at all levels
    • You'll manage team backlogs and ensure forward project momentum

You Ideally:

    • Have excellent systems-level analytical skills and an ability to move up and down the stack
    • Have a generally data-conscious, creative attitude toward systems and infrastructure
    • Have a passion for automation, and knowledge of system automation frameworks
    • Have basic sysadmin skills (install a host, upgrade a box, format a disc, etc)
    • Have a deep understanding of Linux
    • Have experience with AWS at scale (>100 nodes)
    • Have experience with relational and non-relational databases (we like Postgres, Redis)
    • Have experience with cluster / container frameworks (Kubernetes, Docker, Mesos, etc)
    • Have kernel experience
    • Have experience debugging JVM
    • Have experience setting up and using common CI tools (we like Jenkins, Travis)
    • Understand the causes of load and latency, the differences between them, and how to minimize both
    • Understands configuration management systems and config-as-code (we use Terraform)
    • Know several scripting and programming languages (Bash, Go, Python, Scala, etc)
    • Know how not to do things that take down our entire system (but seriously)
    • Enjoy a high growth environment


    • Be committed to standard engineering methodologies, including automated testing and continuous integration
    • Be willing to take risks with product features and functionality
    • Be dedicated to creative problem solving
    • Be an excellent written and verbal communicator
    • Be prepared for diverse challenges without fear of failure
    • Be passionate about crafting the future of digital communication
    • Be a social person. We are a social app, we like social people.