Site Reliability Engineer
As a key member of the site reliability engineering team, you will be responsible for ensuring availability, reliability, responsiveness and scalability across all Contrast production services. You will be deeply hands-on with our cloud-based infrastructure, Linux systems, automation, monitoring and systems telemetry.
This person will work with our engineering team to design and build system infrastructure that is automated, elastic, and reliable. The tools you build and provide will increase automation, consistency, and confidence in all platforms. Every change to the environment must be checked into Git and deployed. In a teammate, we hope that you challenge conventional wisdom and encourage everyone to have simpler solutions. You will be an opinionated teammate that is interested in the products we work on and have a passion for making the Internet a safer place.
Ideal candidates have a background or immense interest in working with: Ansible, AWS, Tomcat/Java, MongoDB, Redis, RabbitMQ, Kafka, ZooKeeper, MySQL and Restful API development. If you're amazing but missing some of these, email us your résumé and cover letter anyway: firstname.lastname@example.org. Please include a link to your Github or BitBucket account, as well as any links to some of your projects if available.
- Ability to harness all that AWS has to offer - you'll be spinning up new scalable environments quickly and keeping our AWS accounts tidy and efficient.
- A passion for working with AWS Lambda gets you an extra cookie each day!
- A love of New Relic, ThreatStack, SumoLogic and any other telemetry available.
- A high degree of familiarity with Linux containers and container orchestration tools like Kubernetes or Docker Swarm
- Strong understanding of HTTP, REST, networking concepts and global load-balancing
- A passion towards automation - we’re looking to automate “all the things”
- Work cross-functionally within a service team and be a core contributor in every significant engineering solution that is delivered
- Debug production issues across services and levels of the stack
- Participate in on-call rotations, along with every member of the engineering team
- Solid understanding of system design, including the operational trade-offs of various designs
- Solid programming and troubleshooting skills. You may be called upon to help with systems written in Java, Go, Python and Node.js. You won’t be expected to know everything, but we are looking for people who can dig through a codebase for debugging and commit tactical fixes as needed.
What We Offer
- Competitive compensation
- Daily team lunches
- Meaningful stock plans
- Medical, dental, and vision benefits
- Flexible paid time off
- You can join us in our office in Baltimore, but we would consider candidates that can work in Austin, TX or Palo Alto, CA as we have staff in those markets.
- You love to code and deploy at scale.
- Desire to make the Internet a safer place.
- Passion for with Configuration Management with Ansible. Automating every environment is a requirement.
- You approach problems from a product perspective, thinking through how the user will interact with what you're building.
- You have strong communication skills. You ask questions, let others know when you need help, and tell others what you need.
- You're a problem solver. You believe the best work is the result of finding the simplest solution to complex challenges.
- You see the big picture. You understand how the code you write interacts with systems and services, both internally and externally.