Site Reliability Engineer
San Diego, CA
Engineering – TE2 - Development
TE2 - The Experience Engine™ Inc, a division of accesso, is the leader in experience-driven, personalized advertisement and content delivery for connected consumers, bridging the physical and digital brand experience across mobile, wearables and other digital technologies. TE2 is designed for industries where an in-person experience is a critical engagement opportunity, including hospitality, resorts, theme parks, food, travel, education and healthcare. For more information about TE2, please visit www.theexperienceengine.com.
The Site Reliability Engineers objective is to essentially "make things scale" which includes: building software that automates experiences, developing utilities that provide insights/metrics, and providing instrumentation for the Engineering teams to more efficiently scale up the TE2 platform's performance.
Red Hat, Docker, Kubernetes, AWS, Jenkins, and Ansible are the main internal tech stack you will be working with.
Challenges that you may tackle include:
- Instrumentation and metrics collection from AWS lambda FaaS or otherwise immutable containers
- Minimize and harden microservices and public-facing API gateway attack surface
- Continuous delivery using tools such as Jenkins pipelines, Docker, Kubernetes
- Observability, capacity planning, system and service performance analysis and tuning
- Orchestration of AWS VPC resources using tools such as terraform, boto, consul
Some of the technologies you will be working with:
- Configuration management: ansible, aws-cli, git
- Operating Systems: mostly RedHat derived linux
- Containerization and virtualization technologies: Docker Enterprise, Kubernetes
- Metrics and monitoring: statsd, ELK, PagerDuty, Slack chatops
- Messaging: Kafka, RabbitMQ
- Microservices patterns: Eureka, Ribbon, Hystrix, nginx
- Databases: Couchbase (NoSQL, N1QL), memcached, Elasticsearch, PostgreSQL, Oracle
- L2-L7 frame/packet/session inspection: netflow, WAF, pcap
- 5+ years of highly-available or high-volume site reliability engineering or systems administration
- 3+ years of infrastructure automation, configuration management or container orchestration
- Strong with one or more languages (Go (golang), Python, Java, Ruby, perl or bash) and git
- BA/BS in Computer Science or a related technical field (preferred, but not necessary)
- Periodic participation in an after-hours on-call rotation supporting production environments 24x7
- Willingness to embrace an agile devops culture
We're seeking deep expertise in one or more of the following:
- Deploying, configuring, scaling, debugging, and maintaining Kafka message brokers and Zookeeper clusters. In-depth knowledge of Kafka/Zookeeper internals is great.
- Managing Couchbase database clusters, encompassing provisioning, scaling, monitoring, and debugging. Expertise optimizing indices and queries is desired, as well as experience facilitating backup and recovery.
- Container orchestration in Docker Enterprise and/or Kubernetes environments. Managing, deploying, and configuring clusters running Swarm or Kubernetes, diagnosing networking issues, planning and implementing cluster upgrades.
What We Offer:
- Competitive compensation package including discretionary annual bonus opportunity;
- 4-weeks of Paid Time Off for employees up to 3-years of tenure (higher accrual thereafter);
- 8-hours of paid Volunteer Time Off to give back to organizations and groups you feel most passionately about;
- Three different medical insurance plans to choose from, including an employer-contributed HSA;
- Employer-paid short & long-term disability and life insurance;
- Matching 401K;
- Unlimited access to Udemy for Business for continued learning and career development.
- We are an E-Verify organization. Eligible candidates must be authorized to work in the US without requiring visa sponsorship.
- accesso is a drug free company.
If you are interested in joining a team who values Passion, Commitment, Teamwork, Innovation and Integrity and what we’ve described above is YOU, then apply today and let’s talk!