Site Reliability Engineer - Couchbase
Engineering – TE2 - Development
TE2 - The Experience Engine™ Inc, a division of accesso, is the leader in experience-driven, personalized advertisement and content delivery for connected consumers, bridging the physical and digital brand experience across mobile, wearables and other digital technologies. TE2 is designed for industries where an in-person experience is a critical engagement opportunity, including hospitality, resorts, theme parks, food, travel, education and healthcare. For more information about TE2, please visit www.theexperienceengine.com.
The Couchbase Site Reliability Engineers objective is to essentially "make things scale" which includes: building software that automates experiences, developing utilities that provide insights/metrics, and providing instrumentation for the Engineering teams to more efficiently scale up the TE2 platform's performance.
Red Hat, Docker, Kubernetes, AWS, Jenkins, and Ansible are the main internal tech stack you will be working with.
The Couchbase Site Reliability Engineer brings deep expertise supporting Couchbase database clusters as part of complex TE2 deployments. You will serve as a subject matter expert on all aspects of our utilization of Couchbase, including deployment, configuration, scaling (MDS), and upgrades. You will debug problems in production and test environments, advise developers on best practices using Couchbase including key-value operations and N1QL queries, and maintain high-volume clusters in multiple datacenters. You will develop automation that improves deployment speed and service reliability of Couchbase clusters.
Challenges that you may tackle include:
- Instrumentation and metrics collection from AWS lambda FaaS or otherwise immutable containers
- Minimize and harden microservices and public-facing API gateway attack surface
- Continuous delivery using tools such as Jenkins pipelines, Docker, Kubernetes
- Observability, capacity planning, system and service performance analysis and tuning
- Orchestration of AWS VPC resources using tools such as terraform, boto, consul
Some of the technologies you will be working with:
- Configuration management: ansible, aws-cli, git
- Operating Systems: mostly RedHat derived linux
- Containerization and virtualization technologies: Docker Enterprise, Kubernetes
- Metrics and monitoring: statsd, ELK, PagerDuty, Slack chatops
- Messaging: Kafka, RabbitMQ
- Microservices patterns: Eureka, Ribbon, Hystrix, nginx
- Databases: Couchbase (NoSQL, N1QL), memcached, Elasticsearch, PostgreSQL, Oracle
- L2-L7 frame/packet/session inspection: netflow, WAF, pcap
- 5+ years of highly-available or high-volume site reliability engineering or systems administration
- 3+ years of infrastructure automation, configuration management or container orchestration
- Strong with one or more languages (Go (golang), Python, Java, Ruby, perl or bash) and git
- BA/BS in Computer Science or a related technical field (preferred, but not necessary)
- Periodic participation in an after-hours on-call rotation supporting production environments 24x7
- Willingness to embrace an agile devops culture
What We Offer:
- Competitive compensation package including discretionary annual bonus opportunity;
- 26-days of paid annual leave for employees (paid leave increases with tenure);
- 8-hours of paid Volunteer Time Off to give back to organizations and groups you feel most passionately about;
- Robust health insurance scheme with the opportunity to participate in private medical scheme after satisfactory performance;
- Matching pension scheme (up to 8%);
- Unlimited access to Udemy for Business for continued learning and career development;
- A flexible work schedule around our core business hours.
- Eligibility to work in the UK is required.
- accesso is a drug free company.
If you are interested in joining a team who values Passion, Commitment, Teamwork, Innovation and Integrity and what we’ve described above is YOU, then apply today and let’s talk!