Sr. DevOps Cloud Engineer

New York, NY
Engineering
Full-time
Ladders is coming off a terrific 2017, and looking forward to an even better 2018.  The results speak for themselves:
#1 fastest-growing career site in the USA in 2017 according to comScore.  We grew traffic an astounding 128% and are back in the Top Ten largest career sites nationally.
#1 for Facebook Engagement among all career sites.  With the launch of Ladders News in April, we dominated the world’s largest social network and beat Indeed, LinkedIn, and everybody else in our space.
#1 highest average income of all national career sites.  The average Member at Ladders earns $149K / year, which makes us a top place for companies and recruiters to find their best professionals.
#1 Best-seller on Amazon: our career guides have taken top spot in Careers, Resumes, and Job-hunting categories on Amazon books.

Over the past year, we’ve upgraded our systems, people, business, and product.  We’ve added new leaders who have brought new energy and new momentum.

If you’re looking for a fast-paced, growth environment, with the chance for significant responsibility and career growth, we’d love to talk.

You will work closely with coworkers in other offices and time zones on the constantly evolving mission to achieve zero downtime of services. The team is constantly innovating new approaches to improve efficiency, automate repetitive work and reduce manual tasks.

RESPONSIBILITIES:

    • Deploying and configuring production systems, with a particular focus on ease of configuration, reduction of human error, repeatability, and security.
    • Day-to-day operational support of cloud infrastructure as well as internal company IT offerings, such as remote access, proxies and software hosted by the operations team
    • Participate in projects to design and implement new technology solutions, meet specific business needs, solve problems, or improve Global Operations offerings
    • Deployment and management of AWS-based services and infrastructure.
    • Participate in 24/7 on-call responsibilities to respond to emergency situations and perform scheduled maintenance
    • Contribute to and maintain documentation for systems, processes, procedures and infrastructure configuration
    • Drive to master emerging technologies and share experiences with team members
    • Proven problem solving and critical thinking skills
    • Take ownership of the infrastructure you build

BASIC QUALIFICATIONS:

    • 3+ years of established track record in architecting and implementing scalable, distributed, and highly available systems on cloud/hybrid environments
    • Expert knowledge in AWS resources including but not limited to VPC, subnets, security groups, EC2 instances, S3 buckets, IAM, Route 53, CloudFront, load balancers, cloudwatch, Lambda, CloudFormation (or Terraform)
    • Have networking experience and understanding of network protocols, DNS, VPN and load balancing.
    • Experience with Docker, and container orchestration with Kubernetes, Mesos, Docker Swarm, or equivalent.
    • Experience with Puppet, Chef or Ansible.
    • Hands-on experience with expert level knowledge of Linux/UnixExperience supporting service-based architectures with REST APIs, Spring, Node.js, Java, and PythonHave scripting experience in bash, Perl, or Shell
    • Experience in monitoring and setting up alerting for Cloud components
    • Talent at diagnosing and remedying serious issues promptly and effectively.
    • Strong communication skills and an ability to interact with all levels of technical and non-technical personnel
    • Work independently and take initiative.

PREFERRED QUALIFICATIONS:

    • Experience solving problems with distributed systems including logging, monitoring, tracing
    • Experience with Kafka at scale
    • Good understanding of technologies such as Kafka, Couchbase
    • Have experience monitoring infrastructure costs, helping to manage and reduce these.

Keywords:

    • Kubernetes, Docker, Jenkins, Elasticsearch, Logstash, Kibana, Kafka, MariaDB, Couchbase, Bash, Java, Clojure, Scala, Selenium, AWS, EC2, Site Reliability Engineer, Devops