· Maintain and enhance the SLA of 99.999% for offered services and managed platforms.
· Participate in 24x7 on call for mission critical services on a rotation basis.
· Design systems architecture for projects using Linux and Linux application stacks (LAMP, Ruby, Mysql, Redis, Aerospike, Java, Python etc)
· Proficient in capacity planning
· Design, implement and enhance CI and CD platforms.
· Design, implement, enhance and manage internal cloud offerings
· Understand and Automate and solutions for permanently fix to prevent outages / downtimes
· Responsible for architecting deployments for High availability, scalability and reliability
· Design and implement platforms for monitoring, log processing, metrics collection and data visualization.
· Script and code automation tools (in shell/perl/ruby/python etc) for automation and efficient management of sites/products
· Infrastructure and platform security.
· Puppet configuration management.
· Lead and mentor a team of Operations Engineers.
· Liaise with application development teams to drive operational best practices.
· You take pride in calling yourself expert/master in most of the below technologies/skills -
· Minimum 3 years of relevant experience in most of the following
· Proven track record of managing high traffic internet applications, especially in e-commerce domain.
· Linux: In depth Linux/Unix fundamentals, Good understanding the various linux kernel subsystems (memory, storage, network etc), Understanding of various distributions nuances (Ubuntu/Fedora/Centos etc), Package management etc
· Fundamentals: DNS & Networking Fundamentals, TCP/UDP, IP Routing, HA & Load Balancing Concepts.
· Application Stacks: LAMP, Openresty/Nginx/HAproxy/ATS, Wackamole, Email Platforms, Tomcat.
· Cloud Infrastructure : OpenStack
· Edge Cache - Redis and aerospike
· Databases: SQL/RDBMS, MySQL/NDB, MongoDB, Cassandra.
· Configuration management : Ansible/Chef/Puppet
· Tools/Utilities: Nagios, Zabbix, Cacti, Ganglia, Kickstart/Cobbler, Mcollective, Yum, RPM, GIT/SVN
· Scripting/Programming: Extensive work done on two or more of these scripting/programming languages - Bash/PERL/Ruby/Python/PHP.
· Others: Regular expressions, Excellent troubleshooting skills.
Good to have :
· Systems/Hardware - LOM/IPMI/IP KVMs, Dell Hardware.
· A deep knowledge of AWS Cloud services would be an added advantage