You should have or be most of the following:
· Experience running and maintaining a 24x7 Internet-oriented production environment
· Demonstrable expertise around specifying, designing, maintaining and/or implementing tools and systems to monitor, repair and update software and hardware systems.
· Experience in building dashboards and automation to track system health and performance of software systems for 24x7 environments
· A solid grasp of networking fundamentals, preferably including hands-on experience with load balancers, switches, routers, VPC, VPN etc.
· Demonstrable expertise architecting and operating solutions on AWS.
You will be expected to deliver on the following types of initiatives:
· Through participation in all phases of the development of a large distributed system; providing hardware, manageability, operability and performance perspectives on all aspects of the system
· Define and/or refine hardware requirements and selected designs, balancing raw up-front dollar cost with operability and TCO, from the data center infrastructure up specify and participate in the development and delivery of operability-related features such as system health monitoring, diagnostics, repair, and other self-healing automation
· Develop or further existing application and system management tools and processes that reduce manual efforts and increase overall efficiency
· Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic
· Maintain fleet inventory management, including producing, maintaining, and evolving capacity plans for various components
· Monitor the health of the fleet, automating system health, maintenance tasks, and reporting systems as needed
· BS Computer Science or other technical degree and related experience
· Solid experience of running and maintaining a 24x7 production environment
· Extensive years of experience of operating *NIX systems administration or development
· Experience with scripting Python, Ruby, Perl, or similar languages
· Experience with support procedures and methodologies for production computing environments
· Experience with service-oriented architecture and web services