Amazon is currently seeking adventurous engineering operations and maintenance professionals for our mission critical Data Centers in the Greater London Area. These future Amazonians must be willing to go to one of three locations to complete specialized training to prepare for deployment in a data center region. Once training is completed, individuals will be relocated to the permanent location. Amazon Web Services invites best-in-class engineering professionals to join an elite team to build and maintain one of the world's largest information system infrastructures.
Our Data Center Engineering Operations Team is a committed group that works to maintain the critical physical infrastructure that supports Amazon Web Services. Specifically, this team works to ensure that the data center's MEP operates at 100% availability while maintaining first-class customer service to the teams and groups within the data centers.
The Data Center Chief Engineer (CE) is responsible for ensuring that all electrical, mechanical, and fire/life safety equipment within the data center is operating at peak efficiency. This involves both planned preventative maintenance of equipment, daily corrective work, and emergency response to emergent issues. The CE serves as an expert technical resource reporting to a site’s Data Center Facility Manager and interacting with onsite Engineering Operations Technicians (EOT) and any third party vendors. They are expected to be a singular focal point for all facility operations within a given data center and to support Amazon within its owned and operated data centers. Data center equipment that supports mission-critical servers must maintain better than 99.9999% uptime.
Also expected from the CE is the ability to manage small-to-medium impacting projects and new hall builds from conception to completion. These projects involve large amounts of independent work as well as collaboration with external support groups including engineering, automation, procurement, and finance in both local and global settings. The CE will be tasked with creating and delivering on key milestones, obtaining and tracking quotes for all necessary costs, and documenting project results for future implementation at other facilities. The goals of such projects are for the CE to drive innovation and resiliency while reducing operational costs in the facilities.
The CE directs, trains and supports EOTs in their role of providing hands-on electrical and mechanical equipment troubleshooting and operations. Implementation and execution of site/equipment-specific training exercises is also expected. This equipment includes, but is not limited to, stand-by diesel generators, switchgear, UPSs, PDUs, AHUs, chillers, cooling towers, chemical treatment systems, pumps, motors, VFDs, and building automation systems.
• Oversee the day-to-day operations and maintenance of mechanical and electrical equipment in a data center.
• Operate independently with limited direct management
• Act as an escalation point for all facilities-related issues within the data center, escalating to the Data Center Facility Manager as needed, and work OT hours as needed to support site stability.
• Perform root cause analysis of equipment failures
• Troubleshoot and report of facility and data sever-level events within internal SLA
• Create and deploy new standard practices for Engineering Operations Technicians, Chief Engineers, and vendor support teams
• Provide training and guidance to Engineering Operations Technicians and assisting in recruiting efforts for the same
• Ensure all safety procedures are adhered to by vendor and Amazon staff
• Utilize internal Change Management Systems to manage building workflows
• Communicate complex technical information to a non-technical audience
The successful candidate must be able to demonstrate the following competencies / behaviours:
- Ability to solve problems at their root, stepping back to understand the broader context.
- Aptitude for troubleshooting and problem solving.
- Ability to maintain SLAs through the implementation of proactive issue detection and immediate response.
- Ability to follow support procedures, system documentation, and issue tracking entries into a trouble ticket system.
- Shows good judgment and instincts in decision making.
- Ability to prioritize in complex, fast-paced environment.
- Ability to demonstrate their ability to take ownership of technical issues brought to them by their customer base.
- Ability to demonstrate a willingness to actively engage other support teams to drive it to resolution.