Data Center Operations Lead - AMAZON Web Services, Singapore in Singapore, Singapore

Data Center Operations designs, installs & maintains the world’s largest Cloud Computing Infrastructure. We are looking for skilled Data Center Technicians with a passion for technology to help us expand our Cloud to the next level.

Amazon Web Services (AWS) offers an exciting, dynamic and challenging environment encouraging creativity and personal development while maintaining AWS computing environments in a secure, scalable, and cost-effective manner.

To keep up with demand on both disk and network capacity, we continue to expand our Data Centers in every region. Also, our content delivery AWS service, CloudFront, has expanded its Data Centers presence by over 50% worldwide in the last 12 months and are expecting to increase by a similar number over the next 12 months. This requires talented people to build & manage. We hope it is you!

At Amazon, career progression is part of our environment. We want you to progress. If your career path is in Systems, Network or Database Engineering, Software Development or maybe AWS Support, Technical Operations or perhaps Project Management, we will create a development plan to enable you to succeed reaching those goals. This begins on Day One!

Come and work for the world’s most Customer Centric Company.

The Opportunity: Data Center Operations Lead

This role is a unique opportunity to work in some of the most cutting edge data centers in the world. Amazon data centers are large-scale high-density centers where you will be working on changing the face of Cloud technology in the region.

A Data Center lead may be the primary point of contact for both internal customers (for example: Network Engineers, Systems Engineers, Software Developers, Database Engineers, Technical Operations) and external customers (Hardware Vendors, Contractors, Service Providers among others).


  • Ensuring effective and efficient Technical Management of day to day Data Center Operations

  • Improve the workflows and throughput for Data Center Operations.

  • Become a subject master in Data Center Operations.

  • Ensure all operational KPIs and Metrics are being measured and met inside the Data Center.

  • Be passionate about the quality and quantity of services being provided by the Data Center Operations team and continuously strive to improve our Customer Experience.

  • Part of an on-call rotation for Data Center Issue escalations.

  • Ensuring the Data Center is compliant with all relevant security & Safety policies and procedures.

  • Problem Solving:

  • Maintain a high level of system reliability by prioritizing and resolving trouble tickets efficiently

  • Intermediate Linux Troubleshooting

  • Primary point of contact for all Systems and Network hardware problems

  • Troubleshoot technical issues on various platforms ranging from Systems to Networking

  • Remediation of physical layer outages, both Systems & Network

  • Remediation and recovery of physical power issues on racks

  • Participate in Data Center power and cooling incidents and escalate to relevant team

  • Operations:

  • Prepare and handle the 24x7 shift schedule plan of the team resources and on call requirements and response during shift rotations with ability to work on shifts on occasional bases

  • Install server racks in line with internal SLAs

  • Data Center Technical point of contact for all high severity issues

  • Hardware diagnostics and replacement of server and network devices and parts

  • Meet SLAs for our customer uptime and be the technical escalation point of the site.

  • Enforcing Amazons Security best practices

  • Interact with third party vendors & contractors

  • Contribute ideas to improve operational efficiency

  • Project Management & Operational Excellence:

  • Deliver small to mid-scale projects including Data Center roll-out

  • Participate in team meetings for metric analysis and project status updates

  • Help build the world’s largest Cloud infrastructure

  • Share knowledge with other technical staff on the best practices related to all service owner issues

  • Write and enhance technical and operational processes and procedures

  • Data driven analytical approach to resolving problem

  • Understands operational metrics and drive actions to achieve the targets

  • Perform tasks needed with minimal supervision

  • Participate in global conference calls and report status to management (tracking action items for cluster)

  • Make team more effective by proposing and introducing process changes

  • Proactively propose solutions to problems found and not just report issue

  • 5+ years of server Operating Systems experience

  • 3+ years of Data Center experience

  • Server Hardware Troubleshooting experience

  • Basic Network operation / support experience

  • Strong written & verbal communication skills

  • Understanding of Data Center hygiene, safety, and security

  • Passionate about IT infrastructure and hardware

  • This position also has a physical component requiring the ability to lift & rack equipment up to 20kg; it may require working in cramped spaces or in elevated locations while adhering to health & safety guidelines

  • This role involves covering 24x7 shift rotation

  • Linux Administration

  • Remote Access: Console routers, IPMI, Ticketing System such as BMC Remedy

  • Intermediate network administration and support experience

  • Understanding cabling infrastructure (copper and fiber) including troubleshooting

  • Good knowledge about Data Center facilities (power and cooling)

  • Knowledge of programming language


AMZR Req ID: 465939

External Company URL: