Amazon.com Data Center Operations Lead - AMAZON Web Services, Singapore in Singapore, Singapore
Data Center Operations designs, installs & maintains the world’s largest Cloud Computing Infrastructure. We are looking for skilled Data Center Technicians with a passion for technology to help us expand our Cloud to the next level.
Amazon Web Services (AWS) offers an exciting, dynamic and challenging environment encouraging creativity and personal development while maintaining AWS computing environments in a secure, scalable, and cost-effective manner.
To keep up with demand on both disk and network capacity, we continue to expand our Data Centers in every region. Also, our content delivery AWS service, CloudFront, has expanded its Data Centers presence by over 50% worldwide in the last 12 months and are expecting to increase by a similar number over the next 12 months. This requires talented people to build & manage. We hope it is you!
At Amazon, career progression is part of our environment. We want you to progress. If your career path is in Systems, Network or Database Engineering, Software Development or maybe AWS Support, Technical Operations or perhaps Project Management, we will create a development plan to enable you to succeed reaching those goals. This begins on Day One!
Come and work for the world’s most Customer Centric Company.
The Opportunity: Data Center Operations Lead
This role is a unique opportunity to work in some of the most cutting edge data centers in the world. Amazon data centers are large-scale high-density centers where you will be working on changing the face of Cloud technology in the region.
A Data Center lead may be the primary point of contact for both internal customers (for example: Network Engineers, Systems Engineers, Software Developers, Database Engineers, Technical Operations) and external customers (Hardware Vendors, Contractors, Service Providers among others).
Ensuring effective and efficient Technical Management of day to day Data Center Operations
Improve the workflows and throughput for Data Center Operations.
Become a subject master in Data Center Operations.
Ensure all operational KPIs and Metrics are being measured and met inside the Data Center.
Be passionate about the quality and quantity of services being provided by the Data Center Operations team and continuously strive to improve our Customer Experience.
Part of an on-call rotation for Data Center Issue escalations.
Ensuring the Data Center is compliant with all relevant security & Safety policies and procedures.
Maintain a high level of system reliability by prioritizing and resolving trouble tickets efficiently
Intermediate Linux Troubleshooting
Primary point of contact for all Systems and Network hardware problems
Troubleshoot technical issues on various platforms ranging from Systems to Networking
Remediation of physical layer outages, both Systems & Network
Remediation and recovery of physical power issues on racks
Participate in Data Center power and cooling incidents and escalate to relevant team
Prepare and handle the 24x7 shift schedule plan of the team resources and on call requirements and response during shift rotations with ability to work on shifts on occasional bases
Install server racks in line with internal SLAs
Data Center Technical point of contact for all high severity issues
Hardware diagnostics and replacement of server and network devices and parts
Meet SLAs for our customer uptime and be the technical escalation point of the site.
Enforcing Amazons Security best practices
Interact with third party vendors & contractors
Contribute ideas to improve operational efficiency
Project Management & Operational Excellence:
Deliver small to mid-scale projects including Data Center roll-out
Participate in team meetings for metric analysis and project status updates
Help build the world’s largest Cloud infrastructure
Share knowledge with other technical staff on the best practices related to all service owner issues
Write and enhance technical and operational processes and procedures
Data driven analytical approach to resolving problem
Understands operational metrics and drive actions to achieve the targets
Perform tasks needed with minimal supervision
Participate in global conference calls and report status to management (tracking action items for cluster)
Make team more effective by proposing and introducing process changes
Proactively propose solutions to problems found and not just report issue
5+ years of server Operating Systems experience
3+ years of Data Center experience
Server Hardware Troubleshooting experience
Basic Network operation / support experience
Strong written & verbal communication skills
Understanding of Data Center hygiene, safety, and security
Passionate about IT infrastructure and hardware
This position also has a physical component requiring the ability to lift & rack equipment up to 20kg; it may require working in cramped spaces or in elevated locations while adhering to health & safety guidelines
This role involves covering 24x7 shift rotation
Remote Access: Console routers, IPMI, Ticketing System such as BMC Remedy
Intermediate network administration and support experience
Understanding cabling infrastructure (copper and fiber) including troubleshooting
Good knowledge about Data Center facilities (power and cooling)
Knowledge of programming language
AMZR Req ID: 465939
External Company URL: www.amazon.com