Senior Data Center Engineer at TEKsystems
Ashburn, VA 20146
About the Job
Top Skills' Details
Candidates need to have 5-6 years of experience in systems and data center activities, and some experience in scripting. Role includes maintaining overall uptime, performance, and capacity of the Pandora service.
Focus areas:
(1) 4+ years data center infrastructure experience (mid-to-expert level experience)
(2) a bit of scripting in bash or python or Perl (junior-to-mid level experience)
(3) Unix/Linux (imid to expert level experience)
***Qualified candidates are required to live 45-minutes or less from the worksite location to be considered, in order to respond in a timely manner to on-call needs.
Secondary Skills - Nice to Haves
- Cabling
- Ansible
- Red hat
- server+ certification
- CCNA
Job Description
What you’ll do:
100% hands-on with datacenter infrastructure provisioning and server/network equipment deployments.
Rack/Cable/Provision a large inventory of servers, switches, PDUs and consoles.
Perform initial configuration of systems as defined by our standard operating procedures. (BIOS configuration, PXE OS installs, DNS updates etc.)
Troubleshoot and remedy hardware issues, document findings and provide detailed RCA reports.
Assist in decommissioning and retiring aging hardware and provision replacements.
Manage RMA processes with various vendors.
Maintain an up-to-date inventory list of all hardware equipment between all our datacenters.
Implement best-practice methodology for maintaining a datacenter environment.
Document and track all assigned datacenter related issues and tasks via our internal ticketing system in a timely fashion.
What you’ll need
BA/BS Information Technology, Computer Science or a related field. or equivalent experience.
IT Certifications such as Server+, RHCSA/RHCE, CCNA or similar a plus.
Minimum 6 years’ experience working in enterprise scale datacenters.
100% hands-on with datacenter infrastructure provisioning and server/network equipment deployments.
Responsible for datacenter capacity and growth planning as well as power and cooling aspects.
Work alongside a group of data-center operations engineers in accomplishing the various day-to-day datacenter related tasks and facilitating new build-outs.
Strong understanding of x86 server hardware architecture and subsystems as it relates to configuration, triage, and certification in a large-scale server environment.
Familiarity with monitoring stacks such as Prometheus, Alertmanager, and Grafana.
Familiarity with executing Ansible playbooks.
Hands-on experience with PXE boot, UEFI, AMI BIOS distributions, BMC/AMI/iDRAC implementation and troubleshooting.
Practical professional knowledge of Linux and full network stack from NIC firmware to TCP/IP.
Familiar with SAN and NAS connectivity utilizing FC HBAs.
Familiar with version control systems such as Bitbucket aand Git.
Troubleshoot and remedy hardware issues, document findings and provide detailed RCA reports.
Maintain an up-to-date inventory of hardware equipment between all our co-locations.
Familiarity with performance testing and reporting tools, such as Phoronix, FIO, Stream, and other benchmarking tool
Knowledgeable in datacenter best practices including cabling, power balancing, inventory tracking, and more.
Excellent time management skills, with the ability to prioritize and multitask, and work under shifting deadlines in a fast-paced environment.
Provide mentoring support to datacenter engineers.
Participate in a 24x7x365 on-call rotation.
Up to 15% travel.
Must have legal right to work in the US
Technical Skills:
Demonstrated proficiency in monitoring stacks such as Prometheus, Alertmanager, and Grafana.
Hands-on experience with PXE boot, UEFI, AMI BIOS distributions, BMC/iDRAC implementation.
Experience creating and executing Ansible playbooks.
Experience with docker containers
Basic understanding of Hashicorp Nomad/Consul/Vault
Practical professional knowledge of Linux and full network stack from NIC firmware to TCP/IP.
Expertise with SAN and NAS arrays such as Netapp, Isilon, Pure Storage, and Brocade.
Familiarity with Bitbucket and Git.
Familiarity with performance testing and reporting tools, such as Phoronix, FIO, Stream and others.
Experience with ISC DHCP and BIND DNS operations.
Intermediate scripting skills in Python and familiarity with OOP concepts.
Significant knowledge of Linux kernel drivers, kernel tuning, and debugging hardware compatibility issues.
Basic understanding of subnetting, DHCP Relays, network load balancing, and ARP.
Working knowledge of package management tools such as APT and RPM.
Additional Skills & QualificationsAdditional Skills:
• Familiarity with version control systems such as Bitbucket and Git.
• Familiarity with performance testing and reporting tools, such as Phoronix, FIO, Stream and others.
• No need to be a scripting language expert; no requirement to script from scratch.
• No database specific experience required.
Employee Value Proposition (EVP)Candidate will be joining a team of 8 employees who have all had long tenures at Pandora (ranging from 5 to 13 years). Pandora provides the opportunities to learn and grow in this role; the career path is strong and well mapped out. Another perk is that on Fridays, the team incorporates "flexible Fridays" where IF you have no meetings or work to be done, you can sign off for the week at 12:30pm.
Work EnvironmentFull time, on-site at Pandora's data center facilities in San Jose and Santa Clara, CA. Will be spending 5-6 hours a day on the data center floor with responsibilities such as receiving shipments, asset tagging, inventorying equipment, racking and cabling equipment, and making connections for servers.
Business Drivers/Customer ImpactThe team needs an 8th member to keep up with the day-to-day operations of their data centers in order to maintain their high level of site reliability for large-scale consumer online services.
Description:
What you’ll do:
100% hands-on with datacenter infrastructure provisioning and server/network equipment deployments.
Rack/Cable/Provision a large inventory of servers, switches, PDUs and consoles.
Perform initial configuration of systems as defined by our standard operating procedures. (BIOS configuration, PXE OS installs, DNS updates etc.)
Troubleshoot and remedy hardware issues, document findings and provide detailed RCA reports.
Assist in decommissioning and retiring aging hardware and provision replacements.
Manage RMA processes with various vendors.
Maintain an up-to-date inventory list of all hardware equipment between all our datacenters.
Implement best-practice methodology for maintaining a datacenter environment.
Document and track all assigned datacenter related issues and tasks via our internal ticketing system in a timely fashion.
What you’ll need
BA/BS Information Technology, Computer Science or a related field. or equivalent experience.
IT Certifications such as Server+, RHCSA/RHCE, CCNA or similar a plus.
Minimum 6 years’ experience working in enterprise scale datacenters.
100% hands-on with datacenter infrastructure provisioning and server/network equipment deployments.
Responsible for datacenter capacity and growth planning as well as power and cooling aspects.
Work alongside a group of data-center operations engineers in accomplishing the various day-to-day datacenter related tasks and facilitating new build-outs.
Strong understanding of x86 server hardware architecture and subsystems as it relates to configuration, triage, and certification in a large-scale server environment.
Familiarity with monitoring stacks such as Prometheus, Alertmanager, and Grafana.
Familiarity with executing Ansible playbooks.
Hands-on experience with PXE boot, UEFI, AMI BIOS distributions, BMC/AMI/iDRAC implementation and troubleshooting.
Practical professional knowledge of Linux and full network stack from NIC firmware to TCP/IP.
Familiar with SAN and NAS connectivity utilizing FC HBAs.
Familiar with version control systems such as Bitbucket aand Git.
Troubleshoot and remedy hardware issues, document findings and provide detailed RCA reports.
Maintain an up-to-date inventory of hardware equipment between all our co-locations.
Familiarity with performance testing and reporting tools, such as Phoronix, FIO, Stream, and other benchmarking tool
Knowledgeable in datacenter best practices including cabling, power balancing, inventory tracking, and more.
Excellent time management skills, with the ability to prioritize and multitask, and work under shifting deadlines in a fast-paced environment.
Provide mentoring support to datacenter engineers.
Participate in a 24x7x365 on-call rotation.
Up to 15% travel.
Must have legal right to work in the US
Technical Skills:
Demonstrated proficiency in monitoring stacks such as Prometheus, Alertmanager, and Grafana.
Hands-on experience with PXE boot, UEFI, AMI BIOS distributions, BMC/iDRAC implementation.
Experience creating and executing Ansible playbooks.
Experience with docker containers
Basic understanding of Hashicorp Nomad/Consul/Vault
Practical professional knowledge of Linux and full network stack from NIC firmware to TCP/IP.
Expertise with SAN and NAS arrays such as Netapp, Isilon, Pure Storage, and Brocade.
Familiarity with Bitbucket and Git.
Familiarity with performance testing and reporting tools, such as Phoronix, FIO, Stream and others.
Experience with ISC DHCP and BIND DNS operations.
Intermediate scripting skills in Python and familiarity with OOP concepts.
Significant knowledge of Linux kernel drivers, kernel tuning, and debugging hardware compatibility issues.
Basic understanding of subnetting, DHCP Relays, network load balancing, and ARP.
Working knowledge of package management tools such as APT and RPM.
Skills:
1. Data Center Experience, 3. Python OR Bash Scripting, Rack and Stack, Hardware Troubleshooting, Prometheus, NAS, Linux, Cabling, Ansible, Red hat, server+ certification, CCNA
Top Skills Details:
1. Data Center Experience,3. Python OR Bash Scripting,Rack and Stack,Hardware Troubleshooting,Prometheus,NAS,Linux
Additional Skills & Qualifications:
Additional Skills:
• Familiarity with version control systems such as Bitbucket and Git.
• Familiarity with performance testing and reporting tools, such as Phoronix, FIO, Stream and others.
• No need to be a scripting language expert; no requirement to script from scratch.
• No database specific experience required.
Experience Level:
Expert Level
About TEKsystems:
We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.
The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.