Lead Site Reliability Engineer - Umanist Staffing LLC
Fort Worth, TX
About the Job
Overview:& & & & &
&
TekWissen Group is a workforce management provider throughout the USA and many other countries in the world. Our client is the& multinational information technology services and consulting company and is a leading provider of information technology, consulting, and business process outsourcing services, dedicated helping the world's leading companies build stronger businesses.& &
& &
Title: Lead Site Reliability Engineer (SRE)&
Work Location:& Fort Worth, TX&
Job Type: Contract& & & & &
Work Type: Hybrid - 3 days onsite weekly&
Duration: 4& Months&
& &
Job Description:& &
A Site Reliability Engineer is responsible for monitoring, automating, and improving the reliability, performance, and availability of TechOps supply chain application at client.
We are seeking a highly skilled SRE with a strong background in Database Design, Azure DevOps, SRE, GITHUB, and Infra CI/CD Pipelines.&
With 12-15 years of total experience, the ideal candidate will lead our efforts in managing geospatial data and infrastructure, ensuring high-quality and efficient project delivery.&
Experience:&
12 to 15 years of total experience.&
7 to 8 years of solid SRE working experience&
Required Skills:&
Monitoring and Metrics in Dynatrace, Prometheus, Grafana and integrations with Moogsoft/xMatters&
Implement Github, GitAction CI/CD and ADO cloud for automation&
Implementing monitoring, observability in AKS and Azure cloud, Kubernetes&
Open source Logging infrastructure&
Worked in an environment with Node JS and GQL with for 2 years of experience&
Hands-on experience with Infrastructure as a Service (IaaS), Platform as a Service (PaaS) tools and platforms, and containers and container orchestration platforms (aka Docker Kubernetes)&
Expertise in one or more cloud native relational databases such as MySql, PostgreSql and NoSQL databases such as Cassandra and MongoDB highly desired&
Strong technical knowledge and skills that are broad and deep, covering various hardware, software, and technology platforms&
Develop, implement, and maintain applications and systems that integrate MongoDB&
Dynatrace&
Mezmo&
Security Vulnerabilities (remediation/compliance)&
Nice to have skills:&
Terraform in Azure and on-prem infrastructure resources
Load balancing the application including Proxies and CDN (automate)&
Able to script Automated performance testing scenarios for APIs and Web front ends and embed in CI/CD pipelines dashboarding/reporting query languages&
Airline Industry experience helpful&
Typescript, JavaScript&
Database and persistence frameworks: Mongo, Oracle, Object/Relational Mapping, Query performance tuning&
Experience with Mongo Schema Design and Mongo Aggregation Framework&
Web Services: Graph QL, REST/SOAP (JSON/WSDL/XML)&
DB Admin/SQL Server&
Terraform&
SysAdmin&
Troubleshooting Network Issues&
VM Management&
Roles Responsibilities:&
Make monitoring and alerting notify on symptoms and not on outages.&
Document the findings turn into repeatable actions–and then into automation.&
Improve the deployment process, change mgmt., release mgmt. processes to make it efficient and streamlined.&
Debug production issues across services and levels of the stack.&
Proposes ideas and solutions within the product team to improve resiliency, availability, security.&
Plan and execute configuration change operations both at the application and the infrastructure level.&
Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation&
Complete Root Cause Analysis (RCA) investigations&
Improving DevSecOps practices and accelerating delivery and take a lead role in troubleshooting technical issues&
Assist in providing inputs to develop strategic technology roadmaps&
Respond to incidents and provide support for customer incidents.&
TekWissen® Group is an equal opportunity employer supporting workforce diversity.& &
Source : Umanist Staffing LLC