Full Stack/Site Reliability Engineer - ICONMA, LLC
Dearborn, MI 48120
About the Job
Full Stack/Site Reliability Engineer
Location: Dearborn, MI/Hybrid
Duration: 24 months
Description:
We are seeking a talented Full Stack / Site Reliability Engineer to play a key role in developing a comprehensive Internal Developer Platform (IDP) that includes CI/CD pipelines, managed infrastructure, observability, and a developer portal. The primary focus of this role will be on ensuring the stability and scalability of the Internal Developer Platform that hosts the cloud applications.
The secondary focus of this role will be to facilitate the enablement of our product teams developing and supporting these cloud applications.
Responsibilities:
Strong background in software development and systems administration, as well as excellent problem-solving and communication skills.
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Provide primary operational and engineering Support for multiple large, distributed software applications
Participating in an on-call rotation for incident response and support.
Skills Required:
Understanding of gRPC & RESTful APIs, and microservices platform
Experience Required:
5 - 6 years’ experience with Golang, JAVA, J2EE, NoSQL/SQL Datastore, Spring Boot, GCP/AWS/Azure, Docker/K8 in Maintenance and Development of multi-tier applications.
4 - 5 Years of experience with any of APM and other monitoring tools such as Grafana Cloud, Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka, DataDog, PagerDuty.
Strong experience with product & development teams to establish error budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators), KPIs (Key performance indicators) and effectively drive the use of the budget to ensure maximum domain availability/uptime.
Nice to have : Google Cloud Platform Engineer
Experience Preferred:
Regularly review key site technical metrics such as transactions errors, logging, response times, caching strategies, conversion/bounce rates, capacity & resource utilization.
Proactively identify stability risks & work with engineering leadership to establish appropriate mitigation plans
Experience in solving complex architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc.
Architect, design & develop automation to reduce toil, improve recoverability, availability, latency & scalability of supported applications with understanding of MTTD (Mean Time to Detection) & MTTR (Mean Time to Resolution)
Maintain knowledge repository that includes Standard operating procedure, Release checklists, Runbooks for incident recovery.
Education Required:
4 Year College Degree in Computer Science or Equivalent Experience.
As an equal opportunity employer, ICONMA provides an employment environment that supports and encourages the abilities of all persons without regard to race, color, religion, gender, sexual orientation, gender identity or express, ethnicity, national origin, age, disability status, political affiliation, genetics, marital status, protected veteran status, or any other characteristic protected by federal, state, or local laws.
Location: Dearborn, MI/Hybrid
Duration: 24 months
Description:
We are seeking a talented Full Stack / Site Reliability Engineer to play a key role in developing a comprehensive Internal Developer Platform (IDP) that includes CI/CD pipelines, managed infrastructure, observability, and a developer portal. The primary focus of this role will be on ensuring the stability and scalability of the Internal Developer Platform that hosts the cloud applications.
The secondary focus of this role will be to facilitate the enablement of our product teams developing and supporting these cloud applications.
Responsibilities:
Strong background in software development and systems administration, as well as excellent problem-solving and communication skills.
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Provide primary operational and engineering Support for multiple large, distributed software applications
Participating in an on-call rotation for incident response and support.
Skills Required:
Understanding of gRPC & RESTful APIs, and microservices platform
Experience Required:
5 - 6 years’ experience with Golang, JAVA, J2EE, NoSQL/SQL Datastore, Spring Boot, GCP/AWS/Azure, Docker/K8 in Maintenance and Development of multi-tier applications.
4 - 5 Years of experience with any of APM and other monitoring tools such as Grafana Cloud, Dynatrace, New Relic, ELK, Splunk, Prometheus, Sensu, Nagios, Kafka, DataDog, PagerDuty.
Strong experience with product & development teams to establish error budgets by identifying the right SLOs (Service level objective), SLIs (Service level indicators), KPIs (Key performance indicators) and effectively drive the use of the budget to ensure maximum domain availability/uptime.
Nice to have : Google Cloud Platform Engineer
Experience Preferred:
Regularly review key site technical metrics such as transactions errors, logging, response times, caching strategies, conversion/bounce rates, capacity & resource utilization.
Proactively identify stability risks & work with engineering leadership to establish appropriate mitigation plans
Experience in solving complex architecture/design & business problems, work to simplify, optimize, remove bottlenecks, etc.
Architect, design & develop automation to reduce toil, improve recoverability, availability, latency & scalability of supported applications with understanding of MTTD (Mean Time to Detection) & MTTR (Mean Time to Resolution)
Maintain knowledge repository that includes Standard operating procedure, Release checklists, Runbooks for incident recovery.
Education Required:
4 Year College Degree in Computer Science or Equivalent Experience.
As an equal opportunity employer, ICONMA provides an employment environment that supports and encourages the abilities of all persons without regard to race, color, religion, gender, sexual orientation, gender identity or express, ethnicity, national origin, age, disability status, political affiliation, genetics, marital status, protected veteran status, or any other characteristic protected by federal, state, or local laws.
Source : ICONMA, LLC