Site Reliability Engineer
Company: S3
Location: Irving
Posted on: April 2, 2026
|
|
|
Job Description:
Job Description Strategic Staffing Solutions is currently
looking for a Site Reliability Engineer , a W2 contract opportunity
with one of our largest clients! This is a W2 contract opportunity,
and the candidates should be willing to work on our W2 ONLY, NO
C2C. Site Reliability Engineer Location: Irving, TX/Charlotte, NC
/Phoenix, AZ Type: W2 Contract – 12-month contract Work Schedule:
Hybrid Schedule: 3 days in office Top Skills - Strong Kubernetes
(K8s) experience (OCP/OpenShift preferred) - Hands-on Harness (CD
tool) experience - DevOps / SRE background (5 years overall) -
CI/CD platform support (NOT just usage) - Cloud exposure (OCP
primary; Azure/GCP acceptable) Platform Ownership & Reliability
(SRE): -Support end-to-end reliability, availability, and
performance of the Harness CD platform across non-prod, prod, and
BCP environments -Maintain and report on SLIs, SLOs, error budgets,
deployment success rates, and platform health metrics -Lead
incident response, troubleshooting, and RCA for deployment
failures, delegate outages, or platform performance issues
-Identify and remediate scaling, performance, and capacity
constraints across delegates, pipelines, Kubernetes clusters, and
cloud integrations Automation & Engineering Excellence: -Develop
automation for provisioning, configuration, scaling, upgrades, and
maintenance of Harness components -Build Infrastructure as Code
(IaC) using Terraform, Ansible, Helm, or equivalent tools -Automate
common operational tasks including delegate lifecycle, cluster
onboarding, secret rotation, and pipeline validation -Reduce manual
work by implementing resilient, repeatable, and self-service
automation workflows DevOps & CI/CD Integration: -Maintain and
enhance Harness integrations with GitHub, Jenkins, Azure DevOps,
Kubernetes/OpenShift clusters, and cloud providers -Ensure an
efficient developer experience through well-optimized pipelines and
reliable deployment mechanisms -Partner with DevOps teams to
optimize orchestration strategies (blue/green, canary, rolling)
-Work with Security teams to embed DevSecOps controls such as
policy enforcement, governance pipelines, and security checks
Observability & Monitoring: -Implement and maintain monitoring,
logging, dashboards, and alerting for all Harness components -Use
Splunk, Prometheus, Grafana, AppDynamics, or similar tools to build
actionable alerts -Detect and escalate issues such as delegate
saturation, pipeline slowdowns, API failures, and Kubernetes
resource constraints -Support proactive monitoring to reduce mean
time to detection and resolution Modernization & Continuous
Improvement: -Assist with Harness upgrades, hotfixes, patching, and
vendor-recommended lifecycle activities -Contribute to
modernization efforts including containerization, cloud-native
deployments, and multi-cloud expansion -Support resiliency
improvements such as BCP validation, backup verification, and BCP
readiness -Evaluate new Harness features, modules on platform
capabilities for enterprise usage Technical Leadership: -Act as a
technical SME for Harness platform operations and enhancements
-Provide platform guidance, documentation, architecture details,
and runbook development -Partner with senior engineers to improve
standards, automation patterns, and operational excellence Required
Qualifications: Core Technical Skills: -5 7 years of experience in
DevOps, SRE, Platform Engineering, or Cloud Engineering roles
-Hands-on experience with Harness CD -Strong experience with
Kubernetes/OpenShift, Linux, cloud services and deployment best
practices -Solid understanding of CI/CD workflows and software
release automation SRE & Automation: -Experience applying SRE
concepts such as SLIs/SLOs, error budgets, and operational maturity
improvements -Strong automation/scripting skills using Python,
Bash, or PowerShell -Infrastructure as Code experience with
Terraform, Ansible, Helm, or equivalent tooling Observability &
Troubleshooting : -Experience with observability tools (Prometheus,
Grafana, Splunk, ELK, AppDynamics, etc.) -Strong troubleshooting
skills across container, OS, networking, platform, and cloud
technology layers Preferred Qualifications: -Experience supporting
CD platforms at enterprise scale (hundreds of teams, multi-region
deployments) -Experience in cloud-native and hybrid cloud
environments (Azure, GCP) -Familiarity with DevSecOps practices,
policy automation frameworks, and governance models -Experience
supporting complex upgrades, platform migrations, or modernization
projects “Beware of scams. S3 never asks for money during its
onboarding process.”
Keywords: S3, Wylie , Site Reliability Engineer, IT / Software / Systems , Irving, Texas