Senior Site Reliability Engineer (SRE)

August 9

🏡 Remote – New York

Apply Now
Logo of Business Wire

Business Wire

Global Leader in News Content Distribution

Public Relations • Press Release Distribution • Investor Relations • SEC filing • SEO

501 - 1000

Description

• Design and implement highly automated systems/services that ensure the availability, reliability, and scalability of infrastructure and applications. • Build and maintain monitoring and alerting to provide timely feedback on the performance and health of systems, network, and applications. Continuously improve infrastructure and application design to ensure 99.99% uptime while removing architectural complexity. • Work with software development to design and implement systems/applications that are resilient to failure and highly scalable. • Achieve material application performance improvements based on insights from observability metrics. • Develop and maintain disaster recovery plans and procedures. • Participate in on-call rotations to ensure 24/7 application availability. • Triage incoming Web Support escalation requests. • Drive incident root cause analysis, service restoration, and serve as an incident commander during outage events.

Requirements

• 7+ years of experience as a software engineer with 5 years as an SRE supporting Infrastructure, Networking, and Application Operations in a high availability, 24x7 hybrid environment (Colo/Cloud) • Strong record of automation (e.g., Python, Bash, Ansible, Terraform, CloudFormation) • Strong experience with AWS cloud infrastructure and container orchestration (Kubernetes, ArgoCD) operating in a GitOps framework • Strong experience with application monitoring, observability, and alerting systems (e.g., New Relic, Grafana) • Strong experience with at least one programming language (Python, Java) • Advanced experience with Linux system administration, Java-based applications, and network architecture • Ability to participate in architecture reviews • AWS related certifications (Architecture, DevSecOps, Cloud Engineer) are a plus.

Benefits

• Ability to work remotely • Excellent health benefits that begin on your first day of employment • $100 monthly fitness allotment, a tuition reimbursement program, and enhanced mental health resources • 401(k) plan with generous company match, and annual profit sharing contribution (subject to company performance) • PTO, Floating Holidays, Wellness Day Off, Birthday Day Off, and more!

Apply Now
Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@techjobsnewyorkcity.com