August 1
🏡 Remote – New York
• Design, implement, support, and monitor a production environment that world-class athletes depend on • Provide support across multi-region deployments • Evangelize and mature best practices and standards for infrastructure-as-code across multiple product lines • Use data to increase service resiliency and design to minimize downtime or loss of data • Solve complex problems where analysis of situations or data requires the application of scientific methods and an in-depth evaluation of multiple factors • Respond to and resolve incidents to minimize downtime and impact • Develop and maintain monitoring systems, set up alerts, and analyze performance metrics to ensure high availability • Plan for and manage the scaling of infrastructure to handle varying loads • Create and maintain tools and automation to improve operational efficiency and reduce manual interventions • Define and measure SLOs and work to meet or exceed them
• Expert knowledge of Linux operating systems and administration • Cloud-based infrastructure with a heavy emphasis on AWS • Commanding knowledge of config management such as Ansible, Terraform, and Packer • Familiarity with Docker, containers, and their uses • Proficiency in programming/scripting languages (Python, Go, Java, etc.) • Strong knowledge of system design, performance tuning, and troubleshooting • Experience with monitoring and logging tools (Prometheus, Grafana, Datadog) • Understanding of incident management and disaster recovery practices
• Comprehensive benefits plan, including medical, dental, vision, disability, life insurance, and a 401K match • Additional educational opportunities via Range for courses, conferences, and other options • Unlimited paid time off • Company equity • 100% remote-optional work setting
Apply Now