Lead Data Engineer

February 20

🏡 Remote – New York

Apply Now
Logo of SprintFWD

SprintFWD

Fitness-tech product development agency.

11 - 50

Description

• Lead the design and growth of our Products and Data Warehouses around our clients Analytics • Design and develop scalable data warehousing solutions, building ETL pipelines in Big Data environments (cloud, on-prem, hybrid) • Manage the transformation of large daily batch data volumes in the cloud using Apache Spark, EMR, and Glue, ensuring streamlined processing and cost savings • Construct and maintain high-throughput streaming data pipelines using technologies like Kinesis, Spark Streaming, and Elasticsearch, while minimizing response lag • Automate and orchestrate complex data workflows using Python, Apache Airflow, and Step Functions to eliminate bottlenecks in data pipelines • Mentor and guide a team, providing technical expertise in SQL query execution, data manipulation, data visualization, and performance optimization • Develop, test, and deploy scalable reverse ETL solutions using API Gateway, Python (Flask), and Lambda, achieving near-zero latency and high scalability • Help architect data solutions/frameworks and define data models for the underlying data warehouse and data lakes • Collaborate with key stakeholders to map, implement, and deliver successful data solutions • Maintain detailed documentation of your work and changes to support data quality and data governance • Ensure high operational efficiency and quality of your solutions to meet SLAs and support commitment to our clients • Be an active participant and advocate of agile/scrum practice to ensure health and process improvements for your team

Requirements

• 3-5 years of data engineering experience developing large data pipelines • Strong SQL skills and ability to create queries to extract data and build performant datasets • Hands-on experience with data integration tools (e.g. Apache Spark, Apache Kafka) • Hands-on experience with cloud-based data services (e.g., AWS Glue, Azure Data Factory, Google Cloud Dataflow) • Experience with version control systems (e.g., Git) and collaborative development practices • Strong programming skills in Python • Experience with at least one major MPP or cloud database technology (Snowflake, Redshift, Big Query) • Solid experience with data integration toolsets (i.e Airflow) and writing and maintaining Data Pipelines • Strong in Data Modeling techniques and Data Warehousing standard methodologies and practices • Familiar with Scrum and Agile methodologies • You are a problem solver with strong attention to detail and excellent analytical and communication skills • Nice to have experience with Cloud technologies like AWS (S3, EMR, EC2)

Benefits

• Health, Dental, Vision and Life Insurance • 401K Plan • Various Fitness and Wellness Benefits (Group Classes, Free Training, Products, etc.) • Continued Education Courses • Conference Attendance

Apply Now
Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@techjobsnewyorkcity.com