Data Engineer

March 17

🏢 In-office - Manhattan

Apply Now
Logo of Cybersyn

Cybersyn

Cybersyn is a DaaS (data-as-a-service) company, whose mission is to create a real-time view into the world’s economy.

2 - 10

Description

• Help get data from wherever it is to where we need it (in Snowflake): in practice, this often means writing jobs to extract, download, or transform data as efficiently as possible. You need to worry about compute efficiency and also care about building some context for what the data actually is. • Take research and statistical models and pipelines and implement them in Snowflake in an efficient way that meets time SLA requirements while minimizing costs • Tune Snowflake for performance and cost optimization • Provide infrastructure guidance of Snowflake capabilities to accommodate business/technical use cases • Provide production support for Data Warehouse issues such data load problems, transformation translation problems, query optimization • Take end-to-end ownership of your work and enjoy working with different functions across the company

Requirements

• Experience with Snowflake is requisite • Experience with query optimization is required. You are comfortable in the Snowflake Query Profiler. Snowflake micro-partitions, sortkeys, query acceleration, and search optimization service should all be terms that you are familiar with and ready to discuss. • Experience in Python and SQL is requisite • Experience working with multiple (external) datasets, cleaning, joining, and munging data; experience working with public data sources (ie. US Census, ACS Survey) a huge plus • Experience with dbt and orchestrator systems (Dagster, Prefect, Mage, Kestra, or some equivalent) is highly valued • Experience building and operating data pipelines for real customers in production systems

Benefits

• Ability to shape Cybersyn’s initial technology decisions • Access to some of the most interesting and largest economic data in the world, including real-time spending, transaction, clickstream data from both third-party and first-party sources. • Much of our data is not available to any other third parties. • Our system is built with heterogeneous data sources in mind: we are not working on data from a single product or theme, but data from governments, payment processing systems (think bank records), mobile devices and apps, and SaaS exhaust (think data B2B SaaS collects) • Fast moving culture, lots of responsibility and autonomy from day 1.

Apply Now
Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or lior@techjobsnewyorkcity.com