Senior Data Engineer

New York, NY
Data Migration /
Remote
Company: Spotify USA, Inc.
Company HQ: 4 World Trade Center, 150 Greenwich Street, New York, NY 10007
Location: *Telecommuting permitted: work may be performed anywhere within the Eastern Time Zone region of the U.S.
Job Title: Senior Data Engineer
                     
DUTIES: Design, implement, and maintain batch and real-time data pipelines in Scala, Scio, and Google Cloud Platform that integrate with the broader Spotify software ecosystem, extracting and transforming user streaming data for machine learning applications, specifically to train recommender systems. Design, implement, and maintain Java microservices that integrate with the broader Spotify software ecosystem, aimed at exposing users to multiple experiences for comparative analysis to determine the best one. Write code in Java, Scala, and Python with a focus on rapid experimentation, testability, space efficiency, and order efficiency. Deploy, operate, monitor, and be on-call for large-scale distributed production software services serving traffic to Spotify users worldwide. Produce documentation for internal systems, detailing their design and intended use. Interview candidates for Data Engineering roles, review candidate applications during the hiring process, onboard new hires, and mentor less experienced engineers to foster a healthy and inclusive team culture. *Telecommuting permitted: work may be performed anywhere within the Eastern Time Zone region of the U.S.
 
JOB REQUIREMENTS: Bachelor’s degree (U.S. or foreign equivalent) in Computer Science, Software Engineering, Systems Engineering, or a related field and six (6) years of experience in the job offered or in a related role. Must have six (6) years of experience with: leading technical initiatives from inception to delivery managing stakeholder expectations throughout. Must have five (5) years of experience with: developing backend micro-services in Java; developing batch and real-time big data pipelines in Scala and Python, and implementing their scheduling and orchestration in Python; and designing, developing, and maintaining data pipelines that create machine learning features. Must have three (3) years of experience with: designing and executing data collection experiments and A/B tests to develop and benchmark machine-learning models. Must have two (2) years of experience with: production experience with Apollo open-source framework for micro-services; production experience with open-source frameworks for data pipelines, including Scio, Flyte, Luigi and Styx; production experience with Google Cloud Platform and its product offerings, including GKE, Dataflow, PubSub, BigQuery, Bigtable, Kubeflow, CloudSQL and Cloud Logging; and designing, developing, and maintaining highly available, low-latency feature stores to serve these features for real-time inference.

SALARY:  $182,962 to $274,443/ year