- Work directly with Data Analysts and Platform Engineering Team to create reusable experimental and production data pipelines
- Understand, tune, and master the processing engines (like Spark, Hive, Cascading, etc) used day-to-day
- Keep the data whole, safe, and flowing with expertise on high volume data ingest and streaming platforms (like Spark Streaming, Kafka, etc)
- Sheppard and shape the data by developing efficient structures and schema for the data in storage and transit
- Explore as many new technology options for data processing, storage, and share them with the team
- Develop tools and contribute to open source wherever possible
- Adopt problem solving as a way of life – always go to root cause!
• Proficient in Java and/or Scala
• Fluidly switch mindsets between Java/Scala and Python
• Implement data platform components such as RESTful APIs, Pub/Sub Systems, Database Clients
• AWS experience a plus
• Application of toolsets in the Apache Hadoop ecosystem
• Assembly and deployment of JVM applications
• Database experience is essential.
• Familiarity with reactive platforms and micro-services
• Experience with R, Apache Spark, or Akka a plus
• Degree in Computer Engineering or Computer Science or 3-5 years equivalent experience.