Sr. Data Engineer
Remote - US /
Technology Operations – Engineering /
LOCATION: U.S. Eastern Time Zone
Must reside in the US – preferably in the Eastern Time Zone. Remote working permitted. Must be eligible to work in the US without sponsorship now or in the future. This is a full-time position with benefits. Contractors will not be considered for this position.
Who we are:
Enterra provides solutions that leverage sophisticated machine learning, artificial intelligence (ontologies, inference engines and rules) and natural language processing to provide highly actionable insights and recommendations to business users. Today, our solutions impact just about every aspect of the products you buy at your local store – from what is available to how it's priced and even where they are placed on the shelf. Our SolaaS (Solution as a Service) solutions are deployed within private clouds – principally on Azure. We help transform market-leading companies into true data-driven digital enterprises.
What you will do:
The ideal candidate must be collaborative, and deadline driven. Because of the nature of our work and our technology, successful candidates must take a growth mindset and be comfortable with ambiguity, with the ability to take a proactive, structured approach to achieve results. Results-orientation and deadline driven are critical in our fast-paced environment.
The successful candidate will join a diverse team to:
- Build unique high-impact business solutions utilizing advanced technologies for use by world class clients.
- Create and maintain the underlying data pipeline architecture for the solution offerings from raw client data to final solution output.
- Create, populate, and maintain data structures for machine learning and other analytics.
- Use quantitative and statistical methods to derive insights from data.
- Guide the data technology stack used to build Enterra’s solution offerings.
- Combine machine learning, artificial intelligence (ontologies, inference engines and rules) and natural language processing under a holistic vision to scale and transform businesses — across multiple functions and processes.
- Work with other Enterra personnel to develop and enhance commercial quality solution offerings
- Design, create and maintainoptimal data pipeline architecture, incorporating data wrangling and Extract-Transform-Load (ETL) flows.
- Assemble large, complex data sets to meet analytical requirements – analytics tables, feature-engineering etc.
- Design and build the infrastructure required for optimal, automated extraction, transformation, and loading of data from a wide variety of data sources using SQL and other ‘big data’ technologies such as Databricks.
- Design and build automated analytics tools that utilize the data pipeline to derive actionable insights.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Design and develop data integrations and data quality framework
- Develop appropriate testing strategies and reports for the solution as well as data from external sources.
- Evaluate new technology for use within Enterra.
- Work with other Enterra and client personnel to administer and operate client-specific instances of the Enterra solution offerings
- Configure the data pipelines to accommodate client-specific requirements to onboard new clients.
- Perform regular operations tasks to ingest new and changing data – implement automation where possible.
- Implement processes and tools to monitor data quality - investigate and remedy any data-related issues in daily solution operations.
- May provide guidance and oversight to fellow data engineers
- Bachelor’s degree in Computer Science or a STEM (Science, Technology, Engineering or Math) field required
- Minimum of 7 years hands on experience as a data engineer or similar position.
- Minimum of 7 years commercial experience with Python or Scala Programming Language
- Minimum of 7 years SQL and experience working with relational databases (Postgres preferred).
- Experience with at least one of the following – Databricks, Spark, Hadoop or Kafka
- Demonstratable knowledge and experience developing data pipelines to automate data processing workflows
- Demonstratable experience in data modeling
- Demonstratable knowledge of data warehousing, business intelligence, and application data integration solutions
- Demonstratable experience in developing applications and services that run on a cloud infrastructure Azure preferred
- Excellent problem-solving and communication skills
- Ability to thrive in a fast-paced, remote environment.
- Comfortable with ambiguity with the ability to build structure and take a proactive approach to drive results.
- Attention to detail – quality and accuracy in work is essential.
The following additional skills would be beneficial:
- Knowledge of one or more of the following technologies: Data Science, Machine Learning, Natural Language Processing, Business Intelligence, and Data Visualization.
- Knowledge of statistics and experience using statistical or BI packages for analyzing large datasets (Excel, R, Python, Power BI, Tableau etc.).
- Experience with container management and deployment, e.g., Docker and Kubernetes