Big Data - Engineer
About Dathena Science
Dathena is a Swiss and Singaporean company developing data governance software based on machine learning algorithms. Dathena is the most complete and accurate data governance platform and is the only solution to easily classify and manage data with this level of accuracy while helping companies to comply with regulations. Using cutting edge machine learning technology, Dathena sift through all unstructured data allowing to regain full control on company’s information.
- We are looking for a talented Data Engineer that will help us improve data ingestion and processing performance by analyzing current Spark applications to design and implement efficient architecture solutions.
- Your primary focus will be to optimize current implementations of ML algorithms as well as implementing new research features into production following software engineering best practices.
- Design and implement data pipelines for a large application
- Responsible for systems analysis - Design, Coding, Unit Testing and other SDLC activities
- Benchmark different solution approaches and analyze performances
- Optimise code and resource usage
- Present solutions approaches and architecture choices and make sure data architecture standards are adhered
- Maintain high-performance and data integrity of critical database (NoSQL & SQL type)
- Implement and update ETL processes
- Provide project technical support and expertise
- Support new projects/integrations working with R&D team
Skills and qualifications
- Master’s Degree in Computer Science
- 2+ years experience in working with big data ecosystems in a Spark and Hadoop environment
- Start-up experience is a huge plus
- Understanding of databases fundamentals, distributed computing and micro service architecture principals. Experience in dealing with large HBase databases
- Functional programming skills in Scala and/or Java is good to have
- Experience with Docker, Unix Shell (Bash), Python. Kubernetes is a plus.
- Good understanding of Data Science concepts (e.g. Machine Learning and Natural Language Processing)
- Knowledge of cloud computing infrastructure (e.g. Amazon Web Services EC2, Elastic MapReduce) and considerations for scalable, distributed systems are a plus
- Requirement gathering and understanding, Analyze and convert functional requirements into concrete technical tasks and able to provide reasonable effort estimates
- Work proactively, independently and with global teams to address project requirements, and articulate issues/challenges with enough lead time to address project delivery risks
- Providing expertise in technical analysis and solving technical issues during project delivery
- Code reviews, test case reviews and ensure code developed meets the requirements
- Experience working in Agile/Scrum teams using JIRA, Git, Bitbucket
- Location: Singapore R&D Office