Data Engineer

Hyderabad
Engineering – Engineering /
Permanent - Regular Full-time /
On-site
About Appen

Appen is a leader in AI enablement for critical tasks such as model improvement, supervision, and evaluation. To do this we leverage our global crowd of over one million skilled contractors, speaking over 180 languages and dialects, representing 130 countries. In addition, we utilize the industry's most advanced AI-assisted data annotation platform to collect and label various types of data like images, text, speech, audio, and video.

Our data is crucial for building and continuously improving the world's most innovative artificial intelligence systems and Appen is already trusted by the world's largest technology companies. Now with the explosion of interest in generative AI, Appen is helping leaders in automotive, financial services, retail, healthcare, and governments the confidence to deploy world-class AI products.

At Appen, we are purpose driven. Our fundamental role in AI is to ensure all models are helpful, honest, and harmless, so we firmly believe in unlocking the power of AI to build a better world. We have a learn-it-all culture that values perspective, growth, and innovation. We are customer-obsessed, action-oriented, and celebrate winning together.

At Appen, we are committed to creating an inclusive and diverse workplace. We are an equal opportunity employer that does not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Appen is seeking a highly skilled and motivated Data Engineer to join our dynamic team. In this role, you will use your extensive knowledge of software development to build and enhance complex systems and applications, contributing to the evolution of AI and machine learning.

Key Responsibilities:

    • Create and maintain optimal data pipeline architecture.
    • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability.
    • Build analytics tools that utilizes the data pipeline to provide actionable insights into customer delivered data, operational efficiency and other key business performance metrics.
    • Optimize and improve our existing data products.

Qualifications :

    • Bachelor's degree in computer science, Software Engineering, or a related field. A master's degree is a plus. 
    • 2-5 years of experience in data engineering background.
    • Strong programming skills in Python/Java, excellent SQL skills.
    • Experience with relational SQL and NoSQL databases.
    • Expertise in AWS services EMR, RDS, Glue, Athena, S3, Data Pipeline, Redshift, Lambda, API Gateway.
    • Experience with big data tools: Spark, Kafka.
    • Experience with data streaming systems like spark-streaming, storm.
    • Experience with data pipeline and workflow management tools: Airflow
    • Experience with shell scripting.
    • Ability to quickly understand and appreciate underlying business context, problems and objectives of analytical projects.
    • Clear communication skills to run well defined analysis and produce reports.
    • Excellent time management skills
Appen is the global leader in data for the AI Lifecycle with more than 25 years’ experience in data sourcing, annotation, and model evaluation. Through our expertise, platform, and global crowd, we enable organizations to launch the world’s most innovative artificial intelligence products with speed and at scale. Appen maintains the industry’s most advanced AI-assisted data annotation platform and boasts a global crowd of more than 1 million contributors worldwide, speaking more than 235 languages. Our products and services make Appen a trusted partner to leaders in technology, automotive, finance, retail, healthcare, and government. Appen has customers and offices globally.