Data Engineer – Azure & Big Data Platforms

India

Technology, Information and Media – Technology, Information and Media /

Full-time /

Remote

Apply for this job

This role is for one of our clients

Industry: Technology, Information and Media

Seniority level: Mid-Senior level

Min Experience: 5 years

Location: Remote (India)

JobType: full-time

₹3,00,000 - ₹10,00,000 a year

We are seeking an experienced and versatile Data Engineer to join our growing data engineering team. In this role, you’ll design and build scalable, high-performance data pipelines and infrastructure using modern Azure and big data technologies. You will collaborate closely with cross-functional teams to deliver clean, accessible, and trusted data that powers advanced analytics, AI/ML models, and key business decisions.

Key Responsibilities

Build Scalable Pipelines: Design and implement robust ETL/ELT pipelines to ingest, transform, and store data from diverse structured and unstructured sources (e.g., APIs, Kafka, MongoDB, cloud services).

Develop Azure Data Solutions: Use Azure Data Factory, Databricks, and related tools to manage orchestration, transformation, and pipeline deployment.

Architect & Model Data: Develop scalable data lake and data warehouse solutions using ADLS Gen2, Delta Lake, and Azure SQL DB. Implement optimized data models for analytics and reporting use cases.

Optimize Performance: Apply best practices for data pipeline performance tuning, resource optimization, and parallel processing using Apache Spark and PySpark.

Enable Automation: Automate data workflows and validation processes, including test cases for ETL and Big Data pipelines, ensuring reliability and efficiency.

Collaborate Across Teams: Partner with analysts, data scientists, engineers, and business stakeholders to understand requirements and deliver impactful data products.

Implement Governance & Security: Ensure adherence to data governance, quality, privacy, and compliance standards. Use tools like Key Vault and DevOps CI/CD for secure and automated deployment.

Monitor & Maintain: Establish monitoring, alerting, and logging for data pipeline health and quality, proactively resolving any failures or inconsistencies.

What We’re Looking For

Hands-on Expertise in:

Programming: Python, PySpark, Scala

Azure: Data Factory, Databricks, Key Vault, DevOps CI/CD

Storage: ADLS Gen2, Delta Lake, Azure SQL DB

Big Data Ecosystem: Apache Spark, Hadoop

Experience With:

Data ingestion from real-time and batch sources like Kafka, MongoDB

Building secure, reusable, and scalable data infrastructure

Agile methodology and DevOps principles

Automation testing frameworks for ETL/Big Data pipelines

Preferred (but not required):

Exposure to AI/ML workflows and data science teams

Understanding of MLOps and deployment of ML models at scale

Core Competencies

Strong grasp of data modeling, warehousing, and distributed processing

Solid problem-solving and debugging skills

Passion for clean, reliable, and well-documented data systems

Excellent communication and teamwork abilities

Self-starter attitude with a sense of ownership and accountability

Education & Experience

Bachelor's or Master’s degree in Computer Science, Engineering, or a related field

5–8 years of relevant experience in data engineering roles with cloud and big data tech stacks

Apply for this job