Data Engineer – Azure & Big Data Platforms

India
Technology, Information and Media – Technology, Information and Media /
Full-time /
Remote
This role is for one of our clients

Industry: Technology, Information and Media
Seniority level: Mid-Senior level

Min Experience: 5 years
Location: Remote (India)
JobType: full-time
₹3,00,000 - ₹10,00,000 a year
We are seeking an experienced and versatile Data Engineer to join our growing data engineering team. In this role, you’ll design and build scalable, high-performance data pipelines and infrastructure using modern Azure and big data technologies. You will collaborate closely with cross-functional teams to deliver clean, accessible, and trusted data that powers advanced analytics, AI/ML models, and key business decisions.
Key Responsibilities
Build Scalable Pipelines: Design and implement robust ETL/ELT pipelines to ingest, transform, and store data from diverse structured and unstructured sources (e.g., APIs, Kafka, MongoDB, cloud services).
Develop Azure Data Solutions: Use Azure Data Factory, Databricks, and related tools to manage orchestration, transformation, and pipeline deployment.
Architect & Model Data: Develop scalable data lake and data warehouse solutions using ADLS Gen2, Delta Lake, and Azure SQL DB. Implement optimized data models for analytics and reporting use cases.
Optimize Performance: Apply best practices for data pipeline performance tuning, resource optimization, and parallel processing using Apache Spark and PySpark.
Enable Automation: Automate data workflows and validation processes, including test cases for ETL and Big Data pipelines, ensuring reliability and efficiency.
Collaborate Across Teams: Partner with analysts, data scientists, engineers, and business stakeholders to understand requirements and deliver impactful data products.
Implement Governance & Security: Ensure adherence to data governance, quality, privacy, and compliance standards. Use tools like Key Vault and DevOps CI/CD for secure and automated deployment.
Monitor & Maintain: Establish monitoring, alerting, and logging for data pipeline health and quality, proactively resolving any failures or inconsistencies.

What We’re Looking For
Hands-on Expertise in:
Programming: Python, PySpark, Scala
Azure: Data Factory, Databricks, Key Vault, DevOps CI/CD
Storage: ADLS Gen2, Delta Lake, Azure SQL DB
Big Data Ecosystem: Apache Spark, Hadoop
Experience With:
Data ingestion from real-time and batch sources like Kafka, MongoDB
Building secure, reusable, and scalable data infrastructure
Agile methodology and DevOps principles
Automation testing frameworks for ETL/Big Data pipelines
Preferred (but not required):
Exposure to AI/ML workflows and data science teams
Understanding of MLOps and deployment of ML models at scale

Core Competencies
Strong grasp of data modeling, warehousing, and distributed processing
Solid problem-solving and debugging skills
Passion for clean, reliable, and well-documented data systems
Excellent communication and teamwork abilities
Self-starter attitude with a sense of ownership and accountability

Education & Experience
Bachelor's or Master’s degree in Computer Science, Engineering, or a related field
5–8 years of relevant experience in data engineering roles with cloud and big data tech stacks