Consultant- Data Aggregation

Pune, Maharashtra
Consulting – Consulting /
Full-time /
Hybrid
Beghou brings over three decades of experience helping life sciences companies optimize their commercialization through strategic insight, advanced analytics, and technology. From developing go-to-market strategies and building foundational data analytics infrastructures to leveraging artificial intelligence to improve customer insights and engagement, Beghou helps life sciences companies maximize performance across their portfolios. Beghou also deploys proprietary and third-party technology solutions to help companies forecast performance, design territories, manage customer data, organize, and report on medical and commercial data, and more. Headquartered in Evanston, Illinois, we have 10 global offices.

Our mission is to bring together analytical minds and innovative technology to help life sciences companies navigate the complexity of health care and improve patient outcomes.


The Patient Data Aggregator is responsible for the secure collection, standardization, and aggregation of sensitive patient data across multiple systems to support clinical, operational, and research needs. This role requires a strong understanding of data privacy protocols, including de-identification methodologies such as tokenization and expert determination, and must operate within a HISEC-compliant (High Security) environment. The candidate will collaborate with cross-functional teams to ensure data is managed in a secure, compliant, and usable manner. If you are passionate about utilizing your expertise in U.S. pharmaceutical datasets to drive client success, we invite you to apply for this exciting opportunity! 

We'll trust you to:

    • Collect, clean, and aggregate patient-level data from Electronic Health Records (EHRs), lab systems, and external databases. 
    • Apply tokenization techniques to replace direct identifiers while preserving data integrity for linkage and analysis. 
    • Perform or support expert determination de-identification processes to evaluate and minimize re-identification risk in datasets. 
    • Ensure all activities comply with HIPAA, and institutional privacy policies, particularly within HISEC environments 
    • Manage and maintain secure data pipelines and repositories for sensitive patient information. 
    • Assist in the development of standard operating procedures (SOPs) for secure data aggregation and handling. 
    • Support the generation of anonymized datasets for research, analytics, or external data sharing initiatives. 
    • Monitor data quality, flag inconsistencies, and ensure proper documentation of data provenance and transformation. 
    • Provide support for audits, security assessments, and internal data governance reviews. 
    • Design and implement end-to-end Data aggregation solutions tailored to business requirements using Python and Pyspark. 
    • Synthesizes findings, develops recommendations, and communicates results to clients and internal teams. 
    • Assumes project management responsibilities for Data aggregation implementation on each project with minimal supervision, including managing onshore/client communication, leading meetings, and drafting agendas 
    • Manage multiple projects simultaneously while meeting deadlines 

You'll need to have:

    • Engineering/Master’s Degree from a Tier-I/Tier-II institution/university in Computer Science or relevant concentration, with evidence of strong academic performance. 
    • 5+years of relevant consulting-industry experience 
    • Deep understanding of data management best practices, data modeling and data analytics 
    • Strong understanding of U.S. pharmaceutical datasets and their applications 
    • Experience with tokenization tools/methodsand expert determination for de-identification. 
    • Logical thinking and problem-solving skills along with an ability to collaborate. 
    • Familiarity with HISEC standards or other high-security frameworks 
    • Strong programming skills in Python and working knowledge of PySpark
    • Proficiency in Microsoft Office products,data analysis tools (e.g., Snowflake, Redshift) and job orchestration (Airflow, Databricks workflows) 
    • Strong understanding of HIPAA, GDPR, and related data privacy regulations. 
    • Excellent verbal and written communication skills, with the ability to convey complex information clearly to diverse audiences  
    • Strong organizational abilities and effective time management skills 
    • Ability to thrive in an international matrix environment and willingness to support US clients during US working hours with 3-4 hours of overlap. 
At Beghou Consulting, you'll join a highly collaborative, values-driven team where technical excellence, analytical rigor, and personal growth converge. Whether you're passionate about AI innovation, building commercialization strategies, or shaping the next generation of data-first solutions in life sciences, this is a place to make an impact!