Data Engineer

Texas - Austin
Engineering – Engineering /
Full-Time /
Remote
The Role

Veeva OpenData supports the industry by providing real-time reference data across the complete healthcare ecosystem, to support commercial sales execution, compliance, and business analytics. We drive value to our customers through constant innovation, using cloud-based solutions and state-of-the-art technologies to deliver product excellence and customer success.
 
As a Data Engineer in Opendata, you will take responsibility for the OpenData processing workflows in US. You will be building and maintaining data processing tools, pipelines and reports, ensuring data quality in our reference data. We value end-to-end ownership, which gives you the freedom to determine the correct course of action, do all due diligence, and execute solutions in your own creative way.

What You'll Do

    • Build and maintain data processing pipeline and tools using state-of-the-art technologies
    • Work with Python and SQL on Spark-based data pipelines
    • Develop algorithms to build complex data relationships
    • Build analytical data structures to support reporting
    • Build and maintain Data Quality processes
    • Collaborate with Product team to adapt our reference data to changing demands in the market

Requirements

    • 3+ years of experience developing data pipelines using cloud-managed Spark clusters (e.g.  AWS EMR, Databricks)
    • Fluent in Python programming language and PySpark (3+ years of experience)
    • Previous experience building tools and libraries to automate and streamline data processing workflows
    • Proficient with SQL / SparkSQL
    • Hands-on experience working with a Data Lakehouse
    • Good verbal and written communication and proven experience of working and delivering in an Agile environment
    • Applicants must have the unrestricted right to work in the United States. Veeva will not provide sponsorship at this time
    • We are looking for strong mentors with a proven record of making your team better

Nice to Have

    • Experience running data workflows through DevOps pipelines
    • Develop data pipelines with orchestration tools (e.g. Airflow)
    • Experience with AWS services for data processing like EMR, MWAA etc.
    • Previous experience in the Life Sciences sector

Perks & Benefits

    • Medical, dental, vision, and basic life insurance
    • Flexible PTO and company paid holidays
    • Retirement programs
    • 1% charitable giving program

Compensation

    • Base pay: $75,000 - $130,000
    • The salary range listed here has been provided to comply with local regulations and represents a potential base salary range for this role. Please note that actual salaries may vary within the range above or below, depending on experience and location. We look at compensation for each individual and base our offer on your unique qualifications, experience, and expected contributions. This position may also be eligible for other types of compensation in addition to base salary, such as variable bonus and/or stock bonus.
#LI-Remote