[VO2 Data] - Data Engineer GCP

Paris, France

VO2 Data /

CDI /

Hybrid

Tech and Digital Consulting Company, VO2 Group is an independent international enterprise that unleashes the transformative power of technologies to propel major brands into an era of endless possibilities.

As the leading independent French group, established on equal terms, we operate from Paris, Lille, Montreal, Toronto, Shanghai, Casablanca, Jerusalem, and New York with over 700 consultants

As a global player in customer experience, we engage in everything from business and digital strategy, IT, and data to the design of web and mobile products, as well as the creation of tailor-made digital experiences.

We are the Growth Champion according to LES ECHOS for ESNs in 2023.

We have won the ESG WeImpact Happy at Work 2023 award.

VO2 GROUP, The Bright Side Of Tech.

Job description

Data Collection: Extraction of data from various sources, whether it be databases, files, or real-time data streams.

Data Cleaning and Transformation: Cleaning, filtering, enriching, and transforming data to prepare it for analysis. This may include handling missing data, normalization, format conversion, etc.

Data Pipeline Design: Creation of data pipelines to automate data flow, including managing dependencies between different pipeline stages.

Data Storage: Selection of appropriate storage solutions, whether it be Google Cloud Storage, Bigtable, BigQuery, or other GCP services.

Data Integration: Integrating data into data warehouses, columnar data stores, NoSQL databases, or data lakes.

Data Quality Management: Implementation of data quality controls to ensure data integrity and quality.

Data Security: Implementation of security measures to protect sensitive data, including data access, identity and access management, encryption, etc.

Performance Optimization: Monitoring and optimizing the performance of data pipelines to ensure quick response to queries and efficient resource utilization.

Documentation: Documenting data pipelines, data schemas, and processes to facilitate understanding and collaboration.

Automation: Automating ETL (Extract, Transform, Load) processes to minimize manual intervention.

Collaboration: Collaborating with data scientists, analysts, and other team members to understand their needs and ensure data readiness for analysis.

Monitoring: Constant monitoring of data pipelines to detect and resolve potential issues.

Scalability: Designing scalable data pipelines capable of handling growing data volumes.

This list of tasks is not exhaustive and is subject to change.

Profile sought

GCP Mastery: A deep understanding of GCP services and tools is essential for designing and implementing data engineering solutions.

Real-time Data Processing: Ability to design and implement real-time data pipelines using services like Dataflow or Pub/Sub.

Batch Data Processing: Competence in creating batch data processing workflows with tools like Dataprep, Dataprep, and BigQuery.

Programming Languages: Proficiency in programming languages such as Python, Java, or Go for script and application development.

Databases: Knowledge of both NoSQL databases (Cloud Bigtable, Firestore) and SQL databases (BigQuery, Cloud SQL) for data storage and retrieval.

Data Security: Understanding of data security best practices, including authorization management, encryption, and compliance.

Orchestration Tools: Ability to use orchestration tools such as Cloud Composer or Cloud Dataflow to manage data pipelines.

Problem-solving: Aptitude to solve complex problems related to data collection, processing, and storage, as well as optimizing the performance of data pipelines.

Apply for this job