Data Engineer

New York City / San Francisco / Remote /
Engineering /
Full Time
Gauntlet’s mission is to drive adoption and understanding of the financial systems of the future. Gauntlet is the platform for off-chain intelligence that drives on-chain efficiency in Decentralized Finance (DeFi). We work with protocols to manage risk, improve capital efficiency, and manage incentive spend. We also publish cutting-edge research and aim to take a leading role in defining market risk standards across the industry.

Gauntlet is building infrastructure that allows us to simulate and stress-test blockchain protocols, contracts, and network interactions at scale over a wide range of market conditions. Our models ingest a wide range of on-chain and off-chain data, and are continuously calibrated to the current crypto market structure so that our recommendations are always up-to-date. These models and infrastructure power our platform that currently manages risk and optimizes incentives for over $40B in assets.

You will be working as part of an experienced team that has developed simulation software for many other industries, including high-frequency trading, autonomous vehicles and ride-sharing, and the natural sciences.

The Gauntlet platform ingests and indexes data from a wide range of sources to power the models that provide the off-chain intelligence that drives on-chain capital efficiency. This includes public and private blockchain data, as well as market data from centralized exchanges. As the team's first Data Engineer, you will build data intelligence systems, ETL pipelines, and internal databases to take our existing infrastructure to the next level.

Responsibilities

    • Build and operate data infrastructure: storage, container orchestration, database management, and real-time data ingestion systems.
    • Optimize Data systems for scale, speed, and reliability.
    • Design and build services for end-to-end data security and data governance: managing access controls across multiple storage and access layers, tracking data quality, cataloging datasets and their lineage, usage auditing.
    • Build and evolve the tools that empower colleagues across the company to access data and build reliable and scalable transformations. This includes UIs and simple frameworks for derived tables and dimensional modeling, APIs and caching layers for high-throughput serving, and SDKs.
    • Build systems that secure and govern our data end to end: control access across multiple storage and access layers (including BI tools), track data quality, catalogue datasets and their lineage, detect duplication, audit usage and ensure correct data semantics.

Requirements

    • Experience building backend data systems at scale with parallel/distributed compute.
    • Expert at SQL and SQL database design
    • Experience using data tools and orchestration frameworks such as Kubeflow, Airflow, Spark, Flink, Hadoop, Presto, Hive, or Kafka.
    • Comfortable writing production quality code in languages such as Python, GoLang, Rust
    • Experience working with public cloud infrastructure providers such as GCP