Senior Software Engineer – Data Pipeline
Raleigh/Durham | Remote /
Engineering – Engineering /
The data pipeline team is focused on building a reliable, scalable, high-performance data ingestion pipeline to support JupiterOne's innovative graph-based cyber asset data platform. The team is responsible for the pipeline architecture and implementation including APIs used by a growing set of internal and external systems. We pursue an intimate working knowledge of databases including AWS Neptune, DynamoDB, Elasticsearch, Redis, and PostgreSQL, working closely with the integrations and query processing teams to deliver the data customers need for a number of applications.
We're looking for someone with significant experience in production software engineering with a strong focus on data engineering using the JVM (Kotlin is our chosen language). The ideal candidate has worked extensively with lots of data using a variety of tools and programming languages, having an understanding of the tradeoffs between them. You must be sensitive to the impact design and implementation choices have on users and machines alike, whether in the tools and libraries the team provides to other engineers or the performance impact a seemingly small change can introduce. We need a commitment to operable software systems; experience with instrumenting, monitoring, and responding to incidents. A cool head resolving issues under pressure is very important.
JupiterOne is a fast-growing, cyber security company trusted by companies like Hashicorp, Databricks, Marqeta, Divvy, Auth0, and more to secure their digital operations and infrastructure. In less than one year, JupiterOne has earned the trust of Fortune 100 customers and gained more than $49M in funding and support from investors, and advisors like Bain Capital Ventures, Sapphire Ventures, Cisco Investments, Splunk Ventures, and more.
Our software helps companies create a contextual knowledge base using graphs and relationships as the single source of truth for an organization's cyber asset operations. We help security, IT, and cloud teams answer the questions that matter.
What you will do:
- Maintain and advance the JupiterOne data ingestion pipeline
- Develop logging, tracing, and metrics capabilities to monitor pipeline systems
- Optimize the pipeline for scale and performance across a number of data stores
- Develop post query processing pipelines to provide other teams including data science and data ingestion with actionable insights
- Leverage AWS services effectively and efficiently to build new solutions
- Articulate architecture and infrastructure improvements through well written and researched proposals for peer review
- Effectively manage and communicate upcoming changes to internal and external users of the pipeline and its APIs
- Participate in the team's on-call rotation for incident response
- Write concise and meaningful unit and integration tests
- Deploy everything using Terraform
Who you are:
- 5+ years coding production software systems using multiple programming languages
- 3+ years of full-time experience building database access and performance solutions
- Experience with multiple data store technologies (relational databases, NoSQL databases, graph databases, full text search databases, distributed caches, etc.)
- Understand the value of well-modeled data, whether structured or schemaless
- Understand the value of well-structured software, functional and object oriented
- Experience with cloud-native architectures and infrastructure as code
- Experience with automated testing at all layers of a system
- Experience with building operable software and leveraging telemetry to support and improve complex distributed systems
- Willing to be on-call to support the software you design, build, and maintain
- Interested in and comfortable with working on critical path components of large production systems
- Empathetic to users of your code and services, internal and external
- Desire to join a fast-paced startup and team!
Stories you could tell...
- Joined a new engineering team and carefully considered how to integrate, add value, and eventually level up yourself and the team.
- Assigned a set of problems and worked independently to manage the work and provide visibility to stakeholders.
- Used telemetry to understand the behavior of a system new to you and identified opportunities to incrementally improve performance and operability.
- Pioneered a new architecture based on real-world usage of an existing solution and worked to bring along the business and other engineers.
- Received an alert in the middle of sleeping and had to figure out how to get a critical system online again and provided a meaningful and actionable post-mortem.
Technologies we use:
- Ephemeral infrastructure (Docker, AWS Lambda, ECS, Fargate)
- Distributed systems (Kinesis, SQS)
- Modern databases (Neptune, Neo4J, DynamoDB, Elasticsearch, Redis, PostgreSQL)
- Continuous integration and deployment (Docker, Jenkins, Terraform)