Associate Distributed Systems Engineer

Engineering – Distributed/Streaming Services /
Full-time Remote /
Arcadia is dedicated to happier, healthier days for all. We transform diverse data into a unified fabric for health. Our platform delivers actionable insights for our customers to advance care and research, drive strategic growth, and achieve financial success. For more information, visit

Why This Role Is Important to Arcadia

Arcadia’s business is anchored by its scalable data processing infrastructure and pluggable analytics and data enrichment platform.   We collect and store healthcare data at petabyte scale and have built the tools to enable rapid development and deployment of novel analytics capabilities and content, bringing together a legacy of Arcadia data and modern computation engines.   To make this vision a reality, Arcadia has simplified the technologies and databases used to integrate with these computation engines and provides a state-of-the-art development ecosystem for its engineers.   
In this role, you will help take this platform to the next level by building out cloudevent microservices and Apache Spark applications that more efficiently move, transform, enrich, or analyze Arcadia’s key data assets.  You will help Arcadia provide actionable insights to our health care partners and thereby improve the care for millions of patients. 

What Success Looks Like
In 3 months:
- Solid understanding of the health care data we collect 
- Actively developing Cloudevent Microservices or Spark Apps 
- Understanding of Arcadia’s SDLC 
- Positively participating in and contributing to engineering discussions 
- Basic understanding of Semantic/Linked Data and FHIR 
- Understanding of Horizontally Scalable Software 

In 6 months:
- Solid understanding of Arcadia’s data enrichment and analytic applications 
- Comfortable with Arcadia’s unified Kuberenetes ecosystem with Spark and Cloudevent Microservices 
- A solid understanding of Arcadia’s CI/CD pipeline including Github Actions, ArgoCD, Argo Workflows, Helm, and Kustomize 
- Proficient at developing Cloudevent Microservices or Spark Apps 
- Solid understanding of Semantic/Linked Data and FHIR 
- Proficiency in Horizontally Scalable Software 

In 12 months:
- Highly Proficient in Kafka, Java, Cloudevents, or Spark 
- Highly Proficient with Kubernetes, ArgoCD, Argo Workflows, Helm, and Kustomize 
- Highly Proficient in Arcadia’s SDLC (agile) 
- Proficient with Semantic/Linked Data and FHIR 
- Highly Proficient in Horizontally Scalable Software 

Here are some key responsibilities and tasks of a Distributed Systems Engineer:

    • A Distributed Systems Engineer is a specialized software engineer who designs, develops, and maintains complex computer systems that consist of multiple interconnected components or nodes working together to achieve a common goal. These systems are distributed across different machines, servers, or even geographical locations, and they often handle tasks that require high levels of performance, scalability, reliability, and fault tolerance. The role of a Distributed Systems Engineer is crucial in ensuring that these systems operate smoothly and efficiently. 

    • Architecture Design:Engineers in this role are responsible for designing the overall architecture of the distributed system. This involves deciding how different components will interact, designing communication protocols, choosing appropriate data storage solutions, and determining fault-tolerance mechanisms. 

    • Network Communication: Distributed systems involve communication between various nodes. Engineers need to design and implement communication protocols that allow these nodes to exchange data efficiently and securely. This might involve using technologies like Remote Procedure Calls (RPC), message queues, or RESTful APIs. 

    • Concurrency and Parallelism: Handling multiple tasks concurrently is a fundamental aspect of distributed systems. Engineers need to ensure that tasks are efficiently distributed across the system's resources to make the most of available computing power. 

    • Data Storage and Consistency: In many distributed systems, data is distributed across multiple nodes. Engineers need to design strategies for data storage, replication, and synchronization to ensure consistency, availability, and partition tolerance—the three properties of the CAP theorem. 

    • Scalability: Distributed systems often need to handle varying workloads. Engineers must design systems that can scale horizontally (adding more machines) or vertically (increasing resources of existing machines) to accommodate increased load. 

    • Fault Tolerance and Reliability: Distributed systems are prone to various failures, such as hardware crashes, network issues, and software bugs. Engineers need to implement strategies for detecting failures, recovering from them, and maintaining the system's functionality even when components fail. 

    • Performance Optimization: Engineers must continually analyze the system's performance and identify bottlenecks. They may optimize algorithms, data structures, and communication pathways to ensure optimal performance under various conditions. 

    • Security and Privacy: Ensuring the security and privacy of data transmitted and stored in a distributed system is crucial. Engineers need to implement encryption, authentication, and authorization mechanisms to protect sensitive information. 

    • Monitoring and Debugging: Distributed systems can be complex, and identifying the source of issues can be challenging. Engineers need to set up comprehensive monitoring tools to track system health and diagnose problems quickly. 

    • Testing and Deployment: Engineers need to devise testing strategies that cover various scenarios and failure conditions. They also need to plan deployment strategies that minimize downtime and disruption to the system. 

    • Adopting Emerging Technologies: Distributed systems engineering involves staying up-to-date with the latest technologies and trends in the field, such as containerization (e.g., Docker), orchestration (e.g., Kubernetes), and cloud computing platforms. 

    • In essence, a Distributed Systems Engineer works on the intricate challenges associated with building and maintaining large-scale, interconnected systems that exhibit high performance, availability, and fault tolerance. They bridge the gap between theory and practice, applying principles from computer science to real-world systems that power a wide range of applications, from online services to financial systems to scientific research. 

What You'll Be Doing

    • In this position, you will work with a talented engineering group to design and construct a horizontally scalable big data system that ingests and analyses over a billion records nightly.  Leveraging cloud native technologies like Kubernetes, Kafka, Spark, Cassandra (Scylla), Cloudevent Microservices, and Nifi you will implement the next generation of Arcadia’s flagship product on the AWS cloud platform.
    • You will: 
    • Participate in an Agile software development life cycle to produce timely and quality software  
    • Participate in full-stack design and development of software components for our flagship product 
    • Add functionality to our Analytics Platform by building cloudevent microservices for real-time data enrichment and Apache Spark for batch processing and analytics. 
    • Collaborate with engineers, QA analysts, product design specialists, and subject matter experts to help build and deliver quality solutions  
    • Debug and respond to critical production software and/or data issues impacting customer workflows 
    • Mentor new employees and coworkers  
    • Document software designs and development tasks in support of the SDLC  
    • Perform code reviews of teammates as part of a standard development process 
    • Develop and maintain unit tests for code developed within the engineering group  
    • Leverage version control repositories to maintain development products  
    • Evaluate and adopt Software Development tools to establish and maintain efficient local development environments  
    • Leverage build tools like Gradle or Github Actions to define and build deployment artifacts 

What You'll Bring

    • The ideal candidate would be passionate about solving complex and unique big data problems and would be eager to learn about distributed or parallel computing, the processing of large data sets, and how to take software from inception to delivery.  You should have a degree in Computer Science or a related field and be ready to roll up your sleeves and start coding.

Would Love For You To Have

    • Coursework or 1-2 years of experience providing the following: 
    • Proficiency with Spark 
    • Proficiency with Java 
    • Proficiency with Kubernetes 
    • Knowledge of Semantic or Linked Data 
    • Understanding of Scrum/Agile processes 

What You'll Get

    • Working at Arcadia you will get an opportunity to work for an awesome software company that is growing, an opportunity to work with a highly scalable cloud platform, and an opportunity to develop a highly disruptive platform that is going to change healthcare analytics. 
    • Competitive compensation and amazing benefits including Flexible Time Off (~22 day company average)

About Arcadia helps innovative providers and payers across the country transform healthcare to reduce cost while improving patient health. We do this by aggregating large amounts of disparite data, applying algorithms to identify opportunities to provide better patient care, and making those opportunities actionable by physicians at the point of care in near-real time. We are passionate about helping our customers drive meaningful outcomes. We are growing fast and have emerged as a market leader in the highly competitive population health management software market and have been recognized by industry analysts KLAS, IDC, Forrester, and Chilmark for our leadership. For a better sense of our brand and products, please explore our website.

This position is responsible for following all Security policies and procedures in order to protect all PHI under Arcadia's custodianship as well as Arcadia Intellectual Properties.  For any security-specific roles, the responsibilities would be further defined by the hiring manager.