Senior Software Engineer, Spark (Remote - US)

Raleigh, NC /
Engineering & Data Science – Backend Engineering /
Full-Time Employee
The Role 

The Platform team at Lucidworks builds the foundation of our cloud-native microservices architecture orchestrated by Kubernetes. The Platform team owns the design and implementation of our API gateway, security, cloud ops, workflows and job scheduling, Apache Spark integration, messaging framework (Apache Pulsar), and ML model ops / serving infrastructure (Seldon Core / Argo).

To be successful in this role, you should be passionate about solving data analytics problems at scale using SQL, Java, and Scala. Some exposure to Kubernetes and cloud platforms is preferred.

You will:

    • Use the Spark Scala API to build data processing and SQL analytics jobs
    • Build reusable libraries and utilities in Scala to support common tasks
    • Support and refactor an existing codebase containing many diverse Spark jobs
    • Design data intensive workflows that read/write large data sets from/to cloud storage (GCS, S3)
    • Maintain and improve an existing Spark job execution framework on Kubernetes
    • Maintain and improve the Spark-Solr open source project, including porting to Spark 3
    • Provide example Jupyter notebooks for common analytics tasks

You have:

    • 5+ years experience with large-scale distributed systems and streaming data platforms (like Flint, Spark, Storm, Kafka streams, etc)
    • 3+ years experience using messaging platforms like RabbitMQ, ActiveMQ, Kafka, or Apache Pulsar
    • Deep knowledge of fundamentals of Spark, Kubernetes, Helm, and Docker
    • BS in computer science or similar field or equivalent experience
    • Mastery of Git, Gradle, Jenkins, BASH, SQL, Scala, and Java
    • Experience with big data analytics highly preferred
Please note that at this time Lucidworks is unable to sponsor US employment authorization (both new and transfer).

About Lucidworks

Lucidworks is leading digital transformation by fusing the power of search and artificial intelligence to create connected experiences for work, shopping, research and support. 

Fusion is our cloud-native ML-powered search platform that integrates open-source projects Spark and Solr with our proprietary code for query intent prediction, low latency search, hyper-personalization and smart app creation. Our products include applications that run on the Fusion platform including Predictive Merchandiser, which helps ecommerce teams harness the power of ML to improve ecommerce conversion and Smart Answers, which enhances chatbots and virtual assistants with natural language processing and deep learning. We believe in building a team to deliver these products that make searching for insights a uniquely personal experience for a worldwide community of users.

Our roots are in Apache Solr, the global search standard used by 90 percent of U.S. Fortune 500 companies. Our team includes contributors and committers to Solr as well as some of the world's foremost machine learning innovators. We are trusted by the world's largest brands to deliver personalized digital experiences across many industries, including: insurance, banking, capital markets, manufacturing, media, oil & gas, retail, software, and telecommunications. Those customers include companies like: Aetna, Morgan Stanley, Reddit, Red Hat, Uber, Verizon, and Wells Fargo. We also serve government agencies in the civilian, defense and intelligence sectors, including the United States Federal Reserve and the U.S. Census Bureau.

Lucidworks believes in the power of diversity and inclusion to help us do our best work. We are an Equal Opportunity employer and welcome talent across a full range of backgrounds, orientation, origin, and identity in an inclusive and non-discriminatory way. Applicants receive consideration based on the relevant talents, skills, and experiences they offer to our company. Thank you for your interest and we look forward to learning more about you.