Data Engineering Intern

Redwood City, CA
At Wealthfront we believe everyone's personal finances can be optimized and automated for a very low fee using high-end technology. To achieve that, we built a software-only approach, which also helped create a new category: robo-advisors. Over the past six years we've paired the the expertise of our PhD-clad research team with the exceptional talents of our engineering, product and design teams to deliver sophisticated products and services to our clients that are easy and fun to use. And that plan has worked! We have loyal clients from every state who trust us with over $9 billion in assets... and we're just getting started. So if you're passionate about helping people secure their ambitions while helping to change an industry, keep reading.

We recently launched a new user experience that lays the foundation for a future where Wealthfront is the only financial advisor our clients will ever need. To accomplish this, we’ve redesigned and rebuilt our data platform to combine offline and online computation to serve personalized advice and will ultimately be the center of our clients’ financial lives.

We’re looking for engineers who are excited to focus on building the data infrastructure to scale compute for our business. This includes making tradeoffs between online, offline, and streaming architectures, as well as understanding the product well enough to understand the impact these decisions will make on clients.

What you'll work on

    • Spark: As data engineers, we want to move fast, and we want our code to move fast as well. We’ve recently transitioned from Hadoop to Spark, and we are continuing to increase the performance of our data pipelines while simultaneously increasing the complexity of the jobs that run on top of them. We use Spark in both streaming and batch mode. You'll be expected to improve the infrastructure while you implement on top of it.
    • Machine learning: We use statistics to solve hard problems. Whether we’re running regression to better understand our business or clustering as part of a client-facing data pipeline, statistical modeling is key to our business.
    • Data quality: A model is only good if it is correct and built on recent data. We put a strong emphasis on data quality. We write unit tests to test the functional correctness of each module and meta-tests to guard against common programming errors. Throughout our data pipelines we run automated sanity checks on live data, alerting if any data is stale or values fall outside of expected ranges.


    • Strong experience in one of: Java, Hadoop or similar language
    • Experience with online data stores (preferably MySQL); experience with offline data stores (preferably Hadoop stack) is plus
    • Experience with at least one scripting language
    • Passion for agile, test-driven development, continuous integration, and automated testing
    • Solid understanding of distributed systems and functional programming paradigms
    • Pursuing a BS, MS, or PhD in computer science, math, physics, or related field
About Wealthfront

For more information please visit are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.