Software Engineer - Data Retrieval Team

Madrid, Spain / Remote
Engineering
Full-time
Location: Madrid / Fully remote
Seniority: We are open to all levels of seniority, and adjust compensation accordingly (read more about how we set wages here)
Compensation: 49-53,000 Euros + 0.095% stock options for senior engineers

At source{d} we are building the technology stack for the next generation of Machine Learning powered developer tools. We are an open-core company built around our Open Source projects. We have raised over eight million USD so far, and we are currently growing our team.

Engineering at source{d}

Engineering consists of five different teams that represent the architecture of our product:

• Applications (Scala, Go, Python, and Frontend tools): Builds CLI/Web applications combining ML research with our stack.
• Machine Learning (C++ and Python): Performs R&D for Machine Learning on Source Code.
• Data Retrieval (Scala and Go): Builds the technology that finds, fetches, stores and analyzes over +60M Git repositories.
• Language Analysis (Go and another +15): Focused on Babelfish, the universal code parsing server.
• Infrastructure (Go and Python): Manages a cluster of on-prem bare metal servers with Kubernetes and CoreOS, and GCP for user facing applications.

We care about Open Source. Everything we develop is available for anyone to read, modify, and contribute (under Apache 2.0 or GPL3 license). Some examples of our projects are:

bblfsh/bblfshd: Babelfish server, turning code into Universal Abstract Syntax Trees (UASTs).
src-d/engine: a library for running scalable data retrieval pipelines that process any number of Git repositories for source code analysis.
src-d/go-git: a highly extensible Git implementation in pure Go.
src-d/ml: a library to build and apply Machine Learning models on top of Universal Abstract Syntax Trees.

If you are interested in understanding how we do code reviews, please take a look at the PRs on any of these projects. You can also learn more about our engineering methodology here.

Role

The Data Retrieval team is developing source{d}'s high-level code analysis applications for running scalable data retrieval pipelines that process and manipulate any number of code repositories for source code analysis. Written mostly in Go, it aims to be robust, friendly and flexible and capable of running on large-scale distributed clusters over petabytes of data.

• We at source{d} seek to be at the heart of any project related to source code. Thus, this core tool will be used both in-house for building source{d}'s unique global scale open dataset of +60M code repositories for cutting-edge Machine Learning research, as well as used externally by empowering a wide community of developers, researchers and companies worldwide when doing vanguard research or building the next generation of developer tools and experiences.

• Good knowledge of distributed computing and parallel processing is important.

• You will be expected to have strong backend coding skills in at least two languages and very good algorithmic skills. Scala coding skills and knowledge about Apache Spark aren't required but will be highly appreciated, on the other hand Go is not a strict requirement; we strongly believe that it can be learned easily by any skilled developer and care a lot more about our team's mindset and prior experience than any specific skills.

Culture

• source{d} is a company for developers by developers. We firmly believe in always doing what's best for the individual developer in the community. Our team consists of members who are passionate about programming. To understand our culture better, read more about it here.

• At the moment, we are 25+ people from 10 different countries working as a distributed organization. Some of our team members are based in the Madrid or San Francisco office, others work remotely from around the world (Portugal, Estonia, Russia, and others).

• For those wanting to work from one of our offices, we fully support the visa and moving process for you and your family.

• At source{d}, we have a transparent salary policy which we feel strongly about it. Your seniority level will be determined during the last round of on-site interviews.

• At source{d} all of the projects we work on are public on GitHub and the vast majority are open-source under licenses such as Apache 2.0 or GPL3.

• We don't just believe in open-source, we also believe in radical transparency as an organization, there we publish everything about the company at github.com/src-d/guide.

Perks

• We go to conferences and other developer events!
• Open Source Days, every second Monday, you are encouraged to work on any OSS project you choose.
• Flexible hours, set your own schedule that fits you.
• Free books. We will buy any books that help you learn & grow.
• If you choose to work from one of our offices, you will enjoy a comfortable and spacious environment.
• Annual summer and winter Christmas parties and a hackathon retreat are held in Madrid and all team members are flown over for it.
• We also have our own, Open Source craft beers.

Other

• We offer visa and relocation support for those wanting to work in the Madrid or San Francisco office.
• The local timezone of developers who want to work remotely should be between San Francisco and Moscow.