Machine Learning Engineer

San Francisco, CA /
Engineering Team /
Machine Learning Engineering
Deep Discovery is hiring Machine Learning (ML) Engineers with experience with Natural Language Processing (NLP), Natural Language Understanding (NLU), Graph Neural Networks (GNN) or with ML for knowledge graphs, information extraction, entity (ER) and identity (IR) resolution, geospatial processing, time series analysis or risk scoring to help us build a 1.5 billion node business graph to better understand the world’s legitimate and illegitimate business.

We will use this graph to build better Know Your Customer (KYC) risk scores for financial institutions. This is called a Know Your Customer (KYC) system for Anti Money Laundering (AML) and banks use these systems to evaluate the risk of doing business with their clients so they don’t face stiff fines from regulatory agencies. Unlike current approaches to AML and KYC, we look at relationships between companies on the open web and extract structured information from news and other text to build networks that drive the models that generate risk scores for the customers of banks, which they use when conducting background checks. 

We are taking a network-centric approach to KYC that evaluates clients in terms of the context in which they do business, which involves several machine learning tasks: extracting knowledge graphs from news and other text, entity and identity resolution of the networks we collect about the economy, representation learning on the resulting graphs and their associated documents, building a scoring engine that uses our business graph to create an accurate risk score. Users do not believe predictions without explanations and the cost of errors is high, so the final machine learning component is the most critical: the system must be explainable in terms of the graphs from which we draw conclusions, and we use a graph database and network visualizations to explain our risk scores.

Like our team, our business plan is unconventional: to raise venture capital to build a team to build a product to sell to banks to risk score their customers in terms of the business networks they participate in to build a sustainable business that gives free access to this tool to investigative journalists to help them investigate and nail crooked politicians to the wall. This isn’t marketing or some bullshit story about “changing the world” through better ad targeting or a better word processor. This is our entire mission, and it coincides nicely with an initial public offering for stock in our company.

We’re looking for self-motivated Machine Learning Engineers and Researchers who are passionate about networks that can learn the domain quickly and who can mine the literature and customize algorithms and systems to come up with novel solutions to the problems we face in delivering a product. While being published is good, the most important thing we want in a candidate is a track record of shipping products to real customers. We have data engineers but expect you to be fairly self-supporting in carrying out your work, so generalist skills are important. Candidates without advanced degrees are welcome, experience is education.

The ideal candidate will have

    • Early-stage startup experience
    • A track record of shipping data-driven products to market
    • Solid Python 3 skills, including object-oriented analysis and design
    • Working knowledge of neural network architectures
    • A track record implementing self-supervised learning
    • A working knowledge of semi, weak and distant supervision - you make your own labels
    • Published papers in CS, or other quantitative fields
    • Understanding of traditional Social Network Analysis and other graph algorithms

Plus at least one of the following

    • Track record of using machine learning for entity (ER) and identity (IR) resolution
    • Advanced experience with Natural Language Processing (NLP) and Natural Language Understanding (NLU)
    • Working knowledge of Graph Neural Networks (GNNs) including Graph Transformer Networks (GTNs)
    • Track record of using machine learning for solving geo-spatial problems