NLP/ML Engineer - Simplata Technologies
Simplata Technologies is a new company formed to protect sensitive information flowing into, across and through the growing cloud applications used by modern businesses. Simplata began in the fall of 2019 as a project within Madrona Venture Labs (www.madronavl.com). It spun out of MVL in March 2020 and raised initial “pre-seed” funding.
Sensitive data is a broad term that covers both consumer data (PII, personally identifiable information) and the critical and confidential information within a company. While consumer data like email address or credit card number are obviously sensitive it’s just as important for a company to protect internal system passwords, API keys, corporate credit cards, or even the file names for confidential financial documents.
Simplata is the combination of “simple” and “data”. The core idea is that traditional data protection products have been too complicated to configure, to monitor and to maintain. Customers deserve a simple to operate product that makes their business data safer. Everything about Simplata from the dashboards and reports to the install of our connectors into cloud applications has to be consistent with that vision.
Simplata was founded by Steve Banfield, CEO, and Bruce Roberts, CTO. Both have extensive experience in building great teams and companies. This position will report directly to the CTO.
Simplata Technologies is seeking an experienced natural language processing (NLP) and machine learning (ML) engineer to work with our CTO and Head of Product Management to implement automated analysis and detection of sensitive unstructured data found in the myriad of cloud applications in use by modern businesses. This position will be responsible for designing and implementing an end-to-end machine learning pipeline that can integrate with the company’s data processing platform. Responsibilities include evaluating and selecting ML/NLP algorithms and technologies, gathering and engineering input data, building tools to manage labeled training data and managing the labeling process, leveraging SOTA pre-trained models as well as training custom models, and tuning and optimizing system performance metrics to meet the needs of the customer.
- As the first NLP/ML engineer on the team, you will be laying the foundations of a critical element of Simplata’s technology stack. A successful candidate will have a passion for solving information extraction problems with NLP technologies. Your background will include a solid machine learning educational foundation consisting of applied math, statistics and probability theory. You will also have applied modern NLP and ML technology solutions in multiple professional and commercial product settings.
- 5+ years of experience applying ML and NLP solutions in a commercial product context.
- Solid education and understanding of data science (esp. data dimensionality reduction and visualization techniques), statistics and probability concepts.
- Strong command of the python programming language.
- Strong command of supervised and unsupervised learning algorithms.
- Solid experience with open source machine learning and NLP toolkits (scikit-learn, TensorFlow/Keras, PyTorch, spaCY, NLTK, etc.).
- Familiarity with Information Retrieval (IR) concepts.
- Solid understanding and experience with legacy NLP techniques, including text preprocessing (tokenization, stemming, n_grams, etc.), TF/IDF, POS tagging, bag-of-words models for classification, topic modeling, dependency tree parsing, CFGs, Named Entity Recognition (NER), etc.
- Solid understanding of the latest advancements in neural network NLP technologies, including word embeddings (word2vec, Glovec, etc.), RNN/LSTM models, auto encoder and transformer models.
- Experience with the latest generation of contextualized language models (UMLFit, ELMO, BERT, GPT-2, etc.) and their application to character/word/sentence embeddings, classification, etc.
- Familiarity with transfer learning and fine-tuning techniques.
- Familiarity with semi-supervised labeling methodologies.
- Hands-on experience working in and deploying solutions in public cloud environments.
- Experience developing machine learning production pipelines (MLOps).
- A deep belief in the importance of company culture and teamwork. The right candidate will want to be a key contributor to the company’s culture over the long term.
- A proactive approach to problem-solving. Especially at this time when everyone at Simplata is working remotely, everyone on our engineering team will need to work together and be aligned toward the delivery of our key goals.
- Strong communication skills combined with the ability to work remotely throughout the current pandemic environment. Standups, sprint demos, and collaboration will happen via chat and video so demonstrated communication strength is vital.
- Experience with early-stage startup environments.
- Ideally, the candidate will be located in the Seattle area. However, with the need for remote work caused by the coronavirus pandemic, exceptional candidates outside the Seattle area will be considered. Preference will be given to candidates located in the US Pacific time zone to make collaboration by phone, Hangouts, Slack and Zoom more efficient. No international candidates will be considered at this time.
- MS in Computer Science, Data Science or a related field preferred.
Compensation will reflect the nature of the opportunity, in line with early-stage companies including both equity and cash.
Simplata Technologies is an EO employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, or protected veteran status.
Qualified applicants must be authorized to work in the US for any employer without requiring Visa sponsorship.