ML Engineer/Data Scientist - Simplata Technologies
Simplata Team /
About Simplata Technologies
Simplata Technologies is a new company formed to protect sensitive information flowing into, across and through the growing cloud applications used by modern businesses. Simplata began in the fall of 2019 as a project within Madrona Venture Labs (www.madronavl.com). It spun out of MVL in March 2020 and raised initial “pre-seed” funding.
Sensitive data is a broad term that covers both consumer data (PII, personally identifiable information) and the critical and confidential information within a company. While consumer data like email address or credit card number are obviously sensitive it’s just as important for a company to protect internal system passwords, API keys, corporate credit cards, or even the file names for confidential financial documents.
Simplata is the combination of “simple” and “data”. The core idea is that traditional data protection products have been too complicated to configure, to monitor and to maintain. Customers deserve a simple to operate product that makes their business data safer. Everything about Simplata from the dashboards and reports to the install of our connectors into cloud applications has to be consistent with that vision.
The Simplata Team
Simplata was founded by Steve Banfield, CEO, and Bruce Roberts, CTO. Both have extensive experience in building great teams and companies. This position will report directly to the CTO.
Simplata Technologies is seeking a machine learning (ML) engineer/ Data Scientist. This individual will work with our CTO and Product Manager to implement automated analysis and detection of sensitive unstructured data found in the myriad of cloud applications in use by modern businesses. The ideal candidate has familiarity with deploying machine learning models, developing and presenting validation statistics, testing, natural language processing, binary classification, and outlier detection.
This position will be responsible for designing and implementing an end-to-end machine learning pipeline that can integrate with the company’s core data processing platform. Responsibilities include evaluating and selecting ML/NLP algorithms and technologies, gathering and engineering input data, building tools to manage labeled training data and managing the labeling process, leveraging SOTA pre-trained models as well as training custom models, and tuning and optimizing system performance metrics to meet the needs of the customer.
- As an early member of our ML engineering team, you will be laying the foundations of a critical element of Simplata’s technology stack. A successful candidate will have a passion for solving information extraction problems. Your background will include a solid educational background in computer science, machine learning, and/or linguistics. You will also have applied modern ML technology solutions in commercial product settings, and are comfortable in an early stage startup environment. You are a clear communicator, and can advocate for best practices to a team who is relying on your expertise.
- 5+ years of experience applying ML and/orNLP specific solutions in a commercial product context.
- Solid understanding of data science (esp. data dimensionality reduction and visualization techniques), statistics and probability concepts.
- Strong command of the python programming language.
- Strong command of supervised learning algorithms.
- Solid experience with open source machine learning and NLP toolkits (scikit-learn, TensorFlow/Keras, PyTorch, spaCY, NLTK, etc.).
- Familiarity with Information Retrieval (IR) concepts.
- Solid understanding and experience with legacy NLP techniques, including text preprocessing (tokenization, stemming, n_grams, etc.), TF/IDF, POS tagging, bag-of-words models for classification, topic modeling, dependency tree parsing, CFGs, Named Entity Recognition (NER), etc.
- Familiarity with transfer learning and fine-tuning techniques.
- Familiarity with semi-supervised labeling methodologies.
- Hands-on experience working in and deploying solutions in public cloud environments.
- Experience developing machine learning production pipelines (MLOps).
- A deep belief in the importance of company culture and teamwork. The right candidate will want to be a key contributor to the company’s culture over the long term.
- A proactive approach to problem solving. Especially at this time when everyone at Simplata is working remotely, everyone on our engineering team will need to work together and be aligned toward the delivery of our key goals.
- Strong communication skills combined, with the ability to work independently. Standups, sprint demos, and collaboration will happen via chat and video so demonstrated communication strength is vital.
- Experience with early-stage startup environments.
- Ideally, the candidate will be located in the Seattle area. Exceptional candidates outside the Seattle area will be considered. Preference will be given to candidates located in the US Pacific time zone to make collaboration by phone, Hangouts, Slack and Zoom more efficient. No international candidates will be considered at this time.
- MS in Computer Science, Data Science, or a combination of education and equivalent experience or a related field preferred.
Compensation will reflect the nature of the opportunity, in line with early-stage companies including both equity and cash.
Simplata Technologies is an EO employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability, or protected veteran status.
Qualified applicants must be authorized to work in the US for any employer without requiring Visa sponsorship.