Data Scientist (NLP)
Remote - United States /
Curate – Engineering /
Thank you for your interest in this position with Curate, a wholly-owned subsidiary of FiscalNote. The recruitment for this role is being managed by FiscalNote's Recruiting team (parent company), so you are in the right place! Please continue to review this opportunity and apply at your interest. We are excited to review your background. Thank you!
About this Position
Curate, part of FiscalNote, is seeking to hire a full-time Data Scientist. As a Data Scientist, you will be at the core of our advanced analytics where you will be identifying, designing, and implementing machine learning-based solutions to analyze and model our ever-increasing database of millions of municipal documents. You will be working on a wide range of problems that include data processing, training and evaluating our machine learning models, utilizing state-of-the-art machine learning solutions, and much more. You’ll lead the way in all stages of researching and developing insightful and actionable models and analysis, looking into the future, and becoming an internal and external expert. You will be part of the team that is responsible for the success of machine learning-powered automation at Curate, and you will contribute to actionable insights and build automated systems that deliver our business artificial intelligence (AI) solutions to our customers.
About the Curate Team
Curate helps businesses efficiently monitor risk and find opportunities through municipal data - at scale. We scan thousands of local government websites looking for meeting minutes and agendas where decisions are made that impact our business customers. Based in Madison, WI the 25-member Curate team works collaboratively to ensure we’re always improving the product for our customers.
You will be a part of the team that builds statistical, NLP, and ML-enabled services for intelligent data aggregation, manipulation, augmentation, and generation. As a part of FiscalNote, You'll get the opportunity to work at an institution pushing the boundaries of open data transparency, while collaborating with some of the industry’s brightest engineers and data scientists to devise, nurture, and implement cutting-edge solutions to continuously evolving engineering obstacles.
What To Expect in this Position:
- Research and prototype ML & NLP models using the latest AI technologies and frameworks to better and improve Curate’s data review process, such as text extraction, tokenization, NER, POS tagging, and classification methods, as well as enhancing existing models.
- Work on developing and architecting end-to-end pipelines around our models, such as establishing data standards, evaluation metrics, and KPIs to track overall performance over time.
- Explore new data sources for integration with current data and new types of analysis, and spend time developing algorithms to analyze patterns and insights from very large datasets.
- Work with the management team and understand Curate's business needs in order to recommend, prototype, and productionalize new machine learning-based automated solutions.
What Sets You Apart:
- 2-4 years of industry experience (or equivalent) developing machine learning solutions. Advanced degree in Computer Science/Statistics/Applied Math/Economics/Political Science (or related discipline) preferred
- Practical experience and strong understanding of techniques in machine learning (feature engineering, supervised & unsupervised learning, optimization algorithms, deep learning) and natural language processing (TFIDF, text classification, word embeddings, language models), demonstrated by the ability to dig deep into practical problems and choose the right ML method to solve them
- Solid Python programming skills to quickly prototype, as well as develop, projects into production.
- Experience with any modern machine learning frameworks such as Scikit-Learn, Pytorch, Tensorflow, or Spacy is highly desirable. Experience acquired in an academic setting is acceptable.
- Experience working with large structured or unstructured data sets and running exploratory data analysis. Experience with MongoDB is not required but is a plus. Experience with Databricks is a plus.
- Good communication skills, including the ability to present and communicate machine learning requirements and issues to non-technical teams.
Considering US locations only for this position and time zone requirements may apply in order to address team and business needs.