Senior Data Scientist

Mountain View, CA/Remote /
Engineering /
Cape Analytics provides instant property intelligence for buildings across the United States. Cape Analytics enables insurers and other property stakeholders to access valuable property attributes at time of underwriting, with the accuracy and detail that traditionally required an on-site inspection, but with the speed and coverage of property record pre-fill. Founded in 2014, Cape Analytics is backed by leading venture firms and innovative insurers and is comprised of computer vision, data science, and risk analysis experts.

As a Senior Data Scientist on CAPE’s Data Science team, you’ll collaborate with Data Scientists, Computer Vision/Machine Learning Engineers, Data Engineers, and members across Software Engineering, Product, and Sales teams to build robust, scalable machine learning models for identification and annotation of the built world. Additionally, you will develop expertise in ground truth generation, model performance analysis, iterative model development, and unsupervised mapping of the feature space to bring scientific rigor, scalability, and robust performance to our core product offerings.

As a senior member of the team, you will also oversee the work of other data scientists in the team and work with Product Managers to plan the roadmap for the team.

CAPE’s insurance solutions have been adopted by leading carriers across the U.S., Canada, and Australia...but we are just getting started. Over the past 6 years, we’ve constructed an analytics platform purpose-built for deep learning. On the heels of our recent $44 million Series C financing, we’re growing rapidly. In CAPE’s next phase, we’re setting out to solve a larger share of the problem, leveraging a radically expanded array of input data sources and advanced machine learning technologies.

CAPE leverages all available tools and technologies to build our best-in-class tech-stack, which affords us flexibility of fast-deployments, along with the stability to support aggressive SLAs for critical-path client APIs and applications. We build our models using Pytorch and Tensorflow, and leverage Python, Spark and Postgres across our AWS-deployed cloud infrastructure.


    • Develop scientifically rigorous, creative methodologies to continuously improve our machine learning models
    • Incorporate machine learning and data-driven decisioning into the core of our infrastructure
    • Explore and mine new data sources that will help optimize and validate our models
    • Link model capabilities to market needs by customizing models, designing and running validation studies


    • Start to assist in Sprint planning and Quarterly planning with the team
    • Contribute to design and automation of model training, model post-processing and evaluation pipelines at scale
    • Leverage the extensive data generated by Cape in addition to data from external sources to generate structured knowledge about our feature space
    • Implement automated solutions for ensuring data quality and delivery
    • Contribute to peer mentorship, knowledge bases, and skills transfer


    • Be primarily responsible for roadmap planning with Product team along with Sprint planning and Quarterly planning
    • Present your results internally and externally
    • Defend your methodology and incorporate feedback from internal teams as well as customers
    • Improve model performance by identifying failure modes using supervised and unsupervised learning techniques
    • Ideate and implement data-driven methodologies to help scale model performance across geographical, climatic, and temporal dimensions


    • PhD in a STEM field with 3 years of hands-on industry experience or Masters in a STEM field with 5 years of hands-on industry experience
    • A background in the Finance or Real Estate sector is strongly preferred. This includes familiarity with Real Estate data such as MLS and other public record data, Mortgage Loans, Automated Valuation Models, Asset Valuations, Cash Flow Analysis, Risk Analysis etc.
    • Solid knowledge of statistical techniques, including hypothesis testing, statistical sampling, significance testing, statistical inference, maximum likelihood estimation, and experimental design, among others
    • Mastery of, supervised and unsupervised algorithms and their implementations, machine learning concepts including regularization, learning curves,  optimizing hyperparameters, cross-validation, among others
    • Advanced knowledge and significant programming experience in Python programming or other scripting language including relevant libraries like numpy, pandas, SciPy, matplotlib
    • Familiarity with the Linux environment including shell scripting, Git and tools for reproducibility (e.g.  virtual environments, Docker)
    • Demonstrated expertise in building data tools for ETL and data analysis
    • Experience in building meaningful data visualizations using at least one scripting-based visualization tool such as matplotlib, d3.js or bokeh
    • Nice to haves: Experience designing data schemas and extracting data from SQL and NoSQL databases. Experience with GIS systems. Experience with modern data technologies, e.g. Spark, pytorch, Jupyter Notebook, DockerExperience with cloud computing on AWS or GCP
You will work with some of the smartest data scientists in the industry. They are  passionate about the work they do and have collectively built the industry’s leading AI/Analytics product. Success only comes with great team culture, camaraderie, open communication and hard work. These are the qualities that you will experience and enjoy at Cape.

We believe:

*Talent is critical, but best when tempered with humility
*Self-motivation leads to the best outcomes
*Open, direct communication is a sign of respect
*Teamwork drives success
*Having fun together is an important part of the job

***Cape Analytics is an E-verify participant.***