Sr. Data Engineer
Mountain View, CA /
Cape Analytics provides instant property intelligence for buildings across the United States. Cape Analytics enables insurers and other property stakeholders to access valuable property attributes at time of underwriting, with the accuracy and detail that traditionally required an on-site inspection, but with the speed and coverage of property record pre-fill. Founded in 2014, Cape Analytics is backed by leading venture firms and innovative insurers and is comprised of computer vision, data science, and risk analysis experts.
We are looking for a Senior Data Engineer to join our growing Platform Engineering team. The ideal candidate has significant experience in building scalable data platforms that enable business intelligence, analytics, data science and data products. They must have strong, hands-on technical expertise in a variety of technologies and the proven ability to fashion robust, scalable solutions. They must be at ease working in an agile environment. The ability to work across teams with product managers, data scientists, engineers, and business stakeholders to translate engineering and business requirements into working code will be critical to success in this role. This person should embody a passion for continuous improvement and data quality.
We’re looking for a strong, thoughtful Data Engineer that can design and implement robust data systems, pipelines and infrastructure. We have a breadth of different systems, ranging from client-facing web-apps to our highly-scalable, fast, Deep Learning pipeline. We leverage AWS, Rails, Python, Tensorflow, Postgres and much more to develop and deliver our products.
What You’ll Do:
- Integrate data from multiple data sources, design and build cross-platform ETL processes.
- Investigate, solve problems, and implement solutions for ensuring data quality and delivery.
- Develop new tools and processes for managing our data workflows and data infrastructure.
- Collaborate with our Platform, Data Science and Machine Learning teams on building, maintaining and monitoring our data infrastructure.
- Collaborate with product managers, data scientists, business users and other engineers to define requirements and design solutions.
Skills and Experience:
- Data Ingestion
- Interest in pulling data from many sources
- *nix operations, bash scripting, awk, sed
- RESTful APIs
- ETL and storage
- At least one NoSQL technology
- Comfortable choosing technologies that fit the application (e.g. MySQL versus PostgreSQL, Hadoop versus Cassandra)
- Expertise in reporting, analytics, and databases
- Data manipulation
- Knowledge, interest and experience in big data, data mining and statistical analysis
- At least one scripting language (Ruby, Python, etc.); experience with data science libraries like scikit-learn are a plus
- Expertise with SQL
- Data Scheme design
- Experience with Spark, Hadoop, Athena (Presto)
- Deploying algorithms at scale
- Cloud computing, especially AWS technologies (S3, EC2, etc.)
- Familiarity with Docker
- Familiarity with spark
- Configuring and tuning on-demand clusters a major plus!
*Talent is critical, but best when tempered with humility
*Self-motivation leads to the best outcomes
*Open, direct communication is a sign of respect
*Teamwork drives success
*Having fun together is an important part of the job
***Cape Analytics is an E-verify participant.***