Snowflake Senior Software Engineer - Service Resilience

San Mateo, CA
About Snowflake

Snowflake is growing fast and we’re scaling our team to help enable and accelerate our growth. We’re passionate about our people, our customers, our values and our culture! We’re also looking for people with a growth mindset and the pragmatic insight to solve for today while building for the future. And as a Snowflake employee, you will be accountable for supporting and enabling diversity and belonging.

Snowflake started with a clear vision: make modern data warehousing effective, affordable, and accessible to all data users. Because traditional on-premises and cloud solutions struggle with this, Snowflake developed an innovative product with a new built-for-the-cloud architecture that combines the power of data warehousing, the flexibility of big data platforms, and the elasticity of the cloud at a fraction of the cost of traditional solutions.

In addition, Snowflake’s culture was built on the following values that are even more important to us today:

Put Customers First. We only succeed when our customers succeed
Integrity Always. Be open, honest, and respectful
Think Big. Be ambitious and have big goals
Be Excellent. Quality and excellence count in everything we do
Get It Done. Results matter!
Own It
Make Each Other the Best
Embrace each others Differences

The Snowflake Data Warehouse serves 100’s of millions of Customer jobs a day spread across many deployments, cloud providers, and geographies. Our company is growing fast, and our Customer workload is more than doubling every year.

To meet demand the Snowflake service sits atop a highly dynamic, elastic cloud service platform.

Throughout each day our deployments are automatically adapting to meet demand, automatically adding and removing service capacity, balancing loads, monitoring service health, and transparently handling failure recovery.

These are some of the functions the Service Runtime team members work on each day!

We’re looking for a solid, hands-on, Senior Software Engineer to help grow our Service Resilience to the next level, through risk identification, simulation, measurement and automation against all our services.

You’ll use your past experience to help us push the boundaries of robustness in our engine by extending our code to enable Chaos Engineering techniques, and by developing automated frameworks to make our system more robust at scale - all while running well considered chaos experiments across the board (Cloud Services, OS, Application Components & our interdependencies), and measuring their service impact.

If it can break, it will.  We need to know before it impacts our customers.

As a Senior Resilience Engineer you will:

    • Enjoy working on, and gaining a deep understanding of, large scale distributed systems
    • Extend our entire backend to enable Chaos Engineering techniques in the systemDevelop, deploy and manage tools to systematically run chaos experiments and measure impact
    • Observe running systems, and determine/prioritize innovative ways to disrupt them
    • Work closely with others, increasing your technical knowledge
    • Work in our San Mateo office

Our ideal Senior Resilience Engineer will have:

    • Proven experience in Java programming & diagnostics
    • Exposure to at least one major cloud provider (eg. AWS, Azure, GoogleCloud)A history of working on large scale systems
    • A strong desire to learn, and apply, new ways of thinkingSolid written and spoken communication skills
    • Strong familiarity with Linux, and it’s diagnostics
    • Exposure to one or more of GIT, Jenkins, Python scripting, JIRA, SQL, EDW’s, Linux Networking & Storage
Snowflake is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, color, gender identity or expression, marital status, national origin, disability, protected veteran status, race, religion, pregnancy, sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances.