Engineering Manager, Detection

San Francisco, CA or New York, NY
Engineering /
/ Hybrid
Anthropic is developing AI assistants that are helpful, harmless, and honest. As usage of our AI services grows, we need to ensure they are not misused. We're looking for an experienced engineering leader to build out a team focused on detecting abuse, fraud, and harmful content.

About Anthropic
Anthropic is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our customers and for society as a whole. Our interdisciplinary team has experience across ML, physics, policy, business and product.


    • Lead a team of engineers building systems to detect and prevent harm and abuse using Anthropic's AI services
    • Implement systems to detect fraudulent accounts, spam campaigns, harmful user generated content, and other malicious usage
    • Analyze usage patterns and develop protections against new methods of attack and evasion
    • Work closely with data scientists to develop algorithms and signals for detecting threats
    • Build self-service tools for customers to monitor and control access to AI services
    • Design our process for responding to detected signals, including communicating threats and remedies across the organization
    • Coach and mentor team members in their career growth

You may be a good fit if you:

    • Have 5+ years in an engineering management role, leading teams building integrity, trust and safety, or anti-fraud/abuse systems
    • Have deep experience with techniques for bot detection, account fraud, misinformation, and/or harmful user-generated content
    • Have the ability to balance speed and precision when responding to attacks and evaluating risk
    • Have excellent communication skills to explain threats and tradeoffs to stakeholders
    • Have people management skills in coaching, recruiting, and developing engineers
    • Have experience designing operational processes around on-call, post-mortems, etc.

Strong candidates may also:

    • Have a background in building systems at scale with a focus on reliability and performance
    • Have experience with AI/ML and understanding how models can be manipulated
    • Have knowledge of common internet communities, and adversaries like spammers, fraud rings, and their evolving techniques
    • Use technical depth to assess and improve system designs
    • Have project management skills to balance priority tradeoffs

Annual Salary:
The expected salary range for this position is $300k - $500k.