Snorkel AI is looking for a staff distributed systems architect and backend engineer. The company’s flagship product is a cloud-based enterprise software used by data scientists and ML engineers. Snorkel products are used by large enterprises to solve their most impactful problems in today’s data-centric AI world.
You will be part of the backend team that is building a scalable and reliable distributed system that empowers users to solve their most pressing needs in a data-centric AI world. The team has a variety of technical backgrounds, from machine learning PhDs to full-stack engineers who are building large-scale production systems. You will become one of these pragmatic, high-output, product-focused engineers.
Main Responsibilities
- Prototype, optimize, and maintain scalable back-end services that will power new ML development workflows
- Design extensible and testable interfaces between internal services including the underlying storage and data models
- Own the architecture, design, development, and operations of large-scale systems designed for AI/ML tasks including data management systems, data engineering workflow systems, distributed compute systems and connect to the front-end components
- Work with customers to understand their product use case, desired capabilities, and scale requirements and translate that to engineering specifications and code
- Be an engaged team player in a customer-focused cross-functional environment where you will feel excited to take on whatever is most impactful for the company and product
- Work a hybrid schedule with one or two days per week in our Redwood City HQ and work remotely with "No Meeting" Tuesdays and Thursdays
Required Qualifications
- Bachelor's degree in Computer Science or related field
- 4+ years experience in delivering distributed systems and services in a production setting for cloud-native applications
- Ability to design and build efficient scalable data storage and retrieval systems for AI/ML tasks
- Strong communication and coding skills with emphasis on designing for scale and robustness
- Proactive and positive attitude to lead, learn, troubleshoot and take ownership of shipping multi-quarter large feature development as well as immediate debugging and unblocking customers
Preferred Qualifications
- 8+ years of professional software engineering experience
- Experience with architecting and developing production web-scale systems (monitoring, telemetry, performance, reliability, triage and debug)
- Strong development and debugging skills in Python
- Experience developing enterprise software products for machine learning and/or data science applications
- Experience with distributed compute frameworks and/or deep learning frameworks
- Experience building and maintaining large scale, distributed and high performance data pipelines
The salary range for our Tier 1 locations of San Francisco, Seattle, Los Angeles & New York is $191,000.00 - $225,000.00.