stability.ai Stability AI

Stability AI is a community and mission driven, open-source artificial intelligence company that cares deeply about real-world implications and applications. Our most considerable advances grow from our diversity in working across multiple teams and disciplines. We are unafraid to go against established norms and explore creativity. We are motivated to generate breakthrough ideas and convert them into tangible solutions. Our vibrant communities consist of experts, leaders and partners across the globe who are developing cutting-edge open AI models for Image, Language, Audio, Video, 3D and Biology.

Click any job title for details or "Show job details" to show all.
overlay
Audio
remote: USA
added Thu Jun 22, 2023
link-outApply to Stability AI

About the role:

We are looking for Machine Learning Engineers to work on our audio team who are passionate about generative models and creative applications of AI. In particular, we are looking for people who have experience of developing model serving pipelines to operate at scale and have knowledge of state of the art techniques for optimisation and feature development. We want highly creative ML engineers who are motivated to push the boundaries of generative audio models. You will have access to state-of-the-art high performance computing resources and you will be able to work alongside top researchers and engineers to truly make an impact in the fast growing world of generative AI.

Responsibilities:

  • Lead efforts to drive the design, development and production of customer-facing ML music, speech and audio generation systems, with specific reference to inference and API environments
  • Work with the Audio, Platform and Inference teams on building pipelines for the next generation of models, where you may assist with areas such as optimization, model tuning and deployment, HPC clusters, and tooling
  • Be a strategic thought partner for leaders across the organization on driving business impact through machine learning
  • Work on the commercial side - productionizing generative models, and building the infrastructure to serve them at scale
  • Produce events and metrics in our data warehouse so that we can analyze critical business metrics like cost, performance, reliability, etc.
  • Be part of the team that brings new Stability audio models and pipelines into existence for API customers
  • Prototype and productionize inference platform improvements and new features

Qualifications:

  • 5+ years working on machine learning projects, including inference and pipeline development
  • Solid knowledge of Python scientific stack, PyTorch and at least one high-performance inference framework (e.g. TensorRT)
  • Experience profiling and optimizing deep neural networks, including knowledge of GPU profiling tools such as NVIDIA Nsight
  • Experience with Python audio processing libraries such as librosa, torchaudio, or similar
  • Experience with cloud orchestration systems such as Kubernetes and cloud providers such as AWS, GCP, and Azure
  • Ability to rapidly prototype solutions and iterate on them with tight product deadlines
  • Experience with training and/or deploying ML models with Amazon AWS (Sagemaker a plus) or Google Cloud
  • Strong communication, collaboration, and documentation skills
  • Experience with Linux and command line tools
  • Evidence of interest in music / audio projects is valued

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Audio
remote
added Fri Jun 23, 2023
link-outApply to Stability AI

About the Role:

We are looking for ML Research Engineers who are passionate about generative models for music creation. In particular, we are looking for people who can explore new ideas and architectures for music generation models; highly creative people who straddle research and engineering and who are motivated to push the boundaries of generative music research, not just in state-of-the-art performance, but also in balancing performance and resource usage. You will have access to state-of-the-art, high-performance computing resources, and you will work alongside top researchers and engineers to truly make an impact in the fast growing world of generative AI.

Responsibilities:

  • Work with the rest of the research team and the open-source community on developing the next generation of generative audio models
  • Prototype and productionize model architecture improvements and new features
  • Maintain and innovate on open-source code repositories for generative AI audio models, including custom model code, training code, and fine-tuning code
  • Work with Product, Engineering and Commercial teams on model deployment and customized training
  • Create interactive demos and interfaces for generative models, demonstrating simple use cases in an intuitive and fun way
  • Optimize model architectures and inference code for performance on consumer devices
  • Publish results at top conferences, in journals, and in blog posts
  • Keep up to date with the latest research advancements in the field and work them into open-source repos, reimplementing as needed to ensure an open license

Qualifications:

  • 3+ years working on machine learning projects, including training, fine tuning and refining models
  • Publication of papers, projects, and blog posts that had a high impact in generative AI
  • Experience maintaining high-quality, well-documented open-source code repositories for AI models
  • Experience with music generation models, preferably working in the time domain (Jukebox, SampleRNN, RAVE, etc.)
  • Ability to iterate quickly on public code-bases with attention to backwards compatibility, usability, and readability
  • Experience with Python scientific stack, PyTorch, and creating Jupyter/Colab notebooks
  • Ability to communicate machine learning concepts and results effectively through writing and visualization
  • Experience training and/or deploying ML models with Amazon AWS (Sagemaker a plus) or Google Cloud
  • Experience with data engineering, including cleaning and maintaining large heterogeneous datasets
  • Experience building interactive web demos that serve generative ML models
  • Experience with the open-source ML ecosystem (HuggingFace, W&B, etc.)
  • Experience with Linux and command line tools
  • Familiarity with digital signal processing and audio engineering concepts
  • Experience with Python audio processing libraries such as librosa, torchaudio, or similar

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Audio
remote: USA
added Sat Oct 14, 2023
link-outApply to Stability AI

About the role:

We are looking for a backend engineer to join the team developing APIs, building application backends, and putting in place scalable backend infrastructure across our audio platform. You will be responsible for designing scalable server side applications and robust APIs to serve our audio ML models. The ideal candidate will have experience provisioning large compute clusters for machine learning workflows and will have a strong history of supporting teams to create best practices for reliability and scalability.

Responsibilities:

  • Design, develop, and maintain internal & external APIs and micro services
  • Build robust application backends to serve our audio products
  • Define comprehensive API specifications and documentation
  • Deliver customer-facing services, including account management, identity, single-sign-on, subscription billing, and self-service support tools, integrating with existing internal systems where necessary
  • Collaborate with the frontend team and product managers to implement new features
  • Contribute to system architecture design & decisions
  • Manage large compute clusters for ML inference and development
  • Deliver and manage our developer and researcher productivity tools, including CI/CD pipelines for deploying new machine learning models, orchestration, continuous/progressive deployments, test environments, feature flags, and GitHub
  • Own the orchestration, deployments, request middleware and any other micro services that are required to meet the needs of our API customers

Qualifications:

  • 5+ years of experience in backend engineering
  • Experience building ML infrastructure and working with large GPU clusters
  • Distributed system architecture design knowledge or experience with high traffic, high concurrency system development
  • Well-versed in data structures, data modeling, and database management systems as well as object and file storage systems
  • Experience coding in JavaScript and Python

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Communications and Marketing
remote: USA
added Wed Sep 06, 2023
link-outApply to Stability AI

The Role:

We are seeking a talented and experienced Public Relations Manager with expertise in corporate and crisis communications to join our team at Stability AI. The ideal candidate will be a strategic thinker, adept at managing both the proactive corporate communications efforts and the reactive crisis communications strategies. As the Public Relations Manager, you will play a crucial role in safeguarding the company's reputation, building strong relationships with stakeholders, and effectively communicating our mission and values.

Responsibilities:

Corporate Communications:

  • Develop and execute a comprehensive corporate communications strategy that aligns with Stability AI's brand identity and business goals.
  • Create compelling messaging and content for press releases, corporate announcements, and internal communications.
  • Cultivate and maintain relationships with media outlets, industry influencers, and key stakeholders.
  • Manage and enhance the company's online presence through social media, website content, and thought leadership articles.
  • Collaborate cross-functionally with marketing, business development, and leadership teams to ensure consistent messaging and brand alignment.

Crisis Communications:

  • Develop and maintain a robust crisis communications plan that outlines protocols and strategies for managing potential crises.
  • Act as the primary point of contact for crisis situations, providing timely and accurate information to internal and external stakeholders.
  • Lead crisis response efforts, including drafting crisis statements, coordinating media responses, and managing spokesperson training.
  • Monitor and analyze media coverage and online conversations to identify potential issues and sentiment trends.
  • Conduct post-crisis evaluations to identify areas for improvement and update crisis communication plans accordingly.

Media Relations:

  • Foster positive relationships with journalists, editors, and media representatives to ensure accurate and fair coverage of Stability AI.
  • Pitch stories, interviews, and thought leadership pieces to relevant media outlets to increase the company's visibility.
  • Prepare executives and spokespersons for media interviews, ensuring consistent messaging and effective communication.

Qualifications:

  • Bachelor's degree in Communications, Public Relations, Journalism, or related field (Master's degree preferred).
  • Proven experience 7+ in public relations, corporate communications, or crisis management, preferably in a fast-paced and dynamic environment.
  • Strong understanding of PR strategies, crisis communication techniques, and media landscape.
  • Exceptional written and verbal communication skills, with the ability to craft clear and compelling messages.
  • Demonstrated success in managing media relations and securing media coverage.
  • Proficiency in using PR tools, social media platforms, and analytics to measure and improve PR efforts.
  • Ability to thrive under pressure, make quick decisions, and adapt to rapidly changing situations.
  • Excellent interpersonal skills and the ability to collaborate effectively with cross-functional teams.
  • High level of integrity, professionalism, and discretion when handling sensitive information.
  • Strong organizational and project management skills, with the ability to manage multiple priorities simultaneously.
  • Availability to work outside regular business hours during crisis situations.
  • Experience working in AI or tech environments is a plus.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Communications and Marketing
remote: USA
added Wed Sep 06, 2023
link-outApply to Stability AI

The Role:

As the Public Relations Manager for Stability AI, you will play a critical role in shaping the communication strategies that highlight our commitment to ensuring the stability, reliability, and ethical deployment of AI technologies. You will be responsible for creating a cohesive narrative that resonates both internally and externally, fostering understanding, trust, and collaboration among stakeholders.

Responsibilities:

Internal Communications:

  • Collaborate closely with cross-functional teams, including AI researchers, engineers, and policy experts, to deeply understand the nuances of Stability AI initiatives.
  • Develop and execute internal communication plans that keep employees informed, engaged, and aligned with the goals and progress of Stability AI projects.
  • Create engaging content, such as newsletters, intranet updates, and presentations, to convey technical concepts and achievements to non-technical internal audiences.
  • Facilitate knowledge-sharing sessions and workshops to enhance awareness and understanding of Stability AI efforts among different teams.

External Communications:

  • Craft and manage the external communication strategy for Stability AI, aimed at stakeholders including the media, partners, regulators, and the public.
  • Develop and maintain relationships with relevant media outlets, journalists, and influencers to ensure accurate and informed coverage of Stability AI initiatives.
  • Create compelling press releases, articles, blog posts, and social media content that effectively communicate the significance of our work in AI stability and safety.
  • Respond to media inquiries and proactively engage in thought leadership activities to position Stability AI as a leader in AI.

Stakeholder Engagement:

  • Collaborate with cross-functional teams to create consistent messaging and branding that aligns with Stability AI values and goals.
  • Engage with external stakeholders, including regulatory bodies, industry associations, and advocacy groups, to promote transparency and gather insights.
  • Represent Stability AI at conferences, webinars, and panels, sharing expertise on AI stability and fostering meaningful connections within the industry.

Qualifications:

  • Bachelor's degree in Communications, Public Relations, Journalism, or a related field (Master's degree preferred).
  • 7+ years of experience in public relations or communications, with a proven track record of successfully managing campaigns in the technology or AI sector.
  • Strong grasp of AI concepts, ethics, and challenges, along with an ability to translate technical jargon into clear, accessible language.
  • Excellent writing and editing skills, with the ability to tailor content to different audiences and platforms.
  • Proficiency in crisis communication, reputation management, and media relations.
  • Collaborative and adaptable team player with the ability to thrive in a fast-paced, evolving environment.
  • Experience working in AI or tech environments is a plus.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Engineering
remote: London
added Sat Oct 14, 2023
link-outApply to Stability AI

About the role

We have a huge amount of computing resources and offer access to innovative cloud products, the training and creation of new AI models (Image/Audio/LLM) and research in security around those new models. We are looking for the first member of a newly formed team, who’ll work together with the director and CISO to create a roadmap that will ensure the future security of Stability AI.

Responsibilities

  • You’ll help us future proof, build roadmaps and foresee future issues that might arise with scale
  • Author, maintain, and extend Infrastructure as Code and CI/CD pipelines for build, test, and deployment.
  • Respond to AWS monitoring alerts relating to security, performance, and availability
  • Contribute to product deployments.
  • Remediate security vulnerabilities identified by vulnerability management tools.
  • Remediate operational issues in the development, staging and production environments.
  • Author, maintain, extend, and improve automation.
  • Contribute to operational improvements while building an operational excellence mindset.
  • You’ll help with security compliances such as ISO 27001:2022 and SOC2

Qualifications

  • A strong background in DevOps, DevSecOps, Security or Cloud Engineering,
  • Expert level of Cyber Security experience.
  • Expert level of software security and development practices and implementations.
  • Expert level knowledge and experience of Kubernetes/Docker/Terraform and AWS (GCP a bonus!)
  • 5+ years of demonstrable and significant experience of implementing Docker and Kubernetes in an enterprise environment.
  • 5+ years of experience of delivering CI/CD pipelines to automate everything.
  • Solid experience of working with languages like Python/Go for automation

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Engineering
remote: USA
added Thu Sep 21, 2023
link-outApply to Stability AI

About the role:

We are looking for a talented Engineer with a focus on High-Performance Computing that will work with a growing multidisciplinary team of talented research scientists and machine learning engineers to improve and scale the efficiency within our computing capacity. Stability AI operates a very large HPC cluster for training foundational AI models across several modalities. Operating, automating, monitoring and troubleshooting issues with the cluster is strategically important to the long-term success of the business. This HPC Engineer role is critically important to our company and the ideal candidate will possess a passion for making incremental, measurable improvements, as well as solving unique problems that have yet to be solved in our industry.

Responsibilities:

  • Maintain HPC Clusters Operations: Ensure the smooth operation of HPC clusters, including routine maintenance, software updates, and hardware optimizations.
  • Monitor and Recover Dead Nodes: Continuously monitor cluster nodes, identify dead nodes, and implement recovery procedures to minimize downtime.
  • Documentation: Maintain detailed documentation of dead node incidents, their root causes, and resolutions for future reference and improvement.
  • Shared Volumes Management: Monitor the health and usage of shared volumes, and collaborate with users to enforce cleanup procedures.
  • POSIX Permissions Enforcement: Monitor and contact users who do not adhere to POSIX permissions standards on shared storage to enhance security.
  • HPC Help Center Support: Monitor and respond to user queries and issues submitted to the HPC Help Center, providing timely solutions and assistance.
  • Job Launch Support: Assist users in launching jobs efficiently, reducing the need for constant supervision and ensuring optimal job execution.
  • Optimizing Low-Priority Jobs: Guide users on maximizing the utilization of low-priority jobs through strategies such as preemption robustness and auto-requeueing.
  • S3 Access Permissions: Maintain and troubleshoot S3 access permissions, resolving access issues and ensuring data integrity.
  • Interactive Job Monitoring: Monitor all CPU clusters for users who forget to end interactive jobs and take appropriate actions to maintain cluster availability.
  • Authentication and Authorization: Develop and maintain processes related to authentication, authorization, and accounting for cluster usage, ensuring secure access management.
  • Security Measures: Implement and enhance security protocols for HPC clusters, including tools for rapid access removal in case of security risks.
  • Slurm Scheduling Deployment: Convert and deploy Slurm scheduling for various cloud resources, including Kubernetes (K8s), TPUs, and Trainium.
  • Slurm Support: Issue and resolve Slurm support tickets with external Slurm support providers to address scheduling and cluster management issues.
  • AWS Resource Management: Maintain and manage AWS resources associated with HPC clusters, including login nodes, S3 buckets, FSx volumes, VPCs, subnets, NAT Gateways, S3 VPC Endpoints, and routing tables.

Requirements:

  • Bachelor's degree in computer science, information technology, or a related field. Master's degree preferred.
  • Proven experience in high-performance computing (HPC) administration and maintenance.
  • Proficiency in HPC cluster management tools and technologies, with a strong focus on Slurm scheduling.
  • Knowledge of cloud computing platforms, particularly AWS, and experience with managing associated resources.
  • Strong scripting and programming skills (e.g., Bash, Python) for automation and system optimization.
  • Familiarity with authentication, authorization, and accounting (AAA) processes for cluster usage.
  • Understanding of security best practices and the ability to quickly respond to security threats.
  • Excellent communication skills to effectively collaborate with users, solve issues, and provide guidance.
  • Attention to detail and the ability to document processes and solutions effectively.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Engineering
remote: London
added Sat Oct 14, 2023
link-outApply to Stability AI

About the role

We are looking for an IT Specialist who will be an integral part of our dedicated team, driving the stability, security, and innovation of our technological landscape. Your expertise and experience will be instrumental in maintaining our systems at peak performance, implementing strategic IT initiatives, and ensuring seamless technical support for our employees. You will collaborate closely with cross-functional teams, leveraging your technical prowess to contribute to the growth and success of Stability. This role provides a unique opportunity to work within the Generative AI space, making a significant impact on the digital infrastructure that drives our operations.

Responsibilities

  • You’ll help us future proof, build roadmaps and foresee future issues that might arise with scale
  • You’ll onboard both technical and non technical joiners
  • You’ll collaborate with external vendors, consultants, and partners to ensure effective delivery of IT services.
  • Monitor and troubleshoot network performance, resolve connectivity issues, and optimize network architecture for efficiency.
  • You’ll own the network architecture, for the London Office and consider the implications of scale
  • You’ll work closely with cybersecurity colleagues
  • You’ll train employees on new technologies, software applications, and cybersecurity best practices,
  • You’ll get the chance to gain certifications through Google such as ISO 27001:2022 and SOC2

Qualifications

  • Minimum of 3 years of proven work experience as an IT Specialist or in a similar role.
  • Must have automation and skills in programming languages and tools such as Python/Powershell/Bash/Golang/Terraform
  • In-depth knowledge of computer systems, networks, security protocols, and IT best practices.
  • Proficiency in managing Windows, Linux, and macOS environments.
  • Excellent problem-solving skills with the ability to diagnose and resolve complex technical issues.
  • Proficiency in Google Workplace from an administration perspective.
  • Solid understanding of cybersecurity practices and data protection regulations.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Engineering
remote: USA
added Wed Sep 06, 2023
link-outApply to Stability AI

About the role:

We are looking for an IT Specialist who will be an integral part of our dedicated team, driving the stability, security, and innovation of our technological landscape. Your expertise and experience will be instrumental in maintaining our systems at peak performance, implementing strategic IT initiatives, and ensuring seamless technical support for our employees. You will collaborate closely with cross-functional teams, leveraging your technical prowess to contribute to the growth and success of Stability.ai. This role provides a unique opportunity to work within the Generative AI space, making a significant impact on the digital infrastructure that drives our operations.

Responsibilities:

  • Manage and maintain the company's IT infrastructure, including servers, networks, hardware, and software applications.
  • Ensure the security and integrity of data, network access, backup systems, and disaster recovery procedures.
  • Monitor and troubleshoot network performance, resolve connectivity issues, and optimize network architecture for efficiency.
  • Provide technical support to employees by diagnosing and resolving hardware, software, and network problems in a timely manner.
  • Install, configure, and upgrade computer hardware and software components as required.
  • Collaborate with the IT team to plan and implement system upgrades, patches, and new software rollouts.
  • Implement and enforce IT policies and procedures to ensure data security and compliance with relevant regulations.
  • Stay current with industry trends and emerging technologies, making recommendations for technology adoption that align with company goals.
  • Assist in the planning and execution of IT projects, including the evaluation of technology solutions, vendor selection, and project implementation.
  • Create and maintain documentation for IT processes, procedures, configurations, and user guides.
  • Train employees on new technologies, software applications, and cybersecurity best practices.
  • Collaborate with external vendors, consultants, and partners to ensure effective delivery of IT services.

Qualifications:

  • Bachelor's degree in Information Technology, Computer Science, or a related field. Master's degree is a plus.
  • Minimum of 3 years of proven work experience as an IT Specialist or in a similar role.
  • Must have automation and skills in programming languages and tools such as Python/Powershell/Bash/Golang/Terraform
  • In-depth knowledge of computer systems, networks, security protocols, and IT best practices.
  • Proficiency in managing Windows, Linux, and macOS environments.
  • Strong experience with cloud services such as AWS or GCP.
  • Excellent problem-solving skills with the ability to diagnose and resolve complex technical issues.
  • Proficiency in Google Workplace from an administration perspective.
  • Solid understanding of cybersecurity practices and data protection regulations.
  • Strong communication and interpersonal skills to collaborate effectively with cross-functional teams and communicate technical information to non-technical stakeholders.
  • Proven track record of implementing and managing IT projects effectively.
  • Ability to stay up-to-date with the latest technology trends and advancements.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Legal
remote: USA
added Thu Aug 31, 2023
link-outApply to Stability AI

If you are eager to be part of a team that values innovation, fosters growth, and embraces diversity, we invite you to apply for the Associate General Counsel - Commercial position at Stability AI. Join us in our mission to make a difference in the AI sector and contribute to our ongoing success!

The Role:

Stability AI’s small legal team is growing quickly, and we are seeking a dynamic and experienced Associate General Counsel for Commercial to join our legal team. The role will require the candidate to be ready to scale and lead a global, commercial legal function upon joining our team, including hiring and managing direct reports consistent with company growth. You will have the opportunity to shape business and legal strategy in an industry with little legal precedents, brand new legal questions and concerns, and quickly evolving business models. While a technical background is not a requirement for the role, curiosity, the willingness to learn about new technologies, and the ability to translate an engineer’s technical explanations into actionable legal mitigation measures and contractual provisions with the assistance of Product Counsel is a must.

This is a remote role. Though we are a global company with headquarters in London, preference for location is Pacific time zone.

Responsibilities:

● Negotiate, prioritize, and otherwise manage commercial transactions with customers and partners with minimal oversight

● Work closely with engineers and Product/Privacy Counsel to understand Stability AI’s products and services in order to address any special legal requirements or risks related to Stability AI’s products and services in Stability AI’s agreements, and to clearly and

persuasively communicate with negotiating parties who will often have limited familiarity with Stability AI’s technology
● Analyze, identify, and mitigate potential legal risks in our commercial dealings
● Work with stakeholders from finance, sales, and business development teams to ensure efficiency and cohesion in managing customer and partner transactions
● Create templates and playbooks as needed; Work on complex, one-off customer and partner agreements
● Lead the legal team’s efforts to enter into new foreign markets
● Select and manage the implementation of IT services necessary for commercial transactions, such as digital contract management tools, order form creation tools, or online click-to-accept vendors. Train others on how to use these tools as needed.
● Stay current on legal and regulatory developments that impact the technology sector, and adapt our contracts and commercial operations accordingly
● Implement global trade compliance programs to ensure we conduct business in accordance with legal requirements, and oversee training initiatives to educate employees about them
● Contribute to the continuous improvement of legal processes, templates, and best practices to enhance operational efficiency
● As company grows, hire and manage a team of direct reports, including creating annual budgets, developing hiring plans, and providing performance feedback and individualized growth plans for team members

Qualifications:

● Juris Doctor (JD) degree from an accredited law school and active bar membership in good standing

● 12+ years of experience as an attorney, with a significant portion of that experience focused on commercial transactions, technology transactions, and related areas.

● Previous in-house, leadership role at a technology company
● Ability to address problems creatively
● Practical approach to commercially reasonable risk mitigation in the face of both legal and business uncertainty
● Experience interviewing and evaluating candidates for legal roles
● Proven track record of independently drafting, negotiating, and reviewing various types of contracts and agreements
● Familiarity with both consumer-facing and enterprise-facing contracts
● Strong understanding of intellectual property law, data privacy regulations (e.g., GDPR, CCPA), and technology licensing
● Familiarity with channel (distributors, resellers) and related practice issues
● Working knowledge of revenue recognition rules and basic tax provisions
● Excellent analytical, problem-solving, and communication skills, with the ability to distill complex legal concepts into clear and actionable guidance for non-legal stakeholders.
● Ability to work both independently and collaboratively in a fast-paced, innovative environment
● Experience working with cross-functional teams and effectively managing multiple priorities
● Proactive mindset to identify and address legal risks
● High level of integrity, professionalism, and attention to detail
● Flexibility to adapt to evolving priorities
● Ability to foster relationships and collaborate with internal clients
● Experience in gathering and maintaining metrics for commercial deal flow
● Familiarity with emerging technologies, including AI and data analytics
● Experience with open source software licensing, data licensing, or source-available licensing

Preferred Qualifications:

 Privacy and data security certifications (CIPP/US, CIPP/E)

 Familiarity with online music licensing (PROs, Publishers, Labels)

 Knowledge of copyright laws
 Experience in drafting global agreements or agreements for foreign markets
 Experience doing deals with universities or other research institutions
 Experience doing deals with federal, state, and local governments in the US and abroad
 Experience with selecting and managing outside counsel
 Experience with export control regulations
 Knowledge of laws related to bribery and corruption in the US and abroad
 Excellent repertoire of English colloquialisms or appreciation for British humor (Stability AI is headquartered in London)

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Partnership + Business Development
remote
added Tue Oct 24, 2023
link-outApply to Stability AI
About The Role:
As a Sales Operations and Enablement Lead at Stability AI, you will play a critical role in driving the success of our sales team. This individual will support the day-to-day business operations and drive reporting and strategy to optimize business functions. This role will be responsible for building out the internal operations for the company at scale and would ensure those operations are working efficiently and effectively towards the company’s goal and vision through leveraging data and analytics to scale the business.
What You Will Accomplish:
  • Sales Data Analysis: Analyzing sales data and customer usage data to identify trends, opportunities, and areas for improvement, and presenting actionable insights to the sales team and management.
  • Sales Forecasting: Developing accurate sales forecasts to aid in budgeting and resource allocation.
  • CRM Management: Administering and optimizing our CRM system to ensure it supports the sales team’s needs effectively.
  • Sales Process Optimization: Continuously improving sales processes to enhance efficiency and effectiveness.
  • Reporting: Creating and delivering regular reports on sales performance and key metrics.
  • Sales Support: Providing crucial support to the sales team through the provision of tools, training, and materials.
  • Sales Metrics: Defining, tracking, and reporting on key performance metrics to assess and improve sales performance.
  • Sales Strategy: Collaborating with the sales and marketing teams to develop and execute sales strategies that drive growth and market penetration.
  • Sales Training: Assisting in the training and development of sales representatives to enhance their skills and knowledge.
  • Technology Integration: Implementing and managing sales-related software and technology to streamline processes.
  • Competitor Analysis: Monitoring and analyzing competitor activities and market trends to help us maintain a competitive edge.
  • Customer Insights: Collecting and analyzing customer feedback and data to improve sales strategies and customer satisfaction.
What You Will Bring:
  • Bachelor’s degree in Business Administration, Finance, Economics, Engineering, Computer Science or related field
  • 5+ years in Sales Operations role, Sales Strategy, Business Development and/or Finance
  • Deep experience with analyzing large data sets and providing recommendations
  • Experience utilizing data visualization tools for reporting KPI’s
  • Experience with implementing tools and use of CRM
  • Experience in internal forecasting and revenue reporting
  • Start-up experience preferred
  • AI experience preferred

Compensation

The salary range for this role is between $126,000 and $234,000. Individual pay within the range is based on factors like job-related skills and experience. Total compensation also includes stock options and benefits.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Product
remote
added Fri Jun 23, 2023
link-outApply to Stability AI
Can’t find what you’re looking for? We’re always looking for people with unique skills and passion. Submit your CV / Resume here, and we’ll be in touch when we have an opening that matches!
overlay
Research
remote
added Fri Jun 23, 2023
link-outApply to Stability AI

About the role:

We are looking for Engineers and Researchers in the machine learning discipline who are passionate about generative models and creative applications of AI. In particular, we are looking for people who share our mission of open-source research; people who do not believe AI models should be controlled by a centralized gatekeeper behind a closed wall, but rather be truly open and in control by all. We want highly creative researchers who are motivated to push the boundaries of generative models research, not just in state-of-the-art performance, but in pushing the efficient frontier between performance and resource usage. You will have access to state-of-the-art high performance computing resources and you will be able to work alongside top researchers and engineers to truly make an impact in the fast growing world of generative AI.

As an AI Compiler/Performance Engineer you will work on design and implementation of significant parts of the Stability.ai Compiler and Runtime targeting efficient training and deployment of our models. You will work on performance analysis and design/implementation of new optimizations passes and developing methods targeting new backend targets for custom devices. You will be on the forefront of moving Machine Learning Frameworks from hand written kernels to efficiently generated codegen kernels. You will be responsible for developing the research/engineering agenda, supervising its execution and guiding a group of engineers following it. You will work closely with the key stakeholders within Stability.ai as well as external entities (HW/SW providers) in order to steer the engineering efforts towards more efficient model execution.

Responsibilities

  • Analyze and design effective compiler optimizations

  • Implement and/or enhance code generation targeting machine learning accelerators

  • Develop hardware-aware optimization for emerging ML algorithms and across a spectrum of HW platforms (GPU, TPU, CPUs, custom ASICs, edge-devices)

  • Contribute to the development of machine-learning libraries, intermediate representations

  • Employ scientific methods to evaluate performance and to debug, diagnose and drive resolution of cross-disciplinary system issues

  • Work with algorithm research teams to map graphs to hardware implementations, model data-flows, create cost-benefit analysis and estimate cluster or silicon power and performance

  • Work with research team to execute research agenda

  • Work with open-source community on model release and tooling

  • Work with engineering / business teams on model deployment and customized training

  • Develop testing plans

  • Analyze trade offs, risk mitigation strategies and communicate those to internal and external stakeholders

  • Oversee a team of engineers, provide technical direction and engineering leadership

Qualifications

  • 2+ years of experience with an MS or PhD (preferred) in Computer Science, Electrical Engineering or equivalent field

  • Experience in deep learning algorithms, frameworks and their Intermediate Representations e.g: Pytorch/GLOW, Jax, Tensorflow XLA, LLVM/MLIR, Apache TVM

  • Good understanding of benchmarking/profiling, analyzing performance, building performance models for a given task/device

  • Familiar with concepts such as roofline modeling, flop/memory utilization, power consumption, latency

  • Good understanding of language design, compiler optimizers, backend code generators

  • Ability to communicate research/engineering ideas effectively through writing and visualization

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Research
remote: Germany
added Thu Sep 21, 2023
link-outApply to Stability AI

About the role:

We are looking for a talented Data Engineer with a focus on scaling efficient distribution workloads. You will work alongside a growing multidisciplinary team of talented research scientists and machine learning engineers to improve and scale the efficiency within our computing capacity. Stability AI operates a very large HPC cluster for training foundational AI models across several modalities.In this role, you will contribute to groundbreaking projects that redefine visual storytelling through advanced generative modeling technique and optimize and manage large-scale distributed workloads to drive project efficiency and success

Responsibilities:

  • Clean, normalize, and preprocess data in a scalable, parallelizable way to prepare it for ingestion into our machine learning model training pipelines while ensuring of data quality
  • Building and maintaining highly scalable distributed workloads
  • Build data pipelines to ingest and process data (e.g. images and text) for feeding into ML models
  • AWS Resource Management
  • Keep up-to-date with papers / methods regarding how to improve data quality and/or curate data for Image, Video, LLMs etc.

Qualifications:

  • Proven background within large scale distributed workloads
  • Experience with large scale data loading for machine learning training runs
  • Experience with cloud storage and file systems. AWS (S3) is strongly preferred, but open to other cloud platforms
  • Experience with Python + Pytorch, Deep learning, Computer Vision
  • Experience with multiprocessing and multithreading python workloads.
  • Experience with parallel dataframe manipulation using Pyspark/Ray
  • Proficiency in HPC cluster management tools and technologies
  • Excellent communication skills to effectively collaborate with users, solve issues, and provide guidance.
  • Attention to detail and the ability to document processes and solutions effectively.
  • Nice to have: Experience with data loading stack (Webdataset, Torchdata, fsspec, AIstore)

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Research
remote
added Fri Jul 21, 2023
link-outApply to Stability AI

About the role:

We are looking for experts in Machine Learning and 3D Graphics, who are passionate about generative models and creative applications of AI. In particular, we are looking for people who share our mission of open-sourcing machine learning models; people who do not believe AI models should be controlled by a centralized gatekeeper behind a closed wall, but rather be truly open and in control by all. You will have access to state-of-the-art high performance computing resources and you will work alongside other creative, hard-working top researchers and engineers, to make a lasting, positive impact in the fast-growing world of generative AI.

Responsibilities:

  • Contribute to the creation and improvement of applications and use cases such as text-to-3D, 2D-to-3D, text-to-4D, simple and fast 3D editing… and make systematic progress towards enabling anyone to easily and quickly create complex, animated and interactive 3D characters and environments
  • Take ownership of ML and 3D graphics new and existing features, across the whole product lifecycle, from developing prototypes, to assisting the teams charged with production and maintenance - with minimal supervision
  • Collaborate within Stability and in the open-source community on developing the next generation of 3D-aware neural models, where you may assist with areas such as optimization of model training, model tuning, dataset engineering, automation of processes, tooling, data translation, integration in game engines, etc.
  • Always look for opportunities to delight our customers and improve their life through AI
  • If you want: publish new ideas on arxiv and in major conferences

Qualifications:

  • 3+ years working on state-of-the-art ML or 3D graphics projects (ideally both), including training, fine-tuning ML models, neural graphics, raytracing, etc.
  • Experience researching and implementing SOTA algorithms related to ML or 3D, e.g Neural Radiance Fields, real-time 3D graphics, diffusion models, 3D file format conversion, etc.
  • Experience with Python scientific stack, PyTorch or similar ML frameworks
  • Communicate ML insights and results effectively: verbally, in writing and through visualization
  • Self-motivated, well-organized (e.g. keep track of your own priorities and refine the plan according to new insights), self-reliant (able to debug complex issues with minimal supervision)

Desirable but optional:

  • ML and 3D open-source projects or influential publications
  • Performance optimization, such as profiling shaders and/or ML systems, C++/CUDA, algorithmic improvements, ML quantization and distillation
  • 3D graphics and game development, e.g. OpenGL, DirectX, Vulkan, Unreal Engine, Unity, Godot
  • JAX, TPUs, compiler development

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Research
remote
added Fri Jun 23, 2023
link-outApply to Stability AI

About the role:

We are looking for Machine Learning engineers who are passionate about generative models and creative applications of AI. In particular, we are looking for people who share our mission of open-source machine learning models; people who do not believe AI models should be controlled by a centralized gatekeeper behind a closed wall, but rather be truly open and in control by all. We want highly creative ML engineers who are motivated to push the boundaries of generative models. You will have access to state-of-the-art high performance computing resources and you will be able to work alongside top researchers and engineers to truly make an impact in the fast growing world of generative AI.

Responsibilities:

  • Lead efforts to drive the design development and production of ML systems, and present the solutions to customers
  • Work with Research team on developing the next generation of models, where you may assist with areas such as optimization of model training, model tuning, dataset engineering, HPC clusters, tooling, and work on open-source efforts
  • Be a strategic thought partner for leaders across the organization on driving business impact through machine learning
  • Work on the Commercial side - productioning generative models, and building the infrastructure to serve them at scale, or work to build customized models for commercial applications
  • Prototype and productionize model architecture improvements and new features

Qualifications:

  • 3+ years working on machine learning projects, including training, fine tuning and refining models
  • Experience with Python scientific stack, PyTorch, creating Jupyter/Colab notebooks
  • Experience with JAX / TPUs / CUDA-level / JavaScript (TensorFlow.js etc) a plus
  • Ability to communicate machine learning concepts and results effectively through writing and visualization
  • Experience with training and/or deploying ML models with Amazon AWS (Sagemaker a plus) or Google Cloud
  • Experience with building interactive web demos that serve generative ML models
  • Experience with the open-source ML ecosystem (HuggingFace, W&B, etc.)
  • Experience with Linux and command line tools
overlay
Research
remote
added Tue Oct 24, 2023
link-outApply to Stability AI

About the role:

We are looking for model safety engineers in the text and image modality who are passionate about generative models and creative applications of AI and at the same time understand the risks they can pose. In particular, we are looking for individuals with experience in red teaming as well as defenses of attacks when it comes to malicious prompts for text-to-image and language models. The ideal candidate will have a deep understanding of natural language processing (NLP) techniques and a track record of developing and implementing effective strategies to mitigate risks associated with text-to-image and language models. They should be proactive in identifying potential vulnerabilities and possess a hacker mindset to anticipate and counteract malicious inputs. Additionally, a strong ethical compass and a commitment to the responsible use of AI technology are essential attributes we seek in the ideal Model Safety Engineer.

Responsibilities:

  • Conduct red teaming exercises to identify vulnerabilities and potential misuse of AI models.
  • Develop and implement defense strategies to protect against harmful inputs and outputs.
  • Evaluate and deploy guardrails to ensure the responsible and ethical use of our models.
  • Collaborate with cross-functional teams to integrate safety measures into model development pipelines.
  • Stay up-to-date with the latest research and advancements in AI safety to continuously improve our safeguards.

Qualifications:

  • 3+ years working experience in AI safety, ethics, or security, with a focus on text-to-image and language models.
  • Strong programming skills in Python and experience with relevant libraries and frameworks.
  • Knowledge of AI/ML model architectures and the ability to assess their vulnerabilities.
  • Familiarity with red teaming methodologies and ethical hacking practices of AI models.
  • Excellent communication skills to convey complex technical concepts effectively.
  • Experience with cloud platforms like Amazon AWS or Google Cloud is a plus.
  • Commitment to the ethical use of AI and a dedication to ensuring AI technology benefits society.
  • Ability to communicate machine learning concepts and results effectively through writing and visualization.
  • Ability to work in a fast-paced, remote, and collaborative startup environment.
overlay
Research
remote
added Fri Jun 23, 2023
link-outApply to Stability AI

About the role:

We are looking for Research Scientists in the machine learning discipline who are passionate about generative models and creative applications of AI. In particular, we are looking for people who share our mission of open-source research; people who do not believe AI models should be controlled by a centralized gatekeeper behind a closed wall, but rather be truly open and in control by all. We want highly creative researchers who are motivated to push the boundaries of generative models research, not just in state-of-the-art performance, but in pushing the efficient frontier between performance and resource usage. You will have access to state-of-the-art high performance computing resources and you will be able to work alongside top researchers and engineers to truly make an impact in the fast growing world of generative AI.

Responsibilities:

  • Work with research team to execute research agenda
  • Build the next generation of creative generative AI models
  • Publication of results at top conferences or journals, and blog posts
  • Work with open-source community on model release and tooling
  • Work with engineering / business teams on model deployment and customized training

Qualifications:

  • Publication of papers, projects, and blog posts that had a high impact in generative AI
  • Experience with dataset curation, rather than rely solely on spoonfed research datasets
  • Ability to communicate research ideas effectively through writing and visualization
  • Experience with Python scientific stack, PyTorch, creating Jupyter/Colab notebooks
  • Experience with training large models on a compute cluster environment is a plus

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Research
remote
added Tue Oct 24, 2023
link-outApply to Stability AI

About the role:

We are seeking a highly motivated and detail-oriented Safety Researcher to join our team. The Safety Researcher will play a crucial role in ensuring the safety and well-being of our organization, employees, and customers. This role involves conducting in-depth research, analyzing safety data, and developing strategies to mitigate risks and promote a secure work environment.

Responsibilities:

  • Safety Analysis: Conduct comprehensive research and analysis of safety-related data, incidents, and trends. Identify potential hazards, risks, and areas for improvement.
  • Regulatory Compliance: Stay up-to-date with local, state, and federal safety regulations and standards. Ensure the organization’s adherence to all safety-related requirements.
  • Safety Procedures: Develop and implement safety policies, procedures, and protocols. Collaborate with various departments to ensure consistent safety practices.
  • Risk Assessment: Perform risk assessments to evaluate potential safety hazards and recommend corrective actions. Proactively identify and mitigate safety risks.
  • Data Collection: Gather and maintain safety-related data, incident reports, and documentation. Utilize data to drive evidence-based decision-making.
  • Training and Awareness: Organize and facilitate safety training programs for employees. Promote safety awareness and best practices throughout the organization.
  • Emergency Response: Develop and maintain emergency response plans and protocols. Coordinate drills and exercises to ensure preparedness.
  • Reporting: Prepare regular reports on safety performance, incidents, and compliance. Communicate findings to management and make recommendations for improvements.
  • Collaboration: Collaborate with cross-functional teams, safety committees, and external agencies to achieve safety objectives.

Qualifications:

  • Proven experience in safety research, risk assessment, and safety management.
  • Knowledge of safety regulations and standards.
  • Strong analytical and problem-solving skills.
  • Excellent communication and presentation abilities.
  • Detail-oriented with a commitment to accuracy.
  • Ability to work independently and as part of a team.
  • Certification in safety management (e.g., Certified Safety Professional) is a plus.
  • Bachelor’s degree in Safety Management, Occupational Health and Safety, or a related field (Master’s degree preferred).

Personal Attributes:

  • Passion for safety and a commitment to creating a secure work environment.
  • Proactive and results-driven.
  • Strong ethical and professional integrity.
  • Ability to adapt to changing circumstances and priorities.

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Security
remote
added Tue Oct 24, 2023
link-outApply to Stability AI

VP Trust and Safety

About the role:

As the VP Trust and Safety at Stability AI, you will lead the global team responsible for protecting our users, platforms and communities from abuse by developing the policies and technology that keep Stability AI products safe, reliable and resistant to abuse.

Responsibilities:

  • Trust and Safety Strategy: Develop and execute a comprehensive trust and safety strategy that aligns with the company's goals and values.
  • Policy Development: Create and maintain guidelines, rules, and policies to prevent or mitigate fraudulent activities, harassment, and the generation of abusive or harmful content.
  • Red Teaming: Perform in-depth tests, analysis and identification of possible threats and abuse in new products and services
  • Vendor Relationships: Collaborate with external partners and vendors to enhance trust and safety solutions.
  • Influence research and engineering: make actionable recommendations to minimize potential harm.
  • Incident Response: Lead incident response efforts, investigating and addressing safety-related incidents promptly and effectively.
  • Data Analysis: Analyze user data to identify trends and patterns related to trust and safety issues.
  • Compliance: Ensure compliance with relevant laws and regulations related to user safety and data protection
  • User Support: Oversee a team of trust and safety specialists to provide support to users facing safety-related issues
  • Training and Education: Develop and deliver training programs to educate employees, users and the general public on trust and safety matters.
  • User Engagement: Promote user awareness and engagement in maintaining a safe platform.
  • Reports and Metrics: Prepare and present reports on trust and safety performance and key metrics to senior management.

Qualifications:

  • Bachelor's degree in a relevant field
  • Proven experience in trust and safety leadership, preferably in a tech or AI environment
  • Strong understanding of internet safety, privacy, and online community dynamics
  • Excellent problem-solving and analytical skills
  • Effective communication and conflict resolution abilities
  • Familiarity with legal and regulatory requirements related to safety and AI
  • Demonstrated leadership and team management experience

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Stability AI Japan
remote: Tokyo
added Sat Oct 14, 2023
link-outApply to Stability AI

About the role:

We are looking for a passionate Developer Community Associate (full-time) who can help build our developer and technical user community in Japan, including engineers, creators, designers, product managers, and startup founders who are interested in building upon our generative AI models and APIs. You will assist your team members in developing ideas to engage our developer community. You will perform a wide variety of tasks to grow, support, and improve our community.

Responsibilities:

  • Work with the Community Management Lead, BizDev, R&D, and other team members to brainstorm ideas for events, contests, and other community initiatives
  • Create and curate content (mostly for a technical audience) to educate and engage the community on the use cases of our models and APIs and highlight exceptional work from our community; share via blog posts, tutorials, videos, tweets, etc.
  • Monitor and respond to user feedback, questions, and comments in Discord etc.
  • Help organize and host online/offline community events including possibly hackathons, webinars, or study sessions, sometimes together with partner companies

Qualifications:

  • Good communication skills with fluency in Japanese and at least intermediate English reading/writing/speaking skills
  • Burning passion for generative AI including image generation and language models
  • 2+ years of experience in developer relations or other developer-facing role
  • 2+ years of experience in building / contributing to online communities
  • 3+ years of programming experience including Python
  • Basic familiarity with productivity tools such as spreadsheets, Notion, Trello, or similar tools
  • Experience with building/growing a successful blog or social media account is a big plus
  • Event organization experience is a plus

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Stability AI Japan
remote: Tokyo
added Fri Jun 23, 2023
link-outApply to Stability AI

About the role:

We are looking for a versatile ML Engineer who will train and deploy generative models for Japanese partners, clients, and the broader community. You will adapt quickly as we try various approaches to various industries in a fast-changing environment. We will focus especially on image/video models, large language models, and chatbots.

You will have access to state-of-the-art high-performance computing resources and you will be able to work alongside top researchers and engineers to truly make an impact in the fast-growing world of generative AI.

Responsibilities:

  • Lead efforts to drive the design, development, and productionization of ML models and systems, and present the solutions to partners and clients in Japan
  • Work on the commercial side - productionizing generative models and building the infrastructure to serve them at scale; collaborate with other engineers and researchers to build customized models for commercial applications
  • Be a strategic thought partner both internally and externally on driving business impact through machine learning
  • Conduct experiments on e.g. fine-tuning image generation models or LLMs for the Japan market
  • Prototype and productionize model architecture improvements and new features
  • Provide technical advice to partners/clients on generative models

Qualifications:

  • Good communication skills with fluency in Japanese and business-level English proficiency
  • 5+ years working on machine learning projects, including training, fine tuning and refining models
  • Familiarity with recent, important papers and projects in the generative machine learning space
  • Experience with Python scientific stack, PyTorch, creating Jupyter/Colab notebooks
  • Experience with the open-source ML ecosystem (HuggingFace, W&B, etc.)
  • Experience with training and/or deploying ML models with Amazon AWS (Sagemaker a plus) or Google Cloud
  • Experience with Linux and command line tools

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Stability AI Japan
remote: Japan & London
added Tue Aug 08, 2023
link-outApply to Stability AI

About the role:

We are looking for a talented MLOps Engineer with a focus on Deep Learning and High-Performance Computing that will work with a growing multidisciplinary team of talented research scientists and machine learning engineers to improve and scale the efficiency within our computing capacity.

Responsibilities:

Optimizing Deep Learning Workflows:

  • Monitor reports and dashboards and detect low utilization jobs, projects, users
  • Partner with researchers to check their workflow when they lack performance
  • Identify bottlenecks and suggest scripting optimisations
  • For high-scale jobs, introduce AWS proprietary profiler and libraries to boost performance
  • Scale-up gating process: check the scripts performance and vet requests to scale up
  • Build a knowledge base / best practices documentation for all researchers
  • Implement and monitor CPU usage levels for our CPU clusters; identify users that need assistance in properly coding to maximize usage of CPU’s
  • Train researchers on best practices on how to implement automatization strategies to minimize human oversight on jobs.

Develop and Test Strategies for Future Workloads:

  • Benchmark new systems capabilities and identify strategies to properly utilize them (H100, TRN2, TPUv5, Intel Gaudi)
  • Define the minimum needs for storage speeds and find better data loading strategies to support high processing demands of the new accelerators

High-Performance Computing:

  • Maintain HPC cluster operations
  • Monitor dead nodes and recover them; document dead nodes and their fixes
  • Monitor shared volumes health, usage, and clean-up needs, pursue users to clean-up
  • Partner with users that do not adequately use POSIX permissions on shared storage
  • Monitor the HPC Help Center and solve user problems
  • Assist users in properly launching their jobs
  • Maintain the future S3 access permissions, debug problems, etc
  • Monitor all CPU clusters for users
  • Create and maintain processes around authentication, authorization and accounting for clusters usage
  • Develop processes around security aspects of the HPC clusters, including tools to in case of security risks are identified (globally, by user, by team, by location, etc)
  • Convert and deploy SLURM scheduling for all clouds and all resource types; integrate TPUs into our larger enterprise approach when SLURM becomes available.
  • If needed to use k8s infrastructure for research, then maintain SLURM on top of K8S
  • Solve SLURM support tickets with Sched MD's bug management tools
  • Maintain AWS resources associated with the HPC clusters (login nodes, S3 buckets, FSx volumes, VPCs, subnets, NAT Gateways, S3 VPC Endpoints, routing tables)

Qualifications:

  • At least 8+ years of relevant experience
  • Applied programming experience in Python, C, and/or C++
  • Experience with libraries and tools like PyTorch and CUDA
  • Experience in building, productizing and monitoring orchestration pipelines for AI and Machine Learning pipelines
  • Experience with training frameworks like Megatrong, NVIDIA or similar frameworks
  • Experience in leading more junior engineers
  • Experience with AWS and/or GCP
  • Experience/exposure to CI tools infra tools is a nice to have (Kubernetes)
  • Experience with Linux-based environments and scripting (Shell Scripting, Python, Powershell)
  • Ability to work well as an individual contributor as well as within a multidisciplinary team environment
  • Strong communicator with excellent interpersonal skills and can-do attitude to work and thrive in a fast-paced team environment

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

overlay
Stability AI Japan
remote: Tokyo
added Fri Jun 23, 2023
link-outApply to Stability AI

About the role:

We are looking for a versatile Software Engineer who will do anything it takes to design and implement our projects for Japanese partners, clients, and the community. We will focus on projects related to image/video models, large language models, and chatbots.

You will adapt quickly as we try various approaches to various industries in a fast-changing environment. You will have access to state-of-the-art high-performance computing resources, and you will be able to work alongside top researchers and engineers to truly make an impact in the fast-growing world of generative AI.

Responsibilities:

  • Lead efforts to drive the design, development, and productionization of ML systems, and present the solutions to partners and clients in Japan
  • Work on the commercial side - productionizing generative models and building the infrastructure to serve them at scale; collaborate with ML Engineers to deploy customized models for commercial applications
  • Be a strategic thought partner both internally and externally on driving business impact through machine learning
  • Build pipelines to ingest and process data (e.g. images and text) for feeding into ML models
  • Provide technical advice to partners/clients on the integration of generative models into their products

Qualifications:

  • Good communication skills with fluency in Japanese and business-level English proficiency
  • 5+ years of software development experience with high proficiency in 2 or more languages (Python required, Go is a plus) across a variety of projects
  • Experience with MLOps
  • Experience with data engineering (data pipelines for ML projects)
  • Experience with Linux and command line tools
  • Experience with cloud computing and APIs
  • Experience with web scraping/crawling is a plus
  • Experience with Tensorflow/PyTorch is a plus
  • Experience with Kubernetes and containers is a plus

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

24 available jobs, organized by type.
Click '+' to add or '-' to remove from your next search.
Include:
Omit:
Company: stability.ai