@ Catchmaker

AI Compiler and Performance Engineer

Research

remote

added Fri Jun 23, 2023

Apply to Stability AI

About the role:

We are looking for Engineers and Researchers in the machine learning discipline who are passionate about generative models and creative applications of AI. In particular, we are looking for people who share our mission of open-source research; people who do not believe AI models should be controlled by a centralized gatekeeper behind a closed wall, but rather be truly open and in control by all. We want highly creative researchers who are motivated to push the boundaries of generative models research, not just in state-of-the-art performance, but in pushing the efficient frontier between performance and resource usage. You will have access to state-of-the-art high performance computing resources and you will be able to work alongside top researchers and engineers to truly make an impact in the fast growing world of generative AI.

As an AI Compiler/Performance Engineer you will work on design and implementation of significant parts of the Stability.ai Compiler and Runtime targeting efficient training and deployment of our models. You will work on performance analysis and design/implementation of new optimizations passes and developing methods targeting new backend targets for custom devices. You will be on the forefront of moving Machine Learning Frameworks from hand written kernels to efficiently generated codegen kernels. You will be responsible for developing the research/engineering agenda, supervising its execution and guiding a group of engineers following it. You will work closely with the key stakeholders within Stability.ai as well as external entities (HW/SW providers) in order to steer the engineering efforts towards more efficient model execution.

Responsibilities

Analyze and design effective compiler optimizations
Implement and/or enhance code generation targeting machine learning accelerators
Develop hardware-aware optimization for emerging ML algorithms and across a spectrum of HW platforms (GPU, TPU, CPUs, custom ASICs, edge-devices)
Contribute to the development of machine-learning libraries, intermediate representations
Employ scientific methods to evaluate performance and to debug, diagnose and drive resolution of cross-disciplinary system issues
Work with algorithm research teams to map graphs to hardware implementations, model data-flows, create cost-benefit analysis and estimate cluster or silicon power and performance
Work with research team to execute research agenda
Work with open-source community on model release and tooling
Work with engineering / business teams on model deployment and customized training
Develop testing plans
Analyze trade offs, risk mitigation strategies and communicate those to internal and external stakeholders
Oversee a team of engineers, provide technical direction and engineering leadership

Qualifications

2+ years of experience with an MS or PhD (preferred) in Computer Science, Electrical Engineering or equivalent field
Experience in deep learning algorithms, frameworks and their Intermediate Representations e.g: Pytorch/GLOW, Jax, Tensorflow XLA, LLVM/MLIR, Apache TVM

Good understanding of benchmarking/profiling, analyzing performance, building performance models for a given task/device
Familiar with concepts such as roofline modeling, flop/memory utilization, power consumption, latency
Good understanding of language design, compiler optimizers, backend code generators
Ability to communicate research/engineering ideas effectively through writing and visualization

Equal Employment Opportunity:

We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

Stability AI

Stability AI is a community and mission driven, open-source artificial intelligence company that cares deeply about real-world implications and applications. Our most considerable advances grow from our diversity in working across multiple teams and disciplines. We are unafraid to go against established norms and explore creativity. We are motivated to generate breakthrough ideas and convert them into tangible solutions. Our vibrant communities consist of experts, leaders and partners across the globe who are developing cutting-edge open AI models for Image, Language, Audio, Video, 3D and Biology.

stability.ai