About The Role
As a member of our Compiler team, you will work with leaders from industry and academia to develop entirely new solutions for the toughest problems in AI compute. As deep neural network architectures evolve, they are becoming enormously parallel, and distributed. Compilers are needed to optimize the mappings of computation graphs to compute nodes. In this position, you will build the tools that generate distributed memory code from evolving intermediate representations.
Responsibilities
- Design and devise graph semantics, intermediate representations, and abstraction layers between high-level definitions (like MLIR) and low-level (LLVM IR) distributed code
- Use state-of-the-art parallelization and partitioning techniques to automate generation of distributed kernels
- Low-level optimization on a SIMD/tensor-aware architecture of compute nodes
- Identify, design and implement novel program analysis and optimization techniques
- Design and implement custom system tools (such as linkers) for architectures with massive number of compute nodes
Requirements
- Enrolled in the University of Toronto's PEY program with a degree in Computer Science, Computer Engineering, or other related disciplines
- High proficiency in programming using Python or C++
- Solid understanding of fundamental concepts related to system design, such as data structures, algorithms, and operating systems.
- Related experience or fundamental knowledge of compilers and distributed systems
- Familiarity with high-level parallel program analysis and optimization
Preferred
- LLVM compiler internals
- Polyhedral models
- Familiarity with HPC kernels and their optimization
Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.
This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.
Cerebras Systems has pioneered a groundbreaking chip and system that revolutionizes deep learning applications. Our system empowers ML researchers to achieve unprecedented speeds in training and inference workloads, propelling AI innovation to new horizons.
The Condor Galaxy 1 (CG-1), unveiled in a recent announcement, stands as a testament to Cerebras' commitment to pushing the boundaries of AI computing. With a staggering 4 ExaFLOP processing power, 54 million cores, and 64-node architecture, the CG-1 is the first of nine powerful supercomputers to be built and operated through an exclusive partnership between Cerebras and G42. This strategic collaboration aims to redefine the possibilities of AI by creating a network of interconnected supercomputers that will collectively deliver a mind-boggling 36 ExaFLOPS of AI compute power upon completion in 2024.
Cerebras is building a team of exceptional people to work together on big problems. Join us!