overlay
Director of Software, Kernel
Software
on site: Sunnyvale
added Tue Sep 19, 2023
link-outApply to Cerebras Systems

The Team

The Kernel team is responsible for the design, implementation, and performance tuning of deep learning operations on highly parallel custom processors. We are developing parallel and distributed algorithms to maximize hardware utilization and accelerate the training of deep neural networks to unprecedented speeds.

This involves:

    • Creating high-performance linear-algebra and machine-learning kernels for custom processors.
    • Designing and implementing parallel algorithms on a distributed hardware architecture.
    • Tuning and optimizing low-level assembly code within significant constraints of highly-optimized high performance hardware.
    • Understanding the tradeoffs of performance, compute, and memory and simultaneously optimize for all three.

The Role

As the Director of the Cerebras Kernel team, you will lead, build, and manage a team of highly talented and motivated software engineers in a fast-paced environment to solve toughest of the problems in rapidly evolving AI space.

Responsibilities

  • Leading the team to develop a flexible and robust library of optimized kernels for primitive operations used by state-of-the-art neural network architectures
  • Providing technical vision and guidance to team members in designing, analyzing, and optimizing algorithmic solutions
  • Working with engineering leadership and product management teams to develop product roadmap
  • Identifying hiring needs and filling them with top talent from industry and academia
  • Mentoring and coaching team members considering both short-term execution and long-term career growth needs
  • Identifying risks in product development schedule and take active measures to mitigate them
  • Actively participating in defining next generation system architecture with hardware and systems teams and provide software perspective for feature prioritization
  • Defining and enforcing best practices in software development process including coding style standards and peer reviews
  • Identifying opportunities for deployment of tools and processes to improve engineering execution efficiency
  • Driving sprint planning meetings

Skills & Qualifications

  • Bachelor’s / Master’s degree or foreign equivalent in Computer Science, Engineering, or related field.
  • 7+ years of related work experience in e.g. kernel design, implementation, and optimization, or high-performance parallel programming
  • 5+ years of experience in building and managing engineering teams
  • Familiarity with parallel algorithms and distributed memory systems
  • Experience with assembly-level programming and optimization and strong knowledge of computer architecture fundamentals
  • Programming fluency and extensive experience in C or C++ and Assembly languages
  • Project and program management experience
  • Familiarity with Agile development methodology
  • Outstanding verbal and written communication skills
  • Deep learning algorithms experience is a plus
  • Experience working in a fast-paced, startup-like environment strongly preferred

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.


This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Cerebras Systems has pioneered a groundbreaking chip and system that revolutionizes deep learning applications. Our system empowers ML researchers to achieve unprecedented speeds in training and inference workloads, propelling AI innovation to new horizons.

The Condor Galaxy 1 (CG-1), unveiled in a recent announcement, stands as a testament to Cerebras' commitment to pushing the boundaries of AI computing. With a staggering 4 ExaFLOP processing power, 54 million cores, and 64-node architecture, the CG-1 is the first of nine powerful supercomputers to be built and operated through an exclusive partnership between Cerebras and G42. This strategic collaboration aims to redefine the possibilities of AI by creating a network of interconnected supercomputers that will collectively deliver a mind-boggling 36 ExaFLOPS of AI compute power upon completion in 2024.

Cerebras is building a team of exceptional people to work together on big problems. Join us!