MLSys 2026 Competition - NVIDIA Track

FlashInfer AI Kernel Generation Contest

Create high-performance GPU kernels for state-of-the-art LLM architectures on NVIDIA Blackwell GPUs with humans and/or AI agents

Organizer and Sponsors

Contest Overview

🎯

The Challenge

Create optimized CUDA kernels for cutting-edge LLM operations, either by hand or with AI agents. Receive kernel specifications and produce high-performance code for NVIDIA Blackwell B200 GPUs.

📊

Benchmark

Compete across workloads derived from production models. Kernels are evaluated on correctness, speed, and win rate against FlashInfer baselines.

Platform

Submit and evaluate your kernels on FlashInfer-Bench (bench.flashinfer.ai).

🤖

Two Approaches

We welcome both expert-crafted seed kernels with agent-assisted evolution, and fully agent-generated solutions. The two approaches will be evaluated separately. Agent solutions must open-source scripts to reproduce kernels. No API credits provided.

Competition Tracks

Three kernel categories targeting the most important operations in modern LLMs

Track A

Fused MoE

Fused Mixture-of-Experts kernels with FP8 support.

FP8 MoE
Track B

Sparse Attention

Deepseek Sparse Attention from Deepseek V3.2

Track C

Gated Delta Net

Gated Delta Net used in Qwen3-Next

Getting Started

Everything you need to start competing

📦

Starter Kit

Development environment setup and test/benchmark scripts.

View Starter Kit
📋

Submission Format

Use any language (CuTe DSL, CUDA, Tilelang, Triton, cuTile, etc.). Host your code in a GitHub repo following our starter kit format, then share the repo URL with organizers (private repos welcome, just add organizer access).

🎯

Evaluation

Biweekly evaluations plus final evaluation. Tag your commits on GitHub to participate. Note: Modal scores are for reference only (clock frequency cannot be locked). Official evaluations run on bare metal machines.

🤖

Agent Baseline

OpenEvolve-based agent baseline for AI-assisted kernel generation.

View Agent Baseline

Timeline

Jan 22, 2026

Public Launch

  • Registration opens
  • Starter kit released
Feb 9, 2026

Baselines Released

  • OpenEvolve-based baselines available
Feb 15, 2026

Registration Deadline

  • Last day to register your team
Apr 24, 2026

Kernel Submission Deadline

  • 11:59 PM AoE
May 1, 2026

Writeup Deadline

  • Technical report due (max 4 pages) — requirements
  • 11:59 PM AoE
May 12, 2026

Winners Notified

  • Results announced via email
May 22, 2026 • 11:00 AM – 12:50 PM PDT

MLSys 2026 Award Ceremony

  • Bellevue, WA
  • Winners present their solutions

Prizes & Resources

🏆

GPU Prizes

1st NVIDIA DGX Spark
2nd NVIDIA GeForce RTX 5090
3rd NVIDIA GeForce RTX 5080

🎫

Free Registration

Winners receive complimentary MLSys 2026 conference registration.

💻

GPU Access

Registered teams receive Modal compute credits for NVIDIA B200 GPU development.

Winners

Top-3 teams per Track × Approach.

Track A Fused MoE

Agent-Assisted

1st
Team Wombat

Members: George Karpenkov, Yury Kirpichev, Mikhail Usvyatsev

2nd
KernelEvolve

Members: Qasim Khan, Jai Menon

3rd
LLM-CUDA

Members: Yue Shui, Chenyu Ma, Hangfei Xu, Shengzhao Wen, Yanpeng Wang

Full-Agent

1st
HAN Lab Kernel Mafia

Members: Dongyun Zou, Zhekai Zhang, Shang Yang, Hao Kang

2nd
GEMM People

Members: Jierui Xu, Yanchuan Tang, Kai Luo

3rd
Insider

Members: Mayank Suthar

Track B Sparse Attention (DSA)

Agent-Assisted

1st
Dogacel

Members: Doğaç Eldenk

2nd
Cong

Members: The Cong Luong

3rd
Team Wombat

Members: George Karpenkov, Yury Kirpichev, Mikhail Usvyatsev

Full-Agent

1st
Dogacel

Members: Doğaç Eldenk

2nd
HAN Lab Kernel Mafia

Members: Dongyun Zou, Zhekai Zhang, Shang Yang, Hao Kang

3rd
UW SyFI

Members: Keisuke Kamahori, Steven Gao, Shihang Li, Wei Shen, Yile Gu

Track C Gated Delta Net (GDN)

Agent-Assisted

1st
Kachua

Members: Romit Jain, Pushkar Patel

2nd
UW SyFI

Members: Keisuke Kamahori, Steven Gao, Shihang Li, Wei Shen, Yile Gu

3rd
LLM-CUDA

Members: Yue Shui, Chenyu Ma, Hangfei Xu, Shengzhao Wen, Yanpeng Wang

Full-Agent

1st
UW SyFI

Members: Keisuke Kamahori, Steven Gao, Shihang Li, Wei Shen, Yile Gu

2nd
LLM-CUDA

Members: Yue Shui, Chenyu Ma, Hangfei Xu, Shengzhao Wen, Yanpeng Wang

3rd
HAN Lab Kernel Mafia

Members: Dongyun Zou, Zhekai Zhang, Shang Yang, Hao Kang

Ready to Compete?

Join teams from around the world in pushing the boundaries of AI kernel generation.

Teams of up to 5 members | Registration deadline: February 15, 2026

Resources