1 task
flash-attention-eval
evaluates whether frontier LLMs can implement high-performance GPU kernels from a mathematical description. no hand-holding, no fill-in-the-blank templates, no pytest. just: here's the math, here's a GPU, make it work and make it fast.