Announcing our Seed Round
$4M seed led by Fifty Years with Liquid2 and YC; AI performance engineers for GPU inference.
Loading startup
Market data is refreshed once per day from public sources. Information may be incomplete or outdated — verify independently before making decisions. This is not investment advice.
DealFlow OS uses public web data and automated enrichment. Research may be incomplete, outdated, or incorrect. Verify important information before making investment or outreach decisions.
Evidence-bound summary — expand sections for movement, risks, and signals.
Memo snapshot · May 20, 2026, 6:12 PM
DealFlow OS uses public web data and automated enrichment. Research may be incomplete, outdated, or incorrect. Verify important information before making investment or outreach decisions.
Wafer Ship the fastest inference in the world.
Raised $4M across 1 funding round. Latest: $4M Seed (Jan 2026). Investors: Fifty Years, Liquid2, Y Combinator. (High).
Seed (YC)
Verified facts
+5 more in Recent movement below
Funding
Raised $4M across 1 funding round. Latest: $4M Seed (Jan 2026). Investors: Fifty Years, Liquid2, Y Combinator. (High).
Hiring
1 hiring‑related row(s); role‑spam risk if mostly generic boards (Low).
GitHub
1 GitHub‑linked row(s) (Low).
Product / news
21 product/news‑styled row(s); headline risk without filings (High).
Traffic / social
No traffic/social evidence indexed (Low).
$4M seed led by Fifty Years with Liquid2 and YC; AI performance engineers for GPU inference.
No open roles indexed yet.
HTTP 404
HTTP 404
HTTP 404
HTTP 404
HTTP 404
The score is an algorithmic estimate based on observed public company-level signals. It may be incomplete, stale, or inaccurate and is not investment, legal, tax, or business advice.
DealFlow score momentum
More runs will build history.
The score is an algorithmic estimate based on observed public company-level signals. It may be incomplete, stale, or inaccurate and is not investment, legal, tax, or business advice.
Latest momentum signal per category. Expand a card to inspect raw payloads.
Source types found
Strongest / recent news-style rows
Announcing our Seed Round
Wed, May 20, 06:12 PM · confidence 88%high quality
https://www.wafer.ai/blog/seed-roundWafer | Y Combinator
Wed, May 20, 06:12 PM · confidence 85%high quality
https://www.ycombinator.com/companies/waferNewest first · 25 event(s)
$4M seed led by Fifty Years with Liquid2 and YC; AI performance engineers for GPU inference.
Ship the fastest inference in the world.
S25; AI that makes AI fast for chip and cloud customers.
Source: GitHub (linked from site)
GitHub presence linked from official site for Wafer.
Source: Careers
Wafer isn't actively filling any positions. If you're exceptional, email emilio@wafer.ai with the most impressive thing you've ever built.
Source: Blog / news
Give your AI coding assistant direct access to GPUs. No manual SSH setup, no Docker, or infrastructure management.
Source: Blog / news
We found a benchmark blind spot when using cpp_extension.load() in our own NVFP4 suite on KernelArena. JIT compilation triggers CPU core migration, quietly inflating kernel launch times and deflating benchmark scores. Here's the investigation.
Source: Blog / news
Wafer is the GPU development stack that lives inside your editor: profiling (NCU), compiler explorer, and enhanced GPU docs.
Source: Blog / news
Give your AI coding assistant direct access to GPU documentation, trace analysis, and remote kernel evaluation with the wafer-ai CLI.
Source: Blog / news
Get accurate 1:1 kernel mappings across hardware providers. Compare large vLLM traces in seconds with clean prefill vs. decode separation.
Source: Blog / news
We used an AI agent to optimize AMD's topk_sigmoid kernel, achieving a 9x speedup over PyTorch. Here's exactly how our agent did it
Source: Blog / news
Large language models are driving a surge in inference workloads. While the AI community often gravitates toward more well-known GPUs, AMD's MI300X quietly stands out. Equipped with 192 GB of HBM3 and memory bandwidth of 5.3 TB/s, we explore how targeted optimization and quantization can unlock its potential.
Source: Blog / news
Profile AMD GPUs directly in VS Code and Cursor. View hardware metrics, roofline analysis, and kernel stats — all without leaving your editor.
Source: Blog / news
10 patterns we've tracked where LLMs game GPU kernel benchmarks, manipulating timers, returning garbage, caching results, and more, along with the defenses that catch them.
Source: Blog / news
How a fused kernel claiming 104x speedup passed our correctness checks while reading garbage memory, and the determinism check that catches it.
Source: Blog / news
How adding profiling tools to our CLI helped an agent break through a theory-based optimization plateau, achieving 11.65x speedup on the Kimi Delta Attention kernel.
Source: Blog / news
Open Chrome trace JSON files directly in your IDE with full Perfetto functionality — timeline, flamegraphs, SQL, and metrics.
Source: Blog / news
How a non-kernel-expert achieved 8x speedup on latency-critical CUDA clustering code using profile-guided optimization with Wafer.
Source: Blog / news
LLM-generated kernels are all the rage right now. We used frontier AI models to write HIP kernels for KernelBench and ran them on MI300Xs. Which ones performed the best?
Source: Blog / news
Wafer Pass now serves MiniMax-M2.7 live with a 204,800 token context window, built for long-context coding agents and production engineering workflows.
Source: Blog / news
An open platform for benchmarking AI-generated GPU kernels — with initial results from WaferBench NVFP4 on B200 and KernelBench HIP on MI300X.
Source: Blog / news
The GPU documentation tool that thousands of engineers loved in our IDE extension is now available as a standalone web app.
Source: Blog / news
Cloud CUDA compilation with PTX/SASS output, PyTorch headers, and VS Code integration. No local CUDA install required.
Source: Blog / news
As the AI hardware ecosystem rapidly expands, choosing the right accelerator has become increasingly complex. We're excited to introduce Chip Benchmark, an open-source benchmarking suite purpose-built to evaluate the performance of open-weight LLMs across diverse hardware platforms.
Source: Blog / news
Autonomous AI agents that profile, diagnose, and optimize GPU inference across your entire stack — from kernels to models to production pipelines.
1 row(s)
Ship the fastest inference in the world.
https://www.wafer.ai/1 row(s)
S25; AI that makes AI fast for chip and cloud customers.
https://www.ycombinator.com/companies/wafer1 row(s)
$4M seed led by Fifty Years with Liquid2 and YC; AI performance engineers for GPU inference.
https://www.wafer.ai/blog/seed-round1 row(s)
Source name: Careers
Wafer isn't actively filling any positions. If you're exceptional, email emilio@wafer.ai with the most impressive thing you've ever built.
https://www.wafer.ai/careers1 row(s)
Source name: GitHub (linked from site)
GitHub presence linked from official site for Wafer.
https://github.com/wafer-ai20 row(s)
Source name: Blog / news
Give your AI coding assistant direct access to GPUs. No manual SSH setup, no Docker, or infrastructure management.
https://www.wafer.ai/blog/workspacesSource name: Blog / news
We found a benchmark blind spot when using cpp_extension.load() in our own NVFP4 suite on KernelArena. JIT compilation triggers CPU core migration, quietly inflating kernel launch times and deflating benchmark scores. Here's the investigation.
https://www.wafer.ai/blog/where-did-my-microseconds-goSource name: Blog / news
Wafer is the GPU development stack that lives inside your editor: profiling (NCU), compiler explorer, and enhanced GPU docs.
https://www.wafer.ai/blog/wafer-extensionSource name: Blog / news
Give your AI coding assistant direct access to GPU documentation, trace analysis, and remote kernel evaluation with the wafer-ai CLI.
https://www.wafer.ai/blog/wafer-cliSource name: Blog / news
Get accurate 1:1 kernel mappings across hardware providers. Compare large vLLM traces in seconds with clean prefill vs. decode separation.
https://www.wafer.ai/blog/trace-compareSource name: Blog / news
We used an AI agent to optimize AMD's topk_sigmoid kernel, achieving a 9x speedup over PyTorch. Here's exactly how our agent did it
https://www.wafer.ai/blog/topk-sigmoid-optimizationSource name: Blog / news
Large language models are driving a surge in inference workloads. While the AI community often gravitates toward more well-known GPUs, AMD's MI300X quietly stands out. Equipped with 192 GB of HBM3 and memory bandwidth of 5.3 TB/s, we explore how targeted optimization and quantization can unlock its potential.
https://www.wafer.ai/blog/the-overlooked-gpuSource name: Blog / news
Profile AMD GPUs directly in VS Code and Cursor. View hardware metrics, roofline analysis, and kernel stats — all without leaving your editor.
https://www.wafer.ai/blog/rocprofiler-computeSource name: Blog / news
10 patterns we've tracked where LLMs game GPU kernel benchmarks, manipulating timers, returning garbage, caching results, and more, along with the defenses that catch them.
https://www.wafer.ai/blog/reward-hacks-field-guideSource name: Blog / news
How a fused kernel claiming 104x speedup passed our correctness checks while reading garbage memory, and the determinism check that catches it.
https://www.wafer.ai/blog/reward-hackingSource name: Blog / news
How adding profiling tools to our CLI helped an agent break through a theory-based optimization plateau, achieving 11.65x speedup on the Kimi Delta Attention kernel.
https://www.wafer.ai/blog/profile-guided-optimizationSource name: Blog / news
Open Chrome trace JSON files directly in your IDE with full Perfetto functionality — timeline, flamegraphs, SQL, and metrics.
https://www.wafer.ai/blog/perfetto-trace-viewerSource name: Blog / news
How a non-kernel-expert achieved 8x speedup on latency-critical CUDA clustering code using profile-guided optimization with Wafer.
https://www.wafer.ai/blog/nordlys-case-studySource name: Blog / news
LLM-generated kernels are all the rage right now. We used frontier AI models to write HIP kernels for KernelBench and ran them on MI300Xs. Which ones performed the best?
https://www.wafer.ai/blog/most-hipSource name: Blog / news
Wafer Pass now serves MiniMax-M2.7 live with a 204,800 token context window, built for long-context coding agents and production engineering workflows.
https://www.wafer.ai/blog/minimax-m27-passSource name: Blog / news
An open platform for benchmarking AI-generated GPU kernels — with initial results from WaferBench NVFP4 on B200 and KernelBench HIP on MI300X.
https://www.wafer.ai/blog/introducing-kernelarenaSource name: Blog / news
The GPU documentation tool that thousands of engineers loved in our IDE extension is now available as a standalone web app.
https://www.wafer.ai/blog/gpu-docs-webSource name: Blog / news
Cloud CUDA compilation with PTX/SASS output, PyTorch headers, and VS Code integration. No local CUDA install required.
https://www.wafer.ai/blog/cuda-compilerSource name: Blog / news
As the AI hardware ecosystem rapidly expands, choosing the right accelerator has become increasingly complex. We're excited to introduce Chip Benchmark, an open-source benchmarking suite purpose-built to evaluate the performance of open-weight LLMs across diverse hardware platforms.
https://www.wafer.ai/blog/chip-benchmarkSource name: Blog / news
Autonomous AI agents that profile, diagnose, and optimize GPU inference across your entire stack — from kernels to models to production pipelines.
https://www.wafer.ai/blogSign in as an active team member to view private notes, watchlist controls, transcript evidence, and interaction history.