Cactus | Y Combinator
Summer 2025 batch; low-latency on-device AI engine for mobile and wearables.
Loading startup
Market data is refreshed once per day from public sources. Information may be incomplete or outdated — verify independently before making decisions. This is not investment advice.
DealFlow OS uses public web data and automated enrichment. Research may be incomplete, outdated, or incorrect. Verify important information before making investment or outreach decisions.
Evidence-bound summary — expand sections for movement, risks, and signals.
Memo snapshot · May 20, 2026, 6:21 PM
DealFlow OS uses public web data and automated enrichment. Research may be incomplete, outdated, or incorrect. Verify important information before making investment or outreach decisions.
Cactus - On-device AI for Smartphones, Laptops & Edge One inference engine for on-device AI across smartphones, laptops, and edge hardware
Raised $500K across 1 funding round. Latest: $500K Pre-seed (Jun 2025). Investors: Y Combinator. (High).
Seed (YC)
Verified facts
+5 more in Recent movement below
Funding
Raised $500K across 1 funding round. Latest: $500K Pre-seed (Jun 2025). Investors: Y Combinator. (High).
Hiring
No hiring/careers evidence indexed (Low).
GitHub
No GitHub‑linked evidence indexed (Low).
Product / news
7 product/news‑styled row(s); headline risk without filings (High).
Traffic / social
No traffic/social evidence indexed (Low).
Summer 2025 batch; low-latency on-device AI engine for mobile and wearables.
No open roles indexed yet.
HTTP 404
HTTP 404
HTTP 404
HTTP 404
HTTP 404
HTTP 404
The score is an algorithmic estimate based on observed public company-level signals. It may be incomplete, stale, or inaccurate and is not investment, legal, tax, or business advice.
DealFlow score momentum
More runs will build history.
The score is an algorithmic estimate based on observed public company-level signals. It may be incomplete, stale, or inaccurate and is not investment, legal, tax, or business advice.
Latest momentum signal per category. Expand a card to inspect raw payloads.
Source types found
Strongest / recent news-style rows
Cactus | Y Combinator
Wed, May 20, 06:21 PM · confidence 88%high quality
https://www.ycombinator.com/companies/cactusNewest first · 11 event(s)
Summer 2025 batch; low-latency on-device AI engine for mobile and wearables.
One inference engine for on-device AI across hardware targets.
Source: Blog / news
A simplified offline variant of TurboQuant using Hadamard rotation and per-group Lloyd-Max codebooks — 4× compression of per-layer embeddings in Gemma 4 E2B at +0.06 PPL.
Source: Blog / news
Review of NVIDIA's Parakeet-CTC-1.1B model running locally on Mac with Cactus. Architecture breakdown, benchmarks, and transcription use cases.
Source: Blog / news
Benchmarking Liquid's LFM-2.5-350m across seven devices with Cactus. INT8 quantization, single-core CPU decode, zero-copy loading, and why this configuration makes on-device inference practical.
Source: Blog / news
Review of LiquidAI's LFM2-24B-A2B mixture-of-experts model running locally on Mac with Cactus. Architecture breakdown, benchmarks, and coding agent use cases.
Source: Blog / news
How Cactus combines on-device and cloud inference for real-time speech transcription with sub-150ms latency and automatic cloud handoff for noisy audio.
Source: Blog / news
Gemma 4 runs natively on your device with real-time voice, vision, and audio, and routes hard problems to the cloud when it should.
Source: Blog / news
Deep dives into on-device AI, inference optimization, and running models on smartphones, laptops, and edge hardware.
Source: Homepage
One inference engine for on-device AI across smartphones, laptops, and edge hardware. Run LLMs, transcription, and embeddings locally with automatic cloud fallback.
Open-source low-latency mobile AI engine.
2 row(s)
One inference engine for on-device AI across hardware targets.
https://www.cactuscompute.comSource name: Homepage
One inference engine for on-device AI across smartphones, laptops, and edge hardware. Run LLMs, transcription, and embeddings locally with automatic cloud fallback.
https://cactuscompute.com/1 row(s)
Summer 2025 batch; low-latency on-device AI engine for mobile and wearables.
https://www.ycombinator.com/companies/cactus1 row(s)
Open-source low-latency mobile AI engine.
https://github.com/cactus-compute/cactus7 row(s)
Source name: Blog / news
A simplified offline variant of TurboQuant using Hadamard rotation and per-group Lloyd-Max codebooks — 4× compression of per-layer embeddings in Gemma 4 E2B at +0.06 PPL.
https://cactuscompute.com/blog/turboquant-hSource name: Blog / news
Review of NVIDIA's Parakeet-CTC-1.1B model running locally on Mac with Cactus. Architecture breakdown, benchmarks, and transcription use cases.
https://cactuscompute.com/blog/parakeetSource name: Blog / news
Benchmarking Liquid's LFM-2.5-350m across seven devices with Cactus. INT8 quantization, single-core CPU decode, zero-copy loading, and why this configuration makes on-device inference practical.
https://cactuscompute.com/blog/lfm2-5-350mSource name: Blog / news
Review of LiquidAI's LFM2-24B-A2B mixture-of-experts model running locally on Mac with Cactus. Architecture breakdown, benchmarks, and coding agent use cases.
https://cactuscompute.com/blog/lfm2-24b-a2bSource name: Blog / news
How Cactus combines on-device and cloud inference for real-time speech transcription with sub-150ms latency and automatic cloud handoff for noisy audio.
https://cactuscompute.com/blog/hybrid-transcriptionSource name: Blog / news
Gemma 4 runs natively on your device with real-time voice, vision, and audio, and routes hard problems to the cloud when it should.
https://cactuscompute.com/blog/gemma4Source name: Blog / news
Deep dives into on-device AI, inference optimization, and running models on smartphones, laptops, and edge hardware.
https://cactuscompute.com/blogSign in as an active team member to view private notes, watchlist controls, transcript evidence, and interaction history.