Ask Runable forDesign-Driven General AI AgentTry Runable For Free
Runable
Back to Blog
Technology7 min read

Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data extraction, can run 'anywhere' | VentureBeat

LFM2.5-230M proves that while 3-billion-parameter models like VibeThinker are solving advanced calculus, a 230-million-parameter model is the superior, highl...

TechnologyInnovationBest PracticesGuideTutorial
Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data extraction, can run 'anywhere'   | VentureBeat
Listen to Article
0:00
0:00
0:00

Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data extraction, can run 'anywhere' | Venture Beat

Overview

Liquid AI's smallest model yet LFM2.5-230M beats models 4X its size at data extraction, can run 'anywhere'

Liquid AI, founded by former MIT computer scientists, today released its smallest AI language model yet, LFM2.5-230M, and enterprises would do well to consider it for their uses in data extraction and local deployment on smartphones, laptops and robotics.

Details

This is a 230-million-parameter foundation model explicitly designed for on-device agentic workflows, and as Liquid states in its release blog post, that small size makes it possible to run nearly "anywhere." According to Liquid, it also outperforms models more than 4X its size on selected benchmarks, specifically doing better at data extraction than the 800 million parameter count Alibaba Qwen 3.5-0.8B (Instruct) and 1-billion parameter Google Gemma 3 1B.

Liquid AI LFM2.5-230M benchmark comparison chart. Credit: Liquid AI

Liquid AI LFM2.5-230M benchmark comparison chart. Credit: Liquid AI

The model targets developers and engineers building lightweight data extraction pipelines and autonomous edge systems.

Operating under a dual-use commercial license, the model remains free for individuals and companies generating less than $10 million in annual revenue, while requiring a paid enterprise agreement for larger corporations.

This release distinguishes itself from other small AI models by utilizing the LFM2 architecture to achieve high inference speeds without the massive memory overhead typical of parameter-heavy transformers.

While major AI companies Anthropic, Open AI, Google, Microsoft, Meta and others push parameter counts into the hundreds of billions or trillions to achieve frontier performance, a parallel race focuses entirely on the edge and local deployments.

Liquid AI's launch of LFM2.5-230M signals a pivotal shift toward architectural efficiency over brute-force scaling. By squeezing 19 trillion tokens of pre-training into a 230-million-parameter footprint, the company demonstrates that edge devices do not need massive computational power or persistent cloud connections to execute complex, multi-step agentic workflows.

The LFM2.5-230M model diverges from standard transformer architectures, relying instead on the LFM2 framework. This architecture functions as a hybrid system, interleaving gated short-range convolutions with grouped-query attention to process information efficiently.

For those tracking the evolution of efficient architectures, Liquid’s approach shares a similar conceptual goal: managing long contexts and sequential data effectively on edge hardware without the quadratic memory costs of pure attention mechanisms. The model supports an expansive 32K context window, allowing it to ingest substantial documents or continuous streams of robotic telemetry.

When analyzing the performance charts provided in the release, the architectural efficiency becomes visually apparent. The model maintains a memory footprint of under 400MB while achieving prefill and decode speeds that outpace comparable models like Gemma 3 1B IT and Granite 4.0-H-350M.

On a Samsung Galaxy S25 Ultra equipped with a Qualcomm Snapdragon Gen 4 CPU, the model reaches a decode speed of 213 tokens per second. Even on a highly constrained Raspberry Pi 5, the model maintains a decode rate of 42 tokens per second. Furthermore, internal benchmarking shows the GPU inference stack delivers lower end-to-end latency than competing small models across all concurrency levels.

To understand why a 230-million-parameter model is necessary, one must look at how enterprises currently manage data.

Organizations have traditionally relied on rigid, rule-based Extract, Transform, Load (ETL) scripts to move and process data. However, these legacy systems are notoriously brittle; a simple change in a document's layout or a schema update can break the entire pipeline.

To solve this, the industry is shifting toward "AI ETL," where machine learning infers mappings, detects schema drift, and adapts to changes automatically. In a modern lightweight data extraction pipeline, an AI model connects to unstructured sources—like PDFs, emails, or web forms—and structures the data into formats like JSON without requiring hardcoded rules.

For enterprises, using a massive flagship model like Claude Opus 4.6 (which costs $5.00 per million input tokens) to parse routine invoices, format addresses, or route telemetry data is economically unviable.

This is where models like LFM2.5-230M become critical. Designed explicitly as a lightweight extraction engine, it allows companies to automate repetitive formatting and data parsing at a fraction of the compute cost and latency, running directly on local hardware rather than relying on expensive, continuous cloud API calls.

The AI industry in mid-2026 is seeing a renaissance in "small" models, but the definition of "small" varies wildly.

Recently, the open-weight community was stunned by Weibo's Vibe Thinker-3B, a 3-billion-parameter model built on a Qwen 2-style backbone that achieved a massive 94.3 on the AIME 2026 math benchmark, rivaling 600-billion-parameter behemoths through aggressive data curation and reinforcement learning.

Similarly, Google's Gemma 4 family — which recently crossed 200 million downloads — pushes frontier AI to the edge, including the E2B (2 billion parameters) designed specifically for mobile and Io T deployments.

By contrast, Liquid AI's LFM2.5-230M operates in a completely different weight class. At just 230 million parameters, it is roughly one-tenth the size of Google's smallest Gemma 4 model and Vibe Thinker-3B.

Because of its microscopic footprint, LFM2.5-230M is not designed to compete on reasoning-heavy workloads like advanced math, coding, or creative writing—a constraint Liquid AI explicitly acknowledges.

However, in its intended domains of data extraction and tool calling, the model punches well above its weight class.

Benchmarks released by Liquid AI show LFM2.5-230M scoring 43.26 on the BFCLv 3 tool-use benchmark, dominating IBM's Granite 4.0-350M (39.58) and completely outpacing larger 1-billion-parameter models like Google's Gemma 3 1B IT (16.61).

Liquid AI LFM2.5-230M benchmark comparison bar chart. Credit: Liquid AI

Liquid AI LFM2.5-230M benchmark comparison bar chart. Credit: Liquid AI

On Case Report Bench for data extraction, it scores 22.51, decimating the Qwen 3.5-0.8B (Instruct).

LFM2.5-230M proves that while 3-billion-parameter models like Vibe Thinker are solving advanced calculus, a 230-million-parameter model is the superior, highly optimized choice for executing structured tool calls and keeping agentic pipelines running efficiently on constrained hardware.

Because it excels at tool calling, LFM2.5-230M functions primarily as a skill-selection layer. Liquid AI demonstrated this capability by deploying the model on a Unitree G1 humanoid robot.

Running entirely on-device via the robot's onboard NVIDIA Jetson Orin compute module, the model successfully processes complex environmental commands.

As noted in the company's technical blog, the model takes a free-form instruction like, "Hold still for 2 seconds, then walk forward at 1 meter per second for 3 meters, hold a forward one-leg kneel for 5 seconds, and walk backward at 0.5 meters per second for 3 meters," and automatically translates it into a structured multi-step plan calling on pre-trained low-level skills provided by NVIDIA's SONIC framework.

The base and post-trained models are available immediately on Hugging Face, with native day-one support across the inference ecosystem for llama.cpp (GGUF), MLX, v LLM, SGLang, and ONNX.

Liquid AI ships LFM2.5-230M under the LFM Open License v 1.0. Despite the word "open" in the title, this is not an Open Source Initiative (OSI) compliant license; it operates as a restricted, dual-use commercial framework.

For independent developers, researchers, and early-stage startups, the license functions identically to open-source software.

Users receive a perpetual, worldwide, royalty-free license to reproduce, modify, and distribute the model, provided they retain original copyright notices and prominently state any modifications.

However, the license includes a strict "Commercial Use Limitation". Any legal entity generating $10 million or more in annual revenue loses the right to use the model commercially under this agreement.

Large enterprises crossing this financial threshold must negotiate a separate, paid commercial agreement with Liquid AI to deploy the model in production.

This strategy protects the company from having its intellectual property absorbed by major technology conglomerates for free, while still seeding the model at the grassroots developer level.

Deep insights for enterprise AI, data, and security leaders

By submitting your email, you agree to our Terms and Privacy Notice.

Key Takeaways

  • Liquid AI's smallest model yet LFM2

  • Liquid AI, founded by former MIT computer scientists, today released its smallest AI language model yet, LFM2

  • This is a 230-million-parameter foundation model explicitly designed for on-device agentic workflows, and as Liquid states in its release blog post, that small size makes it possible to run nearly "anywhere

Cut Costs with Runable

Cost savings are based on average monthly price per user for each app.

Which apps do you use?

Apps to replace

ChatGPTChatGPT
$20 / month
LovableLovable
$25 / month
Gamma AIGamma AI
$25 / month
HiggsFieldHiggsField
$49 / month
Leonardo AILeonardo AI
$12 / month
TOTAL$131 / month

Runable price = $9 / month

Saves $122 / month

Runable can save upto $1464 per year compared to the non-enterprise price of your apps.