Liquid AI has officially announced the release of LFM2-24B-A2B, a specialized large language model optimized for local, low-latency tool dispatch, alongside LocalCowork, an open-source desktop agent application. This dual release, documented within the Liquid4All GitHub Cookbook, establishes a comprehensive framework for organizations to deploy sophisticated AI-driven workflows entirely on local hardware. By eliminating the necessity for cloud-based API calls and preventing data egress, the system addresses critical concerns regarding data privacy and operational security in sensitive enterprise environments. The move signals a broader industry shift toward edge computing, where the computational heavy lifting of generative AI is moved from centralized data centers to the user’s immediate hardware environment.
The Evolution of Local-First Artificial Intelligence
The development of LFM2-24B-A2B comes at a time when enterprise interest in "local-first" AI is reaching a zenith. Traditionally, the deployment of high-performance AI agents has been tethered to cloud infrastructure due to the immense computational requirements of large-scale models. However, this dependency introduces risks, including potential data leaks, latency fluctuations, and recurring subscription costs. Liquid AI, an organization known for its foundational research into Liquid Neural Networks and efficient model architectures, has designed the LFM2-24B-A2B to mitigate these issues by optimizing the model for consumer-grade workstations.
The timeline of this release follows a series of breakthroughs in model compression and sparse architecture. Earlier in the year, the industry saw a surge in the adoption of the Model Context Protocol (MCP), a standardized framework that allows AI models to interact with external tools and data sources seamlessly. By integrating MCP into the LocalCowork application, Liquid AI has enabled a plug-and-play ecosystem where developers can attach various functionalities—ranging from file system management to security scanning—without rewriting the core agent logic.
Technical Architecture and the Sparse Mixture-of-Experts Advantage
To achieve the performance metrics required for interactive desktop use, LFM2-24B-A2B utilizes a Sparse Mixture-of-Experts (MoE) architecture. This design is central to the model’s ability to run on consumer hardware while maintaining the intellectual breadth of a much larger system. While the model possesses a total parameter count of 24 billion, it only activates approximately 2 billion parameters per token during the inference phase. This selective activation allows for a significantly reduced computational footprint, ensuring that the model can generate responses rapidly without exhausting the host system’s resources.
During internal testing, Liquid AI utilized a high-end consumer configuration to establish performance baselines. The primary testing environment consisted of an Apple M3 Max chipset equipped with 128GB of Unified Memory. The model was served using the vLLM framework, an open-source library designed for high-throughput and low-latency LLM serving. By leveraging the unified memory architecture of the M-series chips, the LFM2-24B-A2B can access its entire parameter set with high bandwidth, further reducing the "time-to-first-token" and overall generation latency.
LocalCowork: A Secure Hub for Enterprise Tooling
LocalCowork serves as the practical, user-facing implementation of the LFM2-24B-A2B model. As a completely offline desktop AI agent, it is designed to function as a digital colleague capable of executing complex tasks within the user’s local environment. The application’s core strength lies in its integration with the Model Context Protocol (MCP), which facilitates a secure bridge between the AI model and the host operating system.
The system is shipped with a library of 75 tools distributed across 14 MCP servers. These tools cover a wide spectrum of enterprise needs, including:
- Filesystem Operations: Searching, reading, writing, and organizing local directories.
- Optical Character Recognition (OCR): Converting images and scanned documents into machine-readable text.
- Security Scanning: Analyzing local files for vulnerabilities or malicious code.
- Data Parsing and Exporting: Extracting structured information from unstructured sources and formatting it for reports.
For the initial public demonstration and the GitHub release, Liquid AI focused on a curated subset of 20 tools across six servers. This subset was selected based on rigorous reliability testing, ensuring that each tool achieved high success rates in both isolated and chained execution scenarios. A critical feature of LocalCowork is its local audit trail; every action taken by the agent is logged locally, providing a transparent record for compliance and security auditing, which is a prerequisite for many regulated industries.
Performance Benchmarks and Reliability Metrics
Liquid AI conducted an extensive evaluation of the LFM2-24B-A2B model to quantify its efficacy in real-world scenarios. The evaluation workload comprised 100 single-step tool selection prompts and 50 multi-step chains. The multi-step chains were designed to simulate complex workflows, requiring between three and six discrete tool executions—such as locating a specific folder, performing OCR on its contents, parsing the resulting data, deduplicating entries, and finally exporting the result to a new file.
The latency results were particularly notable for an on-device model. The system averaged approximately 385 milliseconds per tool-selection response. This sub-second dispatch time is vital for maintaining a "human-in-the-loop" workflow, where delays of even a few seconds can disrupt the user’s cognitive flow and reduce the perceived utility of the AI assistant.
In terms of accuracy, the model demonstrated robust performance:
- Single-Step Accuracy: The model correctly identified and executed the appropriate tool in 83% of the test cases.
- Multi-Step Chain Accuracy: In complex scenarios requiring sequential logic and data passing between tools, the model maintained a 72% success rate.
While these figures indicate that the model is not yet infallible, they represent a significant advancement for local models of this size. The 72% multi-step accuracy suggests that while human supervision remains necessary for critical tasks, the agent is capable of handling the majority of routine data processing tasks autonomously.
Industry Implications and Enterprise Data Sovereignty
The release of LFM2-24B-A2B and LocalCowork has significant implications for the broader technology landscape, particularly regarding data sovereignty. In the current AI ecosystem, most powerful models are proprietary and hosted by a handful of large corporations. This centralization creates a "black box" problem where users have little visibility into how their data is processed or stored.
By providing a high-performance model that runs locally, Liquid AI offers an alternative for organizations that are legally or ethically bound to keep their data within their own infrastructure. This is especially relevant for sectors such as healthcare, legal services, and national defense, where the use of cloud-based AI has been limited by strict regulatory frameworks.
Furthermore, the move toward local execution reduces "data egress" costs. For large enterprises, the cumulative cost of sending massive amounts of data to the cloud for processing can be substantial. Local execution eliminates these variable costs, replacing them with a one-time investment in local hardware.
Comparative Analysis: Local vs. Cloud-Based Agents
When compared to cloud-based counterparts like GPT-4o or Claude 3.5 Sonnet, local models like LFM2-24B-A2B occupy a unique niche. While the largest cloud models may still hold a marginal lead in raw reasoning capabilities and broad knowledge, they suffer from inherent network latency and privacy trade-offs. Liquid AI’s MoE approach narrows this gap by providing a model that is "smart enough" for the vast majority of enterprise tool-calling tasks while excelling in speed and privacy.
The use of the Model Context Protocol (MCP) is also a strategic choice. By adopting an open standard, Liquid AI ensures that LocalCowork is not a siloed product. Developers who build tools for other MCP-compatible platforms can easily port them to the LocalCowork environment, fostering a community-driven expansion of the agent’s capabilities.
Future Outlook and Development Roadmap
Liquid AI has indicated that the release of LFM2-24B-A2B is part of an ongoing commitment to open-source and local-first AI development. Future iterations are expected to focus on further reducing the parameter activation count while increasing the success rate for complex, multi-step chains. There is also a concerted effort to expand the library of supported MCP servers, potentially including more specialized tools for software development, financial modeling, and creative workflows.
The organization’s decision to host the project on the Liquid4All GitHub Cookbook encourages transparency and collaborative improvement. As more developers experiment with the LFM2-24B-A2B architecture, it is likely that community-driven optimizations for different hardware backends (such as NVIDIA RTX GPUs or specialized AI accelerators) will emerge, further democratizing access to high-performance local AI.
In conclusion, the launch of LFM2-24B-A2B and LocalCowork represents a pivotal moment for on-device generative AI. By combining a sophisticated MoE architecture with a standardized tool-calling protocol, Liquid AI has provided a viable path for enterprises to harness the power of AI agents without compromising on speed, cost, or data privacy. As hardware continues to evolve and models become more efficient, the boundary between local and cloud-based AI performance will continue to blur, with Liquid AI positioned at the forefront of this technological transition.
