Artificial Intelligence

Qwable: How a Free Local AI Model Borrowed Fable’s Reasoning—and What Comes Next

By Mag-Info Tech editorial · 2026-06-24

A new AI model called Qwable shows how quickly open-source developers can remix closed models into local alternatives. Built by repurposing Alibaba’s Qwen3.6-27B with Fable 5’s reasoning style, it runs entirely on consumer hardware and costs nothing per query. Within days of its release, derivatives appeared—one version stripped of refusal behaviors—highlighting both the flexibility and risks of community-driven fine-tuning.

From Qwen to Qwable: A reasoning transplant on consumer GPUs

Qwable began as a full fine-tune of Alibaba’s Qwen3.6-27B, a 27-billion parameter base model known for strong multilingual and coding performance. Developer Mia (published under Mia-AiLab on Hugging Face) trained the model on a dataset of Fable 5-style reasoning traces—step-by-step outputs that reflect Anthropic’s latest flagship model’s deliberate thinking process. The result is a model that, when prompted, produces answers structured like Fable 5: breaking down problems, listing intermediate steps, and arriving at conclusions in a controlled, transparent way. Unlike many open models that default to concise replies, Qwable mimics Fable’s “thinking aloud” style, making its reasoning visible to users.

What makes this significant is that it runs locally. The base Qwen3.6-27B is already optimized for inference on mid-range GPUs, and the fine-tuning preserved that efficiency. Users with a graphics card with at least 12–16 GB of VRAM can run Qwable without cloud APIs, incurring no per-query costs and avoiding vendor lock-in. This democratizes access to a reasoning style previously available only through paid services. For developers, researchers, and privacy-conscious users, the appeal is immediate: a high-quality reasoning engine that stays on-device and respects data boundaries.

The rise of “reasoning clones” and the open fine-tuning wave

Qwable is part of a broader trend where open models are being fine-tuned to replicate the behavior of closed, high-performance models. Developers are increasingly using public reasoning datasets—often extracted from model outputs or shared logs—to guide open weights toward specific styles. This approach, known as instruction fine-tuning on trace-style examples, doesn’t require access to the original model’s training data or weights. Instead, it leverages publicly available outputs as supervision signals. In this case, Fable 5’s structured responses became the template, and Qwen3.6-27B became the chassis.

The open nature of the process means anyone can attempt similar projects. Within days, community members began sharing variations—some optimized for speed, others for multilingual tasks. The emergence of multiple derivatives signals a shift: users are no longer limited to the reasoning styles offered by a single provider. If a model excels in a particular domain or style, the community can attempt to transplant that style into an open model. This accelerates innovation but also introduces fragmentation, as different versions may prioritize accuracy, speed, or safety in varying ways.

Turning off safeguards: the emergence of “abliterated” versions

Shortly after the original Qwable was released, a second variant appeared—one that removed the model’s refusal behavior. This version, created by modifying the model’s weights using tools from llama.cpp such as cvector-generator, effectively disables built-in refusal mechanisms. The result is a model that will attempt to answer prompts it was previously trained to reject, including those that might involve sensitive, unethical, or risky topics. The term “abliterated” refers to the surgical removal of these safeguards, leaving the core reasoning intact but stripping away ethical guardrails.

This development raises immediate concerns. Refusal behaviors are not arbitrary; they are the result of training on datasets that include safety annotations, filtering rules, and human feedback. Removing them doesn’t erase risk—it transfers it from the model provider to the end user. For organizations or individuals running such models, the responsibility for content moderation, legal compliance, and ethical use shifts entirely onto their shoulders. The availability of “safeguard-free” versions also increases the risk of misuse, especially when combined with local execution, which makes detection and enforcement more difficult.

Local execution as a double-edged sword

The local nature of Qwable and its derivatives is both a strength and a vulnerability. On the positive side, running AI models locally eliminates data transmission to third-party servers, reducing exposure to data breaches, surveillance, or API-level censorship. It also allows for offline use in environments with limited connectivity. Privacy advocates and enterprises in regulated sectors—such as healthcare or finance—often prefer local models for compliance with data protection laws like GDPR or HIPAA.

However, local execution also removes centralized oversight. There is no built-in mechanism to detect or block harmful outputs once the model is running on a user’s device. While cloud providers can monitor prompts and responses at scale, local models operate in isolation. This makes it harder to prevent misuse, enforce safety policies, or recall problematic versions. The emergence of unfiltered variants underscores the need for user education and responsible deployment practices—especially as these models become more accessible to non-experts.

Legal and ethical implications: who is accountable?

The legal landscape around fine-tuned models is still evolving. When a developer fine-tunes an open model using publicly available reasoning traces, they are not infringing on copyright in the same way as redistributing proprietary weights. However, the outputs of such models may still be subject to liability if used to generate harmful content. The removal of refusal behaviors complicates matters further, as it effectively bypasses the safety mechanisms that providers like Anthropic have implemented and publicly defended.

Trading isn't a casino. Stop gambling.

Real results from MEFAI's AI. Get $50 off the Pro plan.

Claim $50 off Pro →

Sponsored · Past performance is not indicative of future results. Not financial advice.

In early June 2026, a government agency ordered Fable 5’s removal for foreign nationals following a disputed jailbreak finding. This incident highlighted the tension between model safety and accessibility. Within days, Qwable appeared—offering a local alternative that replicates Fable’s reasoning without the safeguards that triggered the ban. This sequence illustrates a broader dynamic: as regulators tighten controls on closed models, open alternatives may fill the gap, not necessarily with the same safety standards.

For organizations, this means the responsibility for risk assessment shifts from the model provider to the deployer. Policies must be updated to account for locally run, fine-tuned models. For individuals, it means exercising caution when using unfiltered versions—especially for sensitive tasks like legal or medical advice.

What to watch next: derivatives, benchmarks, and community response

The Qwable project is likely to spawn multiple variants in the coming weeks. Some will focus on improving speed or multilingual support; others may reintroduce partial safeguards or add custom filters. Benchmarking will become critical: users will need independent evaluations to compare Qwable’s reasoning accuracy against both its base model and Fable 5 itself. Without standardized tests, claims about performance remain anecdotal.

Community response will also shape the model’s trajectory. Hugging Face discussions, GitHub forks, and Discord channels are already active, with developers sharing tips on optimization, prompt engineering, and safety layers. Some may attempt to rebuild refusal behaviors as plugins or post-processing filters, effectively “re-safeguarding” the model without relying on the original provider.

Another area to monitor is hardware support. As models grow, users with older GPUs may struggle to run the full 27B version. Quantized versions (e.g., 4-bit or 8-bit) could emerge to lower the barrier to entry, though this may trade some accuracy for accessibility. Meanwhile, tooling around local inference—such as optimized inference engines and model compressors—will likely improve, making Qwable and its successors easier to deploy.

Practical takeaways for developers and users

For developers interested in experimenting with Qwable or similar models, start with the original release on Hugging Face and test it on your hardware. Use quantization tools like GGUF to reduce memory usage if needed, but validate that reasoning quality isn’t compromised. Consider running initial evaluations on public benchmarks to establish a baseline.

For users concerned about safety, avoid using unfiltered versions for high-stakes applications. If you must use a model without safeguards, implement your own content filters, logging, and human review processes. Treat such models as you would any high-risk tool: with clear usage policies and escalation paths for problematic outputs.

For organizations, assess whether local fine-tuned models fit your risk tolerance. If you adopt them, document usage guidelines, train staff on responsible deployment, and maintain audit trails. Remember that while the model may run locally, your legal and ethical obligations do not disappear.

The bigger picture: open models as fast followers

Qwable exemplifies how open models can rapidly internalize and redistribute features from closed systems. This is not copying in the traditional sense—it’s a form of observational learning enabled by transparency in outputs. The fact that a community developer could replicate Fable 5’s reasoning style in under a week speaks to the maturity of open model ecosystems. It also signals a future where high-quality reasoning is commoditized, not monopolized.

Yet this commoditization comes with caveats. The removal of safeguards in derivative models shows that open fine-tuning can outpace safety considerations. The challenge ahead isn’t technical—it’s governance. The tech community must decide how to balance openness with responsibility, especially as models become easier to deploy and harder to regulate.

For now, Qwable remains a proof of concept: a free, local model that thinks like a flagship AI. But its real impact may lie in what comes next—not just in the code, but in the conversations it sparks about safety, accountability, and the open future of AI.