open source · community built · free forever

Big intelligence. Tiny footprint.

We build small, fast, production-ready models that punch above their weight. No bloat. No nonsense. Runs on your laptop, your phone, your API — not just someone's A100 cluster.

Browse Models → Quick Start

141M

Smallest Classifier

20ms

Avg Inference

17+

Safety Labels

GPU Cost

CORTYX v2 — live 17-label toxicity classifier Runs on Colab T4 Open to all DeBERTa-v3 backbone GuardLite — coming soon TinyModels — small but powerful CORTYX v2 — live 17-label toxicity classifier Runs on Colab T4 Open to all DeBERTa-v3 backbone GuardLite — coming soon TinyModels — small but powerful

// what we do

Machine learning for everyone

Most AI labs race to make models bigger. We race to make them smaller without losing what matters.

🎯

Precision over scale

Every model we ship is purpose-built for a specific task — not a general-purpose blob that tries to do everything mediocrely. Specialization wins.

⚡

Free-tier first

Everything is trained and tested on Google Colab T4 GPUs. If it doesn't run there, it doesn't ship. Your hardware shouldn't be a barrier to good AI.

🔓

Truly open

No paywalls. No gated weights. No "available upon request." Every model, dataset, and training notebook is public, forkable, and yours to build on.

🌱

Low carbon, high impact

Smaller models mean less compute, less energy, less waste. Good ML shouldn't require a power plant. We take that seriously.

// models

What we ship

Every model trained on free-tier hardware. No excuses, no shortcuts — just efficient engineering.

⚔️ JujutsuKaiserver

Jujutsu Kaisen themed model — see model card for details

TinyModels

Live

🌌 NYXIS-1.1B

1.1B parameter model by QuantaSparkLabs

QuantaSparkLabs

Live

More models

Additional releases coming — follow the org to stay updated

—

Coming

// why tiny

Efficiency over parameters

The AI industry measures progress in billions of parameters. We measure it in milliseconds and zero-dollar bills.

Big models

Cost per inference$$$ expensive

Latency500ms – 2s

DeploymentNeeds A100 / H100

Accessible toWell-funded labs

Carbon footprintVery high

TinyModels ⚡
Cost per inferenceFree / near-free
Latency20 – 80ms
DeploymentT4, CPU, phone
Accessible toEveryone
Carbon footprintLow

// philosophy

Three rules. No exceptions.

Every model we release must clear all three bars. No partial credit.

Fits on a free GPU

If it doesn't run on a Colab T4, it doesn't ship. Deployable by anyone means deployable by everyone — students, indie devs, researchers with no budget.

Beats models twice its size

Size is not an excuse for quality. Every TinyModel must outperform or match models with significantly higher parameter counts on its target benchmark. Efficiency is the craft.

Ships with clean docs

A model nobody can use is worthless. Every release comes with a complete model card, working code examples, and honest evaluation numbers — no cherry-picked benchmarks.

// a word (or two)

Jokes and a serious note

We take the work seriously. Not always ourselves.

// joke #1

GPT-4 walks into a bar. The bartender says: "We don't serve models with 1.8 trillion parameters here." GPT-4 says: "That's fine, I'll just hallucinate a better bar."

// joke #2

A researcher asks a 70B model and a 141M model the same question. The 70B model takes 4 seconds and gives a three-paragraph answer. The 141M model answers in 22ms and says: "Same thing, shorter."

The democratization of AI is not a marketing phrase — it is an obligation. When the tools to build intelligent systems are locked behind compute budgets that only large organizations can afford, the people who could benefit most from this technology are exactly the ones who cannot access it. A student in a resource-constrained environment with a free Colab account deserves the same quality of inference as a team running enterprise GPU clusters.

TinyModels exists because we believe that making something smaller and faster is not a compromise — it is a discipline. Every parameter saved, every millisecond cut, every megabyte reduced is a deliberate act of engineering that makes this technology more portable, more sustainable, and more inclusive. We will keep building that way.

// quick start

Up and running in 60 seconds

All TinyModels share the same interface. Learn once, use everywhere.

Install

JujutsuKaiserver

NYXIS-1.1B

# one-time setup
pip install transformers torch sentencepiece huggingface_hub

        
        from transformers import pipeline

        # load JujutsuKaiserver from TinyModels

        model = pipeline("text-generation",

                     model="TinyModels/JujutsuKaiserver")

        result = model("Your prompt here")

        print(result)

        
        from transformers import pipeline

        # load NYXIS-1.1B from QuantaSparkLabs

        model = pipeline("text-generation",

                     model="QuantaSparkLabs/NYXIS-1.1B")

        result = model("Your prompt here")

        print(result)