Free 100B Model - April 2026

Elephant Alpha AI Guide

Complete guide to the stealth 100B AI model on OpenRouter. Free, fast (~65 tok/s), 256K context. Compare costs, get API setup code, and learn the best use cases.

Cost Savings Calculator

See how much you save switching to Elephant Alpha from paid models.

Current Monthly Cost-
Elephant Alpha Cost$0.00
Monthly Savings-
Yearly Savings-
Cost Reduction100%

Model Comparison

ModelParamsContextSpeedInput $/MOutput $/MBest For
Elephant Alpha FREE 100B256K~65 tok/s FAST$0$0Coding, agents, batch
Claude Opus 4.7~2T200K~30 tok/s$15$75Complex reasoning
GPT-5.4~1.8T128K~40 tok/s$10$30General purpose
Claude Sonnet 4.5~350B200K~80 tok/s$3$15Fast + smart balance
Gemini 3.1 Pro~500B2M~55 tok/s$3.5$10.5Long context
Qwen 3.6-35B35B128K~90 tok/s$0.15$0.60Local agents
DeepSeek V3671B MoE128K~50 tok/s$0.27$1.10Cost-efficient

Quick Start Guide

1
Get OpenRouter API Key — Sign up at openrouter.ai and create a free API key. No credit card needed for Elephant Alpha.
2
Set Model ID — Use openrouter/elephant-alpha as the model identifier in your API calls.
3
Point to OpenRouter — Set base URL to https://openrouter.ai/api/v1. Compatible with OpenAI SDK format.
4
Start building — Use clear, precise instructions. Elephant Alpha works best with structured prompts — less chat, more action.

API Code Examples

Python (OpenAI SDK)

from openai import OpenAI client = OpenAI( base_url="https://openrouter.ai/api/v1", api_key="your-openrouter-key" ) response = client.chat.completions.create( model="openrouter/elephant-alpha", messages=[ {"role": "system", "content": "You are a fast, efficient coding assistant. Be concise."}, {"role": "user", "content": "Write a Python function to merge two sorted arrays in O(n) time."} ], max_tokens=2000 ) print(response.choices[0].message.content)

JavaScript (fetch)

const response = await fetch("https://openrouter.ai/api/v1/chat/completions", { method: "POST", headers: { "Authorization": "Bearer your-openrouter-key", "Content-Type": "application/json" }, body: JSON.stringify({ model: "openrouter/elephant-alpha", messages: [ { role: "user", content: "Refactor this code for better error handling..." } ] }) }); const data = await response.json(); console.log(data.choices[0].message.content);

cURL

curl https://openrouter.ai/api/v1/chat/completions \ -H "Authorization: Bearer your-openrouter-key" \ -H "Content-Type: application/json" \ -d '{ "model": "openrouter/elephant-alpha", "messages": [{"role": "user", "content": "Summarize this document..."}], "max_tokens": 4000 }'

Using with Kilo Code / VS Code

// In Kilo Code settings (or similar OpenAI-compatible tools): // Provider: OpenRouter // Base URL: https://openrouter.ai/api/v1 // Model: openrouter/elephant-alpha // API Key: your-openrouter-key // // Elephant Alpha works great as an "execution" model // Pair with Claude/GPT for planning, Elephant for implementation

Strengths & Weaknesses

Strengths

  • Completely free ($0/M tokens)
  • ~65 tok/s output speed
  • 256K context window
  • 32K max output tokens
  • Function calling support
  • Structured outputs
  • Minimal token waste (50-75% less fluff)
  • Great for agents & batch work
  • OpenAI API compatible

Limitations

  • Anonymous provider (privacy risk)
  • Logs all prompts/completions
  • Weaker on complex reasoning
  • Not ideal for creative/vague prompts
  • May underperform smaller models on some benchmarks
  • No guaranteed uptime/SLA
  • Alpha phase — could change or disappear

Best Use Cases for Elephant Alpha

1. Agent Execution Layer

Use Claude or GPT for planning and task decomposition, then hand off individual steps to Elephant Alpha for execution. This "planner + executor" pattern cuts API costs by 80-90% while maintaining quality.

2. Code Completion & Debugging

Elephant Alpha handles code generation, refactoring, and bug fixing with minimal token overhead. It stays focused on the task without unnecessary explanations — perfect for IDE integrations and CI/CD pipelines.

3. Document Processing

Summarize contracts, convert meeting transcripts to structured tables, extract data from PDFs. The 256K context window handles long documents, and the efficient output means faster processing at zero cost.

4. Batch Processing & Data Pipelines

Process thousands of items cheaply: classify support tickets, generate product descriptions, tag content, extract entities. At $0/token, batch size is only limited by rate limits.

5. Browser Automation Agents

Powers tools like OpenClaw for web scraping, form filling, and automated testing. The fast response time keeps agent loops snappy, and the free pricing makes long-running automation affordable.

Optimal Workflow: Planner + Executor Pattern

1
Claude Opus 4.7 plans — Break complex task into clear, atomic subtasks with specific instructions
2
Elephant Alpha executes — Run each subtask with precise prompts. Gets the job done fast at $0
3
Claude reviews — Spot-check outputs for quality. Only re-run failures through Claude
4
Result: 80-90% cost reduction, 2-3x faster throughput, same quality for execution tasks

Elephant Alpha Key Specifications

Model IDopenrouter/elephant-alpha
Parameters~100 billion
Context Window262,144 tokens (256K)
Max Output32,768 tokens (32K)
Speed~65 tokens/second
Pricing$0 / million tokens (input & output)
Release Date~April 13, 2026
ProviderAnonymous (via OpenRouter stealth program)
FeaturesFunction calling, structured outputs, prompt caching
ModalityText only
AccessOpenRouter API (OpenAI-compatible)

Related Stealth Models on OpenRouter

Elephant Alpha is part of OpenRouter's stealth program, where anonymous providers test models through blind evaluation. Other stealth models that have appeared include:

These models typically appear during testing phases and may change, improve, or disappear without notice. Take advantage of the free access while it lasts.