About PromptLean

A library of token-efficient prompts, built by developers who care about both quality and cost.

Why this exists

Most prompt libraries optimize for one thing: quality. The result is verbose, ceremony-heavy prompts that would embarrass any engineer who wrote them in code. They're 200 tokens of preamble telling a capable model to "think step by step."

Token cost compounds fast. At 10,000 calls per day, a 150-token prompt difference costs real money — and the response latency difference is real too.

PromptLean organizes prompts into three tiers, so you can pick the right token budget for your situation.

The three tiers

⚡ Lean ~20–40 tokens

Minimal instruction. Works because modern frontier models have strong priors about common tasks. Use when:

You're iterating rapidly and output quality doesn't need to be final
Context window is tight (many docs in the prompt already)
High-volume calls where cost matters
The model clearly understands the task from domain context

⚖ Balanced ~60–90 tokens

Structured output spec without ceremony. Tells the model what sections to produce. Use when:

You need consistent output structure across runs
The task has multiple sub-questions that need separate answers
You're feeding output downstream and need predictable format

★ Max Quality ~150–220 tokens

Role assignment, full rubric, output spec, calibration examples. Use when:

Output will be shared externally (report, blog post, code review)
The task requires expert-level nuance (system design, eval rubric)
You're running once and need the best possible output
You're using an API directly where you control cost explicitly

How token estimates work

Token counts use the GPT-3 tokenizer (cl100k_base) as a rough baseline. Claude and Gemini use similar tokenization but may differ by ±10%. Estimates are for the prompt template only — they don't include your content (the [PLACEHOLDER] values).

The savings percentages compare Lean and Balanced against Max Quality for the same prompt.

Model compatibility

Every prompt page includes model-specific notes and quality benchmarks for 8 frontier models: GPT-5.4, Claude Sonnet 4.6, Claude Opus 4.6, Gemini 3.1 Pro, Grok-4, Llama 4 Scout, Mistral Large 3, and o1.

Lean variants generally work better on models with strong instruction-following (Claude Sonnet 4.6, GPT-5.4). Max Quality variants add the most value on complex, multi-part tasks where even capable models benefit from a clear rubric.

Use as an API

All prompts live in a single JSON file on GitHub. You can fetch() it directly in any app — no key, no auth, no rate limit.

Endpoint

https://raw.githubusercontent.com/kishormorol/promptlean/main/data/prompts.json

JavaScript

const res = await fetch(
  'https://raw.githubusercontent.com/kishormorol/promptlean/main/data/prompts.json'
);
const { prompts } = await res.json();

// Get the lean variant of a specific prompt
const codeReview = prompts.find(p => p.id === 'code-review');
const leanPrompt = codeReview.variants.lean.prompt;
console.log(leanPrompt); // "Review this code. Flag bugs..."

Python

import requests

URL = "https://raw.githubusercontent.com/kishormorol/promptlean/main/data/prompts.json"
prompts = requests.get(URL).json()["prompts"]

# Find a prompt and pick the balanced variant
review = next(p for p in prompts if p["id"] == "code-review")
prompt_text = review["variants"]["balanced"]["prompt"]
tokens      = review["variants"]["balanced"]["token_estimate"]
print(f"{tokens} tokens: {prompt_text[:60]}...")

curl

curl -s https://raw.githubusercontent.com/kishormorol/promptlean/main/data/prompts.json \
  | jq '.prompts[] | select(.id == "code-review") | .variants.lean.prompt'

Response shape

{
  "prompts": [
    {
      "id":          string,   // kebab-case slug
      "title":       string,
      "category":    string,
      "tags":        string[],
      "featured":    boolean,
      "description": string,
      "model_notes": { [model]: string },
      "benchmarks":  { [model]: { lean, balanced, max_quality: 1–5, best: string } },
      "variants": {
        "lean":        { "prompt": string, "token_estimate": number },
        "balanced":    { "prompt": string, "token_estimate": number },
        "max_quality": { "prompt": string, "token_estimate": number }
      },
      "source": {           // present only on adapted prompts
        "name":        string,
        "author":      string,
        "url":         string,
        "license":     string,
        "license_url": string | null,
        "note":        string
      } | undefined
    }
  ],
  "categories": string[],
  "models":     string[]
}

CORS: GitHub raw content is served with access-control-allow-origin: * — fetch works from any browser origin. The file updates on every push to main. Cache-bust with a query string if you need fresh data.