EleutherAI: Llemma 7b

This model is no longer available.

Add AI to your application with Puter.js.

Explore Other Models

Model Card

Llemma 7B is an open-source language model purpose-built for mathematics, developed by EleutherAI. It was created by continuing pretraining of Code Llama 7B on the Proof-Pile-2, a 55-billion-token dataset of scientific papers, math-heavy web content, and mathematical code.

The model excels at chain-of-thought mathematical reasoning and can leverage computational tools like Python interpreters and formal theorem provers (Lean, Isabelle) without additional fine-tuning. On the MATH benchmark, Llemma 7B scores 18.0% pass@1, and on GSM8k it achieves 36.4% — significantly outperforming Llama 2 and Code Llama, and surpassing Google's Minerva on an equal-parameter basis.

Llemma is best suited as a specialized base model for math-heavy applications such as step-by-step problem solving, formal proof generation, and scientific reasoning. Its fully open weights, data, and training code make it a strong foundation for further fine-tuning.

Context Window 4K

tokens

Max Output 4K

tokens

Input Cost $0.8

per million tokens

Output Cost $1.2

per million tokens

Release Date Oct 16, 2023

 

Code Example

Add AI to your app with the Puter.js AI API — no API keys or setup required.

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

puter.ai.chat("Explain quantum computing in simple terms").then(response => {
    document.body.innerHTML = response.message.content;
});
<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain quantum computing in simple terms").then(response => {
            document.body.innerHTML = response.message.content;
        });
    </script>
</body>
</html>

Frequently Asked Questions

How do I use Llemma 7b?

You can access Llemma 7b by EleutherAI through Puter.js AI API. Include the library in your web app or Node.js project and start making calls with just a few lines of JavaScript — no backend and no configuration required. You can also use it with Python or cURL via Puter's OpenAI-compatible API.

Is Llemma 7b free?

Yes, it is free if you're using it through Puter.js. With the User-Pays Model, you can add Llemma 7b to your app at no cost — your users pay for their own AI usage directly, making it completely free for you as a developer.

What is the pricing for Llemma 7b?
Pricing for Llemma 7b is based on the number of input and output tokens used per request.
Price per 1M tokens
Input$0.8
Output$1.2
Who created Llemma 7b?

Llemma 7b was created by EleutherAI and released on Oct 16, 2023.

What is the context window of Llemma 7b?

Llemma 7b supports a context window of 4K tokens. For reference, that is roughly equivalent to 8 pages of text.

What is the max output length of Llemma 7b?

Llemma 7b can generate up to 4K tokens in a single response.

Does it work with React / Vue / Vanilla JS / Node / etc.?

Yes — the Llemma 7b API works with any JavaScript framework, Node.js, or plain HTML through Puter.js. Just include the library and start building. See the documentation for more details.

Get started with Puter.js

Add AI to your application without worrying about API keys or setup.

Explore Models View Tutorials