Free LLM API

Nariman Jelveh

Updated: November 23, 2025

On this page

This tutorial will show you how to access hundreds of large language models (LLMs) for free using Puter.js. Whether you need GPT, Claude, Gemini, Grok, DeepSeek, or any of the 500+ proprietary or open-source models supported by Puter.js, you can use them all without API keys, backend infrastructure, or limits.

Puter is the pioneer of the "User-Pays" model, which allows developers to incorporate AI capabilities into their applications while users cover their own usage costs. This revolutionary approach eliminates the need for developers to pay for usage, manage API keys, worry about billing, or maintain server infrastructure.

Getting Started

Puter.js is completely serverless and requires no API keys or sign-ups. To start using any LLM, simply include this script tag in your HTML file, either in the <head> or <body> section:

<script src="https://js.puter.com/v2/"></script>

That's it! You're now ready to access hundreds of LLMs completely free. No backend, no API keys, no configuration.

Example 1: Using GPT-5 Nano

Let's start with OpenAI's GPT-5 Nano, a fast and efficient model perfect for most tasks. This example shows how to use the puter.ai.chat() function to generate text using GPT-5 Nano:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain machine learning in simple terms", { 
            model: "gpt-5-nano" 
        }).then(response => {
            puter.print(response);
        });
    </script>
</body>
</html>

Example 2: Using Claude Sonnet 4

By simply changing the model name to "claude-sonnet-4", you can use Anthropic's Claude Sonnet 4 model to generate text. No need to provide any API keys or change your code:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Write a creative story about a robot discovering emotions", { 
            model: "claude-sonnet-4" 
        }).then(response => {
            puter.print(response.message.content[0].text);
        });
    </script>
</body>
</html>

Example 3: Using DeepSeek R1 for Complex Reasoning

DeepSeek R1 excels at step-by-step problem solving and logical reasoning:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "Solve this logic puzzle: If all bloops are razzles and all razzles are lazzles, are all bloops definitely lazzles? Explain your reasoning.", 
            { model: "deepseek/deepseek-r1" }
        ).then(response => {
            puter.print(response.message.content);
        });
    </script>
</body>
</html>

Example 4: Streaming Responses for Better UX

For longer responses, streaming provides a better user experience by showing results in real-time:

<html>
<body>
    <div id="output"></div>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        async function streamLLMResponse() {
            const outputDiv = document.getElementById('output');
            outputDiv.innerHTML = '<h2>Streaming Response:</h2>';
            
            const response = await puter.ai.chat(
                "Write a detailed explanation of quantum computing and its applications", 
                { 
                    model: "gpt-5-nano",
                    stream: true 
                }
            );
            
            for await (const part of response) {
                if (part?.text) {
                    outputDiv.innerHTML += part.text;
                }
            }
        }

        streamLLMResponse();
    </script>
</body>
</html>

Example 5: Using LLMs for Image Analysis

Many LLMs support multimodal capabilities for analyzing images:

<html>
<body>
    <img src="https://assets.puter.site/doge.jpeg" style="max-width: 400px; display: block; margin: 20px 0;">
    
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "Describe this image in detail. What do you see?",
            "https://assets.puter.site/doge.jpeg",
            { model: "gpt-5-nano" }
        ).then(response => {
            document.write(`<h3>Image Analysis:</h3><p>${response}</p>`);
        });
    </script>
</body>
</html>

Example 6: Function Calling with LLMs

Enable LLMs to call functions in your application for dynamic interactions:

<html>
<body>
    <div style="max-width: 600px; margin: 20px auto; font-family: Arial, sans-serif;">
        <h1>Weather Assistant</h1>
        <input type="text" id="userQuery" placeholder="Ask about the weather..." 
               style="width: 100%; padding: 10px; margin: 10px 0;">
        <button onclick="askWeather()" style="padding: 10px 20px;">Ask</button>
        <div id="response" style="margin-top: 20px; padding: 15px; background: #f8f9fa; border-radius: 5px;"></div>
    </div>

    <script src="https://js.puter.com/v2/"></script>
    <script>
        // Mock weather function
        function getWeather(location) {
            const weatherData = {
                'Paris': '22°C, Partly Cloudy',
                'London': '18°C, Rainy',
                'New York': '25°C, Sunny',
                'Tokyo': '28°C, Clear'
            };
            return JSON.stringify(weatherData[location] || '20°C, Unknown');
        }

        // Define tools available to the LLM
        const tools = [{
            type: "function",
            function: {
                name: "get_weather",
                description: "Get current weather for a location",
                parameters: {
                    type: "object",
                    properties: {
                        location: {
                            type: "string",
                            description: "City name (e.g., Paris, London)"
                        }
                    },
                    required: ["location"]
                }
            }
        }];

        async function askWeather() {
            const userQuery = document.getElementById('userQuery').value;
            const responseDiv = document.getElementById('response');
            
            if (!userQuery) return;
            
            responseDiv.innerHTML = 'Processing...';
            
            try {
                // First API call
                const completion = await puter.ai.chat(userQuery, { tools });
                
                // Check if LLM wants to call a function
                if (completion.message.tool_calls?.length > 0) {
                    const toolCall = completion.message.tool_calls[0];
                    const args = JSON.parse(toolCall.function.arguments);
                    const weatherData = getWeather(args.location);
                    
                    // Second API call with function result
                    const finalResponse = await puter.ai.chat([
                        { role: "user", content: userQuery },
                        completion.message,
                        {
                            role: "tool",
                            tool_call_id: toolCall.id,
                            content: weatherData
                        }
                    ]);
                    
                    responseDiv.innerHTML = finalResponse;
                } else {
                    responseDiv.innerHTML = completion;
                }
            } catch (error) {
                responseDiv.innerHTML = `Error: ${error.message}`;
            }
        }
    </script>
</body>
</html>

Available Models

Puter.js provides access to 500+ models from various providers including OpenAI, Anthropic, Google, Grok, and more. For a complete list of all available models, visit: https://puter.com/puterai/chat/models

Advanced Features

Temperature Control

Control the randomness and creativity of LLM responses by setting the temperature parameter. The default temperature is 0.7.

// Low temperature (0.2) - More focused and deterministic
puter.ai.chat("Explain photosynthesis", {
    model: "gpt-5-nano",
    temperature: 0.2,
    max_tokens: 100
});

// High temperature (0.8) - More creative and varied
puter.ai.chat("Write a poem about coding", {
    model: "claude-sonnet-4",
    temperature: 0.8,
    max_tokens: 200
});

Multi-turn Conversations

Maintain context across multiple exchanges by passing the conversation history as an array of messages.

let conversationHistory = [];

async function continueConversation(userMessage) {
    conversationHistory.push({
        role: "user",
        content: userMessage
    });

    const response = await puter.ai.chat(conversationHistory, {
        model: "gpt-5-nano"
    });

    conversationHistory.push({
        role: "assistant",
        content: response
    });

    return response;
}

Other AI Features

Puter.js is not limited to LLMs! It also provides access to a wide range of AI features including:

That's it! You now have free, unlimited access to hundreds of large language models through Puter.js. Build powerful AI applications without worrying about API keys, rate limits, or infrastructure costs. The future of AI development is serverless, and it's available to you right now.

Free, Serverless AI and Cloud

Start creating powerful web applications with Puter.js in seconds!

Get Started Now

Read the Docs • Try the Playground