Tutorials

Free, Unlimited Z.AI GLM API

On this page

This tutorial will show you how to use Puter.js to access Z.AI GLM models, including GLM 5 Turbo, GLM 5, GLM 4.7 Flash, GLM 4.7, GLM 4.6V, and other Z.AI models completely free, without any API keys or usage restrictions.

Puter is the pioneer of the "User-Pays" model, which allows developers to incorporate AI capabilities into their applications with each user covering their own usage costs. This model enables developers to access advanced AI capabilities for free, without any API keys or server-side setup.

Getting Started

To use Puter.js, import our NPM library in your project:

// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';

Or, add our script via CDN if you are working directly with HTML, simply add it to the <head> or <body> section of your code:

<script src="https://js.puter.com/v2/"></script>

Nothing else is required to start using Puter.js for free access to Z.AI GLM models and capabilities.

Example 1: Use GLM 5 for conversational AI

To generate text using GLM 5, use the puter.ai.chat() function:

puter.ai.chat("Explain the concept of quantum computing in simple terms", { model: "z-ai/glm-5" })
.then(response => {
    puter.print(response);
});

Full code example:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain the concept of quantum computing in simple terms", { model: "z-ai/glm-5" })
            .then(response => {
                puter.print(response);
            });
    </script>
</body>
</html>

Example 2: High-speed inference with GLM 5 Turbo

GLM 5 Turbo is a high-speed variant of GLM 5, optimized for fast inference and agent-driven workflows. It excels at tool invocation, complex instruction decomposition, and long-chain task execution:

puter.ai.chat(
    "Break down the steps to build a REST API with authentication and rate limiting",
    { model: "z-ai/glm-5-turbo" }
)
.then(response => {
    puter.print(response);
});

Full code example:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "Break down the steps to build a REST API with authentication and rate limiting",
            { model: "z-ai/glm-5-turbo" }
        )
        .then(response => {
            puter.print(response);
        });
    </script>
</body>
</html>

Example 3: Cost-efficient chat with GLM 4.7 Flash

For cost-efficient conversational AI, use GLM 4.7 Flash with the puter.ai.chat() function:

puter.ai.chat(
    "What are the key differences between machine learning and deep learning?",
    { model: "z-ai/glm-4.7-flash" }
)
.then(response => {
    puter.print(response);
});

Full code example:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "What are the key differences between machine learning and deep learning?",
            { model: "z-ai/glm-4.7-flash" }
        )
        .then(response => {
            puter.print(response);
        });
    </script>
</body>
</html>

Example 4: Code generation with GLM 5

GLM 5 excels at code generation tasks. Here's how to use it for writing code:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "Write a Python function that implements binary search on a sorted array",
            { model: "z-ai/glm-5" }
        )
        .then(response => {
            puter.print(response, {code: true});
        });
    </script>
</body>
</html>

Example 5: Chinese language support with GLM 5

GLM 5 has excellent Chinese language support, making it ideal for multilingual applications:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "请解释一下人工智能在医疗领域的应用前景",
            { model: "z-ai/glm-5" }
        )
        .then(response => {
            puter.print(response);
        });
    </script>
</body>
</html>

This example demonstrates GLM 5's ability to understand and respond in Chinese, making it perfect for applications targeting Chinese-speaking users or requiring multilingual support.

Example 6: Stream responses for longer queries

For longer responses, use streaming to get results in real-time:

async function streamResponse() {
    const response = await puter.ai.chat(
        "Explain the history and evolution of artificial intelligence in detail",
        { model: "z-ai/glm-5", stream: true }
    );

    for await (const part of response) {
        if (part?.reasoning) puter.print(part?.reasoning);
        else puter.print(part?.text);
    }
}

streamResponse();

Full code example:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        async function streamResponse() {
            const response = await puter.ai.chat(
                "Explain the history and evolution of artificial intelligence in detail",
                { model: "z-ai/glm-5", stream: true }
            );

            for await (const part of response) {
                if (part?.reasoning) puter.print(part?.reasoning);
                else puter.print(part?.text);
            }
        }

        streamResponse();
    </script>
</body>
</html>

Example 7: Image Analysis with GLM 4.6V

To analyze images, simply provide an image URL to puter.ai.chat() using the vision model GLM 4.6V:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "What do you see in this image?",
            "https://assets.puter.site/doge.jpeg",
            { model: 'z-ai/glm-4.6v' }
        ).then(response => {
            document.write(response);
        });
    </script>
</body>
</html>

List of supported models

The following Z.AI GLM models are supported by Puter.js:

z-ai/glm-5-turbo
z-ai/glm-5
z-ai/glm-4.7-flash
z-ai/glm-4.6v
z-ai/glm-4.7
z-ai/glm-4.6
z-ai/glm-4.5v
z-ai/glm-4.5
z-ai/glm-4.5-air
z-ai/glm-4.5-air:free
z-ai/glm-4-32b

Conclusion

Using Puter.js, you can access Z.AI GLM models without needing an API key or a backend. And thanks to the User-Pays model, your users cover their own AI usage, not you as the developer. This means you can build powerful applications without worrying about AI usage costs.

You can find all AI features supported by Puter.js in the documentation.

Free, Serverless AI and Cloud

Start creating powerful web applications with Puter.js in seconds!

Get Started Now

Read the Docs Try the Playground