Free, Unlimited Z.AI GLM API

Reynaldi Chernando

Updated: January 21, 2026

On this page

This tutorial will show you how to use Puter.js to access Z.AI GLM models, including GLM 4.7 Flash, GLM 4.7, GLM 4.6V, and other Z.AI models completely free, without any API keys or usage restrictions.

Puter is the pioneer of the "User-Pays" model, which allows developers to incorporate AI capabilities into their applications with each user covering their own usage costs. This model enables developers to access advanced AI capabilities for free, without any API keys or server-side setup.

Getting Started

You can use Puter.js without any API keys or sign-ups. To start using Puter.js, include the following script tag in your HTML file, either in the <head> or <body> section:

<script src="https://js.puter.com/v2/"></script>

Nothing else is required to start using Puter.js for free access to Z.AI GLM models and capabilities.

Example 1: Use GLM 4.7 for conversational AI

To generate text using GLM 4.7, use the puter.ai.chat() function:

puter.ai.chat("Explain the concept of quantum computing in simple terms", { model: "z-ai/glm-4.7" })
.then(response => {
    puter.print(response);
});

Full code example:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain the concept of quantum computing in simple terms", { model: "z-ai/glm-4.7" })
            .then(response => {
                puter.print(response);
            });
    </script>
</body>
</html>

Example 2: Cost-efficient chat with GLM 4.7 Flash

For cost-efficient conversational AI, use GLM 4.7 Flash with the puter.ai.chat() function:

puter.ai.chat(
    "What are the key differences between machine learning and deep learning?",
    { model: "z-ai/glm-4.7-flash" }
)
.then(response => {
    puter.print(response);
});

Full code example:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "What are the key differences between machine learning and deep learning?",
            { model: "z-ai/glm-4.7-flash" }
        )
        .then(response => {
            puter.print(response);
        });
    </script>
</body>
</html>

Example 3: Code generation with GLM 4.7

GLM 4.7 excels at code generation tasks. Here's how to use it for writing code:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "Write a Python function that implements binary search on a sorted array",
            { model: "z-ai/glm-4.7" }
        )
        .then(response => {
            puter.print(response, {code: true});
        });
    </script>
</body>
</html>

Example 4: Chinese language support with GLM 4.7

GLM 4.7 has excellent Chinese language support, making it ideal for multilingual applications:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "请解释一下人工智能在医疗领域的应用前景",
            { model: "z-ai/glm-4.7" }
        )
        .then(response => {
            puter.print(response);
        });
    </script>
</body>
</html>

This example demonstrates GLM 4.7's ability to understand and respond in Chinese, making it perfect for applications targeting Chinese-speaking users or requiring multilingual support.

Example 5: Stream responses for longer queries

For longer responses, use streaming to get results in real-time:

async function streamResponse() {
    const response = await puter.ai.chat(
        "Explain the history and evolution of artificial intelligence in detail",
        { model: "z-ai/glm-4.7", stream: true }
    );

    for await (const part of response) {
        if (part?.reasoning) puter.print(part?.reasoning);
        else puter.print(part?.text);
    }
}

streamResponse();

Full code example:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        async function streamResponse() {
            const response = await puter.ai.chat(
                "Explain the history and evolution of artificial intelligence in detail",
                { model: "z-ai/glm-4.7", stream: true }
            );

            for await (const part of response) {
                if (part?.reasoning) puter.print(part?.reasoning);
                else puter.print(part?.text);
            }
        }

        streamResponse();
    </script>
</body>
</html>

Example 6: Image Analysis with GLM 4.6V

To analyze images, simply provide an image URL to puter.ai.chat() using the vision model GLM 4.6V:

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat(
            "What do you see in this image?",
            "https://assets.puter.site/doge.jpeg",
            { model: 'z-ai/glm-4.6v' }
        ).then(response => {
            document.write(response);
        });
    </script>
</body>
</html>

List of supported models

The following Z.AI GLM models are supported by Puter.js:

z-ai/glm-4-32b
z-ai/glm-4.5
z-ai/glm-4.5-air
z-ai/glm-4.5-air:free
z-ai/glm-4.5v
z-ai/glm-4.6
z-ai/glm-4.6:exacto
z-ai/glm-4.6v
z-ai/glm-4.7
z-ai/glm-4.7-flash

That's it! You now have a free alternative to the Z.AI GLM API using Puter.js. This allows you to access GLM-4.7 Flash, GLM-4.7, GLM-4.6v, and other Z.AI GLM model capabilities without needing an API key or a backend. True serverless AI!

Free, Serverless AI and Cloud

Start creating powerful web applications with Puter.js in seconds!

Get Started Now

Read the Docs • Try the Playground