Free, Unlimited IBM Granite API
On this page
This tutorial will show you how to use Puter.js to access IBM's Granite AI models for free. Using Puter.js, you can leverage models like Granite 4.1 8B and Granite 4.0 Micro without any API keys or usage restrictions.
Puter is the pioneer of the "User-Pays" model, which allows developers to add AI capabilities to their applications while users cover their own usage costs. This model enables developers to access advanced AI features for free, without any API keys or server-side setup.
Getting Started
To use Puter.js, import our NPM library in your project:
// npm install @heyputer/puter.js
import { puter } from '@heyputer/puter.js';
Or alternatively, add our script via CDN if you are working directly with HTML, simply add it to the <head> or <body> section of your code:
<script src="https://js.puter.com/v2/"></script>
You're now ready to use Puter.js to access IBM Granite capabilities. No API keys or sign-ups are required.
Example 1: Basic Chat with Granite 4.1 8B
Here's a simple example showing how to generate text using Granite 4.1 8B, IBM's dense, decoder-only enterprise model with a 131K-token context window:
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat("Explain why dense transformers are easier to fine-tune than MoE models", {
model: 'ibm-granite/granite-4.1-8b'
}).then(response => {
puter.print(response);
});
</script>
</body>
</html>
Using the puter.ai.chat() function, you can generate text using Granite 4.1 8B, which is purpose-built for enterprise workloads like RAG, summarization, and classification.
Example 2: Low-Latency Inference with Granite 4.0 Micro
Granite 4.0 Micro is a 3B-parameter dense model optimized for low-latency, cost-efficient workloads. Despite its compact size, it outperforms its predecessor Granite 3.3 8B and is a strong fit for agentic sub-tasks and API orchestration:
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat("Summarize the key differences between supervised and unsupervised learning", {
model: 'ibm-granite/granite-4.0-h-micro'
}).then(response => {
puter.print(response);
});
</script>
</body>
</html>
Example 3: Tool Calling
Granite models implement OpenAI-compatible tool calling, making them well-suited for agentic workflows. Here's how to let the model call an external function:
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
// Mock weather function
function getWeather(location) {
return location + ': 22°C, Sunny';
}
// Define the tool
const tools = [{
type: "function",
function: {
name: "get_weather",
description: "Get current weather for a location",
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
// Mock weather function
function getWeather(location) {
return location + ': 22°C, Sunny';
}
// Define the tool
const tools = [{
type: "function",
function: {
name: "get_weather",
description: "Get current weather for a location",<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
// Mock weather function
function getWeather(location) {
return location + ': 22°C, Sunny';
}
// Define the tool
const tools = [{
type: "function",
function: {
name: "get_weather",
description: "Get current weather for a location",
parameters: {
type: "object",
properties: {
location: { type: "string", description: "City name" }
},
required: ["location"]
}
}
}];
(async () => {
const question = "What's the weather in San Francisco?";
puter.print("Question: " + question + "<br/>");
// Call AI with tools
const response = await puter.ai.chat(question, { tools, model: "ibm-granite/granite-4.1-8b" });
// Check if AI wants to call a function
if (response.message.tool_calls?.length > 0) {
const toolCall = response.message.tool_calls[0];
const args = JSON.parse(toolCall.function.arguments);
const weatherData = getWeather(args.location);
// Send result back to AI
const finalResponse = await puter.ai.chat([
{ role: "user", content: question },
response.message,
{ role: "tool", tool_call_id: toolCall.id, content: weatherData }
], { model: "ibm-granite/granite-4.1-8b" });
puter.print("Answer: " + finalResponse);
} else {
puter.print("Answer: " + response);
}
})();
</script>
</body>
</html>
Example 4: Multilingual Generation
Granite 4.1 8B natively supports 12 languages including English, German, Spanish, French, Japanese, and Chinese. Here's an example summarizing content in German:
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat(
"Fasse die Vorteile von erneuerbaren Energien auf Deutsch in fünf Stichpunkten zusammen.",
{ model: 'ibm-granite/granite-4.1-8b' }
).then(response => {
puter.print(response);
});
</script>
</body>
</html>
Example 5: Streaming Response
For longer responses, use streaming to get results in real-time:
<html>
<body>
<div id="output"></div>
<script src="https://js.puter.com/v2/"></script>
<script>
async function streamResponse() {
const outputDiv = document.getElementById('output');
const response = await puter.ai.chat(
"Draft an internal RFC for migrating a legacy auth service to OAuth 2.1",
{
model: 'ibm-granite/granite-4.1-8b',
stream: true
}
);
for await (const part of response) {
if (part?.text) {
outputDiv.innerHTML += part.text.replaceAll('\n', '<br>');
}
}
}
streamResponse();
</script>
</body>
</html>
List of Supported IBM Granite Models
The following IBM Granite models are supported by Puter.js:
ibm-granite/granite-4.0-h-micro
ibm-granite/granite-4.1-8b
Conclusion
Using Puter.js, you can gain access to IBM Granite without having to set up the AI server yourself. And thanks to the User-Pays model, your users cover their own AI usage, not you as the developer. This means you can build powerful enterprise-ready applications with tool calling, RAG, and multilingual generation without worrying about AI usage costs.
You can find all AI features supported by Puter.js in the documentation.
Related
Free, Serverless AI and Cloud
Start creating powerful web applications with Puter.js in seconds!
Get Started Now