Grok 2 Vision API

Access Grok 2 Vision from xAI using Puter.js AI API.

Get Started

Model Card

Grok 2 Vision is a multimodal AI model that combines text and visual understanding capabilities, excelling at object recognition, visual math reasoning (MathVista), and document-based question answering (DocVQA). It supports image analysis with a 32K context window.

Context Window

8K

tokens

Max Output

8,192

tokens

Input Cost

$2

per million tokens

Output Cost

$10

per million tokens

API Usage Example

Add Grok 2 Vision to your app with just a few lines of code.
No API keys, no backend, no configuration required.

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        puter.ai.chat("Explain quantum computing in simple terms", {
            model: "x-ai/grok-2-vision"
        }).then(response => {
            document.body.innerHTML = response.message.content;
        });
    </script>
</body>
</html>

View full documentation →

Get started with Puter.js

Add Grok 2 Vision to your app without worrying about API keys or setup.

Read the Docs View Tutorials