The ollama npm package is the official JavaScript/TypeScript client for integrating with Ollama servers, enabling developers to run large language models locally without cloud dependencies. It provides a straightforward API for chat completions, text generation, and model lifecycle management (pulling, creating, and pushing models). With over 569,000 weekly downloads, it has become the de facto standard for JavaScript developers building AI-powered applications that prioritize data privacy and offline capability.
The library mirrors Ollama's REST API with idiomatic JavaScript patterns, offering async/await support and streaming capabilities through AsyncIterables. It handles everything from simple text completions to advanced features like multi-modal inputs (vision models), structured JSON output with schema validation, and custom model creation via Modelfiles. The package works seamlessly in both Node.js and browser environments when properly configured.
Developers choose ollama when they need full control over their AI infrastructure—models run on local hardware, no data leaves the system, and there are no per-token API costs. It's particularly valuable for enterprise applications with strict compliance requirements, edge computing scenarios, or development workflows where internet connectivity is unreliable. The library maintains feature parity with Ollama's server capabilities, automatically exposing new functionality like tool-calling and advanced quantization as the ecosystem evolves.
import { Ollama } from 'ollama';
// Initialize client (defaults to http://127.0.0.1:11434)
const ollama = new Ollama({ host: 'http://localhost:11434' });
// Non-streaming chat completion
async function simpleChat() {
const response = await ollama.chat({
model: 'llama3.1',
messages: [
{ role: 'system', content: 'You are a helpful coding assistant.' },
{ role: 'user', content: 'Explain JavaScript promises in one sentence.' }
]
});
console.log(response.message.content);
}
// Streaming response with real-time output
async function streamingChat() {
const stream = await ollama.chat({
model: 'llama3.1',
messages: [{ role: 'user', content: 'Write a haiku about TypeScript' }],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.message.content);
}
console.log('\n');
}
// Vision model with image input
async function analyzeImage() {
const fs = require('fs');
const imageBuffer = fs.readFileSync('./screenshot.png');
const base64Image = imageBuffer.toString('base64');
const response = await ollama.chat({
model: 'llava',
messages: [{
role: 'user',
content: 'Describe this image in detail',
images: [base64Image]
}]
});
console.log(response.message.content);
}
// Pull model with progress tracking
async function downloadModel() {
const stream = await ollama.pull({ model: 'mistral', stream: true });
for await (const progress of stream) {
if (progress.status) {
console.log(`${progress.status}: ${progress.completed || 0}/${progress.total || 0}`);
}
}
}
streamingChat().catch(console.error);Local chatbot interfaces: Build conversational UIs in Electron or web apps where user data never leaves the device. Stream responses token-by-token for real-time feedback, using models like Llama 3.1 or Mistral running entirely on user hardware.
Document analysis pipelines: Process sensitive documents (legal, medical, financial) through local LLMs for summarization, extraction, or classification without third-party API calls. Combine with RAG patterns to query internal knowledge bases while maintaining HIPAA/GDPR compliance.
Development tooling and IDE extensions: Create VS Code extensions or CLI tools that provide code completion, documentation generation, or test case creation using local models. Developers get AI assistance without sending proprietary code to external services.
Multi-modal content processing: Analyze images with vision models (LLaVA) for tasks like automated alt-text generation, product catalog classification, or visual QA systems. Pass base64-encoded images directly to chat methods alongside text prompts.
Prototyping and experimentation: Quickly test different LLM architectures and fine-tuned models by pulling from Ollama's registry. Create custom models with specific system prompts or parameter tweaks via Modelfiles, ideal for research or proof-of-concept work before cloud deployment.
<p align="center"> <img height="100" width="100" alt="LlamaIndex logo" src="https://ts.llamaindex.ai/square.svg" /> </p> <h1 align="center">LlamaIndex.TS</h1> <h3 align="center"> Data framework for your LLM application. </h3>
The official TypeScript library for the OpenAI API
npm install ollamapnpm add ollamabun add ollama