Why was Ollama/LM Studio integration deprecated?

Due to browser security policies that prevent HTTPS production sites from accessing localhost HTTP endpoints, we deprecated Ollama and LM Studio integration to ensure consistent functionality across all deployment environments.

What are the current AI integration options?

Current options include AI-ML API for cloud processing with advanced models, and WebLLM for privacy-first browser-based AI processing. See our updated 2025 AI Integration Guide for complete details.

Should I use cloud or local AI models?

Cloud models offer superior performance and capabilities but require internet connectivity. Local models provide privacy and offline functionality but with limited capabilities. Choose based on your specific needs for privacy, performance, and connectivity.

⚠️ Deprecation Notice: This guide includes information about Ollama/LM Studio integration, which was deprecated in January 2025 due to production deployment limitations with HTTPS/localhost security policies. For current AI integration options, see our 2025 AI Integration Guide: Updated Models & Features.

The online AI Notepad features an AI assistance tool that enables you to generate, refine, and enhance text using a diverse range of AI models, both online and locally.

Whether you're looking to spark new ideas, fine-tune existing content, or distil complex information, these advanced AI capabilities help you work more efficiently and creatively directly in your browser.

How to Use the AI Feature

Although each tool has unique AI use cases, the overall usage and flow are consistent across tools.

Step 1: Locate and click on the settings icon at the bottom left of your browser window. This will open an AI dialogue box where you can select and configure your AI model, API keys, and preferences.

AI assistant configuration window Step 2: Click on the Model dropdown menu to reveal the list of available options.

This list is divided into three (3) categories of AI models:

Cloud models (OpenAI)
Local models (Browser)
Local Models (Ollama/LM Studio)
OpenRouter Models

Step 3: Select your desired model and use it accordingly. You may require additional configurations depending on your model choice; more information on this is provided in a later section.

Essential Considerations: What You Need to Know Upfront

Integrating AI requires awareness of certain constraints and trade-offs. Below are key factors to keep in mind:

Hardware Requirements

Cloud Models: No local hardware constraints, just stable internet.
Browser-Based Local Models: These run entirely in your browser’s sandbox. Models under ~7B parameters typically require 16 GB RAM (min), though 32 GB is recommended for smoother switching and multitasking.
Ollama/LM Studio Local Frameworks: To deploy larger models (e.g., 7B+), a dedicated GPU with at least 8 GB VRAM is recommended. For models like the DeepSeek 14B or 67B, consider using 16 GB–32 GB of VRAM to avoid performance bottlenecks.

Internet Connectivity

Cloud-based AI models (e.g., via OpenRouter, or direct API access) require an active internet connection, while local models can function offline once downloaded and configured.

AI Credits and Costs

Cloud-based services often operate on a pay-per-use model or require AI credits. Local models incur no recurring API costs after initial setup, making them cost-effective for extensive use.

Privacy Implications

Local AI processing ensures your data never leaves your device. Cloud services handle data according to their respective privacy policies, which vary by provider.

Language Support

While the UI supports multiple languages, AI features have been primarily tested in English; results in other languages may vary in accuracy and quality.

Selecting Your AI Model: Matching Power to Purpose

With a spectrum of AI models available, the key is understanding their strengths to choose the best co-pilot for your specific writing needs. This section breaks down the types of models and their ideal use cases.

Cloud-Based Models

These models are hosted on OpenAI servers, offering immense computational power and access to the latest advancements without taxing your local hardware.

Ideal For: Complex creative writing, in-depth analysis, comprehensive summarization of large texts, general knowledge queries, when internet access is reliable.

Options & Capabilities:

GPT-4o: The 'omni' model, highly versatile for creative generation, nuanced understanding, and conversational interactions. Excellent for sophisticated content creation and complex reasoning tasks.
GPT-4 / GPT-4 Turbo: Powerful, robust, and capable of complex reasoning and long-form content. Ideal for detailed articles, academic drafting, and refined professional documents.
GPT-3.5 Turbo / GPT-4o mini: Fast and cost-effective, excellent for quick summaries, rapid brainstorming, generating short paragraphs, and light content refinement.

Browser-Based Local Models

Smaller AI models that run directly within your web browser. Once downloaded, they offer immediate, private processing without needing an internet connection.

Ideal For: Sensitive personal notes, quick offline brainstorming, simple text expansion, basic rephrasing, when privacy is paramount or internet is unavailable.

Options & Capabilities:

Higher-Quality Tier: Phi-3.5 Mini (3.5 B) and Mistral 7 B (v0.3):

These models deliver richer contextual understanding. They provide thinking capability, ie, they can think longer, hence have better output comparable to smaller server-based models.
Entry-Level Tier: TinyLlama 1.1 B, Llama 3.2 (1 B / 3 B), and Gemma 2 (2 B)
These lighter models are high-speed and highly memory-efficient, making them ideal for basic text editing, short paraphrasing tasks, and rapid prototyping. However, their output may be noticeably less fluent or detailed compared to the Phi-3.5 Mini and Mistral 7 B.

Setup Note: Typically selected and downloaded directly within the AI-powered Online Notepad's interface. No external software required.

Dedicated Local AI Frameworks (Ollama/LM Studio)

These are desktop applications that allow you to run larger, more powerful open-source AI models directly on your computer. They offer the highest level of privacy and customization.

Ideal For: Running larger Llama, DeepSeek, Qwen or Mixtral models. Ideal for highly sensitive documents, extensive offline work, users with powerful local hardware (especially GPUs), experimenting with specific open-source models.

Llama 3.3 70B Instruct (free)
Llama 4 “Maverick” 17B MoE (free)
Mistral Small 3 (24B, free)
DeepSeek V3 0324 (free)
Qwen3-30B-A3B (free)

Options & Capabilities:

Higher-Quality Tier: Llama 3.3 70B Instruct, Llama 4 “Maverick” 17B MoE, Llama 3.1 8 B, Mistral 7 B, and Hermes-2-Pro 13 B:
These models offer richer context understanding, more coherent long-form outputs, and stronger reasoning capabilities, on par with mid-sized cloud/remote models, provided your GPU or Apple M-series chip has sufficient memory.
Mid-Range Tier: Llama 3.1 4 B, DeepSeek V3 0324, Mistral 3 B, and Gemma 3 (smaller variants):
These deliver solid performance for complex prompts and multi-turn tasks, but may experience slowdowns in extremely long contexts. They strike a balance between output quality and hardware affordability.
Entry-Level Tier: Qwen3-30B-A3B, Llama 2 3 B, Alpaca-Lite 7 B, and Phi-3.5 Mini (3.5 B; if deployed locally via Ollama quantization)
These smaller models enable fast inference on entry-level GPUs or older Apple Silicon chips. They’re ideal for basic drafting, quick proof-of-concepts, and privacy-sensitive tasks, though output may be noticeably less detailed.

Setup Note: Requires installation of Ollama (Ollama.ai) or LM Studio(Recommended) and subsequent model downloads on your computer. Your writing tool connects to their local server.

API Services like OpenRouter

OpenRouter acts as a unified gateway to a vast collection of AI models (both commercial and open-source) from various providers, all accessible through a single API key.

Ideal For: Accessing a wide variety of both commercial and open-source AI models through a single API key. This is the fastest way to experiment with diverse models like OpenAI's GPT and O-models, Gemini, Claude, and Llama 3(for free!).

Options & Capabilities

OpenRouter’s catalog includes many of the most powerful models available publicly or commercially, organized below:

Table 1: Top Models on OpenRouter

Model	Provider	Context / Parameters	Strengths
Claude 4 Sonnet	Anthropic	200K tokens	Hybrid reasoning, chain of thought, agents
Gemini 2.0 Flash	Google	1M tokens	Low-latency SEO, summarization
Gemini 2.5 Pro	Google	1M tokens, thinking mode	Deep reasoning, coding, science
GPT-4o-mini	OpenAI	128K tokens	Vision support, highly cost-effective
DeepSeek V3 0324	DeepSeek	685B MoE, 163K tokens	Free, open source, top logic performance

Table 2: Top Free Models on OpenRouter:

Model	Provider	Parameters / Context	Use Case
Llama 4 "Maverick"	Meta	400B MoE (17B active), 128K	Multimodal, vision + text tasks
Llama 3.3 70B Instruct	Meta	70B, 131K tokens	Chat, reasoning, multilingual
DeepSeek V3 0324	DeepSeek	685B MoE, 163K tokens	Research, logic, general purpose
Qwen3-30B-A3B	Tencent	30.5B (3.3B active), 131K	Fast and intelligent dialogue + code
Mistral Small 3	Mistral	24B, 32K tokens	High-quality open model, low latency

These get updates every week/month, balancing task needs, model fit and cost is key.

Explore all current rankings and models:

Setup Note: Requires creating an account and generating an API key at OpenRouter.ai. This key is then used to connect within your writing tool.

Hands-On: How to Set Up Your AI Model

Once you've decided on the right AI model for your project, integrating it into your writing environment is straightforward.

Cloud Models (GPT Series)

Step 1: Select your desired cloud model from the list (e.g., 'GPT-4o,' 'GPT-4 Turbo') and enter your OpenAI API key.

Local Models (Browser)

Step 1: Choose a browser-based local model from the available options. Your tool will typically prompt you to initiate a one-time download directly into your browser's local storage. This usually happens automatically upon first selection.

The first time you select a browser model, it will automatically download weights to your local storage. Subsequent usages are instantaneous and do not require re-downloading, as long as the model remains cached.

Integrating Ollama/LM Studio

Step 1: Download Ollama or LM Studio (Recommended) and ensure that either is running on your computer.

Step 2: In your online notepad's AI settings, select Ollama/LM Studio model option. You'll be prompted to connect to your local server (enter server endpoint URL). If Ollama is currently running on your system, the default endpoint URL is:

http://localhost:11434

Tip: Always refer to the official documentation of Ollama or LM Studio for the most accurate and up-to-date instructions on installation and model management.

Connecting via OpenRouter API Service

Step 1: Visit OpenRouter, create an account, and generate an API key from your dashboard.

Step 2: In your notepad's AI settings, select your preferred OpenRouter Model, and enter your API key in the provided text box.

Conclusion: Empowering Your Writing Journey

By integrating AI Assistance into every tool on Tools-Online.app, we are bridging the gap between human creativity and machine intelligence.

Whether you’re drafting text in the Online Notepad, generating code snippets, or creating diagrams using the Mermaid editor, our unified AI interface ensures you can:

Accelerate Ideation: Generate outlines, brainstorm keywords, or outline workflows in seconds.
Enhance Quality: Refine language, improve readability, and ensure consistency across documents.
Save Time: Automate repetitive edits, generate boilerplate code, or summarize lengthy content instantly.

Ready to transform the way you work? Simply pick any tool on Tools-Online.app, click the AI settings icon, select your preferred model, and watch your productivity soar. The future of online utilities is here, powered by AI.

Additional Resources:

Ollama Hardware Requirements: HOSTKEY: Ollama GPU Requirements
General GPU/CPU for LLMs: Bizon Tech: Best GPU for LLM Training and Inference
OpenRouter Documentation
Meta AI (Llama Models: Llama 3.2, Llama 3.1 8B): Llama Models Information

Categories

Tags

AI Integration: A Step-by-Step Guide

How to Use the AI Feature

Essential Considerations: What You Need to Know Upfront

Hardware Requirements

Internet Connectivity

AI Credits and Costs

Privacy Implications

Language Support

Selecting Your AI Model: Matching Power to Purpose

Cloud-Based Models

Browser-Based Local Models

Dedicated Local AI Frameworks (Ollama/LM Studio)

API Services like OpenRouter

Hands-On: How to Set Up Your AI Model

Cloud Models (GPT Series)

Local Models (Browser)

Integrating Ollama/LM Studio

Connecting via OpenRouter API Service

Conclusion: Empowering Your Writing Journey

Additional Resources: