Local models running through Ollama, LM Studio, or llama.cpp keep prompts and outputs on-device, but they cannot reach live trend data without a tool bridge. Trends MCP connects over HTTP with a Bearer token, so a local agent can query Google, TikTok, Reddit, and 15 other sources while the model weights stay offline.
Privacy-sensitive teams run Llama, Mistral, and Qwen weights through Ollama or LM Studio so customer data never leaves the machine. That isolation breaks the moment someone asks "what is trending on TikTok right now?" A local model has no live web access unless a tool layer supplies it. Trends MCP fills that gap with one HTTP endpoint, stable JSON, and 25 normalized sources behind a single Bearer token.
For cloud-hosted MCP clients like Cursor and Claude Desktop, see the MCP server setup guide. This page covers local inference stacks only.
The split matters for compliance reviews. When Ollama serves the model on localhost:11434, every token of the conversation stays on the host. Trends MCP tool calls are separate HTTP POST requests to https://api.trendsmcp.ai/mcp carrying only structured parameters:
{"source": "google search", "keyword": "running shoes"}
The API returns a JSON time series. No chat history, no system prompt, and no document uploads travel with the request. Teams auditing data residency should note that trend query keywords themselves may contain sensitive terms; treat tool arguments like any other outbound API payload.
[User prompt] → [Local LLM via Ollama/LM Studio]
↓ tool call decision
[MCP client / mcp-remote bridge]
↓ HTTPS + Bearer token
[api.trendsmcp.ai/mcp]
↓ JSON response
[Local LLM synthesizes answer]
The model decides when to call get_trends, get_growth, or get_top_trends. The MCP bridge translates that into HTTP. The model never scrapes Google Trends directly.
Continue supports MCP servers alongside local Ollama models. This is the fastest path for developers already running open-weight models locally.
ollama pull llama3.1:8b
ollama serve
Confirm the API responds at http://localhost:11434.
Create a free key at trendsmcp.ai/account. The free tier includes 100 requests per month. No credit card required.
In ~/.continue/config.json, add the MCP server block:
{
"mcpServers": {
"trends-mcp": {
"url": "https://api.trendsmcp.ai/mcp",
"transport": "http",
"headers": {
"Authorization": "Bearer YOUR_API_KEY"
}
}
}
}
Set the model provider to Ollama in the same config. Continue routes tool calls to Trends MCP while inference stays on localhost:11434.
Prompt: "Using TrendsMCP, what is trending on Google right now?"
The model should invoke get_top_trends with type: "Google Trends". The response returns ranked query strings the model can summarize without hallucinating breakout terms.
For Continue-specific screenshots and troubleshooting, see MCP server for Continue.
LM Studio exposes local models through an OpenAI-compatible API on localhost:1234. It does not host MCP natively as of June 2026. The practical pattern is a thin agent layer that calls both endpoints.
LangChain's MultiServerMCPClient connects to Trends MCP while ChatOpenAI points at LM Studio:
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="http://localhost:1234/v1",
api_key="lm-studio",
model="local-model"
)
mcp_client = MultiServerMCPClient({
"trends-mcp": {
"url": "https://api.trendsmcp.ai/mcp",
"transport": "streamable_http",
"headers": {"Authorization": "Bearer YOUR_API_KEY"}
}
})
The LLM runs locally. Tool execution fetches live data. See LangChain Trends MCP for the full agent loop.
Some teams run Claude Desktop for MCP tooling while experimenting with local models in parallel sessions. Claude Desktop's mcp-remote bridge also works with remote MCP servers:
"trends-mcp": {
"command": "npx",
"args": ["-y", "mcp-remote", "https://api.trendsmcp.ai/mcp", "--header", "Authorization:${AUTH_HEADER}"],
"env": { "AUTH_HEADER": "Bearer YOUR_API_KEY" }
}
This path is not fully local (Claude Desktop inference runs on Anthropic's stack), but the mcp-remote pattern is the same bridge local-only stacks replicate.
Local agents tempt tight polling loops. A ReAct agent checking five keywords across three sources can burn 15 requests per reasoning cycle. On the 100-request free tier, that leaves fewer than seven full cycles per month.
Practical limits for local workflows:
| Pattern | Requests per cycle | Free tier cycles per month |
|---|---|---|
| Single keyword, one source | 1 | 100 |
| Five keywords, one source each | 5 | 20 |
| One live feed pull (get_top_trends) | 1 | 100 |
| Five keywords, three sources each | 15 | 6 |
Cache trend responses in a local SQLite or JSON file when the agent revisits the same keyword within a session. Use get_growth with multiple percent_growth periods in one call instead of separate get_trends pulls for each window.
For read-only agent patterns that minimize write risk, see read-only trend data MCP.
get_top_trends(type="Google Trends", limit=10)
Returns the current Google trending leaderboard. One request.
get_growth(source="google search", keyword="GLP-1", percent_growth=["3M", "12M", "YTD"])
Returns growth percentages for three windows in one request.
Run three separate calls for google search, tiktok, and reddit on the same keyword. Three requests total. The normalized 0-100 scale makes cross-source comparison possible without custom parsers.
This stack fits teams with data residency requirements for customer conversations but a legitimate need for external market signals. Financial research pods, healthcare analytics groups, and legal practices often block cloud LLM uploads while still tracking public trend data.
It fits less well when the workload needs sub-minute polling across ten live feeds. The 100-request free tier and monthly paid caps reward batch research, not high-frequency trading-style refresh rates. Upgrade to Starter ($19, 1,000 requests) when local agent prototypes move to daily production use.
For workflow patterns that apply regardless of hosting, see LLM-native trend research workflow.
FAQ