AllTopicsTodayAllTopicsToday
Notification
Font ResizerAa
  • Home
  • Tech
  • Investing & Finance
  • AI
  • Entertainment
  • Wellness
  • Gaming
  • Movies
Reading: Run Ollama Models Locally and make them Accessible via Public API
Share
Font ResizerAa
AllTopicsTodayAllTopicsToday
  • Home
  • Blog
  • About Us
  • Contact
Search
  • Home
  • Tech
  • Investing & Finance
  • AI
  • Entertainment
  • Wellness
  • Gaming
  • Movies
Have an existing account? Sign In
Follow US
©AllTopicsToday 2026. All Rights Reserved.
AllTopicsToday > Blog > AI > Run Ollama Models Locally and make them Accessible via Public API
AI

Run Ollama Models Locally and make them Accessible via Public API

AllTopicsToday
Last updated: July 28, 2025 5:31 pm
AllTopicsToday
Published: July 28, 2025
Share
SHARE

Blog thumbnail - Publishing local orama models with public API

introduction

Working giant language fashions (LLM) and different open supply fashions provides nice advantages for builders. That is the place the orama shines. Ollama simplifies the method of downloading, establishing and operating these highly effective fashions in your native machine, permitting better management, elevated privateness and lowered prices in comparison with cloud-based options.

Working your mannequin regionally provides an enormous benefit, however integrating it with cloud-based initiatives or sharing it for wider entry is a problem. That is precisely the place Clarifai’s native runners seem. Native runners can expose orama fashions operating regionally by way of public API endpoints, permitting seamless integration with initiatives anyplace, successfully filling the hole between native environments and the cloud.

On this submit, I’ll present you find out how to run an open supply mannequin utilizing Ollama and expose it in a public API utilizing Clarifai’s native runner. This permits the native mannequin to be globally accessible and runs fully on the machine.

An area runner defined it

Native Runners let you run fashions by yourself machines, whether or not it is a laptop computer, workstation or on-prem server, and expose them by way of safe public API endpoints. There isn’t any have to add the mannequin to the cloud. The mannequin stays native, however it behaves as if it was hosted in Make clear.

Upon initialization, the native runner opens a secure tunnel within the management airplane of the Make clear. Requests to the mannequin’s Clarifai API endpoint are processed regionally and routed to the machine returned to the caller. From the surface it really works identical to some other host mannequin. Internally, every little thing runs in {hardware}.

Native runners are particularly helpful:

Quick native growth: Construct, take a look at and iterate fashions on your personal atmosphere with out deployment delays. Examine site visitors, take a look at output and debug in actual time. Utilizing your individual {hardware}: Benefit from an area GPU or customized {hardware} setup. Let the machine deal with inference whereas Clarifai manages routing and API entry. Non-public and Offline Information: Run fashions that depend on native recordsdata, inner databases, or non-public APIs. Hold every little thing on-prem whereas exposing accessible endpoints.

Native runners present native execution flexibility together with the attain of managed APIs with out having to manage the information or atmosphere.

Publish native orama fashions through public API

This part supplies steps to run the Ollama mannequin regionally and make it accessible through the Make clear Public Endpoint.

Stipulations

Earlier than you start, ensure you have:

Step 1: Set up Clarifai and Login

First, set up the Clarifai Python SDK.

Subsequent, log in to Clarifai and configure the context. This lets you hyperlink your native atmosphere to your Clarifai account and handle and publish your fashions.

Comply with the prompts to enter your consumer ID and your Private Entry Token (PAT). If you must assist with these, please discuss with this documentation.

Step 2: Arrange an area orama mannequin for Make clear

Subsequent, put together an area orama mannequin for entry to native Make clear runners. This step units up the recordsdata and configurations required to make use of Clarifai’s platform to show your mannequin by way of a public API endpoint.

Initialize the setup utilizing the next command:

This can generate three key recordsdata within the mission listing.

mannequin.py

config.yaml

requiretion.txt

These outline how clarification communicates with regionally operating orama fashions.

You can too customise the instructions with the next choices:

-Mannequin-Title: The title of the Ollama mannequin you need to present. That is pulled from the Ollama Mannequin Library (defaults to llama3:8b).

– Port: The port on which the Ollama mannequin is operating (default is 23333).

-Context-Size: Units the context size of the mannequin (default is 8192).

For instance, to make use of a Gemma:2B mannequin with a size of 16K context on port 8008, run:

After this step, the native mannequin is able to be revealed utilizing Clarifai’s native runner.

Step 3: Begin the Make clear Native Runner

As soon as the native orama mannequin is configured, the following step is to run the Make clear native runner. This exposes the native mannequin to the Web through a safe clarification endpoint.

Go to the mannequin listing and run it.

When the runner begins, you’ll obtain a public Make clear URL. This URL is a gateway for accessing orama fashions operating regionally from anyplace. Requests made to this clarification endpoint are securely routed to the native machine, permitting the orama mannequin to deal with them.

Carry out inference on uncovered fashions

As a result of the Ollama mannequin runs regionally and is uncovered through Clarifai Native Runner, you possibly can ship inference requests from anyplace utilizing the Clarifai SDK or Openai-compatible endpoint.

Inference utilizing OpenAI appropriate strategies

Set clarifai pat because the atmosphere variable.

You possibly can then use the OpenAI shopper to ship the request.

For multimodal inference, picture information might be included.

Reasoning for the Clarifai SDK

You can too use the clarifai python SDK for inference. The mannequin URL might be obtained out of your clarifai account.

Customizing orama mannequin configuration

The clarifai mannequin init -toolkit ollama command generates the mannequin file construction.

Ollama-Mannequin-upload/
p
└Quar. Mannequin.py
│
├├) config.yaml
└└). Necessities. TXT

You possibly can customise the generated recordsdata to manage how the mannequin works.

1/mannequin.py – Regulate the conduct of your personalized mannequin, implement customized logic, and optimize efficiency.

config.yaml – Defines settings equivalent to calculation necessities. It’s particularly helpful when deploying to devoted calculations utilizing computational orchestrations.

guidelines.txt – Lists the Python packages required for the mannequin.

This setup provides you full management over how the Ollama mannequin is uncovered and used by way of the API. Please discuss with this documentation.

Conclusion

Working an open supply mannequin regionally with Ollama provides you full management over privateness, latency and customization. Clarifai’s native runner lets you expose these fashions through public APIs with out counting on centralized infrastructure. This setup lets you simply join your native mannequin to a bigger workflow or agent system, providing you with full management over your computing and information. If you wish to scale past the machine, verify the orchestration calculations to deploy the mannequin to a devoted GPU node.

How AI tools can redefine universal design to increase accessibility
7 Advanced Feature Engineering Tricks for Text Data Using LLM Embeddings
Europe sets its sights on multi-billion-euro gigawatt AI factories
How to Build a Model-Native Agent That Learns Internal Planning, Memory, and Multi-Tool Reasoning Through End-to-End Reinforcement Learning
NVIDIA Releases Nemotron-Cascade 2: An Open 30B MoE with 3B Active Parameters, Delivering Better Reasoning and Strong Agentic Capabilities
Share This Article
Facebook Email Print
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Follow US

Find US on Social Medias
FacebookLike
XFollow
YoutubeSubscribe
TelegramFollow

Weekly Newsletter

Subscribe to our newsletter to get our newest articles instantly!

Popular News
David zayas as angel batista in dexter resurrection episode 9 season 1.jpg
Movies

Dexter: Resurrection Episode 9 Review

AllTopicsToday
AllTopicsToday
August 29, 2025
Like Reacher, Netflix’s Detective Hole Can Redeem Its Books’ Failed Movie
Asia-Pacific markets climb across the board after third Fed cut of the year
The Simpsons Season 37’s Premiere Death Clarified By EPs
Improve overall well-being in 6 months: Simple plan
- Advertisement -
Ad space (1)

Categories

  • Tech
  • Investing & Finance
  • AI
  • Entertainment
  • Wellness
  • Gaming
  • Movies

About US

We believe in the power of information to empower decisions, fuel curiosity, and spark innovation.
Quick Links
  • Home
  • Blog
  • About Us
  • Contact
Important Links
  • About Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
  • Contact

Subscribe US

Subscribe to our newsletter to get our newest articles instantly!

©AllTopicsToday 2026. All Rights Reserved.
1 2
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?