Self-Hosted LLM Gateway

Why self-host

Keep the LLM control plane close to the systems it protects.

A hosted broker can be useful for quick model access. AI Model Gateway is for teams that need the gateway itself to be part of operations: owned keys, explicit routing policy, visible provider health, auditable config changes, and a rollback path.

Local key control

Keep provider credentials and client-facing gateway tokens under your own runtime and deployment controls.

Routing policy

Use one local entry point for OpenAI-compatible clients while managing provider routing and fallback centrally.

Operational telemetry

Inspect traffic, latency, cost signals, request logs, provider health, diagnostics, and replay from the Admin UI.

Safe change flow

Preview, diff, validate, publish, audit, and roll back config changes instead of editing a live proxy file.

Shortest proof

Verify the operational path before wiring real providers.

The 15-minute evaluation path starts with fit checks, local runtime startup, the provider fallback demo, and the key operations docs. The demo forces a primary OpenAI-compatible upstream to return 429 and verifies the request is served through a fallback provider.

go test ./examples/provider-fallback -run TestProviderFallbackDemo -v

Open the 15-minute evaluation path

AI Model Gateway overview workspace showing local gateway health and operations status

Evaluation path

Assess the gateway as infrastructure, not just a proxy.

Fit. Check whether self-hosting, local keys, and local telemetry match your team constraints.
Run. Build the compact Go runtime, start the supervised planes, and open the Admin UI.
Fail over. Run the fallback demo against fake upstreams before trusting real provider policy.
Operate. Review config publishing, provider health, diagnostics, updates, and rollback workflows.

Self-hosted checklist LLM gateway adoption checklist OpenAI-compatible gateway page Provider fallback gateway page LLM gateway comparison page Chinese self-hosted page Config publish and rollback

Fit check

Use it when self-hosting is a control requirement.

Good fit

Provider keys, routing policy, telemetry, and audit records need to stay local.
Multiple OpenAI-compatible clients should use one stable internal gateway URL.
Operators need provider probes, request logs, config publish, and rollback.
The team wants a smaller Go runtime instead of a full gateway platform stack.

Less ideal

You want a hosted model marketplace to own routing, billing, and provider access.
You only need a client SDK wrapper inside one application.
You do not need local operations workflows or a gateway Admin UI.
You do not want to operate any runtime in your own environment.

Review evidence

Check installability, quality, and security before adopting it.

Release archive install

Try the packaged v1.4.4 runtime with checksum verification, local config, runtime directories, and supervised startup commands.

Open release install path

Quality evidence

Review CI gates, local reproduction commands, runtime smoke checks, feature proof points, and current capability boundaries.

Open quality evidence

Security and trust model

Inspect admin auth, same-origin browser writes, provider-key handling, SSRF defenses, telemetry sensitivity, and update trust.

Open security model

Next step

Run the local evaluation, then decide whether it earns a star.

Start with the checklist and executable fallback demo. If the project matches your self-hosted LLM gateway needs, starring the repository helps more operators discover it.

Star on GitHub Leave feedback