Skip to main content
⌘K

TabbyAPI

The official API server for Exllama. OAI compatible, lightweight, and fast.

View on GitHub Official site
Llm Inference AGPL 3.0 Medium setup 1,219 stars

Overview

Plain English

The official API server for Exllama. OAI compatible, lightweight, and fast.

Technical

The official API server for Exllama. OAI compatible, lightweight, and fast.

Technical scorecard

License AGPL 3.0
Commercial use No
OpenAI-compatible API No
REST API No
Fine-tuning support No
Quantization support No
Docker available No
GUI / no-code available No
Telemetry None
Offline after setup Yes

Data & Privacy

Does it send data online?

After setup, this listing is marked as usable offline. Confirm network behavior against the upstream project before regulated deployment.

Does it store history?

Not verified in this directory yet. Review the upstream docs for persistence, logs, and workspace storage.

License checks?

The listed license needs review before commercial use.

Telemetry?

None

Last verified: May 17, 2026. Maintainer verification should be treated as directory guidance, not legal advice.

Setup & Installation

Medium

A developer can usually get this running with standard docs.

Prerequisites

Python, Docker, Bare Metal

# Start with the official project documentation
# https://github.com/theroyallab/tabbyAPI

Hardware Requirements

RAM16 GB minimum / 32 GB recommended
Hardware tagsNVIDIA GPU (CUDA)
Model formatsNot specified
Primary languagePython

Works Well With