feat(inference): add model-bound relay profiles

Metadata

Source

tuist/hive #89

Updated

Jun 25, 2026

Domains

Hive

Details

What changed

Hive can now act as an OpenAI-compatible inference relay. It adds model-bound profiles and tokens, provider management, usage accounting, dashboard views, and self-hosting documentation.

The relay exposes the OpenAI-compatible models and chat completion endpoints, including streamed responses. Tokens are bound to a profile, profiles route to a runtime-managed or environment-backed provider, and each completed relay request records input tokens, output tokens, estimated United States dollar cost, and audit activity.

The operations dashboard now includes Inference > Profiles and Providers, profile and token detail pages, usage widgets, period-aware charts, token creation, provider creation, provider deletion controls, models.dev model validation, and profile-level client configuration. Seed data creates providers, profiles, tokens, and usage rows so the dashboard has realistic data locally.

Why

Repositories and automation workflows need a stable model address while operators need the freedom to retarget the upstream provider or model as cost, latency, and quality change. Binding usage to Hive-issued tokens also gives operators a more granular attribution boundary, so one profile can serve multiple repositories, workflows, or teams while still showing separate usage and cost breakdowns.

Approach

Profiles store the stable model name plus the upstream provider, models.dev model identifier, enabled state, and editable per-model pricing. Tokens carry the runtime authorization and attribution boundary. Providers can be created in the dashboard or loaded from environment configuration, with credentials encrypted when managed in Hive.

The relay authenticates authorization tokens, validates the requested profile model, rewrites the upstream model before forwarding, preserves streamed chunks, and records usage when the upstream provider returns token counts.

Impact

Existing agent settings stay unchanged. The new relay is opt-in through the inference routes and operations dashboard. Self-hosted operators should configure at least one provider before creating production profiles, and should treat Hive tokens like repository secrets because Hive only shows a token once.

Validation

mix format --check-formatted
mix compile --warnings-as-errors
mix test
mix credo
git diff --check
mix test test/hive/inference_test.exs test/hive_web/controllers/inference_controller_test.exs test/hive_web/live/ops_live/inference_test.exs
mix test test/hive_web/controllers/inference_controller_test.exs test/hive/inference_test.exs
headless Chrome browser smoke checks for profile creation, provider selection, models.dev validation, usage period filters, profile detail, token detail, providers, and profile client configuration.

Comments

No GitHub comments yet.