Hive
feat(inference): add model-bound relay profiles
GitHub issue · Closed
What changed
Hive can now act as an OpenAI-compatible inference relay. It adds model-bound profiles and tokens, provider management, usage accounting, dashboard views, and self-hosting documentation.
The relay exposes the OpenAI-compatible models and chat completion endpoints, including streamed responses. Tokens are bound to a profile, profiles route to a runtime-managed or environment-backed provider, and each completed relay request records input tokens, output tokens, estimated United States dollar cost, and audit activity.
The operations dashboard now includes Inference > Profiles and Providers, profile and token detail pages, usage widgets, period-aware charts, token creation, provider creation, provider deletion controls, models.dev model validation, and profile-level client configuration. Seed data creates providers, profiles, tokens, and usage rows so the dashboard has realistic data locally.
Why
Repositories and automation workflows need a stable model address while operators need the freedom to retarget the upstream provider or model as cost, latency, and quality change. Binding usage to Hive-issued tokens also gives operators a more granular attribution boundary, so one profile can serve multiple repositories, workflows, or teams while still showing separate usage and cost breakdowns.
Approach
Profiles store the stable model name plus the upstream provider, models.dev model identifier, enabled state, and editable per-model pricing. Tokens carry the runtime authorization and attribution boundary. Providers can be created in the dashboard or loaded from environment configuration, with credentials encrypted when managed in Hive.
The relay authenticates authorization tokens, validates the requested profile model, rewrites the upstream model before forwarding, preserves streamed chunks, and records usage when the upstream provider returns token counts.
Impact
Existing agent settings stay unchanged. The new relay is opt-in through the inference routes and operations dashboard. Self-hosted operators should configure at least one provider before creating production profiles, and should treat Hive tokens like repository secrets because Hive only shows a token once.
Validation
mix format --check-formattedmix compile --warnings-as-errorsmix testmix credogit diff --checkmix test test/hive/inference_test.exs test/hive_web/controllers/inference_controller_test.exs test/hive_web/live/ops_live/inference_test.exsmix test test/hive_web/controllers/inference_controller_test.exs test/hive/inference_test.exs- headless Chrome browser smoke checks for profile creation, provider selection, models.dev validation, usage period filters, profile detail, token detail, providers, and profile client configuration.
No GitHub comments yet.