Amba
SDKsFeatures

AI proxy

Call Anthropic and OpenAI via amba so provider keys stay server-side. Prompts are managed in the console.

Amba.ai.* proxies LLM requests through amba so your provider keys never ship to a client. You write the prompt once in the console, reference it by prompt_slug from any SDK, and the server fills in the system prompt + variables + model + rate-limits + usage tracking.

Two providers wired today: Anthropic (messages.create) and OpenAI (chat.completions.create). Both return the upstream response shape verbatim, plus a usage event so you can attribute cost per user.

Quick start

import { Amba } from '@layers/amba-web';
 
const reply = await Amba.ai.anthropic.messages.create({
  prompt_slug: 'support_assistant',
  variables: { user_query: 'How do I cancel?' },
  max_tokens: 1024,
});
 
console.log(reply.content);

Operations

ai.anthropic.messages.create({ prompt_slug, variables, max_tokens })

Sends a prompt to Anthropic via amba.

FieldRequiredNotes
prompt_slugyesReference to a prompt defined in the console. Server resolves the system prompt + model.
variablesoptionalObject whose keys substitute into the prompt's {{variable}} placeholders.
max_tokensoptionalCap output length. Server enforces a project-wide ceiling.
temperatureoptionalFloat; default per-prompt in the console.
enable_prompt_cacheoptionalPass true to opt into Anthropic's prompt caching for that slug.

The response shape mirrors Anthropic's Message object:

{
  content: [...],        // array of content blocks (text / tool_use)
  usage: {
    input_tokens: 123,
    output_tokens: 456,
    cache_creation_input_tokens: 78,
    cache_read_input_tokens: 90,
  },
  stop_reason: 'end_turn',
  model: 'claude-sonnet-4-5',
}

ai.openai.chat.completions.create({ prompt_slug, variables, max_tokens })

Same shape, but routes to OpenAI. Response mirrors OpenAI's ChatCompletion.

Patterns

Prompt slugs

Prompts live in the console — system prompt, variable schema, default model, default temperature, default max_tokens. The client passes the slug; the server fills the rest. This means you can:

  • Change the model behind a slug without redeploying.
  • A/B test prompt variants by routing a slug through a feature flag.
  • Audit which user sent which prompt via the per-call usage event.

Slug naming: lowercase, underscore-separated, up to 64 characters. Group by feature: support_assistant, summarize_review, onboarding_recommend.

Variable substitution

The prompt template uses {{name}} placeholders. The client's variables object fills them:

// Prompt template (in console):
//   You are a helpful support assistant. The user asks: {{user_query}}
//   Reply in under 200 words.

await Amba.ai.anthropic.messages.create({
  prompt_slug: 'support_assistant',
  variables: { user_query: 'How do I cancel?' },
});

Variables that aren't in the template are silently ignored. Missing required variables return a 400 from the server with the offending key listed.

Streaming

Streaming responses (Anthropic's stream: true, OpenAI's chunked) are exposed via the SDK as an async iterator on platforms that support it. See client API — ai for the canonical stream shape.

Cost attribution

Every ai.*.create call emits a ai_usage event automatically — same events namespace as everything else, but with usage.input_tokens and usage.output_tokens attached. Query the event stream per-user to attribute cost without instrumenting your own counter.

Limits

  • Prompt slug must exist: the server returns 404 prompt_slug_not_found for unknown slugs. Define the prompt in the console before referencing it.
  • max_tokens ceiling: per-project hard cap (default 4096); the prompt's console default applies if you don't pass one.
  • Rate limits: per-prompt-slug rate limits configured in the console. Defaults are conservative; raise per-slug as your usage grows.
  • No client-side keys: client SDKs cannot pass an api_key. The server's provider key is the only credential in play.
  • No tool-use roundtrip from clients: tool calls (Anthropic) and function-calling (OpenAI) are accepted in the response but the client SDK doesn't auto-execute tools. Run tool dispatch in a server function and only return the final text to the client.

Reference

On this page