← Back to W30D1

Endpoint Designer

The three endpoints every AI MVP should expose

How to Read This Page

Each endpoint below is a worked example of what your team's api/api_design.md should contain. Every endpoint has four parts: purpose, request schema, response schema, and error conditions. Plus the anti-GenAI requirement: why this endpoint exists.

Your job in Breakout 1: copy this structure for a product of your team's choosing. Don't just rename fields — pick a real capability and justify each endpoint.
POST /v1/predict
Why this endpoint exists: the product capability is "get a prediction from the model given a product-meaningful input." Every client — web, mobile, batch pipelines, other services — calls this same endpoint. The model behind it can change; this contract doesn't.
Request
Response
Errors
curl

Headers

Content-Typeapplication/json
X-Request-Idoptional client-supplied trace ID

Body

{
  "text": "string, required, 1-5000 chars",
  "options": {
    "return_probabilities": false
  }
}
{
  "prediction": "positive",
  "confidence": 0.94,
  "request_id": "req_01HXYZ...",
  "model_version": "1.2.0"
}

confidence is a float 0.0–1.0. model_version lets clients invalidate caches when the model changes without the contract changing.

CodeWhenBody
400Missing required field{"error": "text is required"}
413Input exceeds 5000 chars{"error": "text too long"}
422Validation failed (wrong type, etc.)FastAPI validation detail
429Rate limit exceeded{"error": "rate limit", "retry_after": 30}
503Model unavailable / overloaded{"error": "model unavailable"}
curl -X POST http://localhost:8000/v1/predict \
  -H "Content-Type: application/json" \
  -d '{"text": "I loved this product"}'
GET /health
Why this endpoint exists: load balancers, orchestrators, and monitoring systems need a cheap, side-effect-free way to ask "is this service alive?" A failing /health takes an instance out of rotation before users see errors.
Request
Response
Errors

No body. No auth. No rate limit.

GET /health
{
  "status": "ok",
  "uptime_seconds": 84231,
  "model_loaded": true
}
Common mistake: don't make /health call your model or database. A health check that hits every downstream service turns one slow dependency into a cascading outage. Use /readiness for that if you need it.
CodeWhen
503Service is starting up or shutting down
GET /metadata
Why this endpoint exists: clients and debug tools need to know which version of the contract they're talking to without parsing a URL or reading a changelog. Support engineers read this field first when a user reports a problem.
Request
Response
GET /metadata
{
  "api_version": "1.2.0",
  "model_version": "sentiment-classifier@2026-04-12",
  "supported_languages": ["en", "es"],
  "max_input_length": 5000,
  "rate_limit_per_minute": 60
}

Keep this endpoint cheap. It's read by every client on startup, and sometimes on every request.

Design Review Checklist

Use this in Breakout 2 when reviewing another team's design:

Purpose

Can you explain the endpoint in one sentence without using words like "handle" or "process"?

Request schema

Are field types, required-ness, and constraints explicit? Could you mock this without asking questions?

Response schema

Is the shape the same on success? Are nullable fields marked? Is there a request_id?

Error conditions

Every 4xx and 5xx you can trigger should be listed with the body shape and a human-readable reason.

Leak check

Does the request or response mention model names, token IDs, internal DB fields, or infrastructure? That's a leak. Rename or remove.

Versioning

Is the URL prefixed with /v1/? If not, how will you introduce a breaking change?