Tutorial
This tutorial walks you through a complete Recotem run: fetch data, train a model, serve it, and call /predict. The dataset is a small public purchase log CSV (the same file used by Recotem's own integration tests) and training takes about a minute on a laptop.
Prerequisites: either Docker with the Compose plugin, or Python 3.12+ with Recotem installed. About 50 MB of disk and network access to raw.githubusercontent.com.
Choose your path:
- Path A — Docker Compose (recommended; no Python install needed)
- Path B — pip
The tutorial recipe
The recipe at examples/tutorial-purchase-log/recipe.yaml describes the whole pipeline:
name: purchase_log
source:
type: csv
path: https://raw.githubusercontent.com/codelibs/recotem/refs/tags/v1.0.0/frontend/e2e/test_data/purchase_log.csv
sha256: 945fc769205a5976d38c5783500ae473afbb04608043b703951a699993c8f8be
dtype:
user_id: str
item_id: str
schema:
user_column: user_id
item_column: item_id
cleansing:
drop_null_ids: true
dedup: keep_last
min_rows: 100
min_users: 10
min_items: 10
training:
algorithms: [IALS, TopPop]
metric: ndcg
cutoff: 10
n_trials: 10
split:
scheme: random
heldout_ratio: 0.2
seed: 42
output:
path: ./artifacts/purchase_log.recotem
versioning: append_shaA few things worth noting:
source.sha256is required whenever a data file is fetched over HTTP or HTTPS. Recotem verifies the download matches the expected checksum before touching it. This prevents training on a silently swapped or corrupted file.training.algorithmslists two candidates: IALS (implicit-feedback matrix factorization) and TopPop (popularity baseline). Optuna runs trials for each and picks the best-scoring combination.output.versioning: append_shawrites each new artifact with a unique suffix and updates a pointer file. The server reads through the pointer, so hot-swapping is atomic.
Path A — Docker Compose
Step 1 — Generate keys
docker run --rm ghcr.io/codelibs/recotem:latest keygen --type signing --kid devCopy the env_entry= line from the output and set it:
export RECOTEM_SIGNING_KEYS="dev:<plaintext-hex-from-output>"Then generate an API key:
docker run --rm ghcr.io/codelibs/recotem:latest keygen --type api --kid devCopy both the env_entry= line and the plaintext= line:
export RECOTEM_API_KEYS="dev:sha256:<hash-hex-from-output>"
export RECOTEM_API_PLAINTEXT="<plaintext-from-output>" # used in Step 4 (curl)Step 2 — Train
From the repository root:
docker compose run --rm trainThis runs a one-shot training container. It fetches the CSV from GitHub, verifies the sha256, runs the Optuna search, and writes a signed artifact to the artifacts volume shared with the serve container.
The last log line should look like:
{"event":"train_done","name":"purchase_log","exit_code":0,
"artifact":"./artifacts/purchase_log....recotem","best_class":"IALSRecommender"}Step 3 — Serve
docker compose up -d serveCheck that the server started and loaded the model:
curl http://localhost:8080/healthExpected response:
{"status":"ok","total":1,"loaded":1}Step 4 — Predict
curl -sX POST http://localhost:8080/predict/purchase_log \
-H "X-API-Key: $RECOTEM_API_PLAINTEXT" \
-H "Content-Type: application/json" \
-d '{"user_id": "1", "cutoff": 5}' | python3 -m json.toolExpected response shape (exact scores vary by training run):
{
"items": [
{"item_id": "42", "score": 0.91},
{"item_id": "17", "score": 0.87}
],
"model": {
"recipe": "purchase_log",
"best_class": "IALSRecommender",
"kid": "dev"
},
"request_id": "..."
}Step 5 — Tear down
docker compose down -vPath B — pip
Step 1 — Install and verify
pip install recotem
recotem --helpStep 2 — Generate keys
recotem keygen --type signing --kid dev
recotem keygen --type api --kid devExport the values shown in the output:
export RECOTEM_SIGNING_KEYS="dev:<plaintext-hex-from-signing>"
export RECOTEM_API_KEYS="dev:sha256:<hash-hex-from-api>"
export RECOTEM_API_PLAINTEXT="<plaintext-from-api>"Step 3 — Validate the recipe (optional but recommended)
recotem validate examples/tutorial-purchase-log/recipe.yamlThis parses the recipe and runs a quick connectivity check (an HTTP HEAD request to the CSV URL) without downloading the full file. Useful for catching configuration problems before committing to a full training run.
Step 4 — Train
Run from the repository root so the CWD-relative output.path (./artifacts/...) resolves correctly:
mkdir -p artifacts
recotem train examples/tutorial-purchase-log/recipe.yamlStep 5 — Serve
recotem serve --recipes examples/tutorial-purchase-log/Step 6 — Predict
In a separate terminal:
curl -sX POST http://127.0.0.1:8080/predict/purchase_log \
-H "X-API-Key: $RECOTEM_API_PLAINTEXT" \
-H "Content-Type: application/json" \
-d '{"user_id": "1", "cutoff": 5}' | python3 -m json.toolWhat just happened
recotem trainparsed the recipe, fetched the CSV over HTTPS, verified the sha256, ran an Optuna hyperparameter search across IALS and TopPop, and wrote a binary artifact signed with your signing key.recotem servewatched the artifact directory, found the new file, HMAC-verified it against the same signing key, and registered the/predict/purchase_logendpoint.- The
/predictrequest was authenticated by the API key allow-list and scored using the trained model.
Common issues
| Symptom | Likely cause | Fix |
|---|---|---|
RecipeError: 'source.path' uses a network scheme … requires a 'sha256' integrity pin | The sha256 field was removed from the recipe | Re-add the sha256: line |
DataSourceError: sha256 mismatch | The upstream file changed | Re-compute with curl -sL <url> | shasum -a 256 and update the recipe |
DataSourceError: HTTP 404 fetching ... | The URL changed | Verify the URL in a browser; check the v1.0.0 tag is still present |
ArtifactError: RECOTEM_SIGNING_KEYS not set | Step 1 (key generation) was not exported | Re-run the export and try again |
401 Unauthorized on /predict | Wrong API key value | Use the plaintext line from keygen --type api, not the hash line |
503 recipe_unavailable immediately after training | The watcher has not polled yet | Wait up to RECOTEM_WATCH_INTERVAL seconds (default 5 s; the tutorial compose sets 10 s). Check /health. |
| Path B: artifact written to the wrong directory | The recipe's output.path is relative to the working directory | Run recotem train from the repository root, or change output.path to an absolute path |
recotem: command not found after pip install | The venv is not activated | Activate the venv, or run python -m recotem ... |
Next steps
- Recipe Basics — understand every section of a recipe in detail
- CLI Reference — all flags for
train,serve, and the other commands - Recipe Reference — full field-level documentation for every recipe field
- Batch and Scheduling — run training on a cron schedule
