ci: design nix2container migration + scaffold amd64 publish app (emmett#44)

Archetype: oci-image (buildx -> in-cluster remote buildkit), the HARD case.
DESIGN/PARTIAL, not a finished migration:

- ci/MIGRATION.md: concrete plan to escape buildkit via nix2container/skopeo
  +regctl. The app is pure-stdlib Python, so both arches are buildable on
  emmett (amd64-native + Nix-cross-from-amd64 python3 closure) with no
  buildkit/qemu/docker -> no foreign-arch leg needed; Dockerfile retired on
  cutover. Covers per-arch build, entrypoints, .woodpecker.yaml target,
  escape hatch (unused here), risks, remaining work.
- flake.nix: scaffolds the natively-buildable amd64 leg only
  (stage-amd64, publish-amd64), dry-run by default (PUBLISH=1 to push),
  $REGISTRY_TOKEN -> pass fallback, registry-down/empty-token blockers.
  Mirrors reference impl claude-plugin-registry@9850745.

arm64 leg, publish-index/publish, and YAML cutover are designed but NOT wired.
Verified: nix eval .#apps.x86_64-linux (-> stage-amd64, publish-amd64); no
image build run (downloads closure).
This commit is contained in:
Oleks
2026-06-01 23:41:44 +03:00
parent 3a2df3d6e1
commit e8f3e954e7
3 changed files with 566 additions and 0 deletions
+222
View File
@@ -0,0 +1,222 @@
# Migration plan: buildx → nix2container/skopeo+regctl (emmett#44)
<!-- markdownlint-disable MD013 MD040 MD060 -->
<!-- design doc: dense tables, command/output blocks, and long refs -->
**Status: DESIGN + PARTIAL SCAFFOLD.** This document is a concrete plan, not a
completed migration. A `flake.nix` in the repo root scaffolds the
**natively-buildable amd64 leg only** (`nix run .#publish-amd64`, dry-run by
default). The arm64 leg and the `.woodpecker.yaml` cutover are designed here but
**not yet wired** — see "Remaining work".
Archetype: `oci-image` (buildx → in-cluster remote buildkit) — the HARD
archetype in the emmett#44 local-pipeline-parity standard. Tracking: oleks/cluster
milestone #57.
---
## 1. Why this repo is an easy migration (feasibility)
The current pipeline (`.woodpecker.yaml`) does:
```
docker buildx create --driver remote tcp://buildkit-rootless-arm64.infra.svc...
docker buildx build --platform linux/amd64,linux/arm64 --push .
```
i.e. it depends on **in-cluster remote buildkit** for the multi-arch build and on
a node-pinned step (`howard2404`). That is exactly the cluster-coupled,
not-reproducible-on-emmett shape emmett#44 wants gone.
The application (`bridge.py`) is the easiest possible payload:
- **Pure Python standard library.** `json`, `os`, `sys`, `http.server`,
`urllib`. No `pip install`, no requirements file, no C/Rust extension, no
native wheel.
- The Dockerfile is 5 lines: `FROM python:3.12-alpine`, copy one file, run it.
Consequences for nix2container:
- The image's only real runtime dependency is a **CPython interpreter + its Nix
closure** (glibc, openssl, zlib, ncurses, ...). `bridge.py` is a static asset
copied next to it.
- CPython is in nixpkgs and **cross-compiles cleanly with `pkgsCross`** — there
is no source build of the app to cross, only the standard interpreter, which
the binary cache already serves for aarch64. So **both arches are buildable on
emmett (amd64-native + Nix-cross-from-amd64)** with **no buildkit, no qemu, no
docker daemon**. This satisfies the emmett-OK definition in the standard
("amd64-native OR Nix-cross-from-amd64, never 'uses skopeo' as the build, and
not a foreign-arch buildkit leg").
This repo is therefore a clean reference for "oci-image buildx leg that fully
escapes buildkit", not one of the genuinely-hard foreign-arch cases.
---
## 2. What the flake builds, per arch
Mirrors the reference impl (oleks/claude-plugin-registry @ 9850745), minus the
Rust/Dioxus build (we have no compiled artifact):
```
inputs: fleet (nixpkgs-projects pin), nix2container, flake-utils
mkApp targetPkgs:
# a tiny derivation that places bridge.py + a python3 symlink under /app
appRoot = runCommand "app-root-<arch>":
mkdir -p $out/app
cp ${./bridge.py} $out/app/bridge.py
ln -s ${targetPkgs.python3}/bin/python3 $out/app/python3 # closure tracked
mkImage arch = nix2container.buildImage {
name = "git.oleks.space/oleks/alertmanager-gotify-bridge";
tag = "${version}-${arch}";
inherit arch; # "amd64" | "arm64"
layers = [ (buildLayer {
copyToRoot = [ (appRoot arch) cacert ];
maxLayers = 25;
reproducible = false; # see Determinism note
}) ];
config = {
Cmd = [ "/app/python3" "/app/bridge.py" ];
WorkingDir = "/app";
ExposedPorts = { "8080/tcp" = {}; };
Env = [ "PORT=8080" "SSL_CERT_FILE=/etc/ssl/certs/ca-bundle.crt" ];
};
};
```
- **amd64**: `targetPkgs = pkgs` (native). Fully buildable + pushable on emmett.
- **arm64**: `targetPkgs = pkgs.pkgsCross.aarch64-multiplatform`. Cross from
amd64; the aarch64 `python3` closure comes from the binary cache. No qemu.
> The Alpine base in the current Dockerfile is intentionally dropped: nix2container
> ships only the interpreter closure, so the image is smaller and has no distro
> package layer. `urllib`+TLS works because `cacert` is copied in and
> `SSL_CERT_FILE` points at it.
---
## 3. Entrypoints (uniform front door)
Per the corrected standard, the flake apps ARE the shared code; `.woodpecker.yaml`
and local runs invoke the same `nix run`:
| app | does |
|-----|------|
| `stage-amd64` / `stage-arm64` | realize the image into the local Nix store, **no registry contact** (BUILD/STAGE-parity half) |
| `publish-amd64` / `publish-arm64` | stage + `copyTo` (skopeo) push that arch (PUBLISH half) |
| `publish-index` | build-free: `regctl index create` from pushed per-arch refs, then `:latest` = digest copy of `:TAG` as the LAST mutation |
| `publish` | both arches + `publish-index` |
Conventions carried verbatim from the reference:
- **Token**: `$REGISTRY_TOKEN` → fallback `pass infra/gitea/personal_access_token_packages_rw`
→ named `BLOCKER(empty-token)`. Never echoed; apps run under `set -euo pipefail`
only (no `set -x`).
- **Dry-run by default**: every push-capable app builds + prints what it would
push; `PUBLISH=1` / `--publish` required to mutate the registry. An accidental
local run cannot push.
- **VERSION/TAG**: `TAG=${VERSION:-<flake version>}`; CI derives `version` from
`CI_COMMIT_TAG` (strip leading `v`). `$VERSION` overrides for local dev.
- **Dev-tag guard**: `:latest` is only moved for a real release tag; dev tags
push per-arch + immutable index only.
- **Preflight**: `BLOCKER(registry-down)` if `https://git.oleks.space/v2/` is
unreachable (names the cluster-shared-fate failure mode).
- **Multi-arch**: index assembled from digest-pinned per-arch refs, fails closed
if a required arch is absent this run, `:latest` is a digest COPY (not a
re-assembly) and the last mutation, staging per-arch tags can be pruned;
gitea-oci-cleanup pins index child digests.
---
## 4. `.woodpecker.yaml` after migration (target)
The buildx/howard-pinned step is replaced by a single nix-ci step that runs the
same app the laptop runs:
```yaml
when: [{ event: tag, ref: "refs/tags/v*" }]
steps:
- name: publish
image: git.oleks.space/oleks/nix-ci:latest
environment:
REGISTRY_TOKEN: { from_secret: registry_token }
commands:
- export VERSION="$(echo "$CI_COMMIT_TAG" | sed 's/^v//')"
- nix run .#publish -- --publish # PUBLISH=1 via flag; CI is the only publisher
```
Gone: `docker login`, `buildx create --driver remote`, the `arm64` remote
builder, the `nodeSelector: howard2404` pin, the multi-arch `--platform` build.
The `skip_clone`+manual-clone dance can also drop to a normal clone with tags
(needed for `git describe` fallback) once the version is derived in shared code.
---
## 5. Foreign-arch escape hatch
Not needed for **this** repo — arm64 is a pure Nix cross of a stock interpreter,
so there is no foreign-arch buildkit leg at all. The general escape hatch (kept
for the upstream-source oci-image repos that genuinely need it) is documented for
the archetype, not used here:
> For images whose payload genuinely cannot be cross-compiled by Nix (an
> upstream binary only published for a foreign arch, or a build that won't
> cross), keep a `Dockerfile` + a buildx step that parameterizes
> `BUILDKIT_ADDR` (default `docker-container://local`, CI overrides to the
> in-cluster `tcp://buildkit-rootless-<arch>.infra.svc`). That leg stays
> cluster-coupled by necessity; its per-arch digest still feeds the same
> `regctl index create`/`publish-index` join point, so the multi-arch assembly
> is uniform across native, cross, and foreign-arch legs.
Because alertmanager-gotify-bridge has neither a compiled artifact nor a
foreign-only upstream binary, the escape hatch is dead code here and is **not**
added — the Dockerfile is retired entirely on cutover.
---
## 6. Risks / caveats
1. **Determinism (nix2container `reproducible = false`).** Same caveat as the
reference: parity holds only when emmett and CI resolve the **identical store
path** for the python3 closure from the shared cache. With the `fleet`
nixpkgs-projects pin + `flake.lock` committed this holds; verify by comparing
the pushed digest after a release. For a pure-stdlib interpreter image the
risk is low (no project-specific compiler in the closure), but it is real.
2. **TLS / cert path.** `urllib` to Gotify over HTTPS needs `cacert` in the image
and `SSL_CERT_FILE` set — handled in `config.Env` above. Must be verified once
against the live Gotify endpoint (the Alpine image got certs from the distro;
we now ship them explicitly).
3. **Image shape change.** Switching from `python:3.12-alpine` to a Nix closure
changes the digest, size, and layer layout. Any consumer pinning a specific
base-layer digest (unlikely here) would need updating. The `CMD` path changes
from `python` (PATH) to `/app/python3` (absolute symlink).
4. **arm64 cache coverage.** The cross build is only fast if the aarch64 python3
closure is in the binary cache; a cold cache makes the first emmett arm64 run
slow (still correct, no qemu). Aligns with the fleet-pins strategy.
5. **`publish-index` requires `regctl`/`skopeo` in `nix-ci`.** The reference
already relies on these being present; confirm the `nix-ci` image (or the
app's `runtimeInputs`) provides `regctl`, `skopeo` (via `copyTo`), `curl`.
---
## 7. Remaining work (to finish the migration)
- [x] `flake.nix` scaffolding the **amd64** leg (`stage-amd64`, `publish-amd64`)
— present, dry-run by default. (this commit)
- [ ] Add the **arm64** cross leg (`stage-arm64`, `publish-arm64`) and `publish` /
`publish-index` apps (copy the reference's index machinery verbatim).
- [ ] Verify the cross build resolves from the binary cache (no source rebuild).
- [ ] Verify TLS to Gotify from the Nix image (cacert + `SSL_CERT_FILE`).
- [ ] Cut over `.woodpecker.yaml` to `nix run .#publish -- --publish`; delete the
buildx/remote-builder/howard-pin steps and the `Dockerfile`.
- [ ] One real release: compare emmett-built vs CI-pushed digest (determinism
check).
- [ ] Add `just publish` / `just stage` front door if/when the repo gains a
`justfile` (uniform across all archetypes).
When N>3 oci-image repos have migrated, factor the per-arch image + publish apps
into the shared semver-tagged `ci-archetypes` flake-module (parameterized) and
have repos pin it via `inputs.ci-archetypes?ref=<pin>` rather than copying the
flake.
Generated
+140
View File
@@ -0,0 +1,140 @@
{
"nodes": {
"flake-utils": {
"inputs": {
"systems": "systems"
},
"locked": {
"lastModified": 1731533236,
"narHash": "sha256-l0KFg5HjrsfsO/JpG+r7fRrqm12kzFHyUHqHCVpMMbI=",
"owner": "numtide",
"repo": "flake-utils",
"rev": "11707dc2f618dd54ca8739b309ec4fc024de578b",
"type": "github"
},
"original": {
"owner": "numtide",
"repo": "flake-utils",
"type": "github"
}
},
"fleet": {
"inputs": {
"nixpkgs": "nixpkgs",
"nixpkgs-armer": [
"fleet",
"nixpkgs"
],
"nixpkgs-bim": [
"fleet",
"nixpkgs"
],
"nixpkgs-ci": [
"fleet",
"nixpkgs"
],
"nixpkgs-emmett": [
"fleet",
"nixpkgs"
],
"nixpkgs-howard": [
"fleet",
"nixpkgs"
],
"nixpkgs-mermaid": [
"fleet",
"nixpkgs"
],
"nixpkgs-mermaid-gpu": [
"fleet",
"nixpkgs"
],
"nixpkgs-micron": [
"fleet",
"nixpkgs"
],
"nixpkgs-projects": [
"fleet",
"nixpkgs"
]
},
"locked": {
"lastModified": 1779533061,
"narHash": "sha256-orWNYXtYURhEj3X4+xGMAhaEcKRvwXqTtJ8x2jV/M+Q=",
"ref": "refs/heads/main",
"rev": "b818e345ec4470e4b3e335bd2f864183c512116d",
"revCount": 13,
"type": "git",
"url": "https://git.oleks.space/oleks/fleet-pins"
},
"original": {
"type": "git",
"url": "https://git.oleks.space/oleks/fleet-pins"
}
},
"nix2container": {
"inputs": {
"nixpkgs": [
"nixpkgs"
]
},
"locked": {
"lastModified": 1775487831,
"narHash": "sha256-2lguQpLPQaxpQCJjXhmEEAfabwsAhkP29Z7fgLzHARA=",
"owner": "nlewo",
"repo": "nix2container",
"rev": "76be9608a7f4d6c985d28b0e7be903ae2547df3e",
"type": "github"
},
"original": {
"owner": "nlewo",
"repo": "nix2container",
"type": "github"
}
},
"nixpkgs": {
"locked": {
"lastModified": 1777268161,
"narHash": "sha256-bxrdOn8SCOv8tN4JbTF/TXq7kjo9ag4M+C8yzzIRYbE=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "1c3fe55ad329cbcb28471bb30f05c9827f724c76",
"type": "github"
},
"original": {
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "1c3fe55ad329cbcb28471bb30f05c9827f724c76",
"type": "github"
}
},
"root": {
"inputs": {
"flake-utils": "flake-utils",
"fleet": "fleet",
"nix2container": "nix2container",
"nixpkgs": [
"fleet",
"nixpkgs-projects"
]
}
},
"systems": {
"locked": {
"lastModified": 1681028828,
"narHash": "sha256-Vy1rq5AaRuLzOxct8nz4T6wlgyUR7zLU309k9mBC768=",
"owner": "nix-systems",
"repo": "default",
"rev": "da67096a3b9bf56a91d16901293e51ba5b49a27e",
"type": "github"
},
"original": {
"owner": "nix-systems",
"repo": "default",
"type": "github"
}
}
},
"root": "root",
"version": 7
}
+204
View File
@@ -0,0 +1,204 @@
{
description = "alertmanager-gotify-bridge pure-stdlib Python forwarder, containerized with Nix (nix2container). DESIGN/PARTIAL: amd64 leg only see ci/MIGRATION.md.";
# ──────────────────────────────────────────────────────────────────────────
# SCAFFOLD / PARTIAL MIGRATION (emmett#44, archetype oci-image buildx→nix2container).
# This flake intentionally implements ONLY the natively-buildable amd64 leg
# (`nix run .#publish-amd64`, dry-run by default). The arm64 cross leg, the
# multi-arch `publish-index`/`publish` apps, and the `.woodpecker.yaml` cutover
# are DESIGNED in ci/MIGRATION.md but NOT yet wired. Do not treat this as the
# finished pipeline. Pattern mirrors the reference impl
# (oleks/claude-plugin-registry @ 9850745).
# ──────────────────────────────────────────────────────────────────────────
inputs = {
fleet.url = "git+https://git.oleks.space/oleks/fleet-pins";
nixpkgs.follows = "fleet/nixpkgs-projects";
nix2container.url = "github:nlewo/nix2container";
nix2container.inputs.nixpkgs.follows = "nixpkgs";
flake-utils.url = "github:numtide/flake-utils";
};
outputs =
{
nixpkgs,
nix2container,
flake-utils,
...
}:
flake-utils.lib.eachDefaultSystem (
system:
let
pkgs = import nixpkgs { inherit system; };
inherit (pkgs) lib;
n2c = nix2container.packages.${system}.nix2container;
registry = "git.oleks.space/oleks/alertmanager-gotify-bridge";
# TAG is derived identically for CI + local in shared code: CI exports
# VERSION (= strip-v $CI_COMMIT_TAG); local dev may override $VERSION.
# No version.nix side-channel — the app is a static asset, the flake has
# no version-baked source build, so the tag lives purely in $VERSION.
version = "0.0.0-dev";
# The whole payload: bridge.py + a python3 interpreter symlink under /app.
# The symlink keeps the (arch-correct) python3 Nix closure tracked while
# contributing no extra files. Parameterised over a pkg set so the SAME
# expression builds natively (amd64) and cross (arm64, future leg).
appRoot =
targetPkgs: arch:
pkgs.runCommand "app-root-${arch}" { } ''
mkdir -p $out/app
cp ${./bridge.py} $out/app/bridge.py
ln -s ${targetPkgs.python3}/bin/python3 $out/app/python3
'';
mkImage =
targetPkgs: arch:
n2c.buildImage {
name = registry;
tag = "${version}-${arch}";
inherit arch;
# reproducible = false materializes the layer tar so the image streams
# verbatim from any host (remote-builder + binary-cache safe). For a
# pure-stdlib interpreter closure the determinism risk is low, but the
# standard's caveat still applies: emmett+CI must resolve the identical
# python3 store path from the shared cache (fleet pin + flake.lock).
layers = [
(n2c.buildLayer {
copyToRoot = [
(appRoot targetPkgs arch)
pkgs.cacert
];
maxLayers = 25;
reproducible = false;
})
];
config = {
Cmd = [
"/app/python3"
"/app/bridge.py"
];
WorkingDir = "/app";
ExposedPorts = {
"8080/tcp" = { };
};
Env = [
"PORT=8080"
# urllib → Gotify over HTTPS needs an explicit CA bundle (the old
# Alpine base provided one via the distro; the Nix image ships it).
"SSL_CERT_FILE=/etc/ssl/certs/ca-bundle.crt"
];
};
};
imageAmd64 = mkImage pkgs "amd64";
# ── publish/stage apps (shared by CI + local; cannot drift). ──────────
# SAFETY: dry-run by default. Set PUBLISH=1 / --publish to actually push.
passEntry = "infra/gitea/personal_access_token_packages_rw";
publishGate = ''
PUBLISH="''${PUBLISH:-0}"
for a in "$@"; do
case "$a" in
--publish) PUBLISH=1 ;;
--dry-run) PUBLISH=0 ;;
--help|-h)
echo "usage: [VERSION=x.y.z] [PUBLISH=1] $(basename "$0") [--publish|--dry-run|--help]" >&2
echo " dry-run by default: builds the amd64 image and prints what would be pushed." >&2
echo " --publish / PUBLISH=1 actually mutates the registry." >&2
echo " NOTE: amd64 leg ONLY (scaffold) see ci/MIGRATION.md." >&2
exit 0 ;;
*) echo "error: unknown argument '$a' (try --help)" >&2; exit 2 ;;
esac
done
TAG="''${VERSION:-${version}}"
'';
# Token resolved lazily (only when actually publishing); never echoed,
# never under set -x (writeShellApplication uses -euo pipefail only).
resolveToken = ''
TOKEN="''${REGISTRY_TOKEN:-}"
if [ -z "$TOKEN" ] && command -v pass >/dev/null 2>&1; then
TOKEN="$(pass show ${passEntry} 2>/dev/null || true)"
fi
if [ -z "$TOKEN" ]; then
echo "BLOCKER(empty-token): set REGISTRY_TOKEN env (CI from_secret) or have 'pass ${passEntry}' available; refusing to publish without credentials." >&2
exit 1
fi
'';
registryPreflight = ''
if ! curl -fsS -o /dev/null --max-time 10 "https://git.oleks.space/v2/" 2>/dev/null; then
echo "BLOCKER(registry-down): https://git.oleks.space/v2/ is unreachable CI shares fate with the cluster (Zot/buildkit on armer/k3s). Re-run when the registry is back; the staged image in the Nix store is unchanged." >&2
exit 1
fi
'';
stageAmd64 = ''
echo " staging ${registry}:$TAG-amd64 (local build, no registry contact)"
OUT="$(nix build --no-link --print-out-paths "$FLAKE#image-amd64")"
echo " staged image derivation: $OUT"
'';
mkStageAmd64 = pkgs.writeShellApplication {
name = "stage-amd64";
runtimeInputs = [ pkgs.nix ];
text = ''
FLAKE="''${FLAKE:-.}"
''
+ publishGate
+ stageAmd64;
};
mkPublishAmd64 = pkgs.writeShellApplication {
name = "publish-amd64";
runtimeInputs = [
pkgs.regctl
pkgs.curl
pkgs.nix
];
text = ''
FLAKE="''${FLAKE:-.}"
''
+ publishGate
+ stageAmd64
+ ''
echo " ${registry}:$TAG-amd64"
if [ "$PUBLISH" != "1" ]; then
echo " [dry-run] would push ${registry}:$TAG-amd64 (set PUBLISH=1 / --publish to push)"
echo " [dry-run] NOTE: amd64 leg only arm64 + index not yet implemented (ci/MIGRATION.md)"
exit 0
fi
''
+ resolveToken
+ registryPreflight
+ ''
${lib.getExe imageAmd64.copyTo} --dest-creds "oleks:$TOKEN" "docker://${registry}:$TAG-amd64"
'';
};
in
{
packages = {
image-amd64 = imageAmd64;
default = imageAmd64;
};
apps = {
stage-amd64 = {
type = "app";
program = lib.getExe mkStageAmd64;
meta.description = "Build the amd64 nix2container image into the local store (no registry contact). Scaffold: amd64 leg only.";
};
publish-amd64 = {
type = "app";
program = lib.getExe mkPublishAmd64;
meta.description = "Stage + push the amd64 image (dry-run by default; PUBLISH=1 to push). Scaffold: amd64 leg only see ci/MIGRATION.md.";
};
};
}
);
}