Files
alertmanager-gotify-bridge/ci/MIGRATION.md
T
Oleks e8f3e954e7 ci: design nix2container migration + scaffold amd64 publish app (emmett#44)
Archetype: oci-image (buildx -> in-cluster remote buildkit), the HARD case.
DESIGN/PARTIAL, not a finished migration:

- ci/MIGRATION.md: concrete plan to escape buildkit via nix2container/skopeo
  +regctl. The app is pure-stdlib Python, so both arches are buildable on
  emmett (amd64-native + Nix-cross-from-amd64 python3 closure) with no
  buildkit/qemu/docker -> no foreign-arch leg needed; Dockerfile retired on
  cutover. Covers per-arch build, entrypoints, .woodpecker.yaml target,
  escape hatch (unused here), risks, remaining work.
- flake.nix: scaffolds the natively-buildable amd64 leg only
  (stage-amd64, publish-amd64), dry-run by default (PUBLISH=1 to push),
  $REGISTRY_TOKEN -> pass fallback, registry-down/empty-token blockers.
  Mirrors reference impl claude-plugin-registry@9850745.

arm64 leg, publish-index/publish, and YAML cutover are designed but NOT wired.
Verified: nix eval .#apps.x86_64-linux (-> stage-amd64, publish-amd64); no
image build run (downloads closure).
2026-06-02 03:39:20 +03:00

223 lines
10 KiB
Markdown

# Migration plan: buildx → nix2container/skopeo+regctl (emmett#44)
<!-- markdownlint-disable MD013 MD040 MD060 -->
<!-- design doc: dense tables, command/output blocks, and long refs -->
**Status: DESIGN + PARTIAL SCAFFOLD.** This document is a concrete plan, not a
completed migration. A `flake.nix` in the repo root scaffolds the
**natively-buildable amd64 leg only** (`nix run .#publish-amd64`, dry-run by
default). The arm64 leg and the `.woodpecker.yaml` cutover are designed here but
**not yet wired** — see "Remaining work".
Archetype: `oci-image` (buildx → in-cluster remote buildkit) — the HARD
archetype in the emmett#44 local-pipeline-parity standard. Tracking: oleks/cluster
milestone #57.
---
## 1. Why this repo is an easy migration (feasibility)
The current pipeline (`.woodpecker.yaml`) does:
```
docker buildx create --driver remote tcp://buildkit-rootless-arm64.infra.svc...
docker buildx build --platform linux/amd64,linux/arm64 --push .
```
i.e. it depends on **in-cluster remote buildkit** for the multi-arch build and on
a node-pinned step (`howard2404`). That is exactly the cluster-coupled,
not-reproducible-on-emmett shape emmett#44 wants gone.
The application (`bridge.py`) is the easiest possible payload:
- **Pure Python standard library.** `json`, `os`, `sys`, `http.server`,
`urllib`. No `pip install`, no requirements file, no C/Rust extension, no
native wheel.
- The Dockerfile is 5 lines: `FROM python:3.12-alpine`, copy one file, run it.
Consequences for nix2container:
- The image's only real runtime dependency is a **CPython interpreter + its Nix
closure** (glibc, openssl, zlib, ncurses, ...). `bridge.py` is a static asset
copied next to it.
- CPython is in nixpkgs and **cross-compiles cleanly with `pkgsCross`** — there
is no source build of the app to cross, only the standard interpreter, which
the binary cache already serves for aarch64. So **both arches are buildable on
emmett (amd64-native + Nix-cross-from-amd64)** with **no buildkit, no qemu, no
docker daemon**. This satisfies the emmett-OK definition in the standard
("amd64-native OR Nix-cross-from-amd64, never 'uses skopeo' as the build, and
not a foreign-arch buildkit leg").
This repo is therefore a clean reference for "oci-image buildx leg that fully
escapes buildkit", not one of the genuinely-hard foreign-arch cases.
---
## 2. What the flake builds, per arch
Mirrors the reference impl (oleks/claude-plugin-registry @ 9850745), minus the
Rust/Dioxus build (we have no compiled artifact):
```
inputs: fleet (nixpkgs-projects pin), nix2container, flake-utils
mkApp targetPkgs:
# a tiny derivation that places bridge.py + a python3 symlink under /app
appRoot = runCommand "app-root-<arch>":
mkdir -p $out/app
cp ${./bridge.py} $out/app/bridge.py
ln -s ${targetPkgs.python3}/bin/python3 $out/app/python3 # closure tracked
mkImage arch = nix2container.buildImage {
name = "git.oleks.space/oleks/alertmanager-gotify-bridge";
tag = "${version}-${arch}";
inherit arch; # "amd64" | "arm64"
layers = [ (buildLayer {
copyToRoot = [ (appRoot arch) cacert ];
maxLayers = 25;
reproducible = false; # see Determinism note
}) ];
config = {
Cmd = [ "/app/python3" "/app/bridge.py" ];
WorkingDir = "/app";
ExposedPorts = { "8080/tcp" = {}; };
Env = [ "PORT=8080" "SSL_CERT_FILE=/etc/ssl/certs/ca-bundle.crt" ];
};
};
```
- **amd64**: `targetPkgs = pkgs` (native). Fully buildable + pushable on emmett.
- **arm64**: `targetPkgs = pkgs.pkgsCross.aarch64-multiplatform`. Cross from
amd64; the aarch64 `python3` closure comes from the binary cache. No qemu.
> The Alpine base in the current Dockerfile is intentionally dropped: nix2container
> ships only the interpreter closure, so the image is smaller and has no distro
> package layer. `urllib`+TLS works because `cacert` is copied in and
> `SSL_CERT_FILE` points at it.
---
## 3. Entrypoints (uniform front door)
Per the corrected standard, the flake apps ARE the shared code; `.woodpecker.yaml`
and local runs invoke the same `nix run`:
| app | does |
|-----|------|
| `stage-amd64` / `stage-arm64` | realize the image into the local Nix store, **no registry contact** (BUILD/STAGE-parity half) |
| `publish-amd64` / `publish-arm64` | stage + `copyTo` (skopeo) push that arch (PUBLISH half) |
| `publish-index` | build-free: `regctl index create` from pushed per-arch refs, then `:latest` = digest copy of `:TAG` as the LAST mutation |
| `publish` | both arches + `publish-index` |
Conventions carried verbatim from the reference:
- **Token**: `$REGISTRY_TOKEN` → fallback `pass infra/gitea/personal_access_token_packages_rw`
→ named `BLOCKER(empty-token)`. Never echoed; apps run under `set -euo pipefail`
only (no `set -x`).
- **Dry-run by default**: every push-capable app builds + prints what it would
push; `PUBLISH=1` / `--publish` required to mutate the registry. An accidental
local run cannot push.
- **VERSION/TAG**: `TAG=${VERSION:-<flake version>}`; CI derives `version` from
`CI_COMMIT_TAG` (strip leading `v`). `$VERSION` overrides for local dev.
- **Dev-tag guard**: `:latest` is only moved for a real release tag; dev tags
push per-arch + immutable index only.
- **Preflight**: `BLOCKER(registry-down)` if `https://git.oleks.space/v2/` is
unreachable (names the cluster-shared-fate failure mode).
- **Multi-arch**: index assembled from digest-pinned per-arch refs, fails closed
if a required arch is absent this run, `:latest` is a digest COPY (not a
re-assembly) and the last mutation, staging per-arch tags can be pruned;
gitea-oci-cleanup pins index child digests.
---
## 4. `.woodpecker.yaml` after migration (target)
The buildx/howard-pinned step is replaced by a single nix-ci step that runs the
same app the laptop runs:
```yaml
when: [{ event: tag, ref: "refs/tags/v*" }]
steps:
- name: publish
image: git.oleks.space/oleks/nix-ci:latest
environment:
REGISTRY_TOKEN: { from_secret: registry_token }
commands:
- export VERSION="$(echo "$CI_COMMIT_TAG" | sed 's/^v//')"
- nix run .#publish -- --publish # PUBLISH=1 via flag; CI is the only publisher
```
Gone: `docker login`, `buildx create --driver remote`, the `arm64` remote
builder, the `nodeSelector: howard2404` pin, the multi-arch `--platform` build.
The `skip_clone`+manual-clone dance can also drop to a normal clone with tags
(needed for `git describe` fallback) once the version is derived in shared code.
---
## 5. Foreign-arch escape hatch
Not needed for **this** repo — arm64 is a pure Nix cross of a stock interpreter,
so there is no foreign-arch buildkit leg at all. The general escape hatch (kept
for the upstream-source oci-image repos that genuinely need it) is documented for
the archetype, not used here:
> For images whose payload genuinely cannot be cross-compiled by Nix (an
> upstream binary only published for a foreign arch, or a build that won't
> cross), keep a `Dockerfile` + a buildx step that parameterizes
> `BUILDKIT_ADDR` (default `docker-container://local`, CI overrides to the
> in-cluster `tcp://buildkit-rootless-<arch>.infra.svc`). That leg stays
> cluster-coupled by necessity; its per-arch digest still feeds the same
> `regctl index create`/`publish-index` join point, so the multi-arch assembly
> is uniform across native, cross, and foreign-arch legs.
Because alertmanager-gotify-bridge has neither a compiled artifact nor a
foreign-only upstream binary, the escape hatch is dead code here and is **not**
added — the Dockerfile is retired entirely on cutover.
---
## 6. Risks / caveats
1. **Determinism (nix2container `reproducible = false`).** Same caveat as the
reference: parity holds only when emmett and CI resolve the **identical store
path** for the python3 closure from the shared cache. With the `fleet`
nixpkgs-projects pin + `flake.lock` committed this holds; verify by comparing
the pushed digest after a release. For a pure-stdlib interpreter image the
risk is low (no project-specific compiler in the closure), but it is real.
2. **TLS / cert path.** `urllib` to Gotify over HTTPS needs `cacert` in the image
and `SSL_CERT_FILE` set — handled in `config.Env` above. Must be verified once
against the live Gotify endpoint (the Alpine image got certs from the distro;
we now ship them explicitly).
3. **Image shape change.** Switching from `python:3.12-alpine` to a Nix closure
changes the digest, size, and layer layout. Any consumer pinning a specific
base-layer digest (unlikely here) would need updating. The `CMD` path changes
from `python` (PATH) to `/app/python3` (absolute symlink).
4. **arm64 cache coverage.** The cross build is only fast if the aarch64 python3
closure is in the binary cache; a cold cache makes the first emmett arm64 run
slow (still correct, no qemu). Aligns with the fleet-pins strategy.
5. **`publish-index` requires `regctl`/`skopeo` in `nix-ci`.** The reference
already relies on these being present; confirm the `nix-ci` image (or the
app's `runtimeInputs`) provides `regctl`, `skopeo` (via `copyTo`), `curl`.
---
## 7. Remaining work (to finish the migration)
- [x] `flake.nix` scaffolding the **amd64** leg (`stage-amd64`, `publish-amd64`)
— present, dry-run by default. (this commit)
- [ ] Add the **arm64** cross leg (`stage-arm64`, `publish-arm64`) and `publish` /
`publish-index` apps (copy the reference's index machinery verbatim).
- [ ] Verify the cross build resolves from the binary cache (no source rebuild).
- [ ] Verify TLS to Gotify from the Nix image (cacert + `SSL_CERT_FILE`).
- [ ] Cut over `.woodpecker.yaml` to `nix run .#publish -- --publish`; delete the
buildx/remote-builder/howard-pin steps and the `Dockerfile`.
- [ ] One real release: compare emmett-built vs CI-pushed digest (determinism
check).
- [ ] Add `just publish` / `just stage` front door if/when the repo gains a
`justfile` (uniform across all archetypes).
When N>3 oci-image repos have migrated, factor the per-arch image + publish apps
into the shared semver-tagged `ci-archetypes` flake-module (parameterized) and
have repos pin it via `inputs.ci-archetypes?ref=<pin>` rather than copying the
flake.