Using Écluse: Operator Manual

This is the operator-facing manual for deploying and running Écluse: how to configure it, connect your clients, and, importantly, how to fence its network egress so it stays a safe link in your supply chain. It's the consumer companion to the internal architecture documents, which explain the why behind everything here.

Status: pre-launch. Écluse is under active development. This manual is the configuration and operational contract: the env vars, the config schema, the client setup, and the security responsibilities, documented as they stabilise. Features still landing are marked (planned) with a pointer to the tracking slice; the end-to-end request path is tracked in the delivery plan. Treat this as the deployment contract, not a claim that every capability below is wired today.

Contents

What Écluse does

Écluse sits between your build (developer machine or CI) and the npm registry, and enforces a deny-by-default resilience policy before any package reaches a build. It reads through a private upstream first, falls back to the public registry with rules applied, and mirrors approved packages asynchronously. It's a policy gate, not a registry, and it hosts nothing itself. The design is in docs/architecture.md.

Deployment model

Écluse ships as a single, reproducible container image that runs one process: the HTTP front door (a raw-wai application on PROXY_PORT, default 4873) and, alongside it, the mirror worker. Point your package manager at it as a registry (see Connecting your clients).

Before you run a published image, verify its provenance and SBOM attestations: the recipe (keyless Sigstore + Rekor, pinned by digest) is in the README.

Configuration

Configuration has two layers: environment variables for process-level and secret values, and an optional structured config document for the two things too expressive for flat env vars, namely the rule policy and the mount map. The common single-mount npm deployment on the default policy needs no document at all.

The authoritative semantics, validation rules, and rationale live in Configuration & Authentication; this section is the operator reference. Keep the two in sync when either changes.

Environment variables

Variable Required Description
PROXY_PORT No (default 4873) TCP port the proxy listens on.
PRIVATE_UPSTREAM_URL Yes URL of the private upstream registry (the authority for reads under the default passthrough strategy).
PUBLIC_UPSTREAM_URL No (default https://registry.npmjs.org) URL of the public upstream, queried anonymously and gated by the rules.
PROXY_PUBLIC_URL Recommended The proxy's own externally-reachable base URL (e.g. https://registry.example.com), used to rewrite each served dist.tarball to an absolute URL clients fetch back through the proxy. Unset, tarball URLs are path-relative, which the npm CLI cannot install from — it reads a leading-slash dist.tarball as a local file: path — so set this for any deployment that serves real npm installs.
MIRROR_TARGET_URL No (default PRIVATE_UPSTREAM_URL) Registry that approved packages are mirrored to. Unset ⇒ folds onto the private upstream (one registry, read and written). The write credential does not fold — set MIRROR_TARGET_CREDENTIAL_PROVIDER.
MIRROR_TARGET_CREDENTIAL_PROVIDER No (default static) Mirror-target write credential: static (MIRROR_TARGET_TOKEN) or codeartifact (mints under the container/task role). gcp-artifact-registry is recognised but not yet built.
MIRROR_TARGET_TOKEN No Static write token, when MIRROR_TARGET_CREDENTIAL_PROVIDER=static (the default).
MIRROR_TARGET_CODEARTIFACT_DOMAIN codeartifact only CodeArtifact domain — or parsed from a CodeArtifact MIRROR_TARGET_URL host.
MIRROR_TARGET_CODEARTIFACT_DOMAIN_OWNER codeartifact only 12-digit owning account id — or parsed from the host (a non-account-id value is rejected at boot).
MIRROR_TARGET_CODEARTIFACT_REGION codeartifact only Region — this key, else the host (its authoritative region), else AWS_REGION.
MIRROR_TARGET_CODEARTIFACT_TOKEN_DURATION_SECONDS No Token lifetime in seconds, capped at 43200 (12 h).
MIRROR_QUEUE_PROVIDER No (default sqs) Mirror-queue backend: sqs (AWS), or memory (a bounded in-process queue — no cloud queue, at the cost of a non-durable, best-effort mirror; an explicit choice for a simple/single-node/air-gapped deployment, never an automatic fallback — selecting it warns loudly at boot). pubsub (GCP) is recognised but not yet built.
MIRROR_QUEUE_URL Cloud backends only (sqs/pubsub) Queue identifier: an SQS queue URL or a Pub/Sub projects/<p>/topics/<t> resource. Required for the cloud backends (absent ⇒ fail-loud at boot); not needed for memory (no external queue — ignored).
MIRROR_QUEUE_MEMORY_MAX_DEPTH No (default 50000) memory only. Cap on the in-process queue depth. A cold-cache npm ci enqueues thousands of jobs at once, so the queue is hard-bounded: an enqueue past the cap is dropped (drop-newest) — safe, since a dropped job is re-mirrored on the next demand — and rate-limit-logged. Positive integer.
AWS_REGION AWS backends only Region for SQS and CodeArtifact.
AWS_ENDPOINT_URL_SQS / AWS_ENDPOINT_URL No SQS endpoint override (AWS-SDK-standard). Point at a local emulator (ministack) or VPC endpoint; with one set, requests are signed with AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY. Unset ⇒ normal AWS resolution.
GOOGLE_CLOUD_PROJECT GCP backends only Project for Pub/Sub and Artifact Registry (credentials via ADC).
PROXY_AUTH_TOKEN No If set, clients must present this token (Bearer / _authToken). Omit for network-secured deployments.
PROXY_RESPECT_UPSTREAM_TARBALL_HOST No (default false) Secure default. When false, a tarball is fetched only from the same allowlisted upstream that served the packument; set true only for a registry that serves tarballs from a separate CDN/files host (widens the fetch surface to any allowlisted host). See Securing network egress.
PROXY_HELP_MESSAGE No String appended to every denial message (e.g. a support channel).
PROXY_LOG_FORMAT No (default json) Log shape: json (one JSON object per line, for log collectors) or console (human-readable).
CVE_SYNC_INTERVAL_SECONDS No (default 3600) How often the in-memory advisory index refreshes from OSV. (with the CVE tier)
PROXY_MAX_RESPONSE_BYTES No (default 16777216, 16 MiB) Largest upstream metadata body buffered before the fetch aborts fail-closed. Bounds memory against a hostile upstream returning a giant body. Positive integer.
PROXY_MAX_VERSION_COUNT No (default 100000) Largest version count a packument may carry before it is refused. Bounds per-version rule evaluation against a version flood. Positive integer.
PROXY_MAX_NESTING_DEPTH No (default 64) Deepest JSON nesting a decoded upstream document may reach before it is refused. Bounds CPU/stack against a pathologically nested payload. Positive integer.
PROXY_MIN_PUBLIC_INTEGRITY No (default sha256) Minimum integrity algorithm a public version's digest must meet to be served: sha256, sha512, or blake2b. A public version whose strongest digest is weaker (e.g. a legacy SHA-1 shasum only) is refused with a 403. Hard-floored at SHA-256sha1/md5/an unknown name is rejected at startup. The trusted private upstream is exempt.
PROXY_CONFIG No The configuration document as an inline JSON blob, for an env-only deployment with no mounted file.

Configuration is validated in full at startup, and the process refuses to start on any problem: an unknown rule type, a bad URL, an unresolved policy reference. A misconfiguration is a loud, immediate failure, never a quietly mis-enforced policy.

The configuration document

Supplied either as a JSON file (the reviewable source of truth) or inline via PROXY_CONFIG. It carries the rule policy (see Rule policy) and, for multi-mount deployments, the mount map. Single-mount deployments desugar from the environment variables above and need no document. Schema and examples: Configuration & Authentication.

Secrets

Secrets never live in the configuration document. Client and registry tokens are always environment variables, and cloud-managed registries (CodeArtifact / Artifact Registry) derive short-lived tokens from ambient cloud credentials, keeping long-lived secrets out of config entirely. Écluse always holds a mirror-target write credential; how reads are credentialled is the mount's credential strategy: the default passthrough forwards the client's own token to the private upstream (and strips it before the public one), while service / delegated-cache read with Écluse's own credential. See Outbound Registry Credentials.

Connecting your clients

Point your package manager at the proxy as its registry. With PROXY_AUTH_TOKEN set, supply it the standard npm way:

# .npmrc
registry=https://ecluse.example.internal/
//ecluse.example.internal/:_authToken=${ECLUSE_TOKEN}

Edge authentication to the proxy has three modes (and feeds the mount's credential strategy, which decides how the upstreams are then credentialled):

  1. Open: PROXY_AUTH_TOKEN unset; access control is delegated entirely to the network layer (VPC, service mesh). Appropriate only on a closed network.
  2. Static token: PROXY_AUTH_TOKEN set; clients send it as Authorization: Bearer <token> or .npmrc _authToken.
  3. Trusted edge identity: a fronting gateway / IAP / mesh asserts a verified identity Écluse trusts, sound only where Écluse is reachable solely through that edge. Validating cloud IAM at the npm edge directly stays a gateway concern (the npm client can't speak it; let the managed mirror target enforce write IAM).

Securing network egress (required)

Écluse makes outbound requests to the registries you point it at (that's its job), and some of the URLs it follows (a version's dist.tarball) are taken from upstream responses. As with any service that fetches on a client's behalf, the sensible posture is least-privilege egress, in two layers. Écluse provides the first in the application itself, with an origin-aware trust model:

Crucially, SSRF access to the instance-metadata endpoint is prevented at the service-behaviour level, not by blocking metadata at the network. Écluse only follows internal-resolving locations on the trusted private origin, never on a client- or upstream-influenced one, so an attacker can't steer it at 169.254.169.254. Écluse itself needs the metadata endpoint to mint its instance-role credentials (AWS.newEnv AWS.discover, over amazonka's own HTTP client, independent of the guarded data-plane path), so do not deny the proxy egress to metadata or to internal ranges: that would break its own credentials.

You provide the second layer at the platform: the standard defence-in-depth for an outbound-fetching service, protecting your data targets (registries, mirror) and catching anything the application layer doesn't:

The dist.tarball host policy. A version's dist.tarball is upstream-chosen data, so by default Écluse fetches a tarball only from the same allowlisted upstream that served the packument: a dist.tarball pointing at a different host is refused even if that host is otherwise on the allowlist. If your registry legitimately serves artifacts from a separate CDN/files host (the PyPI-files-host shape), set PROXY_RESPECT_UPSTREAM_TARBALL_HOST=true to relax this to any allowlisted host; it never escapes the allowlist or the internal-range block, but it does widen the fetch surface, so opt in deliberately and pair it with the platform egress controls above.

The rationale (and why both the application guards and the platform controls are worth having) is in Security: Outbound-Request & Input-Validation Invariants.

The controls above secure Écluse's own outbound path. This one is about your consumers', and it's the step that turns Écluse from a proxy clients are asked to use into the registry they can only reach.

If you control your CI environment, deny CI runners outbound access to the public registries (registry.npmjs.org, and the equivalents for other ecosystems) and let them reach only Écluse and your own internal services. Point the runners' package managers at Écluse as their registry.

The result is safe-by-default behaviour. A job that's misconfigured (a stray --registry flag, a committed .npmrc pointing at the public registry, a tool that ignores the settings you shipped) doesn't quietly bypass the policy: it simply can't reach the public registry, so it fails instead of pulling an unvetted package. You stop depending on every job being configured correctly, and depend only on the network, which you administer centrally.

This is what makes the deny-by-default policy unbypassable rather than merely default. Per-project package-manager and version-manager setups (npm/pnpm config, nvm, Nix shells, containers) can each override what you ship to a machine, but none of them can route around a network that only reaches Écluse. See MOTIVATION → The bar for why this is the layer that holds.

The same idea can extend to developer workstations (for example, allowing tarball fetches only through Écluse on a managed or zero-trust network while leaving registry browsing and search open), though workstations are usually a softer control than CI.

Rule policy

Écluse evaluates a named map of rules over a built-in default policy, deny-by-default: a package is admitted only if a rule takes an allow position, and at equal precedence deny wins. The shipped default is deliberately small and biased toward resilience rather than blanket bans:

You override values, add rules (e.g. opt into DenyInstallTimeExecution), or suppress a default by name in the configuration document:

{
  "rules": {
    "min-age":      { "ageSeconds": 1209600 },
    "deny-scripts": { "type": "DenyInstallTimeExecution", "precedence": 200 }
  }
}

Full semantics (precedence, the patch/add/suppress merge, and the strict validation) are in Rule policy and Rules Engine.

Always-on: a public version must carry a strong integrity digest

Independent of the configurable rules above, Écluse enforces one non-negotiable admission policy on public (untrusted) upstreams: a version is served only if its dist carries at least one integrity digest whose algorithm meets the integrity floor (PROXY_MIN_PUBLIC_INTEGRITY, default SHA-256). A public version whose strongest digest is absent or below the floor — for example only a legacy SHA-1 shasum, with no sha256/sha512 SRI integrity — is inadmissible:

SHA-1 and MD5 have practical collisions, so a match on one can't prove the bytes weren't substituted; this closes a tamper-detection gap, since a divergence between two copies that share only a weak digest could go undetected. The floor may be raised (sha512, blake2b) but never set below SHA-256 — a sub-floor value is rejected at startup. The private (trusted) upstream is exempt: its versions enter unfiltered, so a SHA-1-only private version is still served (trust substitutes for cryptographic strength there).

Gotcha. If a custom or off-spec public upstream serves versions without a digest meeting the floor (no integrity, or only a legacy shasum), those versions silently disappear from what Écluse serves and a direct fetch 403s. This is deliberate. If you genuinely need to serve such a source, point it at the private (trusted) upstream slot, not the public one, or — only if the source is trustworthy and you accept the weaker digest — this is the one knob you must not lower below SHA-256. See Security Policy.

Operating Écluse

Planned controls

Documented here so the configuration surface and its security trade-off are known ahead of implementation. Écluse's posture is secure by default, with overrides under your explicit control: you decide your threat tolerance.

The full deployment runbook ships with the launch (S32).

Learn more

The internal design, for when you need the why: