ecluse
Safe HaskellNone
LanguageGHC2021

Ecluse.Server.Route

Description

The shared serve-action vocabulary of the front door, and the agnostic default router.

A Route is one classified request — everything the proxy is willing to serve, named independently of any ecosystem's URL grammar. The actions are common across registries (fetch a packument, stream a tarball, answer a liveness probe, deny a search); only the URL→action mapping is ecosystem-specific. That mapping is a Classifier, injected at the composition root, so this module stays free of any one ecosystem's path conventions while the dispatcher routes through whatever classifier its mount carries.

The model is deny by default, mirroring the rules engine (Ecluse.Rules): the agnostic default denyAll classifies every path as Unsupported (a 404 at the edge), so a deployment that wires no ecosystem router serves nothing rather than guessing. An ecosystem adapter supplies a Classifier that recognises its own paths and falls back to Unsupported for the rest.

Route is a small sum so the whole routing table is unit-testable with __no server__: feed a Classifier some segments, assert the Route.

Synopsis

Routes

data Route Source #

A classified request. Everything the front door is willing to serve is one of these; an unrecognised path is Unsupported (deny by default).

The constructors are the proxy's actions, shared across ecosystems — the artifact a Tarball streams and the metadata a Packument merges are the same serve behaviour whether the upstream is npm, PyPI, or another registry. Only the mapping from a request path to one of these (a Classifier) is ecosystem-specific.

Constructors

Packument PackageName

A package-metadata request — the packument.

Tarball PackageName Version Filename

An artifact request, as a parsed coordinate: the package, the Version the classifier read out of the artifact name, and the Filename itself. The Version is the coordinate the rules gate on; the Filename is the artifact's on-the-wire name, preserved verbatim — it, not a name rebuilt from (package, version), is authoritative for fetching the bytes.

Ping

A registry liveness probe, answered locally.

Search

Package search (unsupported).

Unsupported

Anything unrecognised. Renders as a 404 — deny by default at the routing layer.

Instances

Instances details
Show Route Source # 
Instance details

Defined in Ecluse.Server.Route

Methods

showsPrec :: Int -> Route -> ShowS #

show :: Route -> String #

showList :: [Route] -> ShowS #

Eq Route Source # 
Instance details

Defined in Ecluse.Server.Route

Methods

(==) :: Route -> Route -> Bool #

(/=) :: Route -> Route -> Bool #

newtype Filename Source #

An artifact's on-the-wire file name, the agnostic artifact-name type a Tarball route carries.

It is held as a distinct type, not a bare Text, because it is __authoritative for fetching the bytes__: the proxy fetches an artifact at the upstream path built from this exact name, never one reconstructed from (package, version), so that a registry whose artifact naming differs from the proxy's own convention still resolves. The name is preserved verbatim as received; the classifier that produces it has already applied the component-safety gate (isSafeComponent), so the value is safe to interpolate into a downstream URL.

Constructors

Filename Text 

Instances

Instances details
Show Filename Source # 
Instance details

Defined in Ecluse.Server.Route

Eq Filename Source # 
Instance details

Defined in Ecluse.Server.Route

Classification

type Classifier = [Text] -> Route Source #

The mapping from an ecosystem-native request path to a Route.

A classifier sees the already-mount-stripped, percent-decoded path segments and returns the serve action. Each ecosystem adapter contributes its own — recognising its path grammar and denying everything else — so the agnostic dispatcher stays closed while every mount routes through its ecosystem's template. Dispatch chooses the classifier per matched mount (see Ecluse.Server), so the same shape carries either a single ecosystem or a mount-keyed selection.

denyAll :: Classifier Source #

The agnostic default classifier: every path is Unsupported.

This is the deny-by-default base a deployment runs with until a composition root wires an ecosystem's classifier in, so an unwired server serves nothing rather than guessing a grammar. It deliberately knows no path conventions of its own.

Component safety

isSafeComponent :: Text -> Bool Source #

Whether a single decoded path component is safe to interpolate into a downstream upstream URL — the deny-by-default gate a classifier applies to every component it accepts (a scope, base name, or tarball filename).

The path is percent-decoded before it reaches us, so a single segment can carry a '/', a '\\', a control character, or be "."/".."; any of these enables path traversal or request smuggling once the name reaches the upstream URL. A component is UNSAFE iff it is empty, is exactly "." or "..", or contains a '/', a '\\', or any isControl character. Everything else is accepted: this is a security boundary, not an ecosystem-policy validator, so ordinary names with interior dots (lodash.merge, is.odd), hyphens, underscores, digits, or uppercase all pass.

It lives in the agnostic layer because the threat — interpolating a hostile segment into an upstream URL — is ecosystem-independent; both an ecosystem's path classifier and the defence-in-depth check in Ecluse.Security share this one rule.

This gate is structural: it stops a component that would change the upstream URL's shape (a traversal, an embedded separator, a control character). It does not stop a component that carries other URL-reserved bytes — a '%', '?', '#', ';', or a space — which an accepted name can still hold (notably a once-decoded segment carrying a literal %2e%2e%2f). Those are neutralised not by widening this denylist but by percent-encoding every accepted component with encodeComponent when the upstream URL is built, so the safety of an interpolated component rests on encode-on-build, not on this gate alone.

encodeComponent :: Text -> Text Source #

Percent-encode a single decoded path component for safe interpolation into an upstream URL — the encode-on-build partner of isSafeComponent.

A component is the content between a URL's structural delimiters (a scope, base name, or filename), never the delimiters themselves, so this encodes conservatively: it keeps only the RFC 3986 unreserved set (A-Z, a-z, 0-9, and '-', '.', '_', '~') verbatim and percent-encodes every other byte of the component's UTF-8 encoding as %XX (upper-case hex). A caller composing a path therefore writes the structural '/', scope %2F, '@' sigil, and the like itself, around encoded components — so a '%', '/', '?', '#', ';', space, or control byte inside a component cannot alter the URL's shape, inject a query or fragment, or — the once-decoded %2e%2e%2f case — survive as a live escape a decode-and-normalise upstream could resolve to traversal.

Encoding is per-byte over the UTF-8 form, so a multi-byte character is encoded one %XX per byte ('é'%C3%A9). It does not encode an already-percent-encoded escape idempotently — a literal '%' is always re-encoded to %25 — which is the point: the component is decoded content, so any '%' in it is a literal to be escaped, not a structural escape to preserve.