ecluse
Safe HaskellNone
LanguageGHC2021

Ecluse.Server.Cache

Description

The short-TTL, size-bounded metadata cache shared by the serve paths.

Resolving a package re-fetches its upstream packument, parses it, and evaluates the rules. To avoid repeating the fetch+parse, the result — a coherent pair of the parsed packument metadata (PackageInfo) and the raw document it was decoded from (CacheEntry) — is held here in a short-TTL, size-bounded, STM-backed cache (the cache library backs the TTL store). Both serve paths share it: a packument request and the tarball-gating fetch that follows reuse one fetch+parse, and concurrent resolutions of a popular package collapse to one upstream call (single-flight).

Per-source key

A packument is fetched from two distinct upstreams — a private origin and a public origin — whose documents differ for the same package, so one entry cannot represent both. The key is therefore (source, package): the source is the upstream's base URL, which distinguishes any cached origin without naming a credential, so distinct upstreams never cross-contaminate and the key never blurs the trust split.

Credential-free; sharing is the caller's policy

This cache is strategy-neutral: its key carries no credential dimension (it is (source, package)) and its value is a canonical document, so it stores nothing derived from a caller's credential. Whether a given origin is handed to it — and so shared across clients — is the serve path's decision, not the cache's.

Under the default passthrough access strategy only the anonymous public origin is cached: the trusted private upstream is the per-client authority — it re-authorises each client's request with that client's own forwarded credential — so the serve path fetches it per request and never hands it to this cache. Were a private entry cached under passthrough, the credential-free key would let one client's entry serve another client's private document within the TTL, bypassing the upstream's authorisation. The public origin is anonymous (no client credential), so one shared entry serves every client without crossing any trust boundary. Other strategies make a shared private entry safe by authorising each serve before it is returned (see docs/architecture/access-model.mdCaching); that gate lives on the serve path, never in this credential-free store.

Coherent pair

An entry holds the parsed PackageInfo and the raw Value it was decoded from, so a hit returns a typed view and the exact bytes that produced it — never a mismatched pair. The packument serve path needs both: it decides over the typed view but serves the raw document edited in place, and the two must describe the same fetch.

What is cached is the metadata, not the verdict. The rules are re-evaluated on the cached metadata each request, so time-sensitive rules (AllowIfPublishedBefore) and the separately-synced advisory tier stay correct — only each upstream's fetch+parse is memoised, never a decision. The TTL is short and brief staleness is benign and even aligned with the resilience posture: a brand-new publish need not appear instantly (see docs/architecture/web-layer.md → "Metadata cache").

Two properties the cache library does not provide on its own are layered here:

  • Size bound. cache expires by TTL but never bounds entry count, so an insert that would exceed cacheMaxEntries first purges expired entries and then evicts surplus ones — a safety valve against unbounded growth under a flood of distinct packages, not a precision LRU (eviction order among live entries is unspecified).
  • Single-flight. cache's own fetchWithCache is lookup-then-fetch in plain IO, so two concurrent misses would both fetch. resolveMetadata instead installs an in-flight marker atomically, so the first miss fetches while concurrent misses wait on its result — collapsing a thundering herd to one upstream call. The leader inserts the result into the store before removing its in-flight marker, so a caller arriving in the instant the fetch returns still finds either the store entry or the marker (never a gap) and never re-leads a redundant fetch.
Synopsis

Configuration

data CacheConfig Source #

The metadata cache's tunables, sourced from configuration (see Ecluse.Config): how long a parsed packument stays fresh, and how many distinct (source, package) entries the cache holds before it evicts.

Constructors

CacheConfig 

Fields

  • cacheTtl :: NominalDiffTime

    How long a cached CacheEntry is served before it is re-fetched. Short by design — brief staleness is benign, and conditional-GET revalidates.

  • cacheMaxEntries :: Int

    The maximum number of distinct (source, package) entries held; an insert past this evicts.

Instances

Instances details
Show CacheConfig Source # 
Instance details

Defined in Ecluse.Server.Cache

Eq CacheConfig Source # 
Instance details

Defined in Ecluse.Server.Cache

defaultCacheConfig :: CacheConfig Source #

The default cache tunables: a 60-second TTL and 1024 entries — short enough that a new publish appears promptly, large enough to absorb a normal install's working set of packages.

The cache handle

data MetadataCache Source #

The metadata-cache handle: the TTL store, the size bound, and the in-flight map that gives single-flight. Opaque — built with newMetadataCache and reached only through the accessors. Lives in Env (one per process), so every request shares the same cache and its connection-collapsing.

newMetadataCache :: CacheConfig -> IO MetadataCache Source #

Build a metadata cache from its configuration. The TTL is converted to the cache library's monotonic TimeSpec; the in-flight map starts empty.

Cache entries

newtype Source Source #

Which upstream a cached packument was fetched from — the dimension that partitions the cache by source so distinct upstreams never share an entry.

The discriminator is the upstream's base URL: an upstream is addressed at a distinct URL, and the URL names a location, never a credential, so keying on it keeps the trust split intact (the cached origin is fetched with its own token, supplied through its fetch action; the source carries none). Under the default passthrough strategy only the anonymous public origin is cached, so in practice the cache holds one source per package; the dimension keeps the key honest about which upstream an entry is, never blurring the split.

Constructors

Source Text 

Instances

Instances details
Show Source Source # 
Instance details

Defined in Ecluse.Server.Cache

Eq Source Source # 
Instance details

Defined in Ecluse.Server.Cache

Methods

(==) :: Source -> Source -> Bool #

(/=) :: Source -> Source -> Bool #

Ord Source Source # 
Instance details

Defined in Ecluse.Server.Cache

data CacheEntry Source #

A coherent cache entry: the parsed PackageInfo paired with the raw Value it was decoded from. A hit returns both, so a caller gets a typed view to decide over and the exact bytes that produced it — the packument serve path edits the raw Value in place and must keep its typed decision coherent with those bytes.

Constructors

CacheEntry 

Fields

  • entryInfo :: PackageInfo

    The typed packument view the rules and merge reason over.

  • entryRaw :: Value

    The raw upstream document the served body is built from, edited in place.

Instances

Instances details
Show CacheEntry Source # 
Instance details

Defined in Ecluse.Server.Cache

Eq CacheEntry Source # 
Instance details

Defined in Ecluse.Server.Cache

Resolution

resolveMetadata :: Metrics -> MetadataCache -> Source -> PackageName -> IO CacheEntry -> IO CacheEntry Source #

Resolve a package's metadata from one upstream Source, reusing the cache and collapsing concurrent misses.

On a fresh, unexpired hit the cached CacheEntry is returned and the fetch action is never run. On a miss the action runs exactly once even under concurrent callers: the first installs an in-flight marker and fetches, the others wait on its result. A successful fetch is cached (subject to the TTL and size bound); a failed fetch caches nothing (so a transient upstream error does not poison the cache) and is re-raised to every waiter.

A claimed in-flight slot is always eventually filled and de-registered, even if the leader is hit by an async exception (a request timeout, a killed handler thread) between claiming the slot and completing: the claim and the leader's cleanup-armed run live under one mask, so no interruptible point sits in the gap, and any exception before normal completion fills the marker with that error — unblocking every waiting follower with it rather than leaving them parked forever — and frees the slot so a later call re-leads. This closes the single-flight orphan window (without it, a cancelled leader would wedge that (source, package) key until restart). A follower's own wait on the marker stays interruptible.

The Source partitions the cache: distinct upstreams of the same package resolve under distinct keys and never cross-contaminate. The fetch action supplies the origin's own credential, so reading through one source never blurs another's trust posture. Under the default passthrough strategy only the anonymous public origin is resolved here — the trusted private origin is the per-client authority and is fetched per request, never cached, so a shared entry can never serve one client another's private document.

The result is always re-decided by the caller's rules on each request — only the fetch+parse is memoised, never the verdict.

Each resolution records the ecluse.metadata_cache.requests hit/miss counter (a coalescing follower counts as a miss, like the leader it waits on), and a leader's insert refreshes the ecluse.metadata_cache.entries occupancy gauge.

resolveMetadataWith :: IO () -> Metrics -> MetadataCache -> Source -> PackageName -> IO CacheEntry -> IO CacheEntry Source #

As resolveMetadata, but with a hook run on the leading thread at the single-flight claim → fetch-runner handoff: the window between the STM transaction committing the in-flight claim and the leader's exception guard taking ownership of the marker. It exists only so a test can deterministically park a leader in that window and cancel it there, exercising the orphan-window guarantee; production always passes pure () via resolveMetadata.

cachedMetadata :: MetadataCache -> Source -> PackageName -> IO (Maybe CacheEntry) Source #

Look up a package's cached entry for one Source without fetching on a miss — the cache's read-only view, for inspection and tests. A Nothing is a miss or an expired entry; this never triggers a fetch and never collapses (use resolveMetadata for the serve path).

cacheSize :: MetadataCache -> IO Int Source #

The number of entries currently held (including any not-yet-purged expired).