Cache (memoization)

WebPPL provides a cache combinator that returns a new function whose results are remembered and reused. This is commonly called memoization.

At first glance, cache feels very “functional”: for the same input you get the same output. But the mechanism that makes this possible is state.

So cache is a good example of a deliberate, useful tension:

cache is FP “returning” to stateful computation—on purpose—so you can model and optimize reliably.

What cache does

Given a function f, cache(f) returns a function g such that:

  • On the first call with an argument x, g(x) computes f(x) and stores the result.

  • On later calls with the same argument x, g(x) returns the stored result instead of recomputing.

In other words, cache creates a hidden lookup table:

  • key: the argument(s)

  • value: the computed result

From the outside, the cached function looks like a deterministic mapping. Inside, it maintains mutable state (the memo table).

Why this matters for functional programming

In “pure” functional programming, a function should have no hidden memory: it should not depend on or modify hidden state.

cache violates that purity internally, but it is still widely used in functional settings because:

  • it preserves the interface of a function,

  • it can preserve the observable behavior (for pure functions),

  • it gives large performance wins.

So you can think of cache as:

  • a controlled “imperative” mechanism (state),

  • wrapped in a functional interface (a returned function),

  • to achieve a good goal (speed or stable latent structure).

In probabilistic programming, there is an additional motivation: cache can express a single latent random choice per entity.

When cache is safe: pure/deterministic computations

If f is deterministic (no randomness, no side effects), then caching is behavior-preserving: it only makes things faster.

Example: caching an expensive deterministic computation

var expensive = function(x) {
  // imagine something computationally heavy here
  return x * x + 1;
};

var fastExpensive = cache(expensive);

display(fastExpensive(10)); // computes and stores
display(fastExpensive(10)); // reuses cached result

When cache is a modeling tool: one latent value per key

If the body of f contains randomness, then caching changes the meaning:

  • without caching: each call can create a new random choice

  • with caching: the first random choice is stored and reused for that key

This is often exactly what you want when modeling hidden traits.

Good modeling example: a stable latent trait per individual

Suppose each person has an unobserved “type” (A or B) that should be consistent everywhere in the program.

// A latent trait per person id
var personType = cache(function(id) {
  return flip(0.3) ? "A" : "B";
});

display(personType("anna"));
display(personType("anna")); // same as above: "anna" keeps one latent type
display(personType("bela")); // different id => independent latent type

Here cache is not just an optimization; it expresses a modeling assumption:

  • each id has exactly one latent draw.

This is a clean example of how probabilistic modeling often uses a small amount of state internally to achieve a coherent functional model structure.

When cache is dangerous: you accidentally remove independence

If you intended independent random draws on each call, caching is the wrong tool. It will “glue together” calls that should be independent.

Bad example: you wanted a fresh coin flip every time

var coin = cache(function() { return flip(0.5); });

display(coin()); // first draw
display(coin()); // WARNING: returns the same result as above (cached)

This is usually a bug, because it turns two intended independent flips into one shared latent flip.

A simple rule of thumb

Use cache when you want either:

  1. Optimization for expensive deterministic code, or

  2. One latent value per key (a stable hidden property tied to an id).

Avoid cache when you want:

  • independent random draws each time,

  • time-varying or history-dependent behavior (unless that is explicitly modeled).

See also

  • functions_and_closures (cache returns a function; it is a closure-like pattern)

  • immutability_lists_objects (why FP prefers immutable data, and what cache is doing differently)

  • Distributions: Bernoulli (for the flip examples above)

Executable example

// cache.wppl
// Demonstrates what cache does in WebPPL (memoization) and why it's "state sneaking back"
// into FP for a good purpose.

// 1) Deterministic function: caching is a pure optimization
var expensiveDeterministic = function(x) {
  // Pretend this is expensive; keep it simple & fast.
  return x * x + 1;
};

var fast = cache(expensiveDeterministic);

display("Deterministic caching (same input -> same output, just faster):");
display({first: fast(10), second: fast(10), third: fast(11)});

// 2) Stochastic function: caching changes semantics (creates a stable latent value per key)
var personType = cache(function(id) {
  // One latent draw per id:
  // first time we see an id -> sample; next time -> reuse.
  return flip(0.3) ? "A" : "B";
});

display("\nLatent trait per id (cache turns randomness into stable per-key state):");
display({
  anna1: personType("anna"),
  anna2: personType("anna"),
  bela1: personType("bela"),
  bela2: personType("bela")
});

// 3) Pitfall: accidental loss of independence
var badCoin = cache(function() { return flip(0.5); });

display("\nPitfall: caching a random draw removes independence across calls:");
display({coin1: badCoin(), coin2: badCoin(), note: "coin1 and coin2 are always identical here"});
node:fs:448
    return binding.readFileUtf8(path, stringToFlags(options.flag));
                   ^

Error: ENOENT: no such file or directory, open '../examples/functional/cache.wppl'
    at Object.readFileSync (node:fs:448:20)
    at main (/home/runner/work/WebpplHelp/WebpplHelp/node_modules/webppl/webppl:121:17)
    at Object.<anonymous> (/home/runner/work/WebpplHelp/WebpplHelp/node_modules/webppl/webppl:161:1)
    at Module._compile (node:internal/modules/cjs/loader:1521:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1623:10)
    at Module.load (node:internal/modules/cjs/loader:1266:32)
    at Module._load (node:internal/modules/cjs/loader:1091:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:164:12)
    at node:internal/main/run_main_module:28:49 {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '../examples/functional/cache.wppl'
}

Node.js v20.20.1