Sampling and scoring

This page is the “root” reference for WebPPL’s two most important building blocks:

  • sample(dist[, opts]): create a random choice by drawing a value from a distribution object

  • dist.score(value): compute the log probability / log density assigned to a value

If you are ever unsure about syntax, shapes, or what an option does, come back here first.

Quick glossary (read this once)

distribution object

An object representing a probability distribution, such as Bernoulli({p: 0.7}) or Gaussian({mu: 0, sigma: 1}). Distribution objects support at least:

  • sampling: via sample(dist)

  • scoring: via dist.score(value)

random choice (sample site)

A place in your program where sample(...) is called. During inference, WebPPL treats each sample site as a stochastic “choice” whose value can be explored.

log probability / log density

WebPPL uses natural log values (base e). For discrete distributions, score returns log P(X = value). For continuous distributions, it returns a log density (not a probability).

inference

The process of turning a stochastic program (a model) into a distribution on its return values, usually approximately (e.g. via MCMC, SMC, etc.).

guide distribution

An auxiliary distribution used by some inference strategies as a proposal / approximation. It does not change the model itself; it changes how inference explores it.

drift kernel

A proposal mechanism for MCMC (MH-based) methods. It proposes new values based on the previous value at a sample site.

Distribution objects in one minute

A distribution object represents a distribution, and has two principal uses:

  1. Draw samples from it using sample(dist).

  2. Compute the (natural) log probability / density of a value using dist.score(value).

See also: the Distributions overview page.

Sampling: sample(dist[, opts])

Basic form

Use sample(dist) to draw one value from a distribution object.

  • For Bernoulli({p: ...}) the result is a boolean (true/false).

  • For continuous distributions like Gaussian(...) the result is a number.

Working example: one sample + scoring the sample

 1var d = Bernoulli({p: 0.7});
 2var x = sample(d);
 3var out = {
 4  sample: x,
 5  logp_of_sample: d.score(x),
 6  logp_true: d.score(true),
 7  logp_false: d.score(false)
 8};
 9
10out;
{
  sample: true,
  logp_of_sample: -0.35667494393873245,
  logp_true: -0.35667494393873245,
  logp_false: -1.203972804325936
}

Scoring: dist.score(value)

dist.score(value) returns the natural log of the probability (discrete) or density (continuous) that dist assigns to value.

Two practical notes:

  • Log space is used because probabilities can get extremely small.

  • If you ever need the probability (discrete), you can convert with Math.exp(logp).

For Bernoulli in particular:

  • score(true)  = log(p)

  • score(false) = log(1 - p)

(See also the Bernoulli page.)

From log probability to probability

WebPPL’s score returns values in log space (natural log). To convert a single log probability back to an ordinary probability:

  • p = Math.exp(logp)

Example (Bernoulli)

 1var d = Bernoulli({p: 0.7});
 2
 3var logpTrue = d.score(true);   // log(0.7)
 4var logpFalse = d.score(false); // log(0.3)
 5
 6var out = {
 7  logpTrue: logpTrue,
 8  pTrue: Math.exp(logpTrue),
 9
10  logpFalse: logpFalse,
11  pFalse: Math.exp(logpFalse),
12
13  checkSum: Math.exp(logpTrue) + Math.exp(logpFalse)
14};
15
16out;
{
  logpTrue: -0.35667494393873245,
  pTrue: 0.7,
  logpFalse: -1.203972804325936,
  pFalse: 0.30000000000000004,
  checkSum: 1
}

Normalizing a set of log scores (stable softmax)

When you have several log scores (e.g. for multiple outcomes), computing probabilities as Math.exp(logp) and then normalizing can underflow.

A numerically stable pattern is:

  1. subtract the maximum log score

  2. exponentiate

  3. normalize

 1// Normalize log-scores stably using the "subtract max" trick (log-sum-exp style).
 2
 3var normalizeLogProbs = function(logps) {
 4  var m = reduce(function(a, b) { return a > b ? a : b; }, -Infinity, logps);
 5  var shifted = map(function(x) { return x - m; }, logps);
 6  var ws = map(Math.exp, shifted);
 7  var z = sum(ws);
 8  return map(function(w) { return w / z; }, ws);
 9};
10
11// A small example with Categorical-like weights
12var logps = [Math.log(1), Math.log(2), Math.log(7)];
13var ps = normalizeLogProbs(logps);
14
15var out = {
16  logps: logps,
17  probs: ps,
18  sum: sum(ps)
19};
20
21out;

The optional second argument to sample

sample also accepts an optional second argument opts:

  • sample(dist, {guide: ...})

  • sample(dist, {driftKernel: ...})

These are inference controls: they affect how inference proposes values at a sample site, but they do not change the intended target distribution of the model.

Guide distributions

Definition (what is a guide?)

A guide distribution is an auxiliary distribution that some inference strategies can use instead of sampling directly from the model’s distribution at a sample site.

Syntax

A guide distribution is specified like this:

sample(dist, {guide: function() { return guideDist; }})

Where guideDist is another distribution object (e.g. a Gaussian with different parameters).

When does it matter?

It matters when the inference method is told to use guides. For example, forward sampling has an option guide: true that samples random choices from guides. This is useful for debugging and for making the effect visible.

Working example: forward sampling from model vs from guide

 1var model = function() { return sample(Gaussian({mu: 0, sigma: 1}), {
 2    guide: function() {
 3      return Gaussian({mu: 2, sigma: 1});
 4    }
 5  });
 6};
 7
 8// One forward sample (as a distribution with 1 particle), then take that sample.
 9var oneForward = function(useGuide) {
10  return Infer({method: 'forward', samples: 1, guide: useGuide, model: model}).sample();
11};
12
13var out = {
14  fromModel: repeat(5, function() { return oneForward(false); }),
15  fromGuide: repeat(5, function() { return oneForward(true); })
16};
17
18out;
{
  fromModel: [
    0.11924522582887023,
    0.38808096549942767,
    0.7184860219660659,
    0.22119172616379196,
    -1.390182678896135
  ],
  fromGuide: [
    1.6537032695322873,
    2.6175423879712083,
    2.022255309098623,
    2.816055576307594,
    2.033318965474878
  ]
}

Drift kernels

Definition (what is a drift kernel?)

A drift kernel is a function that maps the previous value of a random choice to a proposal distribution. It is mainly used by MH-based MCMC methods.

In other words: it tells MCMC how to propose a “nearby” value instead of proposing from the prior.

Syntax

A drift kernel is specified like this:

sample(dist, {driftKernel: function(prevVal) { return proposalDist; }})

Working example: MCMC with and without a drift kernel

 1var y = 0.0;
 2
 3// A local random-walk proposal centered on the previous value.
 4var gaussianKernel = function(prevVal) {
 5  return Gaussian({mu: prevVal, sigma: 0.25});
 6};
 7
 8var modelNoDrift = function() {
 9  var x = sample(Gaussian({mu: 0, sigma: 1}));
10  observe(Gaussian({mu: x, sigma: 0.5}), y);
11  return x;
12};
13
14var modelWithDrift = function() {
15  var x = sample(Gaussian({mu: 0, sigma: 1}), {driftKernel: gaussianKernel});
16  observe(Gaussian({mu: x, sigma: 0.5}), y);
17  return x;
18};
19
20var postNo = Infer({method: 'MCMC', samples: 200, burn: 50, lag: 0, model: modelNoDrift});
21var postYes = Infer({method: 'MCMC', samples: 200, burn: 50, lag: 0, model: modelWithDrift});
22
23var out = {
24  noDrift: repeat(5, function() { return postNo.sample(); }),
25  withDrift: repeat(5, function() { return postYes.sample(); })
26};
27
28out;
{
  noDrift: [
    -0.06646820994639271,
    -1.4559684714821435,
    -0.19874645123021997,
    -0.3027843672245508,
    0.391805746701598
  ],
  withDrift: [
    -0.6810996596203992,
    0.8014740290586883,
    -0.2434066210167739,
    -0.40017017186879256,
    -0.33440032947124126
  ]
}

How to run examples locally

From the repository root you can run any example with:

  • npx webppl examples/distributions/<file>.wppl --random-seed 0

(We use --random-seed in the docs so outputs stay reproducible.)