Discrete vs. Categorical

Both Discrete and Categorical represent finite (discrete) distributions parameterized by a vector of non-negative weights.

They differ in what they return:

  • Discrete({ps: ...}) returns an index in {0, 1, ..., ps.length - 1}.

  • Categorical({ps: ..., vs: ...}) returns the corresponding value from vs.

If you remember only one thing, remember this: Discrete returns an index; Categorical returns a value.

Constructors

Discrete

Discrete({ps: ...})

  • ps: list/array of non-negative numbers (weights)

  • return value: an integer index

Categorical

Categorical({ps: ..., vs: ...})

  • vs: list/array of values (any type)

  • ps: list/array of non-negative numbers (weights), same length as vs

Uniform categorical (omit ps)

If you omit ps:

Categorical({vs: vs})

you get a uniform distribution over the values in vs.

Important: unnormalized weights

For both constructors, ps may be unnormalized. That is, ps is treated as weights and then internally normalized:

P(i) ps[i]

Example: ps = [1, 3, 6] corresponds to probabilities [0.1, 0.3, 0.6].

(Contrast: Multinomial requires a normalized probability vector.)

Shape and typing rules (common gotchas)

  • ps must contain non-negative numbers.

  • In Categorical, ps.length must equal vs.length.

  • Discrete returns an index; to map to an actual value, you must index into your own vs list.

  • score expects the same type that sample would return: - Discrete({ps}).score(k) expects an integer index k. - Categorical({ps, vs}).score(v) expects a value from ``vs``.

Executable example

 1var vs = ["red", "green", "blue"];
 2var weights = [1, 3, 6]; // Unnormalized weights are allowed for Discrete/Categorical.
 3
 4var dIndex = Discrete({ps: weights});
 5var dValue = Categorical({ps: weights, vs: vs});
 6var dUnif = Categorical({vs: vs}); // ps omitted -> uniform over vs
 7
 8var i = sample(dIndex);
 9var v = sample(dValue);
10var u = sample(dUnif);
11
12// Show the implied probabilities explicitly (normalize weights)
13var z = sum(weights);
14var ps = map(function(w) { return w / z; }, weights);
15
16var out = {
17  vs: vs,
18  weights: weights,
19  normalized_probs: ps,
20
21  discrete_sample_index: i,
22  discrete_mapped_value: vs[i],
23
24  categorical_sample_value: v,
25  uniform_categorical_sample_value: u,
26
27  // score expects the same type as sample returns:
28  discrete_score_of_index_2: dIndex.score(2),
29  categorical_score_of_blue: dValue.score("blue")
30};
31
32out;
{
  vs: [ 'red', 'green', 'blue' ],
  weights: [ 1, 3, 6 ],
  normalized_probs: [ 0.1, 0.3, 0.6 ],
  discrete_sample_index: 0,
  discrete_mapped_value: 'red',
  categorical_sample_value: 'red',
  uniform_categorical_sample_value: 'green',
  discrete_score_of_index_2: -0.5108256237659907,
  categorical_score_of_blue: -0.5108256237659907
}

Real-life pattern: weighted choice among actions

A common use case is selecting among options with different prior plausibilities (e.g. “try easy fix”, “reboot”, “ask for help”).

Categorical is often more convenient than Discrete here because it returns the option directly.

Tip: if you later need to attach more information, you can store objects in vs (e.g. {name: ..., cost: ...}).

Measurement scales note

Categorical/Discrete are foundational because they match the first measurement levels in the classic scale taxonomy: nominal (categories with no inherent order) and ordinal (categories with an order). At these levels, observations are not “quantities” in the arithmetic sense but labels (nominal) or orderable labels (ordinal), so the natural probabilistic model is a finite choice among outcomes—exactly what Categorical/Discrete represent. For interval and ratio scales, differences and ratios are meaningful and one often uses continuous (or otherwise quantitative) distributions—though discretization is always possible when appropriate.

Examples: nominal vs ordinal

This example demonstrates how Categorical naturally models the first two measurement levels:

  • Nominal: outcomes are labels with no inherent order (e.g. colors).

  • Ordinal: outcomes are still labels, but we interpret them as ordered (e.g. low < medium < high).

The key point is that Categorical itself does not “know” about order. It simply returns one of the values in vs according to the weights in ps. If you want to perform numeric operations that rely on order (for example, compute an “average level”), you must explicitly encode your ordinal labels as ranks (e.g. low→1, medium→2, high→3).

The code below therefore prints three exact (enumerated) distributions:

  1. a nominal distribution over color labels,

  2. an ordinal distribution over level labels (still just labels),

  3. the same ordinal distribution after mapping labels to numeric ranks, which makes quantities like expected rank well-defined.

 1// Measurement scales: nominal vs ordinal with Categorical.
 2// We show the full distribution exactly via enumeration (deterministic output).
 3
 4var summarize = function(d) {
 5  var supp = d.support();
 6  var probs = map(function(v) { return Math.exp(d.score(v)); }, supp);
 7  return {support: supp, probs: probs, sum: sum(probs)};
 8};
 9
10// NOMINAL: categories have no inherent order (e.g., colors).
11var colors = ["red", "green", "blue"];
12var colorWeights = [1, 3, 6]; // unnormalized weights OK
13
14var colorDist = Infer({
15  method: "enumerate",
16  model: function() {
17    return sample(Categorical({ps: colorWeights, vs: colors}));
18  }
19});
20
21// ORDINAL: categories have an order (e.g., low < medium < high),
22// but Categorical itself still just returns labels.
23var levels = ["low", "medium", "high"];
24var levelWeights = [2, 5, 3];
25
26var levelDist = Infer({
27  method: "enumerate",
28  model: function() {
29    return sample(Categorical({ps: levelWeights, vs: levels}));
30  }
31});
32
33// If you need to do numeric operations (e.g., compute an expected "level"),
34// you must explicitly map labels to ranks yourself:
35var rank = function(x) {
36  return x === "low" ? 1 : (x === "medium" ? 2 : 3);
37};
38
39var rankDist = Infer({
40  method: "enumerate",
41  model: function() {
42    return rank(sample(Categorical({ps: levelWeights, vs: levels})));
43  }
44});
45
46var out = {
47  nominal_colors: summarize(colorDist),
48  ordinal_levels_as_labels: summarize(levelDist),
49  ordinal_levels_as_ranks: summarize(rankDist)
50};
51
52out;
{
  nominal_colors: {
    support: [ 'blue', 'green', 'red' ],
    probs: [ 0.6, 0.29999999999999993, 0.10000000000000002 ],
    sum: 1
  },
  ordinal_levels_as_labels: {
    support: [ 'high', 'medium', 'low' ],
    probs: [ 0.29999999999999993, 0.5000000000000001, 0.2 ],
    sum: 1
  },
  ordinal_levels_as_ranks: {
    support: [ 1, 2, 3 ],
    probs: [ 0.2, 0.5000000000000001, 0.29999999999999993 ],
    sum: 1
  }
}