Discrete vs. Categorical
Both Discrete and Categorical represent finite (discrete) distributions parameterized by
a vector of non-negative weights.
They differ in what they return:
Discrete({ps: ...})returns an index in{0, 1, ..., ps.length - 1}.Categorical({ps: ..., vs: ...})returns the corresponding value fromvs.
If you remember only one thing, remember this: Discrete returns an index; Categorical returns a value.
Constructors
Discrete
Discrete({ps: ...})
ps: list/array of non-negative numbers (weights)return value: an integer index
Categorical
Categorical({ps: ..., vs: ...})
vs: list/array of values (any type)ps: list/array of non-negative numbers (weights), same length asvs
Uniform categorical (omit ps)
If you omit ps:
Categorical({vs: vs})
you get a uniform distribution over the values in vs.
Important: unnormalized weights
For both constructors, ps may be unnormalized.
That is, ps is treated as weights and then internally normalized:
P(i) ∝ ps[i]
Example: ps = [1, 3, 6] corresponds to probabilities [0.1, 0.3, 0.6].
(Contrast: Multinomial requires a normalized probability vector.)
Shape and typing rules (common gotchas)
psmust contain non-negative numbers.In
Categorical,ps.lengthmust equalvs.length.Discretereturns an index; to map to an actual value, you must index into your ownvslist.scoreexpects the same type thatsamplewould return: -Discrete({ps}).score(k)expects an integer indexk. -Categorical({ps, vs}).score(v)expects a value from ``vs``.
Executable example
1var vs = ["red", "green", "blue"];
2var weights = [1, 3, 6]; // Unnormalized weights are allowed for Discrete/Categorical.
3
4var dIndex = Discrete({ps: weights});
5var dValue = Categorical({ps: weights, vs: vs});
6var dUnif = Categorical({vs: vs}); // ps omitted -> uniform over vs
7
8var i = sample(dIndex);
9var v = sample(dValue);
10var u = sample(dUnif);
11
12// Show the implied probabilities explicitly (normalize weights)
13var z = sum(weights);
14var ps = map(function(w) { return w / z; }, weights);
15
16var out = {
17 vs: vs,
18 weights: weights,
19 normalized_probs: ps,
20
21 discrete_sample_index: i,
22 discrete_mapped_value: vs[i],
23
24 categorical_sample_value: v,
25 uniform_categorical_sample_value: u,
26
27 // score expects the same type as sample returns:
28 discrete_score_of_index_2: dIndex.score(2),
29 categorical_score_of_blue: dValue.score("blue")
30};
31
32out;
{
vs: [ 'red', 'green', 'blue' ],
weights: [ 1, 3, 6 ],
normalized_probs: [ 0.1, 0.3, 0.6 ],
discrete_sample_index: 0,
discrete_mapped_value: 'red',
categorical_sample_value: 'red',
uniform_categorical_sample_value: 'green',
discrete_score_of_index_2: -0.5108256237659907,
categorical_score_of_blue: -0.5108256237659907
}
Real-life pattern: weighted choice among actions
A common use case is selecting among options with different prior plausibilities (e.g. “try easy fix”, “reboot”, “ask for help”).
Categorical is often more convenient than Discrete here because it returns the option directly.
Tip: if you later need to attach more information, you can store objects in vs (e.g. {name: ..., cost: ...}).
Measurement scales note
Categorical/Discrete are foundational because they match the first measurement levels in the classic scale taxonomy:
nominal (categories with no inherent order) and ordinal (categories with an order).
At these levels, observations are not “quantities” in the arithmetic sense but labels (nominal) or orderable labels (ordinal),
so the natural probabilistic model is a finite choice among outcomes—exactly what Categorical/Discrete represent.
For interval and ratio scales, differences and ratios are meaningful and one often uses continuous (or otherwise quantitative)
distributions—though discretization is always possible when appropriate.
Examples: nominal vs ordinal
This example demonstrates how Categorical naturally models the first two measurement levels:
Nominal: outcomes are labels with no inherent order (e.g. colors).
Ordinal: outcomes are still labels, but we interpret them as ordered (e.g. low < medium < high).
The key point is that Categorical itself does not “know” about order. It simply returns one of the values in vs
according to the weights in ps. If you want to perform numeric operations that rely on order (for example, compute an
“average level”), you must explicitly encode your ordinal labels as ranks (e.g. low→1, medium→2, high→3).
The code below therefore prints three exact (enumerated) distributions:
a nominal distribution over color labels,
an ordinal distribution over level labels (still just labels),
the same ordinal distribution after mapping labels to numeric ranks, which makes quantities like expected rank well-defined.
1// Measurement scales: nominal vs ordinal with Categorical.
2// We show the full distribution exactly via enumeration (deterministic output).
3
4var summarize = function(d) {
5 var supp = d.support();
6 var probs = map(function(v) { return Math.exp(d.score(v)); }, supp);
7 return {support: supp, probs: probs, sum: sum(probs)};
8};
9
10// NOMINAL: categories have no inherent order (e.g., colors).
11var colors = ["red", "green", "blue"];
12var colorWeights = [1, 3, 6]; // unnormalized weights OK
13
14var colorDist = Infer({
15 method: "enumerate",
16 model: function() {
17 return sample(Categorical({ps: colorWeights, vs: colors}));
18 }
19});
20
21// ORDINAL: categories have an order (e.g., low < medium < high),
22// but Categorical itself still just returns labels.
23var levels = ["low", "medium", "high"];
24var levelWeights = [2, 5, 3];
25
26var levelDist = Infer({
27 method: "enumerate",
28 model: function() {
29 return sample(Categorical({ps: levelWeights, vs: levels}));
30 }
31});
32
33// If you need to do numeric operations (e.g., compute an expected "level"),
34// you must explicitly map labels to ranks yourself:
35var rank = function(x) {
36 return x === "low" ? 1 : (x === "medium" ? 2 : 3);
37};
38
39var rankDist = Infer({
40 method: "enumerate",
41 model: function() {
42 return rank(sample(Categorical({ps: levelWeights, vs: levels})));
43 }
44});
45
46var out = {
47 nominal_colors: summarize(colorDist),
48 ordinal_levels_as_labels: summarize(levelDist),
49 ordinal_levels_as_ranks: summarize(rankDist)
50};
51
52out;
{
nominal_colors: {
support: [ 'blue', 'green', 'red' ],
probs: [ 0.6, 0.29999999999999993, 0.10000000000000002 ],
sum: 1
},
ordinal_levels_as_labels: {
support: [ 'high', 'medium', 'low' ],
probs: [ 0.29999999999999993, 0.5000000000000001, 0.2 ],
sum: 1
},
ordinal_levels_as_ranks: {
support: [ 1, 2, 3 ],
probs: [ 0.2, 0.5000000000000001, 0.29999999999999993 ],
sum: 1
}
}