TraceState:概率抽样¶
Status: Experimental
Table of Contents
- [TraceState:概率抽样](#tracestate概率抽样) - [Overview](#overview) - [Definitions](#definitions) - [Sampling](#sampling) - [Adjusted count](#adjusted-count) - [Sampler](#sampler) - [Parent-based sampler](#parent-based-sampler) - [Probability sampler](#probability-sampler) - [Consistent probability sampler](#consistent-probability-sampler) - [Trace completeness](#trace-completeness) - [Non-probability sampler](#non-probability-sampler) - [Always-on consistent probability sampler](#always-on-consistent-probability-sampler) - [Always-off sampler](#always-off-sampler) - [Consistent Probability sampling](#consistent-probability-sampling) - [Conformance](#conformance) - [Completeness guarantee](#completeness-guarantee) - [Context invariants](#context-invariants) - [Sampled flag](#sampled-flag) - [Requirement: Inconsistent p-values are unset](#requirement-inconsistent-p-values-are-unset) - [P-value](#p-value) - [Requirement: Out-of-range p-values are unset](#requirement-out-of-range-p-values-are-unset) - [R-value](#r-value) - [Requirement: Out-of-range r-values unset both p and r](#requirement-out-of-range-r-values-unset-both-p-and-r) - [Requirement: R-value is generated with the correct probabilities](#requirement-r-value-is-generated-with-the-correct-probabilities) - [Examples: Context invariants](#examples-context-invariants) - [Example: Probability sampled context](#example-probability-sampled-context) - [Example: Probability unsampled](#example-probability-unsampled) - [Samplers](#samplers) - [ParentConsistentProbabilityBased sampler](#parentconsistentprobabilitybased-sampler) - [Requirement: ParentConsistentProbabilityBased API](#requirement-parentconsistentprobabilitybased-api) - [Requirement: ParentConsistentProbabilityBased does not modify valid tracestate](#requirement-parentconsistentprobabilitybased-does-not-modify-valid-tracestate) - [Requirement: ParentConsistentProbabilityBased calls the configured root sampler for root spans](#requirement-parentconsistentprobabilitybased-calls-the-configured-root-sampler-for-root-spans) - [Requirement: ParentConsistentProbabilityBased respects the sampled flag for non-root spans](#requirement-parentconsistentprobabilitybased-respects-the-sampled-flag-for-non-root-spans) - [ConsistentProbabilityBased sampler](#consistentprobabilitybased-sampler) - [Requirement: TraceIdRatioBased API compatibility](#requirement-traceidratiobased-api-compatibility) - [Requirement: ConsistentProbabilityBased sampler sets r for root span](#requirement-consistentprobabilitybased-sampler-sets-r-for-root-span) - [Requirement: ConsistentProbabilityBased sampler unsets p when not sampled](#requirement-consistentprobabilitybased-sampler-unsets-p-when-not-sampled) - [Requirement: ConsistentProbabilityBased sampler sets p when sampled](#requirement-consistentprobabilitybased-sampler-sets-p-when-sampled) - [Requirement: ConsistentProbabilityBased sampler records unbiased adjusted counts](#requirement-consistentprobabilitybased-sampler-records-unbiased-adjusted-counts) - [Requirement: ConsistentProbabilityBased sampler sets r for non-root span](#requirement-consistentprobabilitybased-sampler-sets-r-for-non-root-span) - [Requirement: ConsistentProbabilityBased sampler decides not to sample for probabilities less than 2\*\*-62](#requirement-consistentprobabilitybased-sampler-decides-not-to-sample-for-probabilities-less-than-2-62) - [Examples: Consistent probability samplers](#examples-consistent-probability-samplers) - [Example: Setting R-value for a root span](#example-setting-r-value-for-a-root-span) - [Example: Handling inconsistent P-value](#example-handling-inconsistent-p-value) - [Example: Handling corrupt R-value](#example-handling-corrupt-r-value) - [Composition rules](#composition-rules) - [List of requirements](#list-of-requirements) - [Requirement: Combining multiple sampling decisions using logical `or`](#requirement-combining-multiple-sampling-decisions-using-logical-or) - [Requirement: Combine multiple consistent probability samplers using the minimum p-value](#requirement-combine-multiple-consistent-probability-samplers-using-the-minimum-p-value) - [Requirement: Unset p when multiple consistent probability samplers decide not to sample](#requirement-unset-p-when-multiple-consistent-probability-samplers-decide-not-to-sample) - [Requirement: Use probability sampler p-value when its decision to sample is combined with non-probability samplers](#requirement-use-probability-sampler-p-value-when-its-decision-to-sample-is-combined-with-non-probability-samplers) - [Requirement: Use p-value 63 when a probability sampler decision not to sample is combined with a non-probability sampler decision to sample](#requirement-use-p-value-63-when-a-probability-sampler-decision-not-to-sample-is-combined-with-a-non-probability-sampler-decision-to-sample) - [Examples: Composition](#examples-composition) - [Example: Probability and non-probability sampler in a root context](#example-probability-and-non-probability-sampler-in-a-root-context) - [Example: Two consistent probability samplers](#example-two-consistent-probability-samplers) - [Producer and consumer recommendations](#producer-and-consumer-recommendations) - [Trace producer: completeness](#trace-producer-completeness) - [Recommenendation: use non-descending power-of-two probabilities](#recommenendation-use-non-descending-power-of-two-probabilities) - [Trace producer: correctness](#trace-producer-correctness) - [Recommenendation: sampler delegation](#recommenendation-sampler-delegation) - [Trace producer: interoperability with `ParentBased` sampler](#trace-producer-interoperability-with-parentbased-sampler) - [Trace producer: interoperability with `TraceIDRatioBased` sampler](#trace-producer-interoperability-with-traceidratiobased-sampler) - [Trace consumer](#trace-consumer) - [Recommendation: Recognize inconsistent r-values](#recommendation-recognize-inconsistent-r-values) - [Appendix: Statistical test requirements](#appendix-statistical-test-requirements) - [Test procedure: non-powers of two](#test-procedure-non-powers-of-two) - [Requirement: Pass 12 non-power-of-two statistical tests](#requirement-pass-12-non-power-of-two-statistical-tests) - [Test procedure: exact powers of two](#test-procedure-exact-powers-of-two) - [Requirement: Pass 3 power-of-two statistical tests](#requirement-pass-3-power-of-two-statistical-tests) - [Test implementation](#test-implementation) - [Appendix](#appendix) - [Methods for generating R-values](#methods-for-generating-r-values)Overview¶
Probability sampling allows OpenTelemetry tracing users to lower span collection costs by the use of randomized sampling techniques. The objectives are:
- Compatible with the existing W3C trace context
sampled
flag - Spans can be accurately counted using a Span-to-metrics pipeline
- Traces tend to be complete, even though spans may make independent sampling decisions.
This document specifies an approach based on an "r-value" and a "p-value". At a
very high level, r-value is a source of randomness and p-value encodes the
sampling probability. A context is sampled when p <= r
.
Significantly, by including the r-value and p-value in the OpenTelemetry
tracestate
, these two values automatically propagate through the context and
are recorded on every Span. This allows Trace consumers to correctly count spans
simply by interpreting the p-value on a given span.
For efficiency, the supported sampling probabilities are limited to powers of
two. P-value is derived from sampling probability, which equals 2**-p
, thus
p-value is encoded using an unsigned integer.
For example, a p-value of 3 indicates a sampling probability of ⅛.
Since the W3C trace context does not specify that any of the 128 bits in a TraceID are true uniform-distributed random bits, the r-value is introduced as an additional source of randomness.
The recommended method of generating an "r-value" is to count the number of leading 0s in a string of 62 random bits, however, it is not required to use this approach.
Definitions¶
Sampling¶
Sampling is a family of techniques for collecting and analyzing only a fraction of a complete data set. Individual items that are "sampled" are taken to represent one or more spans when collected and counted. The representivity of each span is used in a Span-to-Metrics pipeline to accurately count spans.
Sampling terminology uses "population" to refer to the complete set of data being sampled from. In OpenTelemetry tracing, "population" refers to all spans.
In probability sampling, the representivity of individual sample items is generally known, whereas OpenTelemetry also recognizes "non-probability" sampling approaches, in which representivity is not explicitly quantified.
Adjusted count¶
Adjusted count is a measure of representivity, the number of spans in the population that are represented by the individually sampled span. Span-to-metrics pipelines can be built by adding the adjusted count of each sample span to a counter of matching spans.
For probability sampling, adjusted count is defined as the reciprocal (i.e., mathematical inverse) of sampling probability.
For non-probability sampling, adjusted count is unknown.
Zero adjusted count is defined in a way that supports composition of probability and non-probability sampling. Zero is assigned as the adjusted count when a probability sampler does not select a span.
Thus, there are three meaningfully distinct categories of adjusted count:
Adjusted count is | Interpretation |
---|---|
Unknown | The adjusted count is not known, possibly as a result of a non-probability sampler. Items in this category should not be counted. |
Zero | The adjusted count is known; the effective count of the item is zero. |
Non-zero | The adjusted count is known; the effective count of the item is greater than zero. |
Sampler¶
A Sampler provides configurable logic, used by the SDK, for selecting which
Spans are "recorded" and/or "sampled" in a tracing client library. To "record" a
span means to build a representation of it in the client's memory, which makes
it eligible for being exported. To "sample" a span implies setting the W3C
sampled
flag, recording the span, and exporting the span when it is finished.
OpenTelemetry supports spans that are "recorded" and not "sampled" for in-process observability of live spans (e.g., z-pages).
The Sampler interface and the built-in Samplers defined by OpenTelemetry decide immediately whether to sample a span, and the child context immediately propagates the decision.
Parent-based sampler¶
A Sampler that makes its decision to sample based on the W3C sampled
flag from
the context is said to use parent-based sampling.
Probability sampler¶
A probability Sampler is a Sampler that knows immediately, for each of its decisions, the probability that the span had of being selected.
Sampling probability is defined as a number less than or equal to 1 and greater
than 0 (i.e., 0 < probability <= 1
). The case of 0 probability is treated as a
special, non-probabilistic case.
Consistent probability sampler¶
A consistent probability sampler is a Sampler that supports independent sampling decisions at each span in a trace while maintaining that traces will be complete with a certain minimum probability across the trace.
Consistent probability sampling requires that for any span in a given trace, if a Sampler with lesser sampling probability selects the span for sampling, then the span would also be selected by a Sampler configured with greater sampling probability.
Trace completeness¶
A trace is said to be complete when all of the spans belonging to the trace are collected. When at least one span is collected but not all spans are collected, the trace is considered incomplete.
Trace incompleteness may happen on purpose (e.g., through sampling configuration), or by accident (e.g., through collection errors). The OpenTelemetry trace data model supports a one-way test for incompleteness: for any non-root span, the trace is definitely incomplete if the span's parent span was not collected.
Incomplete traces that result from sampling configuration (i.e., on purpose) are known as partial traces. An important subset of the partial traces are those which are also complete subtraces. A complete subtrace is defined at span S when every descendent span is collected.
Since the test for an incompleteness is one-way, it is important to know which sampling configurations may lead to incomplete traces. Sampling configurations that lead naturally to complete traces and complete subtraces are discussed below.
Non-probability sampler¶
A non-probability sampler is a Sampler that makes its decisions not based on chance, but instead uses arbitrary logic and internal state. The adjusted count of spans sampled by a non-probability sampler is unknown.
Always-on consistent probability sampler¶
An always-on sampler is another name for a consistent probability sampler with probability equal to one.
Always-off sampler¶
An always-off Sampler has the effect of disabling a span completely, effectively excluding it from the population. This is defined as a non-probability sampler, not a zero-percent probability sampler, because the spans are effectively unrepresented.
Consistent Probability sampling¶
The consistent sampling scheme adopted by OpenTelemetry propagates two values via the context, termed "p-value" and "r-value".
Both fields are propagated via the OpenTelemetry tracestate
under the ot
vendor tag using the rules for tracestate handling.
Both fields are represented as unsigned decimal integers requiring at most 6
bits of information.
This sampling scheme selects items from among a fixed set of 63 distinct probability values. The set of supported probabilities includes the integer powers of two between 1 and 2**-62. Zero probability and probabilities smaller than 2**-62 are treated as a special case of "ConsistentAlwaysOff" sampler, just as unit probability (i.e., 100%) describes a special case of "ConsistentAlwaysOn" sampler.
R-value encodes which among the 63 possibilities will consistently decide to sample for a given trace. Specifically, r-value specifies the smallest probability that will decide to sample a given trace in terms of the corresponding p-value. For example, a trace with r-value 0 will sample spans configured for 100% sampling, while r-value 1 will sample spans configured for 50% or 100% sampling, and so on through r-value 62, for which a consistent probability sampler will decide "yes" at every supported probability (i.e., greater than or equal to 2**-62).
P-value encodes the adjusted count for child contexts (i.e., consumers of
tracestate
) and consumers of sampled spans to record for use in
Span-to-metrics pipelines. A special p-value of 63 is defined to mean zero
adjusted count, which helps define composition rules for non-probability
samplers.
An invariant will be stated that connects the sampled
trace flag found in
traceparent
context to the r-value and p-value found in tracestate
context.
Conformance¶
Consumers of OpenTelemetry tracestate
data are expected to validate the
probability sampling fields before interpreting the data. This applies to the
two samplers specified here as well as consumers of span data, who are expected
to validate tracestate
before interpreting span adjusted counts.
Producers of OpenTelemetry tracestate
containing p-value and r-value fields
are required to meet the behavioral requirements stated for the
ConsistentProbabilityBased
sampler and to ensure statistically valid outcomes.
A test suite is included in this specification so that users and consumers of
OpenTelemetry tracestate
can be assured of accuracy in Span-to-metrics
pipelines.
Completeness guarantee¶
This specification defines consistent sampling for power-of-two sampling probabilities. When a sampler is configured with a non-power-of-two sampling probability, the sampler will probabilistically choose between the nearest powers of two.
When a single consistent probability sampler is used at the root of a trace and all other spans use a parent-based sampler, the resulting traces are always complete (ignoring collection errors). This property holds even for non-power-of-two sampling probabilities.
When multiple consistent probability samplers are used in the same trace, in general, trace completeness is ensured at the smallest power of two greater than or equal to the minimum sampling probability across the trace.
Context invariants¶
The W3C traceparent
(version 0) contains three fields of information: the
TraceId, the SpanId, and the trace flags. The sampled
trace flag has been
defined by W3C to signal an intent to sample the context.
The Sampler API is responsible for setting the sampled
flag
and the tracestate
.
P-value and r-value are set in the OpenTelemetry tracestate
, under the vendor
tag ot
, using the identifiers p
and r
. P-value is an unsigned integer
valid in the inclusive range [0, 63]
(i.e., there are 64 valid values).
R-value is an unsigned integer valid in the inclusive range [0, 62]
(i.e.,
there are 63 valid values). P-value and r-value are independent settings, each
can be meaningfully set without the other present.
Sampled flag¶
Probability sampling uses additional information to enable consistent decision
making and to record the adjusted count of sampled spans. When both values are
defined and in the specified range, the invariant between r-value and p-value
and the sampled
trace flag states that
((p <= r) == sampled) OR (sampled AND (p == 63)) == TRUE
.
The invariant between sampled
, p
, and r
only applies when both p
and r
are present. When the invariant is violated, the sampled
flag takes precedence
and p
is unset from tracestate
in order to signal unknown adjusted count.
Requirement: Inconsistent p-values are unset¶
Samplers SHOULD unset p
when the invariant between the sampled
, p
, and r
values is violated before using the tracestate
to make a sampling decision or
interpret adjusted count.
P-value¶
Zero adjusted count is represented by the special p-value 63, otherwise the p-value is set to the negative base-2 logarithm of sampling probability:
p-value | Parent Probability | Adjusted count |
---|---|---|
0 | 1 | 1 |
1 | ½ | 2 |
2 | ¼ | 4 |
... | ... | ... |
N | 2**-N | 2**N |
... | ... | ... |
61 | 2**-61 | 2**61 |
62 | 2**-62 | 2**62 |
63 | 0 | 0 |
Requirement: Out-of-range p-values are unset¶
Consumers SHOULD unset p
from the tracestate
if the unsigned decimal value
is greater than 63 before using the tracestate
to make a sampling decision or
interpret adjusted count.
R-value¶
R-value is set in the tracestate
by the Sampler at the root of the trace, in
order to support consistent probability sampling. When the value is omitted or
not present, child spans in the trace are not able to participate in consistent
probability sampling.
R-value determines which sampling probabilities will decide to sample or not decide to sample for spans of a given trace, as follows:
r-value | Implied sampling probabilities |
---|---|
0 | 1 |
1 | ½ and above |
2 | ¼ and above |
3 | ⅛ and above |
... | ... |
0 <= r <= 61 | 2**-r and above |
... | ... |
59 | 2**-59 and above |
60 | 2**-60 and above |
61 | 2**-61 and above |
62 | 2**-62 and above |
These probabilities are specified to ensure that conforming Sampler implementations record spans with correct adjusted counts. The recommended method of generating r-values is to count the number of leading 0s in a string of 62 random bits, however it is not required to use this approach.
Requirement: Out-of-range r-values unset both p and r¶
Samplers SHOULD unset both r
and p
from the tracestate
if the unsigned
decimal value of r
is greater than 62 before using the tracestate
to make a
sampling decision.
Requirement: R-value is generated with the correct probabilities¶
Samplers MUST generate r-values using a randomized scheme that produces each value with the probabilities equivalent to those produced by counting the number of leading 0s in a string of 62 random bits.
Examples: Context invariants¶
Example: Probability sampled context¶
Consider a trace context with the following headers:
The traceparent
contents in this example example are repeated from the
W3C specification)
and have the following base64-encoded field values:
The tracestate
header contains OpenTelemetry string r:3;p:2
, containing
decimal-encoded p-value and r-value:
Here, r-value 3 indicates that a consistent probability sampler configured with
probability 12.5% (i.e., 1-in-8) or greater will sample the trace. The p-value 2
indicates that the parent that set the sampled
flag was configured to sample
at 25% (i.e., 1-in-4). This trace context is consistent because p <= r
is true
and the sampled
flag is set.
Example: Probability unsampled¶
This example has an unsampled context where only the r-value is set.
This supports consistent probability sampling in child contexts by virtue of having an r-value. P-value is not set, consistent with an unsampled context.
Samplers¶
ParentConsistentProbabilityBased sampler¶
The ParentConsistentProbabilityBased
sampler is meant as an optional
replacement for the ParentBased
Sampler. It is required
to first validate the tracestate
and then respect the sampled
flag in the
W3C traceparent.
Requirement: ParentConsistentProbabilityBased API¶
The ParentConsistentProbabilityBased
Sampler constructor SHOULD take a single
Sampler argument, which is the Sampler to use in case the
ParentConsistentProbabilityBased
Sampler is called for a root span.
Requirement: ParentConsistentProbabilityBased does not modify valid tracestate¶
The ParentConsistentProbabilityBased
Sampler MUST NOT modify a valid
tracestate
.
Requirement: ParentConsistentProbabilityBased calls the configured root sampler for root spans¶
The ParentConsistentProbabilityBased
Sampler MUST delegate to the configured
root Sampler when there is not a valid parent trace context.
Requirement: ParentConsistentProbabilityBased respects the sampled flag for non-root spans¶
The ParentConsistentProbabilityBased
Sampler MUST decide to sample the span
according to the value of the sampled
flag in the W3C traceparent header.
ConsistentProbabilityBased sampler¶
The ConsistentProbabilityBased
sampler is meant as an optional replacement for
the TraceIdRatioBased
Sampler. In the case where
it is used as a root sampler, the ConsistentProbabilityBased
sampler is
required to produce a valid tracestate
. In the case where it is used in a
non-root context, it is required to validate the incoming tracestate
and to
produce a valid tracestate
for the outgoing context.
The ConsistentProbabilityBased
sampler is required to support probabilities
that are not exact powers of two. To do so, implementations are required to
select between the nearest powers of two probabilistically. For example, 5%
sampling can be achieved by selecting 1/16 sampling 60% of the time and 1/32
sampling 40% of the time.
Requirement: TraceIdRatioBased API compatibility¶
The ConsistentProbabilityBased
Sampler MUST have the same constructor
signature as the built-in TraceIdRatioBased
sampler in each OpenTelemetry SDK.
Requirement: ConsistentProbabilityBased sampler sets r for root span¶
The ConsistentProbabilityBased
Sampler MUST set r
when it makes a root
sampling decision.
Requirement: ConsistentProbabilityBased sampler unsets p when not sampled¶
The ConsistentProbabilityBased
Sampler MUST unset p
from the tracestate
when it decides not to sample.
Requirement: ConsistentProbabilityBased sampler sets p when sampled¶
The ConsistentProbabilityBased
Sampler MUST set p
when it decides to sample
according to its configured sampling probability.
Requirement: ConsistentProbabilityBased sampler records unbiased adjusted counts¶
The ConsistentProbabilityBased
Sampler with non-zero probability MUST set p
so that the adjusted count interpreted from the tracestate
is an unbiased
estimate of the number of representative spans in the population.
Requirement: ConsistentProbabilityBased sampler sets r for non-root span¶
If r
is not set on the input tracecontext
and the Span is not a root span,
ConsistentProbabilityBased
SHOULD set r
as if it were a root span and warn
the user that a potentially inconsistent trace is being produced.
Requirement: ConsistentProbabilityBased sampler decides not to sample for probabilities less than 2**-62¶
If the configured sampling probability is in the interval [0, 2**-62)
, the
Sampler MUST decide not to sample.
Examples: Consistent probability samplers¶
Example: Setting R-value for a root span¶
A new root span is sampled by a consistent probability sampler at 25%. A new r-value should be generated (see the appendix for suitable methods), in this example r-value 5 is used which happens 1.5625% of the time and indicates to sample:
The span would be sampled because p-value 2 is less than or equal to r-value 5.
An example tracestate
where r-value 1 indicates not to sample at 25%:
This span would not be sampled because p-value 2 (corresponding with 25% sampling) is greater than r-value 1.
Example: Handling inconsistent P-value¶
When either the consistent probability sampler or the parent-based consistent probability sampler receives a sampled context but invalid p-value, for example,
the tracestate
will have its p-value stripped. The r-value is kept, and the
sampler should act as if the following had been received:
The consistent probability sampler will make its own (consistent) decision using the r-value that was received.
The parent-based consistent probability sampler will in this case follow the
sampled
flag. If the context is sampled, the resulting span will have an
r-value without a p-value, which indicates unknown adjusted count.
Example: Handling corrupt R-value¶
A non-root span receives:
where the r-value is out of its valid range. The r-value and p-value are
stripped during validation, according to the invariants. In this case, the
sampler will act as though no tracestate
were received.
The parent-based consistent probability sampler will sample or not sample based
on the sampled
flag, in this case. If the context is sampled, the recorded
span will have an r-value without a p-value, which indicates unknown adjusted
count.
The consistent probability sampler will generate a new r-value and make a new sampling decision while warning the user of a corrupt and potentially inconsistent r-value.
Composition rules¶
When more than one Sampler participates in the decision to sample a context, their decisions can be combined using composition rules. In all cases, the combined decision to sample is the logical-OR of the Samplers' decisions (i.e., sample if at least one of the composite Samplers decides to sample).
To combine p-values from two consistent probability Sampler decisions, the
Sampler with the greater probability takes effect. The output p-value becomes
the minimum of the two values for p
.
To combine a consistent probability Sampler decision with a non-probability Sampler decision, p-value 63 is used to signify zero adjusted count. If the probability Sampler decides to sample, its p-value takes effect. If the probability Sampler decides not to sample when the non-probability sample does sample, p-value 63 takes effect signifying zero adjusted count.
List of requirements¶
Requirement: Combining multiple sampling decisions using logical or
¶
When multiple samplers are combined using composition, the sampling decision MUST be to sample if at least one of the combined samplers decides to sample.
Requirement: Combine multiple consistent probability samplers using the minimum p-value¶
When combining Sampler decisions for multiple consistent probability Samplers
and at least one decides to sample, the minimum of the "yes" decision p
values
MUST be set in the tracestate
.
Requirement: Unset p when multiple consistent probability samplers decide not to sample¶
When combining Sampler decisions for multiple consistent probability Samplers
and none decides to sample, p-value MUST be unset in the tracestate
.
Requirement: Use probability sampler p-value when its decision to sample is combined with non-probability samplers¶
When combining Sampler decisions for a consistent probability Sampler and a
non-probability Sampler, and the probability Sampler decides to sample, its
p-value MUST be set in the tracestate
regardless of the non-probability
Sampler decision.
Requirement: Use p-value 63 when a probability sampler decision not to sample is combined with a non-probability sampler decision to sample¶
When combining Sampler decisions for a consistent probability Sampler and a
non-probability Sampler, and the probability Sampler decides not to sample but
the non-probability does sample, p-value 63 MUST be set in the tracestate
.
Examples: Composition¶
Example: Probability and non-probability sampler in a root context¶
In a new root context, a consistent probability sampler decides not to set the
sampled flag, adds r:4
indicating that the trace is consistently sampled at
6.5% (i.e., 1-in-16) and larger probabilities.
The probability sampler decision is composed with a non-probability sampler that
decides to sample the context. Setting sampled
when the probability sampler
has not sampled requires setting p:63
, indicating zero adjusted count.
The resulting context:
Example: Two consistent probability samplers¶
Whether a root or non-root, if multiple consistent probability samplers make a decision to sample a given context, the minimum p-value is output in the tracestate.
If a root context, the first of the samplers generates r:15
and its own
p-value p:10
(i.e., adjusted count 1024). The second of the two probability
samplers outputs a smaller adjusted count p:8
(i.e., adjusted count 256).
The resulting context takes the smaller p-value:
Producer and consumer recommendations¶
Trace producer: completeness¶
As stated in the completeness guarantee, traces will be possibly incomplete when configuring multiple consistent probability samplers in the same trace. One way to avoid producing incomplete traces is to use parent-based samplers except for root spans.
There is a simple test for trace incompleteness, but it is a one-way test and does not detect when child spans are uncollected. One way to avoid producing incomplete traces is to avoid configuring non-power-of-two sampling probabilities for non-root spans, because completeness is not guaranteed for non-power-of-two sampling probabilities.
Recommenendation: use non-descending power-of-two probabilities¶
Complete subtraces will be produced when the sequence of sampling probabilities from the root of a trace to its leaves consists of non-descending powers of two. To ensure complete sub-traces are produced, child samplers SHOULD be configured with a power-of-two probability greater than or equal to the parent span's sampling probability.
Trace producer: correctness¶
The use of tracestate to convey adjusted count information rests upon trust between participants in a trace. Users are advised not to use a Span-to-metrics pipeline when the parent sampling decision's corresponding adjusted count is untrustworthy.
The ConsistentProbabilityBased
and ParentConsistentProbabilityBased
samplers
can be used as delegates of another sampler, for conditioning the choice of
sampler on span and other fixed attributes. However, for adjusted counts to be
trustworthy, the choice of non-root sampler cannot be conditioned on the
parent's sampled trace flag or the OpenTelemetry tracestate r-value and p-value,
as these decisions would lead to incorrect adjusted counts.
For example, the built-in ParentBased
sampler supports
configuring the delegated-to sampler based on whether the parent context is
remote or non-remote, sampled or unsampled. If a ParentBased
sampler delegates
to a ConsistentProbabilityBased
sampler only for unsampled contexts, the
resulting Span-to-metrics pipeline will (probably) overcount spans.
Recommenendation: sampler delegation¶
For non-root spans, composite samplers SHOULD NOT condition the choice of delegated-to sampler based on the parent's sampled flag or OpenTelemetry tracestate.
Trace producer: interoperability with ParentBased
sampler¶
The OpenTelemetry built-in ParentBased
sampler is interoperable with the
ConsistentProbabilityBased
sampler, provided that the delegated-to sampler
does not change the decision that determined its selection. For example, it is
safe to configure an alternate ParentBased
sampler delegate for unsampled
spans, provided the decision does not change to sampled.
Because the ParentBased
sampler honors the sampled trace flag, and
OpenTelemetry SDKs include the tracestate in the Span
data, which means a
system can be upgraded to probability sampling by just replacing
TraceIDRatioBased
samplers with conforming ConsistentProbabilityBased
samplers everywhere in the trace.
Trace producer: interoperability with TraceIDRatioBased
sampler¶
The TraceIDRatioBased
specification includes a
RECOMMENDATION against being used for non-root spans because it does not specify
how to make the sampler decision consistent across the trace. A
TraceIDRatioBased
sampler at the root span is interoperable with a
ConsistentParentProbabilityBased
sampler in terms of completeness, although
the resulting spans will have unknown adjusted count.
When a TraceIDRatioBased
sampler is configured for a non-root span, several
cases arise where an incorrect OpenTelemetry tracestate can be generated.
Consider for example a trace with three spans where the root (R) has a
ConsistentProbabilityBased
sampler, the root's child (P) has a
TraceIDRatioBased
sampler, and the grand-child (C) has a ParentBased
sampler. Because the TraceIDRatioBased
sampler change the intermediate sampled
flag without updating the OpenTelemetry tracestate, we have the following cases:
- If
TraceIDRatioBased
does not change P's decision, the trace is complete and all spans' adjusted counts are correct. - If
TraceIDRatioBased
changes P's decision from no to yes, the consumer will observe a (definitely) incomplete trace containing P and C. Both spans will have invalid OpenTelemetry tracestate, leading to unknown adjusted count in this case. - If
TraceIDRatioBased
changes the sampling decision from yes to no, the consumer will observe singleton trace with correct adjusted count. The consumer cannot determine that R has two unsampled descendents.
As these cases demonstrate, users can expect incompleteness and unknown adjusted
count when using TraceIDRatioBased
samplers for non-root spans, but this goes
against the originally specified warning.
Trace consumer¶
Trace consumers are expected to apply the simple one-way test for incompleteness. When non-root spans are configured with independent sampling probabilities, traces may be complete in a way that cannot be detected. Because of the one-way test, consumers wanting to ensure complete traces are expected to know the minimum sampling probability across the system.
Ignoring accidental data loss, a trace will be complete if all its spans are sampled with consistent probability samplers and the trace's r-value is larger than the corresponding smallest power of two greater than or equal to the minimum sampling probability across the trace.
Due to the ConsistentProbabilityBased
Sampler requirement about setting r
when it is unset for a non-root span, trace consumers are advised to check
traces for r-value consistency. When a single trace contains more than a single
distinct r
value, it means the trace was not correctly sampled at the root for
probability sampling. While the adjusted count of each span is correct in this
scenario, it may be impossible to detect complete traces.
Recommendation: Recognize inconsistent r-values¶
When a single trace contains spans with tracestate
values containing more than
one distinct value for r
, the consumer SHOULD recognize the trace as
inconsistently sampled.
Appendix: Statistical test requirements¶
This section specifies a test that can be implemented to ensure basic conformance with the requirement that sampling decisions are unbiased.
The goal of this test specification is to be simple to implement and not require advanced statistical skills or libraries to be successful.
This test is not meant to evaluate the performance of a random number generator. This test assumes the underlying RNG is of good quality and checks that the sampler produces the expected proportionality with a high degree of statistical confidence.
One of the challenges of this kind of test is that probabilistic tests are expected to occasionally produce exceptional results. To make this a strict test for random behavior, we take the following approach:
- Generate a pre-determined list of 20 random seeds
- Use fixed values for significance level (5%) and trials (20)
- Use a population size of 100,000 spans
- For each trial, simulate the population and compute ChiSquared test statistic
- Locate the first seed value in the ordered list such that the Chi-Squared significance test fails exactly once out of 20 trials
To create this test, perform the above sequence using the seed values from the predetermined list, in order, until a seed value is found with exactly one failure. This is expected to happen fairly often and is required to happen once among the 20 available seeds. After calculating the index of the first seed with exactly one ChiSquared failure, record it in the test. For continuous integration testing, it is only necessary to re-run the test using the predetermined seed index.
As specified, the Chi-Squared test has either one or two degrees of freedom, depending on whether the sampling probability is an exact power of two or not.
Test procedure: non-powers of two¶
In this case there are two degrees of freedom for the Chi-Squared test. The following table summarizes the test parameters.
Test case | Sampling probability | Lower, Upper p-value when sampled | Expectlower | Expectupper | Expectunsampled |
---|---|---|---|---|---|
1 | 0.900000 | 0, 1 | 10000 | 80000 | 10000 |
2 | 0.600000 | 0, 1 | 40000 | 20000 | 40000 |
3 | 0.330000 | 1, 2 | 17000 | 16000 | 67000 |
4 | 0.130000 | 2, 3 | 12000 | 1000 | 87000 |
5 | 0.100000 | 3, 4 | 2500 | 7500 | 90000 |
6 | 0.050000 | 4, 5 | 1250 | 3750 | 95000 |
7 | 0.017000 | 5, 6 | 1425 | 275 | 98300 |
8 | 0.010000 | 6, 7 | 562.5 | 437.5 | 99000 |
9 | 0.005000 | 7, 8 | 281.25 | 218.75 | 99500 |
10 | 0.002900 | 8, 9 | 100.625 | 189.375 | 99710 |
11 | 0.001000 | 9, 10 | 95.3125 | 4.6875 | 99900 |
12 | 0.000500 | 10, 11 | 47.65625 | 2.34375 | 99950 |
The formula for computing Chi-Squared in this case is:
This should be compared with 0.102587, the value of the Chi-Squared distribution for two degrees of freedom with significance level 5%. For each probability in the table above, the test is required to demonstrate a seed that produces exactly one ChiSquared value less than 0.102587.
Requirement: Pass 12 non-power-of-two statistical tests¶
For the test with 20 trials and 100,000 spans each, the test MUST demonstrate a random number generator seed such that the ChiSquared test statistic is below 0.102587 exactly 1 out of 20 times.
Test procedure: exact powers of two¶
In this case there is one degree of freedom for the Chi-Squared test. The following table summarizes the test parameters.
Test case | Sampling probability | P-value when sampled | Expectsampled | Expectunsampled |
---|---|---|---|---|
13 | 0x1p-01 (0.500000) | 1 | 50000 | 50000 |
14 | 0x1p-04 (0.062500) | 4 | 6250 | 93750 |
15 | 0x1p-07 (0.007812) | 7 | 781.25 | 99218.75 |
The formula for computing Chi-Squared in this case is:
This should be compared with 0.003932, the value of the Chi-Squared distribution for one degree of freedom with significance level 5%. For each probability in the table above, the test is required to demonstrate a seed that produces exactly one ChiSquared value less than 0.003932.
Requirement: Pass 3 power-of-two statistical tests¶
For the test with 20 trials and 100,000 spans each, the test MUST demonstrate a random number generator seed such that the ChiSquared test statistic is below 0.003932 exactly 1 out of 20 times.
Test implementation¶
The recommended structure for this test uses a table listing the 15 probability values, the expected p-values, whether the ChiSquared statistic has one or two degrees of freedom, and the index into the predetermined list of seeds.
Note that seed indexes in the example above have what appears to be the correct distribution. The five 0s, two 1s, five 2s, one 3s, and one 4 demonstrate that it is relatively easy to find examples where there is exactly one failure. Probability 0.001, with seed index 6 in this case, is a reminder that outliers exist. Further significance testing of this distribution is not recommended.
Appendix¶
Methods for generating R-values¶
The method used for generating r-values is not specified, in order to leave the implementation freedom to optimize. Typically, when the TraceId is known to contain at a 62-bit substring of random bits, R-values can be derived directly from the 62 random bits of TraceId by:
- Count the leading zeros
- Count the leading ones
- Count the trailing zeros
- Count the trailing ones.
If the TraceId contains unknown or insufficient randomness, another approach is to generate random bits until the first true or false value.
Any scheme that produces r-values shown in the following table is considered conforming.
r-value | Probability of r-value |
---|---|
0 | ½ |
1 | ¼ |
2 | ⅛ |
3 | 1/16 |
... | ... |
0 <= r <= 61 | 2**-(r+1) |
... | ... |
59 | 2**-60 |
60 | 2**-61 |
61 | 2**-62 |
62 | 2**-62 |