Confidence Sequences
glide.confidence_sequences.empirical_bernstein.EmpiricalBernsteinConfidenceSequence
dataclass
Anytime-valid empirical-Bernstein confidence sequence on a running mean.
Holds the per-look running means and the one-sided anytime-valid bound on the side where drift is harmful (a lower bound for a risk, an upper bound for a performance, after the monitor has mapped the sequence back to the original metric orientation). The bounds hold simultaneously at all looks, so testing after every batch does not inflate the false-alarm probability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
running_mean_estimates
|
NDArray
|
Per-look running mean of the per-batch estimates, in original metric units. |
required |
confidence_bounds
|
NDArray
|
Per-look harmful-side anytime-valid bound, in original metric units. |
required |
References
Waudby-Smith, Ian, and Aaditya Ramdas. "Estimating means of bounded random variables by betting." Journal of the Royal Statistical Society Series B: Statistical Methodology 86, no. 1 (2024): 1-27.
Howard, Steven R., Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon. "Time-uniform, nonparametric, nonasymptotic confidence sequences." The Annals of Statistics 49, no. 2 (2021): 1055-1080.
Examples:
>>> import numpy as np
>>> from glide.confidence_sequences import EmpiricalBernsteinConfidenceSequence
>>> sequence = EmpiricalBernsteinConfidenceSequence(
... running_mean_estimates=np.array([0.4, 0.6]),
... confidence_bounds=np.array([0.1, 0.55]),
... )
>>> sequence.test_null_hypothesis(0.5, alternative="larger")
array([False, True])
Source code in glide/confidence_sequences/empirical_bernstein.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |
test_null_hypothesis
test_null_hypothesis(h0_value, alternative='larger')
Test the running mean against h0_value at every look.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
h0_value
|
float
|
The threshold the harmful-side bound is tested against (for a monitor, the user-supplied business threshold). |
required |
alternative
|
str
|
|
'larger'
|
Returns:
| Type | Description |
|---|---|
NDArray
|
Boolean per-look alarm vector, |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Source code in glide/confidence_sequences/empirical_bernstein.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 | |