Confidence Intervals

glide.confidence_intervals.clt.CLTConfidenceInterval `dataclass`

Confidence interval based on the Central Limit Theorem.

Constructs a symmetric interval around the point estimate using the standard normal distribution: [mean - z * std, mean + z * std], where z is the critical value from the standard normal distribution corresponding to the target confidence level.

Parameters:

Name	Type	Description	Default
`mean`	`float`	The point estimate of the population mean.	required
`std`	`float`	The standard error (standard deviation of the estimate).	required
`confidence_level`	`float`	Target coverage probability, e.g. 0.95 for a 95% CI. Default is 0.95.	`0.95`

Examples:

>>> from glide.confidence_intervals import CLTConfidenceInterval
>>> ci = CLTConfidenceInterval(mean=5.0, std=0.2, confidence_level=0.95)
>>> print(f"[{ci.lower_bound:.3f}, {ci.upper_bound:.3f}]")
[4.608, 5.392]

Source code in glide/confidence_intervals/clt.py

@dataclass
class CLTConfidenceInterval:
    """Confidence interval based on the Central Limit Theorem.

    Constructs a symmetric interval around the point estimate using the standard
    normal distribution: [mean - z * std, mean + z * std], where z is the critical
    value from the standard normal distribution corresponding to the target
    confidence level.

    Parameters
    ----------
    mean : float
        The point estimate of the population mean.
    std : float
        The standard error (standard deviation of the estimate).
    confidence_level : float, optional
        Target coverage probability, e.g. 0.95 for a 95% CI. Default is 0.95.

    Examples
    --------
    >>> from glide.confidence_intervals import CLTConfidenceInterval
    >>> ci = CLTConfidenceInterval(mean=5.0, std=0.2, confidence_level=0.95)
    >>> print(f"[{ci.lower_bound:.3f}, {ci.upper_bound:.3f}]")
    [4.608, 5.392]
    """

    mean: float
    std: float
    var: float = field(init=False, repr=False)
    _confidence_level: float = field(init=False, repr=False)
    lower_bound: float = field(init=False, repr=False)
    upper_bound: float = field(init=False, repr=False)
    width: float = field(init=False, repr=False)

    def __init__(self, mean: float, std: float, confidence_level: float = 0.95) -> None:
        self.mean = mean
        self.std = std
        self.var = std**2
        self.confidence_level = confidence_level

    @property
    def confidence_level(self) -> float:
        return self._confidence_level

    @confidence_level.setter
    def confidence_level(self, value: float) -> None:
        _validate_bounds(value, "confidence_level", lower=0, upper=1, left_inclusive=False, right_inclusive=False)
        self._confidence_level = value
        alpha_over_two = (1 - value) / 2
        z_score = norm.ppf(1 - alpha_over_two)
        self.lower_bound = self.mean - self.std * z_score
        self.upper_bound = self.mean + self.std * z_score
        self.width = 2 * self.std * z_score

    def test_null_hypothesis(
        self, h0_value: float, alternative: Literal["larger", "smaller", "two-sided"] = "two-sided"
    ) -> Tuple[float, float, float]:
        """Perform a one-sample z-test against a null hypothesis value.

        Parameters
        ----------
        h0_value : float
            The hypothesized population mean under the null hypothesis (H0: μ = h0_value).
        alternative : str, optional
            The alternative hypothesis. One of:
            - ``'two-sided'`` (default): H1: μ ≠ h0_value
            - ``'larger'``: H1: μ > h0_value
            - ``'smaller'``: H1: μ < h0_value

        Returns
        -------
        Tuple[float, float, float]
            ``(z_stat, p_value, df)`` where ``z_stat`` is the test statistic
            (mean - h0_value) / std, ``p_value`` is the p-value under the standard
            normal distribution, and ``df`` is ``float('inf')``.

        Raises
        ------
        ValueError
            If ``alternative`` is not one of ``'two-sided'``, ``'larger'``, or ``'smaller'``.
        """
        z_stat = (self.mean - h0_value) / self.std
        alternatives = ["two-sided", "larger", "smaller"]
        _validate_literal(alternative, "alternative", alternatives)

        if alternative == alternatives[0]:
            p_value = 2 * norm.sf(abs(z_stat))
        elif alternative == alternatives[1]:
            p_value = norm.sf(z_stat)
        else:
            p_value = norm.cdf(z_stat)

        df = float("inf")
        return z_stat, p_value, df

test_null_hypothesis

test_null_hypothesis(h0_value, alternative='two-sided')

Perform a one-sample z-test against a null hypothesis value.

Parameters:

Name	Type	Description	Default
`h0_value`	`float`	The hypothesized population mean under the null hypothesis (H0: μ = h0_value).	required
`alternative`	`str`	The alternative hypothesis. One of: - `'two-sided'` (default): H1: μ ≠ h0_value - `'larger'`: H1: μ > h0_value - `'smaller'`: H1: μ < h0_value	`'two-sided'`

Returns:

Type	Description
`Tuple[float, float, float]`	`(z_stat, p_value, df)` where `z_stat` is the test statistic (mean - h0_value) / std, `p_value` is the p-value under the standard normal distribution, and `df` is `float('inf')`.

Raises:

Type	Description
`ValueError`	If `alternative` is not one of `'two-sided'`, `'larger'`, or `'smaller'`.

Source code in glide/confidence_intervals/clt.py

def test_null_hypothesis(
    self, h0_value: float, alternative: Literal["larger", "smaller", "two-sided"] = "two-sided"
) -> Tuple[float, float, float]:
    """Perform a one-sample z-test against a null hypothesis value.

    Parameters
    ----------
    h0_value : float
        The hypothesized population mean under the null hypothesis (H0: μ = h0_value).
    alternative : str, optional
        The alternative hypothesis. One of:
        - ``'two-sided'`` (default): H1: μ ≠ h0_value
        - ``'larger'``: H1: μ > h0_value
        - ``'smaller'``: H1: μ < h0_value

    Returns
    -------
    Tuple[float, float, float]
        ``(z_stat, p_value, df)`` where ``z_stat`` is the test statistic
        (mean - h0_value) / std, ``p_value`` is the p-value under the standard
        normal distribution, and ``df`` is ``float('inf')``.

    Raises
    ------
    ValueError
        If ``alternative`` is not one of ``'two-sided'``, ``'larger'``, or ``'smaller'``.
    """
    z_stat = (self.mean - h0_value) / self.std
    alternatives = ["two-sided", "larger", "smaller"]
    _validate_literal(alternative, "alternative", alternatives)

    if alternative == alternatives[0]:
        p_value = 2 * norm.sf(abs(z_stat))
    elif alternative == alternatives[1]:
        p_value = norm.sf(z_stat)
    else:
        p_value = norm.cdf(z_stat)

    df = float("inf")
    return z_stat, p_value, df

glide.confidence_intervals.bootstrap.BootstrapConfidenceInterval `dataclass`

Quantile bootstrap confidence interval.

Stores the full distribution of bootstrap point estimates and derives bounds as quantiles of that distribution.

Parameters:

Name	Type	Description	Default
`bootstrap_estimates`	`NDArray`	Array of shape (B,) containing the B bootstrap point estimates.	required
`confidence_level`	`float`	Target coverage, e.g. 0.95 for a 95 % CI. Default is 0.95.	`0.95`

Examples:

>>> import numpy as np
>>> from glide.confidence_intervals import BootstrapConfidenceInterval
>>> rng = np.random.default_rng(0)
>>> estimates = rng.normal(loc=5.0, scale=0.3, size=20)
>>> ci = BootstrapConfidenceInterval(bootstrap_estimates=estimates, confidence_level=0.95)
>>> print(f"[{ci.lower_bound:.3f}, {ci.upper_bound:.3f}]")
[4.453, 5.354]

Source code in glide/confidence_intervals/bootstrap.py

@dataclass
class BootstrapConfidenceInterval:
    """Quantile bootstrap confidence interval.

    Stores the full distribution of bootstrap point estimates and derives
    bounds as quantiles of that distribution.

    Parameters
    ----------
    bootstrap_estimates : NDArray
        Array of shape (B,) containing the B bootstrap point estimates.
    confidence_level : float, optional
        Target coverage, e.g. 0.95 for a 95 % CI. Default is 0.95.

    Examples
    --------
    >>> import numpy as np
    >>> from glide.confidence_intervals import BootstrapConfidenceInterval
    >>> rng = np.random.default_rng(0)
    >>> estimates = rng.normal(loc=5.0, scale=0.3, size=20)
    >>> ci = BootstrapConfidenceInterval(bootstrap_estimates=estimates, confidence_level=0.95)
    >>> print(f"[{ci.lower_bound:.3f}, {ci.upper_bound:.3f}]")
    [4.453, 5.354]
    """

    mean: float = field(init=False, repr=False)
    var: float = field(init=False, repr=False)
    std: float = field(init=False, repr=False)
    _sorted_estimates: NDArray = field(init=False, repr=False)
    _confidence_level: float = field(init=False, repr=False)
    lower_bound: float = field(init=False, repr=False)
    upper_bound: float = field(init=False, repr=False)
    width: float = field(init=False, repr=False)

    def __init__(self, bootstrap_estimates: NDArray, confidence_level: float = 0.95) -> None:
        self.mean = float(np.mean(bootstrap_estimates))
        self.var = float(np.var(bootstrap_estimates, ddof=1))
        self.std = float(np.sqrt(self.var))
        self._sorted_estimates = np.sort(bootstrap_estimates)
        self.confidence_level = confidence_level

    @property
    def confidence_level(self) -> float:
        return self._confidence_level

    @confidence_level.setter
    def confidence_level(self, value: float) -> None:
        _validate_bounds(value, "confidence_level", lower=0, upper=1, left_inclusive=False, right_inclusive=False)
        self._confidence_level = value
        alpha_over_two = (1 - value) / 2
        self.lower_bound = float(np.quantile(self._sorted_estimates, alpha_over_two))
        self.upper_bound = float(np.quantile(self._sorted_estimates, 1 - alpha_over_two))
        self.width = self.upper_bound - self.lower_bound

    def test_null_hypothesis(
        self,
        h0_value: float,
        alternative: Literal["larger", "smaller", "two-sided"] = "two-sided",
    ) -> Tuple[float, float, float]:
        """Bootstrap hypothesis test against a null value.

        Computes a p-value as the proportion of bootstrap estimates that are
        at least as extreme as `h0_value` under the specified alternative.

        Parameters
        ----------
        h0_value : float
            The hypothesized population mean under the null hypothesis (H0: μ = h0_value).
        alternative : str, optional
            The alternative hypothesis. One of:
            - ``'two-sided'`` (default): H1: μ ≠ h0_value
            - ``'larger'``: H1: μ > h0_value
            - ``'smaller'``: H1: μ < h0_value

        Returns
        -------
        Tuple[float, float, float]
            ``(test_statistic, p_value, df)`` where ``test_statistic`` is the
            point estimate (mean of bootstrap distribution), ``p_value`` is the
            bootstrap p-value, and ``df`` is ``float('inf')``.
        """
        n = len(self._sorted_estimates)
        alternatives = ["two-sided", "larger", "smaller"]
        _validate_literal(alternative, "alternative", alternatives)

        if alternative == alternatives[0]:
            observed_deviation = abs(h0_value - self.mean)
            # Count estimates <= (mean - deviation) or >= (mean + deviation)
            lower_threshold = self.mean - observed_deviation
            upper_threshold = self.mean + observed_deviation
            count_below = np.searchsorted(self._sorted_estimates, lower_threshold, side="right")
            count_above = n - np.searchsorted(self._sorted_estimates, upper_threshold, side="left")
            count_extreme = count_below + count_above
        elif alternative == alternatives[1]:
            # Count estimates <= h0_value (evidence against "larger" alternative)
            count_extreme = np.searchsorted(self._sorted_estimates, h0_value, side="right")
        else:
            # Count estimates >= h0_value (evidence against "smaller" alternative)
            count_extreme = n - np.searchsorted(self._sorted_estimates, h0_value, side="left")

        p_value = float(count_extreme) / n

        return self.mean, p_value, float("inf")

test_null_hypothesis

test_null_hypothesis(h0_value, alternative='two-sided')

Bootstrap hypothesis test against a null value.

Computes a p-value as the proportion of bootstrap estimates that are at least as extreme as h0_value under the specified alternative.

Parameters:

Name	Type	Description	Default
`h0_value`	`float`	The hypothesized population mean under the null hypothesis (H0: μ = h0_value).	required
`alternative`	`str`	The alternative hypothesis. One of: - `'two-sided'` (default): H1: μ ≠ h0_value - `'larger'`: H1: μ > h0_value - `'smaller'`: H1: μ < h0_value	`'two-sided'`

Returns:

Type	Description
`Tuple[float, float, float]`	`(test_statistic, p_value, df)` where `test_statistic` is the point estimate (mean of bootstrap distribution), `p_value` is the bootstrap p-value, and `df` is `float('inf')`.

Source code in glide/confidence_intervals/bootstrap.py

def test_null_hypothesis(
    self,
    h0_value: float,
    alternative: Literal["larger", "smaller", "two-sided"] = "two-sided",
) -> Tuple[float, float, float]:
    """Bootstrap hypothesis test against a null value.

    Computes a p-value as the proportion of bootstrap estimates that are
    at least as extreme as `h0_value` under the specified alternative.

    Parameters
    ----------
    h0_value : float
        The hypothesized population mean under the null hypothesis (H0: μ = h0_value).
    alternative : str, optional
        The alternative hypothesis. One of:
        - ``'two-sided'`` (default): H1: μ ≠ h0_value
        - ``'larger'``: H1: μ > h0_value
        - ``'smaller'``: H1: μ < h0_value

    Returns
    -------
    Tuple[float, float, float]
        ``(test_statistic, p_value, df)`` where ``test_statistic`` is the
        point estimate (mean of bootstrap distribution), ``p_value`` is the
        bootstrap p-value, and ``df`` is ``float('inf')``.
    """
    n = len(self._sorted_estimates)
    alternatives = ["two-sided", "larger", "smaller"]
    _validate_literal(alternative, "alternative", alternatives)

    if alternative == alternatives[0]:
        observed_deviation = abs(h0_value - self.mean)
        # Count estimates <= (mean - deviation) or >= (mean + deviation)
        lower_threshold = self.mean - observed_deviation
        upper_threshold = self.mean + observed_deviation
        count_below = np.searchsorted(self._sorted_estimates, lower_threshold, side="right")
        count_above = n - np.searchsorted(self._sorted_estimates, upper_threshold, side="left")
        count_extreme = count_below + count_above
    elif alternative == alternatives[1]:
        # Count estimates <= h0_value (evidence against "larger" alternative)
        count_extreme = np.searchsorted(self._sorted_estimates, h0_value, side="right")
    else:
        # Count estimates >= h0_value (evidence against "smaller" alternative)
        count_extreme = n - np.searchsorted(self._sorted_estimates, h0_value, side="left")

    p_value = float(count_extreme) / n

    return self.mean, p_value, float("inf")

Confidence Intervals

glide.confidence_intervals.clt.CLTConfidenceInterval dataclass

test_null_hypothesis

glide.confidence_intervals.bootstrap.BootstrapConfidenceInterval dataclass

test_null_hypothesis

glide.confidence_intervals.clt.CLTConfidenceInterval `dataclass`

glide.confidence_intervals.bootstrap.BootstrapConfidenceInterval `dataclass`