Skip to content

API Reference

Simulators

Function Description
generate_binary_dataset Synthetic binary-label dataset
generate_stratified_binary_dataset Stratified binary-label dataset
generate_binary_dataset_with_oracle_sampling Binary dataset with oracle sampling probabilities
generate_gaussian_dataset Synthetic Gaussian dataset
generate_clustered_binary_dataset Synthetic clustered binary-label dataset
simulate_annotation Simulate annotation in the simulation lifecycle

Samplers

Class Description
UniformSampler Uniform random sampling
ActiveSampler Uncertainty-based active sampling
StratifiedSampler Stratified budget allocation with Neyman/proportional strategies
CostOptimalRandomSampler Cost-optimal random sampling
CostOptimalSampler Uncertainty-based cost-optimal sampling
UniformClusterSampler Uniform clustered random sampling

Estimators

Classical

Class Description
ClassicalMeanEstimator Classical sample mean without proxy labels
StratifiedClassicalMeanEstimator Classical mean with population-proportional stratification
IPWClassicalMeanEstimator Classical mean with inverse probability weighting
ClusterClassicalMeanEstimator Classical sample mean on clustered data

Prediction-Powered

Class Description
PPIMeanEstimator Combines labeled data with proxy predictions
StratifiedPPIMeanEstimator PPI with per-stratum optimal weighting
ASIMeanEstimator Active statistical inference with non-uniform sampling
PTDMeanEstimator Predict-then-debias with bootstrap confidence intervals
StratifiedPTDMeanEstimator PTD with per-stratum optimal weighting
IPWPTDMeanEstimator PTD with inverse probability weighting

Confidence Intervals

Class Description
CLTConfidenceInterval CLT-based normal approximation confidence intervals
BootstrapConfidenceInterval Quantile-based bootstrap confidence intervals

Inference Results

Class Description
ClassicalMeanInferenceResult Result object from classical estimators
PredictionPoweredMeanInferenceResult Result object from prediction-powered estimators

Scientific Validation

Function Description
run_monte_carlo Monte Carlo driver for coverage and efficiency validation
compute_hits Per-seed hit indicators for coverage computation
coverage_with_error_bar Empirical coverage and confidence interval

I/O

Module Description
glide.io JSON serialization and export helpers