Contributing to GLIDE
Thank you for considering a contribution to GLIDE! This guide covers everything you need to set up your environment, understand the codebase, and submit a pull request.
Depending on what you want to do, jump to the relevant section:
- Found a bug? → Bug fixes
- Want to add a new estimator or feature? → New features
- Fixing docs or adding an example? → Documentation
- Improving CI, tooling, or the Makefile? → Repository hygiene
- Restructuring code without changing behaviour? → Refactoring
Before writing any code, please open an issue to discuss the scope of your change. This is highly recommended and especially important for new estimators and samplers: sharing the reference paper upfront gives maintainers a chance to read it and frame the ticket to guide your implementation. When you are ready to submit, fork the repository, create a branch off main, and open a pull request against main. The PR template lists all conditions that must be satisfied before requesting a review.
Setup
GLIDE uses uv to manage the virtual environment and all dependency groups.
1. Install uv (skip if already installed):
curl -LsSf https://astral.sh/uv/install.sh | sh
2. Create the virtual environment and install all dependencies:
make venv
This installs the main package, test dependencies, and documentation dependencies in one step.
3. Verify the setup by running the test suite:
make tests
4. Install the git pre-commit hooks:
GLIDE uses prek (a lightweight pre-commit hook runner) configured in prek.toml. The hooks run automatically on every git commit and enforce formatting, type checking, and notebook output stripping.
Install the hooks once after cloning:
uv run prek install
5. Testing notebooks locally (optional):
The project includes example notebooks in docs/. To test all notebooks locally:
make test-notebooks
Note: Notebook testing also runs in CI for all pull requests, so local testing is optional. The CI workflow ensures notebooks are executed and validated before merge.
Architectural overview
The package is organised around four concerns: estimators, samplers, core building blocks, and I/O.
glide/
├── estimators/ # Public API — mean estimators
│ ├── ppi.py
│ ├── ...
│
├── samplers/ # Public API — sampling strategies
│ ├── active.py
│ ├── ...
│
├── simulators/ # Public API — synthetic data generators for tests
│ ├── binary.py
│ ├── ...
│
├── confidence_intervals/ # Confidence interval
│ ├── base.py
│ ├── ...
│
├── mean_inference_results/ # Result types returned by estimators
│ ├── base.py
│ ├── ...
│
├── utils.py # General-purpose helpers
│
└── io/ # Serialisation helpers (e.g., to_json)
└── export.py
How the pieces fit together. Estimators accept raw NumPy arrays and return a MeanInferenceResult subclass: prediction-powered estimators return a PredictionPoweredMeanInferenceResult, classical ones a ClassicalMeanInferenceResult. Every result embeds a ConfidenceInterval (e.g. CLTConfidenceInterval). Samplers produce the labeled arrays that estimators consume. The io module serialises result objects.
Possible contributions
Contributions are listed below.
1. Bug fixes
Reproduce the bug in a failing test first — this confirms the bug exists and guarantees it stays fixed. Then make the minimal code change that makes the test pass.
2. New features
New estimators and samplers should be backed by a scientific publication. Please first open an issue sharing the reference paper to give maintainers a chance to read and frame it to guide your implementation.
Adding a new estimator — step by step
- Identify the inputs, outputs, and any tunable hyperparameters.
- Implement the estimator class:
- If your estimator belongs to an existing family, add it to the corresponding file (e.g. PPI-based methods go in
glide/estimators/ppi.py). Otherwise, createglide/estimators/<name>.py. estimate(array1, array2, ...)runs the method and returns an inference result object. Reuse one fromglide/mean_inference_results(e.g. aMeanInferenceResultsubclass) or add a new one there.- If your estimator has hyperparameters, these should be optional parameters of
estimate()with default values. - Export the new class from
glide/estimators/__init__.py. - Write unit tests in
tests/unit/estimators/test_<name>.py. Cover at minimum: - Correct output type and shape.
- Known analytical results (e.g., the estimator reduces to the classical mean in special cases).
- Doctests in the class docstring.
- Write functional tests in
tests/functional/estimators/test_<name>.py. If applicable, test expected behaviors and properties of your estimator in specific situations, see existing files intests/functional/estimatorsfor examples - Write a numpy-style docstring that includes the reference paper, parameter descriptions, and a small
Examplessection with a minimalistic runnable doctest. See existing estimators for inspiration. - Add an example script in
docs/examples/plot_<name>.pydemonstrating the estimator on some synthetic data. - Update
CHANGELOG.mdunder the[Next release]section.
Adding a new sampler — step by step
- Identify the inputs the sampler requires (e.g. proxy labels, uncertainty scores, stratum labels), the budget parameter, and what values it returns.
- Implement the sampler class:
- Create
glide/samplers/<name>.py. sample(...)runs the sampling procedure and returns the computed values (at least a vectorxiof sampling indicators and possibly a vectorpiof sampling probabilities).- If your sampler has hyperparameters, these should be optional parameters of
sample()with default values. - Export the new class from
glide/samplers/__init__.py. - Write unit tests in
tests/unit/samplers/test_<name>.py. Cover at minimum: - Correct output type and shape.
- Known analytical results (e.g., uniform inputs should yield equal probabilities).
- Edge cases for input parameters (e.g. budget equals dataset size).
- Doctests in the class docstring.
- Write functional tests in
tests/functional/samplers/test_<name>.py. If applicable, test expected behaviors and properties of your sampler. See existing files intests/functional/samplersfor examples. - Write a numpy-style docstring that includes the reference paper, parameter descriptions, and a small
Examplessection with a minimalistic runnable doctest. See existing samplers for inspiration. - Update
CHANGELOG.mdunder the[Next release]section.
3. Documentation
Corrections, clarifications, and new examples live in docs/. Build the docs locally with:
make doc
4. Repository hygiene
Improvements to CI, Makefile targets, GitHub Actions workflows, or dependency configuration. These changes should not affect the public API or test behaviour.
5. Refactoring
Restructuring code without changing observable behaviour. Refactoring PRs must be accompanied by the full passing test suite and must not be bundled with functional changes.
A note on LLM-assisted contributions
LLM usage is welcome and must be disclosed in the PR description. Reviewers should be aware that LLM-generated code tends to increase review burden: it is often verbose, introduces unnecessary abstractions, and may silently diverge from the project's conventions. Contributors are expected to thoroughly read, understand, and validate every line before submitting — not just run the tests. Undisclosed or unvalidated LLM output is grounds for requesting a rewrite.