Contributing¤
This guide covers the workflow for contributing operators, tests, and documentation to DiffBio.
Development Setup¤
The setup script creates a virtual environment, installs all dependencies
via uv, and configures GPU detection. Always activate with source ./activate.sh
before running any commands.
Dependency Runtime Contract¤
DiffBio's canonical ecosystem runtime is the active installed environment, not
whatever sibling checkouts happen to exist next to the repo. In practice that
means datarax, artifex, opifex, and calibrax must resolve from the
activated .venv, with the exact Git snapshots pinned in uv.lock.
When you need the latest GitHub-backed ecosystem state in the current environment, refresh both the lockfile and the installed toolchain:
source ./activate.sh
uv lock --upgrade-package datarax --upgrade-package artifex --upgrade-package opifex --upgrade-package calibrax
uv sync --all-extras --all-groups
Verify the runtime contract immediately after syncing:
That command confirms:
- each ecosystem package resolves from installed
site-packages - the live
opifex.neural.operators.FourierNeuralOperatorconstructor still exposesspatial_dims
Use these targeted smoke tests when checking the current foundation-model and epigenomics baseline after a dependency refresh:
uv run pytest tests/operators/epigenomics/test_fno_peak_calling.py -q
uv run pytest tests/benchmarks/test_singlecell_foundation_suite.py -q
uv run pytest tests/benchmarks/test_genomics_foundation_suite.py -q
Code Quality Tools¤
DiffBio uses ruff for linting and formatting. Do not use Black, isort, or flake8 — they are not configured for this project.
uv run ruff check src/ --fix # Lint + autofix
uv run ruff format src/ # Format
uv run pyright src/ # Type check
uv run pre-commit run --all-files # Run all hooks
Style Rules¤
| Rule | Value |
|---|---|
| Line length | 100 characters |
| Formatter | ruff format (not Black) |
| Import sorting | ruff (not isort) |
| Type checker | pyright (not mypy) |
| Docstrings | Google style |
| Type hints | Required on all public functions |
| Type syntax | list, dict, tuple (not List, Dict, Tuple) |
Pre-commit Hooks¤
Pre-commit runs automatically on git commit. All hooks must pass before
committing. The configured hooks include:
- ruff (lint + format)
- pyright (type checking)
- bandit (security)
- interrogate (docstring coverage)
- nbqa-ruff (notebook linting)
- radon (cyclomatic complexity)
- trailing whitespace, end-of-file, YAML/TOML checks
Install hooks after cloning:
Adding a New Operator¤
1. Define the Configuration¤
from dataclasses import dataclass
from datarax.core.config import OperatorConfig
@dataclass(frozen=True)
class MyOperatorConfig(OperatorConfig):
"""Configuration for MyOperator.
Attributes:
my_param: Controls the smoothing intensity.
"""
my_param: float = 1.0
2. Implement the Operator¤
All operators inherit from datarax.core.operator.OperatorModule and
implement the apply() contract:
from datarax.core.operator import OperatorModule
from flax import nnx
import jax.numpy as jnp
class MyOperator(OperatorModule):
"""One-line description of what this operator does.
Detailed explanation of the algorithm, including what smooth
approximation technique is used for differentiability.
Args:
config: Operator configuration.
rngs: Flax NNX random number generators.
"""
def __init__(self, config: MyOperatorConfig, *, rngs: nnx.Rngs) -> None:
super().__init__(config, rngs=rngs)
self.param = nnx.Param(jnp.array(config.my_param))
def apply(
self, data, state, metadata, random_params=None, stats=None,
):
result = self._process(data)
return {**data, "output": result}, state, metadata
Key patterns:
- Always call super().__init__(config, rngs=rngs)
- Use nnx.Param for learnable parameters
- Return {**data, "new_key": value} to preserve input keys
- Use jax.numpy inside the operator, never numpy
3. Write Tests First¤
Tests go in tests/operators/<domain>/test_<module>.py:
import jax
import jax.numpy as jnp
import pytest
from flax import nnx
from diffbio.operators.<domain> import MyOperator, MyOperatorConfig
class TestMyOperator:
@pytest.fixture
def config(self) -> MyOperatorConfig:
return MyOperatorConfig(my_param=1.0)
@pytest.fixture
def operator(self, config: MyOperatorConfig) -> MyOperator:
return MyOperator(config, rngs=nnx.Rngs(42))
def test_output_shape(self, operator: MyOperator) -> None:
data = {"input": jnp.ones((10, 20))}
result, _, _ = operator.apply(data, {}, None)
assert result["output"].shape == (10, 20)
def test_differentiability(self, operator: MyOperator) -> None:
def loss_fn(data):
result, _, _ = operator.apply(data, {}, None)
return result["output"].sum()
data = {"input": jnp.ones((10, 20))}
grad = jax.grad(loss_fn)(data)
assert jnp.any(grad["input"] != 0)
assert jnp.all(jnp.isfinite(grad["input"]))
def test_jit_compatible(self, operator: MyOperator) -> None:
data = {"input": jnp.ones((10, 20))}
eager_result, _, _ = operator.apply(data, {}, None)
jit_result, _, _ = jax.jit(lambda d: operator.apply(d, {}, None))(data)
assert jnp.allclose(eager_result["output"], jit_result["output"])
Run tests:
4. Export in __init__.py¤
Add the operator and config to the domain's __init__.py:
from diffbio.operators.<domain>.my_module import MyOperator, MyOperatorConfig
__all__ = [
# ... existing exports
"MyOperator",
"MyOperatorConfig",
]
5. Add Documentation¤
Three artifacts:
-
API reference — Add mkdocstrings directive to
docs/api/operators/<domain>.md: -
User guide — Add a section to
docs/user-guide/operators/<domain>.mdwith overview, quick start, config table, and use cases. -
Concept page — If the operator introduces a new biological concept, add it to the relevant
docs/user-guide/concepts/page.
Running Tests¤
source ./activate.sh
# Run all tests
uv run pytest tests/ -v
# Run a specific domain
uv run pytest tests/operators/singlecell/ -xvs
# Run with coverage
uv run pytest tests/ -v --cov=src/diffbio --cov-report=term-missing
# Run a single test
uv run pytest tests/operators/alignment/test_profile_hmm.py::TestProfileHMMSearch -xvs
See Testing for details on test patterns and fixtures.
Building Documentation¤
source ./activate.sh
# Serve locally with live reload
uv run mkdocs serve
# Build static site
uv run mkdocs build
See Example Documentation Design for the example authoring workflow.
Pull Request Process¤
- Create a feature branch from
main - Write tests, then implement
- Run
uv run pre-commit run --all-files— all hooks must pass - Run
uv run pytest tests/ -v— all tests must pass - Update documentation if adding new operators
- Open a pull request with a clear description
Commit Messages¤
Types: feat, fix, docs, test, refactor, chore
Reporting Issues¤
Bug Reports¤
Include: DiffBio version, Python version, JAX version, minimal reproduction code, expected vs actual behavior, full traceback.
Feature Requests¤
Include: use case description, proposed API, related existing operators.
File issues at github.com/avitai/DiffBio/issues.