Skip to content

Biological Regularization API¤

Regularization losses based on biological constraints and priors.

BiologicalPlausibilityLoss¤

diffbio.losses.biological_regularization.BiologicalPlausibilityLoss ¤

BiologicalPlausibilityLoss(
    config: BiologicalRegularizationConfig,
    *,
    rngs: Rngs | None = None,
)

Bases: Module

Combined biological plausibility regularization.

Combines multiple regularization terms to encourage biologically plausible sequences and alignments during differentiable optimization.

Parameters:

Name Type Description Default
config BiologicalRegularizationConfig

BiologicalRegularizationConfig with weights and targets.

required
rngs Rngs | None

Flax NNX random number generators.

None

Parameters:

Name Type Description Default
config BiologicalRegularizationConfig

Configuration with weights and targets.

required
rngs Rngs | None

Random number generators (optional).

None

config instance-attribute ¤

config = config

gc_loss instance-attribute ¤

gc_loss = GCContentRegularization(
    target_gc=target_gc_content,
    tolerance=target_gc_tolerance,
    rngs=rngs,
)

complexity_loss instance-attribute ¤

complexity_loss = SequenceComplexityLoss(
    min_entropy=1.0, rngs=rngs
)

BiologicalRegularizationConfig¤

diffbio.losses.biological_regularization.BiologicalRegularizationConfig dataclass ¤

BiologicalRegularizationConfig(
    gc_content_weight: float = 1.0,
    gap_pattern_weight: float = 1.0,
    complexity_weight: float = 1.0,
    target_gc_content: float = 0.5,
    target_gc_tolerance: float = 0.2,
)

Configuration for biological regularization losses.

Attributes:

Name Type Description
gc_content_weight float

Weight for GC content regularization.

gap_pattern_weight float

Weight for gap pattern regularization.

complexity_weight float

Weight for sequence complexity loss.

target_gc_content float

Target GC content (typically 0.4-0.6).

target_gc_tolerance float

Tolerance around target GC content.

GCContentRegularization¤

diffbio.losses.biological_regularization.GCContentRegularization ¤

GCContentRegularization(
    target_gc: float = 0.5,
    tolerance: float = 0.2,
    *,
    rngs: Rngs | None = None,
)

Bases: Module

Regularization loss for GC content.

Penalizes sequences with GC content far from biological norms. For most organisms, GC content ranges from 25% to 75%.

Parameters:

Name Type Description Default
target_gc float

Target GC content (default 0.5 for balanced).

0.5
tolerance float

Tolerance around target before penalizing.

0.2
rngs Rngs | None

Flax NNX random number generators.

None

Parameters:

Name Type Description Default
target_gc float

Target GC content.

0.5
tolerance float

Tolerance around target.

0.2
rngs Rngs | None

Random number generators (optional).

None

target_gc instance-attribute ¤

target_gc = Param(array(target_gc))

tolerance instance-attribute ¤

tolerance = Param(array(tolerance))

GapPatternRegularization¤

diffbio.losses.biological_regularization.GapPatternRegularization ¤

GapPatternRegularization(
    max_gap_length: int = 10, *, rngs: Rngs | None = None
)

Bases: Module

Regularization loss for gap patterns in alignments.

Penalizes unrealistic gap patterns such as: - Very long consecutive gaps - Many scattered small gaps

Parameters:

Name Type Description Default
max_gap_length int

Maximum expected gap length before penalizing.

10
rngs Rngs | None

Flax NNX random number generators.

None

Parameters:

Name Type Description Default
max_gap_length int

Maximum expected gap length.

10
rngs Rngs | None

Random number generators (optional).

None

max_gap_length instance-attribute ¤

max_gap_length = max_gap_length

SequenceComplexityLoss¤

diffbio.losses.biological_regularization.SequenceComplexityLoss ¤

SequenceComplexityLoss(
    min_entropy: float = 1.0, *, rngs: Rngs | None = None
)

Bases: Module

Regularization loss for sequence complexity.

Penalizes low-complexity sequences that might arise from adversarial optimization (e.g., all-A sequences, repetitive patterns).

Uses entropy as a measure of complexity.

Parameters:

Name Type Description Default
min_entropy float

Minimum expected entropy per position.

1.0
rngs Rngs | None

Flax NNX random number generators.

None

Parameters:

Name Type Description Default
min_entropy float

Minimum expected entropy.

1.0
rngs Rngs | None

Random number generators (optional).

None

min_entropy instance-attribute ¤

min_entropy = Param(array(min_entropy))

Usage Example¤

from diffbio.losses import (
    BiologicalPlausibilityLoss,
    GCContentRegularization,
    SequenceComplexityLoss,
)

# GC content regularization
gc_reg = GCContentRegularization(target_gc=0.5, weight=1.0)
gc_loss = gc_reg(sequences=predicted_sequences)

# Sequence complexity loss
complexity = SequenceComplexityLoss()
comp_loss = complexity(sequences=predicted_sequences)