Skip to content

fix: handle infinite values in anomaly score calculation#100

Merged
St3451 merged 1 commit into
masterfrom
fix/decimal-score-error
May 27, 2026
Merged

fix: handle infinite values in anomaly score calculation#100
St3451 merged 1 commit into
masterfrom
fix/decimal-score-error

Conversation

@St3451
Copy link
Copy Markdown
Collaborator

@St3451 St3451 commented May 27, 2026

When an extreme density of mutations is observed on a residue, the probability of seeing it by chance is so small that it gets approximated to zero, and the clustering score becomes infinite. When this happens, Oncodrive3D tries to recompute the score using the decimal package (get_dcm_anomaly_score), which supports 600-digit precision.

In the reported case, the signal was so high that even 600-digit precision wasn't enough and it hit the precision limit again. This case wasn't expected: dcm_binom_logsf returned np.inf (a float) instead of a Decimal, and the next line tried float / Decimal which crashed with a TypeError.

closes #87

Fix

Check if the high-precision result is again inf. In that case, just return inf as the clustering score. Nothing changes at the level of p-value calculation.

Tests

  • Re-run the failing cohort and confirm it completes without the TypeError
  • Check that failing gene appears in the gene-level output with Status = Processed and a significant p-value

Copilot AI review requested due to automatic review settings May 27, 2026 10:07
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prevents a crash in the high-precision (decimal) anomaly score recomputation path when the computed log survival function underflows and dcm_binom_logsf returns np.inf (a float), which previously led to a float / Decimal TypeError. It aligns with issue #87 by ensuring extreme clustering signals can be represented as an infinite clustering score rather than terminating execution.

Changes:

  • Add a guard in get_dcm_anomaly_score to detect an infinite (float) numerator and return np.inf early.
  • Preserve existing p-value computation behavior; only the clustering score recomputation path is made resilient to precision-limit overflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@St3451 St3451 merged commit 867a85f into master May 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unsupported division

2 participants