Skip to content

fix: report No_mutability genes correctly and avoid UnboundLocalError#101

Merged
St3451 merged 2 commits into
masterfrom
fix/no-mutability-status-reporting
May 27, 2026
Merged

fix: report No_mutability genes correctly and avoid UnboundLocalError#101
St3451 merged 2 commits into
masterfrom
fix/no-mutability-status-reporting

Conversation

@St3451
Copy link
Copy Markdown
Collaborator

@St3451 St3451 commented May 27, 2026

When running with mutabilities, Oncodrive3D filters the sequences dataframe to keep only the genes with reference info available (needed to compute mutabilities). Genes dropped at this step should be reported in the output with status No_mutability. However, there was an order-of-operations bug: the list of dropped genes was built after the filter had already been applied, so it always ended up empty. As a result, the No_mutability status had effectively never been triggered in any run, and dropped genes silently disappeared from the output.

This became visible when a user (running Oncodrive3D within deepCSA) hit a case where all genes were dropped at this step. With nothing left to cluster and nothing reported as filtered out, the code fell into a branch that referenced a variable that was never assigned → UnboundLocalError.

closes #86

Fix

  • Swap the order so dropped genes are captured before genes_to_process is filtered. They will now be correctly reported with status No_mutability, and the crash goes away as a side effect.
  • Populate Mut_in_gene with the actual mutation count for No_mutability entries (was NaN), matching how the other non-processed status blocks (No_mut, No_ID_mapping, Fragmented) already behave.

Copilot AI review requested due to automatic review settings May 27, 2026 10:58
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes reporting for genes that are skipped due to missing mutability information during clustering, ensuring they are correctly identified and their mutation counts are populated.

Changes:

  • Fixes the logic that computes genes_not_mutability so it’s derived from the pre-filter genes_to_process list (previously it could be incorrectly empty).
  • Improves membership checks by caching seq_df gene values into a set for faster lookups.
  • Populates Mut_in_gene for No_mutability genes using the actual per-gene mutation counts instead of NaN.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@St3451 St3451 changed the title Fix/no mutability status reporting fix: report No_mutability genes correctly and avoid UnboundLocalError May 27, 2026
@St3451 St3451 merged commit 45a2269 into master May 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Variable referenced before assignment

2 participants