Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
d830e16
Add python script for splitting FASTA, chunking if necessary
markquintontulloch Jan 30, 2026
f89d0b2
Add pytest tests for split_fasta.py
markquintontulloch Jan 30, 2026
cfe0b4e
Add nextflow module and tests for running split_fasta.py
markquintontulloch Jan 30, 2026
acfe54f
Add python script for splitting FASTA, chunking if necessary
markquintontulloch Jan 30, 2026
8a6adaa
Add pytest tests for split_fasta.py
markquintontulloch Jan 30, 2026
1dbf7eb
Add Nextflow module and tests for running split_fasta.py
markquintontulloch Jan 30, 2026
2e62385
Remove accidentally commited Python bytecode files
markquintontulloch Jan 30, 2026
af21a08
Docstring updates and minor pytest refactor
markquintontulloch Jan 30, 2026
ac505b8
Header updates
markquintontulloch Feb 2, 2026
1ee480d
Moved python stuff to ensembl-genomio
markquintontulloch Feb 3, 2026
66550dc
Test fixes
markquintontulloch Feb 3, 2026
da555a1
Actually remove python script!
markquintontulloch Feb 3, 2026
ad779fd
Update call to splitting script
markquintontulloch Feb 3, 2026
1934c1f
Add FASTA recombination tests
markquintontulloch Feb 10, 2026
225b68a
Refactor for manifest input to recombine module
markquintontulloch Feb 12, 2026
40ed523
Various fixes
markquintontulloch Feb 13, 2026
410a944
Add repeats/combine_json module
markquintontulloch Feb 19, 2026
7bfe4c6
Handle ncRNA features as well as repeats
markquintontulloch Feb 23, 2026
e5bdeb2
Naming update
markquintontulloch Feb 23, 2026
13be40f
Merge branch 'main' into ENSGENOMIO-18
markquintontulloch Mar 2, 2026
fda3137
Add version.yml to output
markquintontulloch Mar 2, 2026
b114773
Remove outdated files
markquintontulloch Mar 11, 2026
824066c
Remove use of test data
markquintontulloch Mar 11, 2026
1405de9
Move dynamic memory allocation to pipeline
markquintontulloch Mar 16, 2026
b00d0ac
Update output filename
markquintontulloch Mar 20, 2026
78e789d
Add meta files
markquintontulloch May 12, 2026
2883bc0
Linting updates
markquintontulloch May 12, 2026
6c07eb1
Code review update
markquintontulloch May 14, 2026
4e6e53f
Remove blank line
markquintontulloch May 14, 2026
a4d4854
Use single line version cmd for param
markquintontulloch May 14, 2026
59e24c4
Use command directly within eval
markquintontulloch May 14, 2026
3d7c7ef
Merge branch 'main' into ENSGENOMIO-18
markquintontulloch May 14, 2026
30961a3
Update snapshots
markquintontulloch May 14, 2026
b771306
Linting fixes
markquintontulloch May 14, 2026
b50cfba
Use package versions
markquintontulloch May 15, 2026
adb281a
Update meta.yml
markquintontulloch May 15, 2026
a14b6c5
Bump genomio version in snapshots
markquintontulloch May 15, 2026
ef8a3ea
Update versioning for fasta_recombine
markquintontulloch May 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
.nextflow*
.nf-test*
__pycache__/
*.pyc
.python-version
6 changes: 6 additions & 0 deletions modules/ensembl/fasta/recombine/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
channels:
- conda-forge
- bioconda
dependencies:
- ensembl-genomio=1.6.1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ensembl-genomio=1.6.1
- ensembl-genomio=1.6.2

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

66 changes: 66 additions & 0 deletions modules/ensembl/fasta/recombine/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
// See the NOTICE file distributed with this work for additional information
// regarding copyright ownership.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

process FASTA_RECOMBINE {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

tag "${meta.id}"
label 'process_medium'

conda "${moduleDir}/environment.yml"
container "${workflow.containerEngine in ['singularity', 'apptainer'] && !task.ext.singularity_pull_docker_container
? 'https://depot.galaxyproject.org/singularity/ensembl-genomio:1.6.1--pyhdfd78af_0'
: 'quay.io/biocontainers/ensembl-genomio:1.6.1--pyhdfd78af_0'}"

input:
tuple val(meta), path(fasta_manifest), path(agp)

output:
tuple val(meta), path("${meta.id}.fa"), emit: recombined_fasta
tuple val("${task.process}"), val('fasta_recombine'), eval("fasta_recombine --version"), emit: versions_fasta_recombine, topic: versions

when:
task.ext.when == null || task.ext.when

script:
def args = []

if (params.chunk_id_regex) {
def rx = params.chunk_id_regex.replace("'", "'\"'\"'")
args << "--chunk-id-regex '${rx}'"
}

if (params.allow_revcomp) {
args << "--allow-revcomp"
}

def has_agp = agp && agp.baseName != 'NO_FILE'
if (has_agp) {
args << "--agp-file ${agp}"
}

def out_fasta = "${meta.id}.fa"
"""
fasta_recombine \\
--fasta-manifest ${fasta_manifest} \\
--out-fasta ${out_fasta} \\
${args.join(' ')}
"""

stub:
"""
out_fa="${meta.id}.fa"
touch "\$out_fa"
"""
}
71 changes: 71 additions & 0 deletions modules/ensembl/fasta/recombine/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
name: "fasta_recombine"
description: Recombine split FASTA sequences into a single FASTA file,
optionally using an AGP file.
keywords:
- ensembl
- fasta
- genomics
- genomio
- recombine
tools:
- "fasta_recombine":
description: "Recombine split FASTA sequences generated by ensembl-genomio."
homepage: "https://github.com/Ensembl/ensembl-genomio"
licence:
- "Apache License version 2.0"
identifier: ""
input:
- - meta:
type: map
description: |
Groovy Map containing meta information
e.g. `[ id:'accession1' ]`
- fasta_manifest:
type: file
description: Manifest file listing split FASTA files to recombine.
pattern: "*.txt"
ontologies: []
- agp:
type: file
description:
Optional AGP file describing how split sequence chunks should
be recombined. Use NO_FILE when not required.
pattern: "*.{agp,NO_FILE}"
ontologies: []
output:
recombined_fasta:
- - meta:
type: map
description: |
Groovy Map containing meta information
e.g. `[ id:'accession1' ]`
- ${meta.id}.fa:
type: file
description: Recombined FASTA file.
pattern: "*.fa"
ontologies: []
versions_fasta_recombine:
- - ${task.process}:
type: string
description: The name of the process.
- fasta_recombine:
type: string
description: The name of the tool.
- ? fasta_recombine --version
: type: eval
description: The expression to obtain the version of the tool
topics:
versions:
- - ${task.process}:
type: string
description: The name of the process.
- fasta_recombine:
type: string
description: The name of the tool.
- ? fasta_recombine --version
: type: eval
description: The expression to obtain the version of the tool
authors:
- "ensembl-dev@ebi.ac.uk"
maintainers:
- "ensembl-dev@ebi.ac.uk"
86 changes: 86 additions & 0 deletions modules/ensembl/fasta/recombine/tests/main.nf.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
// See the NOTICE file distributed with this work for additional information
// regarding copyright ownership.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// nf-core modules test fasta/recombine
nextflow_process {

name "Test Process FASTA_RECOMBINE"
script "../main.nf"
process "FASTA_RECOMBINE"

tag "modules"
tag "modules_ensembl"
tag "fasta"
tag "fasta/recombine"

test("stub outputs: header mode") {

when {
options "-stub"

process {
"""
def manifest = file("manifest.txt")
manifest.text = "x\\n"

def no_file = file("NO_FILE")
no_file.text = ""

input[0] = [
[ id: 'test' ],
manifest,
no_file
]
"""
}
}

then {
assert process.trace.tasks().size() == 1
assert process.out.recombined_fasta.size() == 1
assert process.success
assert snapshot(process.out).match()
}
}

test("stub outputs: AGP mode") {

when {
options "-stub"

process {
"""
def manifest = file("manifest.txt")
manifest.text = "x\\n"

def agp = file("test.agp")
agp.text = ""
input[0] = [
[ id: 'test' ],
manifest,
agp
]
"""
}
}

then {
assert process.trace.tasks().size() == 1
assert process.out.recombined_fasta.size() == 1
assert process.success
assert snapshot(process.out).match()
}
}
}
84 changes: 84 additions & 0 deletions modules/ensembl/fasta/recombine/tests/main.nf.test.snap
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
{
"stub outputs: AGP mode": {
"content": [
{
"0": [
[
{
"id": "test"
},
"test.fa:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"1": [
[
"FASTA_RECOMBINE",
"fasta_recombine",
"1.6.3"
]
],
"recombined_fasta": [
[
{
"id": "test"
},
"test.fa:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"versions_fasta_recombine": [
[
"FASTA_RECOMBINE",
"fasta_recombine",
"1.6.3"
]
]
}
],
"timestamp": "2026-05-14T14:39:11.350698",
"meta": {
"nf-test": "0.9.4",
"nextflow": "25.10.3"
}
},
"stub outputs: header mode": {
"content": [
{
"0": [
[
{
"id": "test"
},
"test.fa:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"1": [
[
"FASTA_RECOMBINE",
"fasta_recombine",
"1.6.3"
]
],
"recombined_fasta": [
[
{
"id": "test"
},
"test.fa:md5,d41d8cd98f00b204e9800998ecf8427e"
]
],
"versions_fasta_recombine": [
[
"FASTA_RECOMBINE",
"fasta_recombine",
"1.6.3"
]
]
}
],
"timestamp": "2026-05-14T14:39:09.216174",
"meta": {
"nf-test": "0.9.4",
"nextflow": "25.10.3"
}
}
}
6 changes: 6 additions & 0 deletions modules/ensembl/fasta/split/environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
channels:
- conda-forge
- bioconda
dependencies:
- ensembl-genomio=1.6.1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ensembl-genomio=1.6.1
- ensembl-genomio=1.6.2

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above

Loading
Loading