Skip to content

Find a way to not load all the tasks infos. #709

@thomasw21

Description

@thomasw21

When running from promptsource.seqio_tasks import tasks it takes a huge amount of time. One of the main reasons is this queries all dataset infos:

dataset_splits = utils.get_dataset_splits(dataset_name, subset_name)
This is problematic for two reasons:

IMO both are unnecessary and should be fixed. Is there a reasons why one cannot load seqio tasks dynamically, in the sense of fetching only what is necessary? Something along the lines of:

def add_seqio_task(task_name):
    seqio.TaskRegistry.add(...)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions