Skip to content

How to get train-splits.txt and valid-splits.txt before training tr11-176B-ml #66

@robinfang7

Description

@robinfang7
  • Big Science version: latest
  • Python version: 3.8.8
  • Operating System: Ubuntu 20.04.5 LTS

Description

How to get train-splits.txt and valid-splits.txt at Line39 in train/tr11-176B-ml/tr11-176B-ml.slurm. Thx.
TRAIN_DATA_PATH=$MEGATRON_DEEPSPEED_REPO/data/train-splits.txt
VALID_DATA_PATH=$MEGATRON_DEEPSPEED_REPO/data/valid-splits.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions