Skip to content

Add --ner-label-index option to validate#9

Open
claire-yq wants to merge 3 commits into
mainfrom
ner-label-index
Open

Add --ner-label-index option to validate#9
claire-yq wants to merge 3 commits into
mainfrom
ner-label-index

Conversation

@claire-yq
Copy link
Copy Markdown
Collaborator

Modifies the validate method by adding an optional --ner-label-index flag, which specifies the column for the NER label. The default value is -1.

Copy link
Copy Markdown
Member

@ConstantineLignos ConstantineLignos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some quick comments to look at before we meet tomorrow. Can you check what is causing the test failures as well?

Comment thread seqscore/conll.py Outdated
label = splits[-1]
other_fields = tuple(splits[1:-1])
label = splits[ner_label_index]
other_fields = tuple(splits[1:ner_label_index] + splits[ner_label_index + 1:])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunately going to mean we need a larger change and might need to abandon the current behavior for other_fields. Let's go over it tomorrow.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here is what was causing the test failures (when ner_label_index = -1, ner_label_index + 1 = 0, so all the fields get copied into other_fields). I fixed that issue, but I'm guessing you're talking about the other issue where the order of `other_fields' might be incorrect?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the original logic is going to need to be updated since it makes a lot of assumptions.

Comment thread tests/test_validation_click.py Outdated
runner = CliRunner()
result = runner.invoke(
validate,
["--labels", "BIO", "--ner-label-index", 1, os.path.join("tests", "conll_annotation", "labels_not_last_col.bio")],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add another test with the index as -2 instead so we can test positive and negative indices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants