Add --ner-label-index option to validate#9
Conversation
ConstantineLignos
left a comment
There was a problem hiding this comment.
Some quick comments to look at before we meet tomorrow. Can you check what is causing the test failures as well?
| label = splits[-1] | ||
| other_fields = tuple(splits[1:-1]) | ||
| label = splits[ner_label_index] | ||
| other_fields = tuple(splits[1:ner_label_index] + splits[ner_label_index + 1:]) |
There was a problem hiding this comment.
This is unfortunately going to mean we need a larger change and might need to abandon the current behavior for other_fields. Let's go over it tomorrow.
There was a problem hiding this comment.
The logic here is what was causing the test failures (when ner_label_index = -1, ner_label_index + 1 = 0, so all the fields get copied into other_fields). I fixed that issue, but I'm guessing you're talking about the other issue where the order of `other_fields' might be incorrect?
There was a problem hiding this comment.
Yes, I think the original logic is going to need to be updated since it makes a lot of assumptions.
| runner = CliRunner() | ||
| result = runner.invoke( | ||
| validate, | ||
| ["--labels", "BIO", "--ner-label-index", 1, os.path.join("tests", "conll_annotation", "labels_not_last_col.bio")], |
There was a problem hiding this comment.
Add another test with the index as -2 instead so we can test positive and negative indices.
Modifies the
validatemethod by adding an optional--ner-label-indexflag, which specifies the column for the NER label. The default value is -1.