Skip to content

parsing of escaped double quotes in double quote delimited fields fails #487

@detlevd

Description

@detlevd

Describe the bug
When processing bibtex entries, where the fields are delimited by double quotes, embedded double quotes should be escaped by {"}, according to https://tug.ctan.org/info/bibtex/tamethebeast/ttb_en.pdf, page 20. v2.0.0b7 however can't cope with that.
Can't see an easy fix in the current parsing technique.

Reproducing

Version: 2.0.0b7

Code:

import bibtexparser
# title according to page 20 of https://tug.ctan.org/info/bibtex/tamethebeast/ttb_en.pdf
bibentrytext = '''
@inproceedings{quotingproblem,
        pages = "23--26",
        title = "Comments on {"}Filenames and Fonts{"}",
}
'''
library = bibtexparser.parse_string(bibentrytext)
new_bibtex_str = bibtexparser.write_string(library)
print(new_bibtex_str)

Bibtex:

@inproceedings{quotingproblem,
        pages = "23--26",
        title = "Comments on {"}Filenames and Fonts{"}",
}

Workaround
Find such fields by hand and use {...} delimiters. Since long or multiline fields (abstract, long titles, ...) might be affected, this is not easily done in a secure way with some REs.

Remaining Questions (Optional)
Please tick all that apply:

  • I would be willing to contribute a PR to fix this issue.
  • This issue is a blocker, I'd be grateful for an early fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions