Skip to content

Update timing parsing regex to account for scientific notation.#856

Open
chinandrew wants to merge 1 commit into
stan-dev:developfrom
chinandrew:timing_scientific_notation
Open

Update timing parsing regex to account for scientific notation.#856
chinandrew wants to merge 1 commit into
stan-dev:developfrom
chinandrew:timing_scientific_notation

Conversation

@chinandrew
Copy link
Copy Markdown

Submission Checklist

  • Run unit tests
  • Declare copyright holder and open-source license: see below

Summary

The regex in parse_timing_lines() has a bug where if scientific notation is present, only the exponent is captured:

In [1]: from cmdstanpy.utils.stancsv import parse_timing_lines

In [2]: lines = [
   ...:     b'# \n',
   ...:     b'#  Elapsed Time: 630099 seconds (Warm-up)\n',
   ...:     b'#                435856 seconds (Sampling)\n',
   ...:     b'#                1.06595e+06 seconds (Total)\n',
   ...:     b'# \n'
   ...: ]

In [3]: parse_timing_lines(lines)
Out[3]: {'Warm-up': 630099.0, 'Sampling': 435856.0, 'Total': 6.0}

This change updates the regex for the first capture group to include e, +, and -, so the output becomes

In [3]: parse_timing_lines(lines)
Out[3]: {'Warm-up': 630099.0, 'Sampling': 435856.0, 'Total': 1065950.0}

Not sure if support for - is really needed in practice, but I added it for completeness. In keeping with the existing escaped . in the regex, I also escape + and -, though I believe no escapes are actually needed in the character class, i.e. ([\d.e+-]+) is valid.

Note that this can lead to the case where the Warm-Up + Sampling != Total due to truncation from the scientific notation. In this example, 1.06595e+06 gets parsed to 1065950.0 but Warm-Up + Sampling is 630099.0 + 435856.0 = 1065955.0. Arguably this is not a parsing issue since it's correctly reading the comment, but perhaps some additional logic to set Total = Warm-Up + Sampling might be useful.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):

Myself

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant