Skip to content

Optimize Time.xmlschema by ~4x#66

Open
byroot wants to merge 1 commit into
ruby:masterfrom
byroot:opt-xmlschema
Open

Optimize Time.xmlschema by ~4x#66
byroot wants to merge 1 commit into
ruby:masterfrom
byroot:opt-xmlschema

Conversation

@byroot

@byroot byroot commented Jun 22, 2026

Copy link
Copy Markdown
Member

It is currentyl fairly slow as it rely on a Regexp match, which isn't very fast an allocates a lot.

Since Ruby 3.2, Time.new(iso_string) is supported, and over 8x faster than Time.xmlschema.

So if we use the regexp with match? to avoid the allocations, and then delegate to .new, we can make Time.xmlschema around 4 times faster:

ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
           xmlschema    67.645k i/100ms
            Time.new   531.202k i/100ms
                 opt   263.541k i/100ms
Calculating -------------------------------------
           xmlschema    680.426k (± 0.7%) i/s    (1.47 μs/i) -      3.450M in   5.070197s
            Time.new      5.736M (± 0.9%) i/s  (174.34 ns/i) -     28.685M in   5.000832s
                 opt      2.779M (± 1.1%) i/s  (359.78 ns/i) -     13.968M in   5.025252s

Comparison:
xmlschema:   680426.2 i/s
 Time.new:  5736027.1 i/s - 8.43x  faster
      opt:  2779497.0 i/s - 4.08x  faster
benchmark ```ruby require 'time' require 'benchmark/ips'

class Time
class << self
def baseline(time)
if /\A\s*
(-?\d+)-(\d\d)-(\d\d)
T
(\d\d):(\d\d):(\d\d)
(.\d+)?
(Z|[+-]\d\d(?::?\d\d)?)?
\s*\z/ix =~ time
year = $1.to_i
mon = $2.to_i
day = $3.to_i
hour = $4.to_i
min = $5.to_i
sec = $6.to_i
usec = 0
if $7
usec = Rational($7) * 1000000
end
if $8
zone = $8
off = zone_offset(zone)
year, mon, day, hour, min, sec =
apply_offset(year, mon, day, hour, min, sec, off)
t = self.utc(year, mon, day, hour, min, sec, usec)
force_zone!(t, zone, off)
t
else
self.local(year, mon, day, hour, min, sec, usec)
end
else
raise ArgumentError.new("invalid xmlschema format: #{time.inspect}")
end
end

def match_new(time)
  if /\A\s*
      (-?\d+)-(\d\d)-(\d\d)
      T
      (\d\d):(\d\d):(\d\d)
      (\.\d+)?
      (Z|[+-]\d\d(?::?\d\d)?)?
      \s*\z/ix.match?(time)
    new(time)
  else
    raise ArgumentError.new("invalid xmlschema format: #{time.inspect}")
  end
end

end
end

now = Time.now
iso = now.iso8601(6).freeze
p now
p Time.match_new(iso)

Benchmark.ips do |x|
x.report('xmlschema') { Time.baseline(iso) }
x.report('Time.new') { Time.new(iso) }
x.report('opt') { Time.match_new(iso) }

x.compare!(order: :baseline)
end

<details>

Of course, if we were to reimplement `.xmlschema` in C, we'd likely could be even faster than `.new`, but that would mean moving that method from the time gem into Ruby core, etc.

It is currentyl fairly slow as it rely on a Regexp match, which
isn't very fast an allocates a lot.

Since Ruby 3.2, `Time.new(iso_string)` is supported, and over
8x faster than `Time.xmlschema`.

So if we use the regexp with `match?` to avoid the allocations,
and then delegate to `.new`, we can make `Time.xmlschema` around
4 times faster:

```
ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
           xmlschema    67.645k i/100ms
            Time.new   531.202k i/100ms
                 opt   263.541k i/100ms
Calculating -------------------------------------
           xmlschema    680.426k (± 0.7%) i/s    (1.47 μs/i) -      3.450M in   5.070197s
            Time.new      5.736M (± 0.9%) i/s  (174.34 ns/i) -     28.685M in   5.000832s
                 opt      2.779M (± 1.1%) i/s  (359.78 ns/i) -     13.968M in   5.025252s

Comparison:
xmlschema:   680426.2 i/s
 Time.new:  5736027.1 i/s - 8.43x  faster
      opt:  2779497.0 i/s - 4.08x  faster
```

```ruby
require 'time'
require 'benchmark/ips'

class Time
  class << self
    def baseline(time)
      if /\A\s*
          (-?\d+)-(\d\d)-(\d\d)
          T
          (\d\d):(\d\d):(\d\d)
          (\.\d+)?
          (Z|[+-]\d\d(?::?\d\d)?)?
          \s*\z/ix =~ time
        year = $1.to_i
        mon = $2.to_i
        day = $3.to_i
        hour = $4.to_i
        min = $5.to_i
        sec = $6.to_i
        usec = 0
        if $7
          usec = Rational($7) * 1000000
        end
        if $8
          zone = $8
          off = zone_offset(zone)
          year, mon, day, hour, min, sec =
            apply_offset(year, mon, day, hour, min, sec, off)
          t = self.utc(year, mon, day, hour, min, sec, usec)
          force_zone!(t, zone, off)
          t
        else
          self.local(year, mon, day, hour, min, sec, usec)
        end
      else
        raise ArgumentError.new("invalid xmlschema format: #{time.inspect}")
      end
    end

    def match_new(time)
      if /\A\s*
          (-?\d+)-(\d\d)-(\d\d)
          T
          (\d\d):(\d\d):(\d\d)
          (\.\d+)?
          (Z|[+-]\d\d(?::?\d\d)?)?
          \s*\z/ix.match?(time)
        new(time)
      else
        raise ArgumentError.new("invalid xmlschema format: #{time.inspect}")
      end
    end
  end
end

now = Time.now
iso = now.iso8601(6).freeze
p now
p Time.match_new(iso)

Benchmark.ips do |x|
  x.report('xmlschema') { Time.baseline(iso) }
  x.report('Time.new') { Time.new(iso) }
  x.report('opt') { Time.match_new(iso) }

  x.compare!(order: :baseline)
end
```
@byroot byroot requested a review from akr June 22, 2026 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant