Skip to content

fix: Data::SExpression jcpan, regex cache pos, short-circuit list context#742

Merged
fglock merged 2 commits into
masterfrom
fix/data-sexpression-cpan-regex-pos
May 15, 2026
Merged

fix: Data::SExpression jcpan, regex cache pos, short-circuit list context#742
fglock merged 2 commits into
masterfrom
fix/data-sexpression-cpan-regex-pos

Conversation

@fglock
Copy link
Copy Markdown
Owner

@fglock fglock commented May 15, 2026

Summary

  • Fixes ./jcpan -t Data::SExpression (second read() on the same parser instance failed because YYData->{INPUT} reuses one RuntimeScalar per key; assigning a new string must clear pos() unless it is a true self-assign).
  • Tightens m?PAT? handling: only set shared cached regex matched for real match-once patterns; detect match-once via a trailing ? on modifiers (not modifiers.contains("?"), which mis-fires on (?:…) and similar).
  • Materializes short string literal scalars per occurrence (JVM + bytecode LIST path) so pos() / scalar identity behave closer to Perl.
  • Aligns && / || / // list-context lowering on the JVM with perlop (RHS uses the operator’s context; LHS stays scalar for the test).

Test plan

  • make
  • timeout 600 ./jcpan -t Data::SExpression

- Invalidate pos() when assigning from another SV so reused hash slots reset
  (YYData INPUT + lexer \\G); preserve pos for self-assign ($x = $x).
- Only arm regex.matched for real m?PAT?; detect match-once via trailing '?' on
  modifiers, not any '?' (avoids (?:...) false positives).
- Materialize short-string literals per occurrence (byte + UTF-8) so scalar
  identity matches Perl SVs and pos state is not shared incorrectly.
- JVM: list-context && / || / // evaluate RHS in LIST context per perlop;
  bytecode path documents same rule (RHS uses caller context).

Generated with [Cursor](https://cursor.com/docs)

Co-Authored-By: Cursor <noreply@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@fglock fglock force-pushed the fix/data-sexpression-cpan-regex-pos branch from 35ff3b7 to cd27270 Compare May 15, 2026 14:42
Removing the cache entry orphaned PosLvalueScalar handles that
matchRegexDirect already holds, breaking /g and \G after mid-match
assignments (e.g. (?{ })). Reset the cached position lvalue in place,
refresh valueHash after the new payload is stored, and call
invalidatePos only after set() copies the RHS.

Generated with [Cursor](https://cursor.com/docs)

Co-Authored-By: Cursor <noreply@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@fglock fglock merged commit d4996c1 into master May 15, 2026
2 checks passed
@fglock fglock deleted the fix/data-sexpression-cpan-regex-pos branch May 15, 2026 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant