Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions dev/design/string_encoding_context_plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,17 @@ die "FAIL" if is_utf8($err);

## Notes

- **Investigation update (2026-05-15):** Running `./jcpan -t Sub::HandlesVia` showed an immediate crash in
Mite’s generated `*.mite.pm`: `HAS_BUILDARGS` was polluted with the string `HAS_FOREIGNBUILDARGS`,
falsely enabling the `BUILDARGS` branch. Root cause was **`UNIVERSAL::can()` returning an empty list**
in **list contexts** inside flat list/hash construction (the empty list vanishes instead of occupying
a real `undef` slot). Returning a singleton **`(undef)`** on **every** failure path fixes Mite but
breaks **scalar-context** compile probes (`VERSION` / `use` **import** / attribute installers) that
assume **not-found** `can()` is **`size()==0`** in their `Universal.can` result. **`Universal.canNotFound(ctx)`**
now returns **`(undef)` only for `LIST` context** failures and **`()` for scalar-like contexts** (still
**`scalar()` → undef**, matching Perl for plain assignments).
A typed-string concat refactor (Phase 1–2 in this doc) was **reverted** after separate `perl5_t`
regressions; redo with targeted tests before merging.
- This fix addresses the root cause rather than applying post-corruption repair
- The eval-time repair in RuntimeRegex can remain as a safety net
- This aligns PerlOnJava with Perl 5's encoding context semantics
Expand Down
12 changes: 12 additions & 0 deletions dev/design/utf8_flag_parity.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,18 @@ strings) never upgrade the result to UTF-8.
- Previously, if neither was BYTE_STRING (e.g. INTEGER + BYTE_STRING), it fell
through to the default STRING return

### 2b. `UNIVERSAL::can()` failures must return `(undef)`

**File:** `src/main/java/org/perlonjava/runtime/perlmodule/Universal.java`

Perl returns **one** undefined value (`(undef)` in list context). PerlOnJava used an **empty**
`RuntimeList`, which behaves like Perl’s truly empty list: in `%h = (...)`/`{ … }`
constructors it eats the next pairing and corrupts literals. Downstream (**Mite** `__META__` in
`*.mite.pm`; **Sub::HandlesVia::CodeGenerator**) saw `HAS_BUILDARGS` swallow the `'HAS_FOREIGNBUILDARGS'`
key as its bogus string value and incorrectly took the `BUILDARGS` constructor branch.

Failures use **`canNotFound(ctx)`**: **list** context returns **`scalarUndef.getList()`** (one `undef`, so outer list splices see a real slot); **scalar / void / lvalue** contexts return an **empty** `RuntimeList` so compile-time call sites that compare **`size() == 0`** vs **`size() == 1 && getBoolean()`** keep working (a singleton `undef` would be `size()==1` with a falsy non-sub value and mis-routes VERSION / import / attribute probes).

### 3. `sprintf` — SprintfOperator.sprintfInternal()

**File:** `src/main/java/org/perlonjava/runtime/operators/SprintfOperator.java`
Expand Down
1 change: 1 addition & 0 deletions dev/modules/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ This directory contains design documents and guides related to porting CPAN modu
| [storable_binary_format.md](storable_binary_format.md) | Storable native Perl binary format — read + write paths landed; jperl ↔ system-perl files interoperate in both directions |
| [unicode_collate.md](unicode_collate.md) | Unicode::Collate — plan: file-backed DUCET + Java XS surface (default); optional ICU path tradeoffs |
| [ppi.md](ppi.md) | **PPI** — CPAN test status, RC1–RC4, refcount/`DESTROY` follow-ups (`t/04_element.t` `%_PARENT`) |
| [sub_handlesvia_support.md](sub_handlesvia_support.md) | **Sub::HandlesVia** — Mite/can/hash fix landed; **`eval_closure` UTF‑8 (\x{c2})** trace plan |

## Module Status Overview

Expand Down
124 changes: 124 additions & 0 deletions dev/modules/sub_handlesvia_support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# Sub::HandlesVia Support for PerlOnJava

## Overview

[Sub::HandlesVia](https://metacpan.org/pod/Sub::HandlesVia) generates delegation methods (“handles”) for Moo/Moose/Object::Pad/toolkit classes. Runtime work uses **generated Perl** compiled via **`Eval::TypeTiny::eval_closure`**; build-time codegen uses **Mite** (`*.mite.pm`) and **`Sub::HandlesVia::CodeGenerator`**.

PerlOnJava must treat **SvUTF8 / BYTE_STRING parity** consistently in string concatenation *and* return **Perl-correct lists** from core helpers such as **`UNIVERSAL::can`**, otherwise Mite constructors and delegated subs break in non-obvious ways.

---

## Completed (upstream-style fixes landed in core)

These changes address blockers traced while running `./jcpan -t Sub::HandlesVia`:

| Area | Problem | Fix |
|------|---------|-----|
| **`UNIVERSAL::can`** | Missing methods returned an **empty** `RuntimeList`, which behaves like Perl’s **empty list** inside hash literals. That **consumes** the next `=>` pairing and corrupted Mite **`__META__`** (`HAS_BUILDARGS` falsely truthy → bogus **`BUILDARGS`** branch). Pure singleton-`undef` on **all** failure paths confused **scalar-context** compiler probes (`VERSION`/`import`/attributes) that discriminate with **`size() == 1`**. | Failure paths route through **`Universal.canNotFound(ctx)`**: **LIST** ⇒ one `undef` element; **scalar/void/lvalue** ⇒ empty list (still **`scalar()` → undef**). `Universal.java`. |
| **String concat SvUTF8** *(deferred)* | A typed-concat experiment caused **`perl5_t`** regressions (`op/sub.t`, `porting/filenames.t`, `re/pat_advanced.t`); it was **reverted** from the PR trajectory serving Sub::HandlesVia. Redo against smaller, **`perl5_t`-backed** steps ([`dev/design/string_encoding_context_plan.md`](../design/string_encoding_context_plan.md)). |

Design cross-links:

- [`dev/design/utf8_flag_parity.md`](../design/utf8_flag_parity.md) — §2b (`can`).
- [`dev/design/string_encoding_context_plan.md`](../design/string_encoding_context_plan.md) — investigation note (2026-05-15).

---

## Current Status (manual smoke)

After the **`can`** fix:

- **`Sub::HandlesVia::CodeGenerator->__META__`** has four keys; **`HAS_BUILDARGS`** exists with **`undef`** (Perl-correct falsy gate).
- **`t/02moo.t`** progresses further but **still fails** when **`Eval::TypeTiny::eval_closure`** compiles generated source (`Unrecognized character \x{c2}` with a `#line` pointing at **`Eval/TypeTiny.pm`** — the synthesized filename/line prefix, not the host file’s UTF-8 problem).

Automated `./jcpan -t Sub::HandlesVia` was previously **timed out at 600s** in CI-style runs; rerun with **`timeout 3600`** after core fixes stabilize.

---

## Next Steps (prioritized)

### 1. [P0] Fix UTF-8 / lead-byte breakage in delegated eval (`\x{c2}`)

**Symptom:**

```text
Failed to compile source because: Unrecognized character \x{c2}; at .../Eval/TypeTiny.pm line 8 ...
at .../Sub/HandlesVia/CodeGenerator.pm line 345 (Eval::TypeTiny::eval_closure)
```

**Goals:**

1. Capture the **exact `%ec_args`** string passed into **`eval_closure`** for a failing case (minimal Moo delegation in `t/02moo.t`), e.g. temporary logging in **`CodeGenerator.pm`** (`generate_coderef_for_handler`) guarded by **`$ENV{SUB_HANDLESVIA_DEBUG_EC}`**.
2. Binary-diff that string (`unpack "H*", $src`) vs system Perl — locate the first stray **`0xc2`** (UTF-8 lead byte) treated as Latin-1.
3. Classify origin:
- **Runtime string typing** remaining in codegen (`"."`/`join`/quoting/formatters elsewhere), or
- **PerlOnJava lexer/compiler** rejecting valid UTF-8 in **`eval`** strings (narrow vs wide rules), or
- **Copy from file** paths reading `.pm` with wrong Perl layer assumptions.
4. Fix at the appropriate layer (prefer **prevent** mis-typing; **`RuntimeRegex.repairLatin1EncodedUtf8IfCorrupted`** is only a fallback per design notes).

**Success:** `timeout 900 ./jperl .../blib/lib .../Sub-HandlesVia-*/t/02moo.t` completes with TAP **ok**.

### 2. [P1] Full CPAN harness run

```bash
timeout 3600 ./jcpan -t Sub::HandlesVia > /tmp/jcpan_Sub_HandlesVia.txt 2>&1
```

Catalog skips (optional deps **MooX::TypeTiny**, **Mouse**, etc.) vs real failures.

### 3. [P2] Concat / SvUTF8 parity redo (staging)

Retry [`dev/design/string_encoding_context_plan.md`](../design/string_encoding_context_plan.md) **Phase 2** (`StringOperators.stringConcat*`) **only after** guarding with:

```bash
cd perl5_t/t
timeout 300 ../../jperl op/sub.t
timeout 180 ../../jperl porting/filenames.t
timeout 600 ../../jperl re/pat_advanced.t # noisy; grep ^not ok
```

Establish **baseline counts** vs **`origin/master`** on the **same harness** (`perl_test_runner.pl` shards if that is CI). A naive `RuntimeScalar(text, BYTE_STRING)` swap for the ISO-8859-1 `byte[]` path surfaced **opaque** regressions in **regex/porting/stack** slices — redo incrementally under its own tiny PR once bisected.

### 4. [P3] Regression tests in-repo (coordination needed)

PerlOnJava policy: **never delete or weaken existing tests**; adding **new** unit tests requires maintainer alignment. Candidate areas:

- **`UNIVERSAL::can`** in **hash constructor** contexts: `%h = (... unknown package ...->can(...) ...)` pairing integrity.
- **Concat parity**: **`no utf8` / `use utf8`** literals **`Encode::is_utf8`** expectations (see **`dev/design/string_encoding_context_plan.md`** verification section).

### 5. [P4] Optional XS

Upstream ships **`Sub::HandlesVia::XS`** (skipped when absent). No action unless performance work demands it — pure Perl path is canonical for portability.

---

## Dependencies (mental model)

| Module | Role |
|--------|------|
| **Type::Tiny** / **Exporter::Tiny** | Types and coercion surfaces for handlers |
| **Eval::TypeTiny** | **`eval_closure`** — compiles delegated method bodies |
| **Mite / Sub::HandlesVia::Mite** | Constructor / attribute sugar; **`__META__`** uses **`can('BUILDARGS')`** |
| **Moo** | Primary toolkit exercised in **`t/02moo*.t`** |
| **Moose / Mouse / Corinna** | Separate test dirs; skip if stacks incomplete |

Issues in **Eval::TypeTiny** often surface as **compile errors inside generated strings** rather than `.pm` syntax errors — treat reports as **`$src` forensic** first.

---

## Related docs

| Document | Topic |
|----------|--------|
| [type_tiny.md](type_tiny.md) | Type::Tiny quirks on PerlOnJava |
| [moo_support.md](moo_support.md) | Moo stack status |
| [moose_support.md](moose_support.md) | Moose prerequisites |

---

## Progress log

| Date | Milestone |
|------|-----------|
| 2026-05-15 | **`UNIVERSAL::can`** empty-list/hash corruption fixed; **`__META__`** validated; `\x{c2}` eval blocker documented as next P0 |
| 2026-05-15 | **`UNIVERSAL::can`** split: **LIST** failures → `(undef)`, **scalar**/compile-time failures → empty list (restores `perl5_t` regressions while fixing Mite splice) |
23 changes: 20 additions & 3 deletions src/main/java/org/perlonjava/runtime/perlmodule/Universal.java
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,23 @@ public static void initialize() {
}
}

/**
* Missing-method return for UNIVERSAL::can.
*
* <p>Perl exposes this as undef. Compile-time lookups pass {@link RuntimeContextType#SCALAR} and
* discriminate with patterns like {@code size() == 1 && getBoolean()}; treating not-found {@code can}
* as a singleton {@code undef} there makes {@code size() == 1} with {@code getBoolean()==false}, which is
* not what those call sites distinguish from “no candidate”. Returning an {@linkplain RuntimeList#isEmpty()}
* list preserves existing logic.
*
* <p>List-context calls flatten this value into enclosing lists — only there must Perl get one{@code undef}
* placeholder (never a vanishing splice), which generated CPAN constructors rely on (e.g. Mite
* {@code __META__} pairings).
*/
private static RuntimeList canNotFound(int ctx) {
return ctx == RuntimeContextType.LIST ? scalarUndef.getList() : new RuntimeList();
}

/**
* Checks if the object can perform a given method.
* Note: This is a Perl method, it expects `this` to be the first argument.
Expand Down Expand Up @@ -155,7 +172,7 @@ public static RuntimeList can(RuntimeArray args, int ctx) {
if (method != null && !isAutoloadDispatch(method, actualMethod, perlClassName)) {
return method.getList();
}
return new RuntimeList();
return canNotFound(ctx);
}

// Handle Package::SUPER::method syntax
Expand All @@ -168,7 +185,7 @@ public static RuntimeList can(RuntimeArray args, int ctx) {
if (method != null && !isAutoloadDispatch(method, actualMethod, packageName)) {
return method.getList();
}
return new RuntimeList();
return canNotFound(ctx);
}

// Perl's can() must NOT consider AUTOLOAD - it should only find
Expand Down Expand Up @@ -219,7 +236,7 @@ public static RuntimeList can(RuntimeArray args, int ctx) {
return method.getList();
}
}
return new RuntimeList();
return canNotFound(ctx);
}

/**
Expand Down
Loading