Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions dev/modules/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ This directory contains design documents and guides related to porting CPAN modu
| [xsloader.md](xsloader.md) | XSLoader architecture |
| [makemaker_perlonjava.md](makemaker_perlonjava.md) | ExtUtils::MakeMaker implementation |
| [cpan_client.md](cpan_client.md) | jcpan - CPAN client for PerlOnJava |
| [cpanplus.md](cpanplus.md) | **CPANPLUS** — `jcpan -t CPANPLUS`: `require` true-value chain, Interpreter/JVM parity, remaining test gaps (**`BUILD_PL`**, SQLite, formatting) |
| [dbix_class.md](dbix_class.md) | DBIx::Class support (in progress) |
| [padwalker.md](padwalker.md) | PadWalker support plan for Reply lexical persistence |
| [dbi_test_parity.md](dbi_test_parity.md) | DBI test-suite parity (~13.5× more passes than master; Phases 1–4 done, incl. a tied-hash method-dispatch fix in the PerlOnJava runtime) |
Expand Down
186 changes: 186 additions & 0 deletions dev/modules/cpanplus.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
# CPANPLUS — full `./jcpan -t CPANPLUS` parity

This document tracks **PerlOnJava**/`jperl` work so **`./jcpan -t CPANPLUS`** can pass the upstream **`t/`** suite without early aborts (“did not return a true value”, missing imports, …).

Related: **`dev/modules/cpan_client.md`** (jcpan client), **`AGENTS.md`** (always **`timeout`** around **`jperl`** / **`jcpan`**).

---

## Resolved: `require` / trailing true value (2026-05)

### Symptoms

Failures such as:

```text
CPANPLUS/Config.pm did not return a true value at t/… line 4, <GEN…> line 317.
Compilation failed in require
```

Perl requires the loaded compilation unit’s **last statement** to yield a **defined** scalar; an **empty `RuntimeList`** at the boundary behaves like **undef**.

### Causes addressed in-tree

1. **`CompilerOptions` leakage (JVM)**
Nested subroutine compilations (`EmitSubroutine`, lazy subs in `SubroutineParser`) reused `compilationUnitFromRequireOrDo` via a shared `CompilerOptions` reference. `EmitBlock` could treat an inner sub’s body like the outer `require` file tail and mis-emit the last statement (empty list).
- **Fix:** **`clone()`** parent options for nested JVM subs / named lazy subs and clear **`compilationUnitFromRequireOrDo`** / **`compilationUnitCallerContext`**.

2. **`eval`** units inheriting **`require`** flags
**`EmitEval`** and **`RuntimeCode`** eval clones now clear **`compilationUnitFromRequireOrDo`** after cloning so eval strings are not codegen’d as the outer **`require`** body.

3. **Interpreter parity after JVM `ctx.contextType = RUNTIME`**
Fallback compilation can leave **`currentCallContext == RUNTIME`** so the interpreter emitted the **file’s last statement** in **RUNTIME** context → empty list semantics for trailing **`1;`** relative to **`require`**.
- **Fix:** **`BytecodeCompiler`** — for the outermost block of a **`compilationUnitFromRequireOrDo`** unit, treat the **last** statement like **`EmitBlock`** (use **`compilationUnitCallerContext`** or **scalar**) when **`currentCallContext == RUNTIME`**.

4. **`LargeBlockRefactorer` vs `require`/do file body**
Whole-block `sub { ... }->()` refactor must not run on the `require`/do outermost body (it would discard the compilation unit’s return value). Mitigated via `compilationUnitFromRequireOrDo` + outer-block detection (`EmitBlockJvmDepth` / `isFileLevelBlock` skips).

5. **Circularity guard (Configure ⇄ Config ⇄ Backend)**
`CPANPLUS::Configure` loads `CPANPLUS::Config`; `Config` pulls `CPANPLUS` → `Backend` → `Configure` again. Even after codegen fixes, `apply()` could still propagate an empty `RuntimeList` with `$@` unchanged.
- **Fix (belt-and-suspenders):** `ModuleOperators.doFile`: for `require` of a `compilationUnitFromRequireOrDo` unit, if `result.isEmpty()` and `$@` is blank (snapshot before `$@` is cleared on success), coerce success to `scalarTrue`. The **`module_true`** feature flag still overrides as before.

Supporting plumbing: **`JavaClassInfo`** fields **`emitJvmApplyBodyFromRequireOrDo`** / **`emitBlockJvmDepth`**, **`PerlLanguageProvider`** marking **`compilationUnitCallerContext`**, **`CompilerOptions`** **javadoc**.

### Verification checklist (regression-sensitive)

```bash
make

# CPANPLUS build dir after jcpan fetched sources (adjust path/version)
PERL5LIB="/path/to/CPANPLUS-*/blib/lib:/path/to/CPANPLUS-*/blib/arch:$PERL5LIB"
cd …/CPANPLUS-*/t && timeout 120 /path/to/jperl -e 'require "./inc/conf.pl"; print "ok\n"'
```

Always wrap **`./jcpan -t CPANPLUS`**:

```bash
timeout 3600 ./jcpan -t CPANPLUS # captures full TAP; see jcpan/build logs
```

---

## Resolved (2026-05-16): **`BUILD_PL` / `MAKEFILE`** “strict bareword” (**`-e`** / **`stat`** + **`->`**)

Perl treats **`BUILD_PL`**, **`MAKEFILE`**, … as **exported constant subs**, not ALLCAPS filehandle slots.

PerlOnJava’s **file-test** operator path and **`stat`/`lstat`** mistakenly consumed any **`^[A-Z_][A-Z0-9_]*$`** bareword as a glob handle **before** list/expression parsing, so **`CONSTANT->($path)`** and **`stat CONSTANT->(...)`** left a bare **`IdentifierNode`** behind and tripped **`strict subs`** at emit time.

**Fix:** **`FileHandle.shouldTreatAllCapsIdentifierAsBareFileHandleSlot`** — only use the legacy handle heuristic when **`NAME`** is **not** followed by **`->`** (skipping whitespace) **and **`GlobalVariable.isGlobalCodeRefDefined(CurrentPackage::NAME)`** is false. Wired from **`ParsePrimary.parseFileTestOperator`** and **`OperatorParser.parseStat`**.

**Check:** **`timeout 120 ./jperl -e '… -e BUILD_PL->($extract) …'`** and **`timeout 300 ./jperl ./04_CPANPLUS-Module.t`** (exit **0**).

---

## Resolved (2026-05-16): **`t/00`** **`_version_to_number`** (**`version` module** parity)

Upstream **`Utils::_version_to_number`** strips non-numeric tails (e.g. **`1.5-a` → `version->parse("1.5")`**), then **`numify`**. Failures (**`v1.5`**, **`1.5`**) were **not** fixable by repurposing **`VersionHelper.normalizeVersion`**: that helper is also used for **`use VERSION` / feature-bundle parsing** and must stay coarse; changing it broke **`use v5.36`** signatures and **`IO::Handle`** ( **`use 5.38.0`** + built-in **`say`** ).

**Fixes:** **`VersionHelper.normalizeVersionForPerlModule`** (tuple + single-dot decimal mantissa chunking + multi-dot **`5.x.y`** tuples) used only from **`Version.java`**; **`normalizeVersion`** unchanged for **`StatementParser`**. **`Version.java`**: removed bogus internal **`v`** prepend on short **`1.x`** decimals; **`numify`** uses **`max(parts − 1, 1)`** fractional **`%03d`** groups (Perl **`version.pm`**). **`StatementParser.parseOptionalPerlBareUseVersion`**: splice lexer-split **`use 5.38.0`** into a tuple string (not **`5.382`** float).

**Check:** **`./jperl src/test/resources/unit/version_pm_numify_parity.t`**; **`timeout 900 ./jcpan -t CPANPLUS`** ( **`t/00`** + full suite ).

---

## Resolved (2026-05-16): **`$^E`** + **`$!`** uninitialized warnings (**File::Copy** TAP noise)

`$^E` is created by the **`$^A`–`$^Z`** startup loop as a plain global (**undef**). Perl defines **`$^E`** as the extended OS error; on **POSIX** it **always matches `$!`** (perlvar). Numeric context **`$^E + 0`** must not warn.

**Fix:** **`GlobalContext.initializeGlobals`**: install **`ErrnoVariable`** for **`main::!`**. Re-point **`$^E`** to the **same `ErrnoVariable`** on non‑Windows hosts; on **Windows** use a **second `ErrnoVariable`** so **`($!, $^E) = (...)`** in **`File::Copy`** can restore errno vs Win32 error independently. Bundled **`File/Copy.pm`** stays stock **`($! + 0, $^E + 0)`**.

**Check:** **`./jperl src/test/resources/unit/errno_caret_e_defined.t`**; **`timeout 900 ./jcpan -t CPANPLUS`** — no **`File/Copy`** **`uninitialized`** line.

---

## Resolved (2026-05): **Strict + string `eval` + import / `no` — pr694 (**`has … =>`** DSL)**

### Symptoms

Failures such as **`Undefined subroutine &Some::Pkg::has`** inside **`eval q{ … use ExporterThing; has foo => (...); no ExporterThing; … }`** even though **`perl`** runs the **`has`** call with the imported CV after the stash entry was deleted (**CPANPLUS**-adjacent **`use`/`no`/DSL** ordering).

Separate regression **`unit/eval_after_stash_delete.t`** must keep **Perl** semantics: compilations that start **after** **`delete $stash{sub}`** must **not** resurrect a pinned CV.

### Cause

For **eval string**, **`Parser.parse()`** runs **`use` / `no`** immediately (BEGIN-like), then **`BytecodeCompiler.compile(ast)`** runs. By emit time the visible **`globalCodeRefs`** entry for **`&Pkg::name`** is often already gone, so **`getGlobalCodeRefForFreshLookup`** constant-pooled an **undef placeholder** for named **`&sub(...)`** sites. **Perl** still calls the **compile-time-pinned** CV.

A **GlobalVariable-only** “always prefer pinned when stash-deleted” fix broke **`eval_after_stash_delete`** (new compile after delete must see an empty slot).

### Fix

- **`SubroutineParser`**: when parsing a **direct** call **`&name(...)`** and **`GlobalVariable`** already shows a **real callable** (**not** a pure **`sub name;`** forward stub with only attributes), **`setAnnotation("parseTimeCodeRef", …)`** on the **`OperatorNode("&", …)`**.
- **`BytecodeCompiler`** embeds **`parseTimeCodeRef`** into the bytecode constant pool (**interpreter** / eval-string parity with compile-time **`&`** pinning).
- **JVM** **`EmitSubroutine.handleApplyOperator`**: **`&(bareword)`** must load the callee via **`EmitVariable` → `getGlobalCodeRef`** (runtime glob). Embedding **`parseTimeCodeRef`** with **`GlobalVariable.registerCompiledCodeRef`** was wrong for **`local *Pkg::…`** overrides (**CPANPLUS::Dist::MM** **`format_available`** / **t/20**) because the ID pins a **`RuntimeScalar`** that **`replacePinnedCodeRef`** does not update.

### Regression tests

```bash
./gradlew shadowJar # ./jperl uses target/perlonjava-*.jar — rebuild after Java changes
timeout 120 ./jperl src/test/resources/unit/pr694_core_regressions.t
timeout 120 ./jperl src/test/resources/unit/eval_after_stash_delete.t
```

---

## Roadmap: `./jcpan -t CPANPLUS` — **detailed next steps**

**Latest harness (2026-05-16):** **`timeout 900 ./jcpan -t CPANPLUS`** → **PASS** (**20** files, **1576** subtests, CPANPLUS **0.9916**); clean TAP re **`File/Copy`** after **`File/Copy.pm`** line **303** guard + fresh **`shadowJar`**.

### 0. Routine verification (every CPANPLUS-related push)

1. **`make`** (Gradle **`shadowJar`** + unit shards — required before PR updates per **`AGENTS.md`**).
2. **Interpreter / eval regressions:**
`timeout 120 ./jperl src/test/resources/unit/pr694_core_regressions.t`
`timeout 120 ./jperl src/test/resources/unit/eval_after_stash_delete.t`
3. **Smoke require** from a CPANPLUS tree (adjust paths):
`PERL5LIB="…/CPANPLUS-*/blib/lib:…/CPANPLUS-*/blib/arch:$PERL5LIB" timeout 120 ./jperl -e 'require CPANPLUS::Config'`
4. **Harness:** **`timeout 3600 ./jcpan -t CPANPLUS`** — capture full log under **`jcpan/`** / build output; note first failing **`t/`** program and TAP line.

### 1. ~~**`BUILD_PL` / `MAKEFILE` barewords~~ — **Done**

See “Resolved … **`BUILD_PL` / `MAKEFILE`**” above.

### 2. ~~**`t/00`** `_version_to_number` / **`version`**~~ — **Done**

See “Resolved … **`_version_to_number`**” above.

### 3. ~~**`t/031`** SQLite source + **`DBIx::Simple`**~~ — **Done (2026-05-16, `master`)**

**`t/031_CPANPLUS-Internals-Source-SQLite.t`** and **`032_…via-sqlite`** pass under **`jcpan -t CPANPLUS`** after upstream **`DBIx::Simple`/JDBC chain** landed on **`master`**. Regression watch: **`dbh`** lifetime under heavy **`SQLite`** use.

### 4. ~~**`t/20`** **`CPANPLUS::Dist::MM`** / **`can_load`**~~ — **Done (2026-05-16, JVM)**

- **Symptom:** **`local *CPANPLUS::Dist::MM::can_load = sub { … }`** should change **`can_load(...)`** inside **`format_available`**; **`jperl`** was still calling the pre-local CV.
- **Fix:** **`EmitSubroutine.handleApplyOperator`** no longer embeds parser **`parseTimeCodeRef`** via **`registerCompiledCodeRef`**; **`&(bareword)`** always goes through **`getGlobalCodeRef`** so **`local *glob`** (**`RuntimeGlob.dynamicSaveState`** / **`replacePinnedCodeRef`**) wins. Interpreter path still uses **`parseTimeCodeRef`** (**pr694** / stash-delete pinning).
- **Check:** **`src/test/resources/unit/cpanplus_dist_mm_can_load_local.t`** (also rebuild **`shadowJar`** before spot-checking **`./jperl -e`** — stale jars looked like **`local`** was broken).

### 5. ~~**`File::Copy`** **`$!`** / **`$^E`** warnings~~ — **Done**

See “Resolved … **`$^E`**” above.

### 6. Documentation + incident hygiene

- After each **`jcpan -t CPANPLUS`** run, update **this file** with: date, subtest totals, dubious-program count, and the **short list** of remaining failing **`t/`** scripts.
- Keep **`dev/modules/cpan_client.md`** in sync only when **`jcpan`** behavior or **`PERL5LIB`** layout changes.

---

## Progress tracking

| Area | Status | Notes |
|------|--------|--------|
| Empty list / **`require`** false negative | **Done** | Cloned **`CompilerOptions`**, evaluator fixes, bytecode last-stmt **`require`** parity, **`doFile`** empty-list heel when **`$@`** clean |
| **`make`** (unit shards) | **Done** | Run before pushing |
| **`jcpan -t CPANPLUS`** bootstrap | **Unblocked** | **`conf.pl`** + **`Selfupdate`** / **`Report`** no longer abort on **`Config`** |
| **`BUILD_PL` / `MAKEFILE` filetest/`stat`** | **Done** | ALLCAP bareword handle heuristic vs **`->`** / defined package sub (**`FileHandle`** helper) |
| **`t/00`** version / **`numify`** | **Done** | **`normalizeVersionForPerlModule`** + **`Version.java`**; bare **`use 5.x.y`** splice (**`StatementParser`**) |
| **Strict + string `eval` + import/**`no`** (pr694)** | **Done** | **`SubroutineParser`** **`parseTimeCodeRef`** → **`BytecodeCompiler`**; **`pr694_core_regressions.t`**, **`eval_after_stash_delete.t`** |
| **`File::Copy` warn + 0 `$!`/`$^E`** | **Done** | **`GlobalContext`**: **`$^E` → `ErrnoVariable`** (alias **`$!`** on POSIX) |
| **`t/031` SQLite Source** | **Done** | Covered by **`jcpan -t CPANPLUS`** PASS (2026-05-16); upstream **`DBIx::Simple`/JDBC** |
| **`t/20` Dist::MM / `can_load`** | **Done (JVM)** | **`EmitSubroutine`**: no **`registerCompiledCodeRef`** for **`&`** calls; **`cpanplus_dist_mm_can_load_local.t`** |

---

## Open questions

- Should the **`doFile`** empty-list coercion be tightened (e.g. only **`.pm`** paths, file size cap, circular-depth probe) vs keeping the current conservative **`compilationUnitFromRequireOrDo`** guard?
- **ASM `ArrayIndexOutOfBoundsException`** in frame compute during heavy BEGIN stacks: already falls back to interpreter — track reduction of fallback frequency?
14 changes: 14 additions & 0 deletions src/main/java/org/perlonjava/app/cli/CompilerOptions.java
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,20 @@ public class CompilerOptions implements Cloneable {
* Perl 5 semantics.
*/
public String initialPackage = null;
/**
* True only when this compilation unit is the body loaded by {@code require} / {@code do}
* ({@link org.perlonjava.runtime.operators.ModuleOperators#doFile}). Used so codegen can treat
* the AST root as a file-level compilation unit regardless of earlier transforms.
*/
public boolean compilationUnitFromRequireOrDo = false;
/**
* Context from require/do/eval caller for this compilation unit (SCALAR/LIST/VOID).
* JVM codegen forces {@link org.perlonjava.runtime.runtimetypes.RuntimeContextType#RUNTIME}
* on {@link org.perlonjava.backend.jvm.EmitterContext}; {@link org.perlonjava.backend.jvm.EmitBlock}
* uses this for the final statement of file-level blocks so {@code require} sees the trailing
* {@code 1;} value. {@code -1} means unset (EmitBlock defaults to SCALAR).
*/
public int compilationUnitCallerContext = -1;
public boolean unicodeStdout = false; // -CO
public boolean unicodeStderr = false; // -CE
public boolean unicodeInput = false; // -CI (same as stdin)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import org.perlonjava.backend.jvm.InterpreterFallbackException;
import org.perlonjava.backend.jvm.JavaClassInfo;
import org.perlonjava.frontend.analysis.ConstantFoldingVisitor;
import org.perlonjava.frontend.astnode.AbstractNode;
import org.perlonjava.frontend.astnode.Node;
import org.perlonjava.frontend.lexer.Lexer;
import org.perlonjava.frontend.lexer.LexerToken;
Expand Down Expand Up @@ -128,6 +129,8 @@ public static RuntimeList executePerlCode(CompilerOptions compilerOptions,
int contextType = callerContext >= 0 ? callerContext :
(isTopLevelScript ? RuntimeContextType.VOID : RuntimeContextType.SCALAR);

compilerOptions.compilationUnitCallerContext = contextType;

// Create the compiler context
EmitterContext ctx = new EmitterContext(
new JavaClassInfo(), // internal java class name
Expand Down Expand Up @@ -208,6 +211,11 @@ public static RuntimeList executePerlCode(CompilerOptions compilerOptions,
// bare constant identifiers (e.g., PI from `use constant PI => 3.14`).
ast = ConstantFoldingVisitor.foldConstants(ast, ctx.symbolTable.getCurrentPackage());

if (compilerOptions.compilationUnitFromRequireOrDo && ast instanceof AbstractNode rootAst
&& !rootAst.getBooleanAnnotation("blockIsSubroutine")) {
rootAst.setAnnotation("isFileLevelBlock", true);
}

if (ctx.compilerOptions.parseOnly) {
// Printing the ast
System.out.println(ast);
Expand Down Expand Up @@ -306,6 +314,8 @@ public static RuntimeList executePerlAST(Node ast,
}
}

compilerOptions.compilationUnitCallerContext = contextType;

EmitterContext ctx = new EmitterContext(
new JavaClassInfo(),
globalSymbolTable.snapShot(),
Expand Down Expand Up @@ -601,10 +611,16 @@ private static RuntimeCode compileToExecutable(Node ast, EmitterContext ctx) thr
if (CompilerOptions.DEBUG_ENABLED) ctx.logDebug("Falling back to bytecode interpreter due to method size");
// Reset strict/feature/warning flags before fallback compilation.
// The JVM compiler already processed BEGIN blocks (use strict, etc.)
// which set these flags on ctx.symbolTable. But the interpreter will
// which set those flags on ctx.symbolTable. But the interpreter will
// re-process those pragmas during execution, so inheriting them causes
// false strict violations (e.g. bareword filehandles rejected).
if (ctx.symbolTable != null) {
//
// Skip this reset for require/do compilation units: clearing strict hints
// before recompiling large modules (e.g. CPANPLUS::Config) has been observed
// to interact badly with unit initialization; those files rely on compile-time
// hints accumulated during the failed JVM compile pass matching execution.
if (ctx.symbolTable != null
&& !(ctx.compilerOptions != null && ctx.compilerOptions.compilationUnitFromRequireOrDo)) {
ctx.symbolTable.strictOptionsStack.pop();
ctx.symbolTable.strictOptionsStack.push(0);
}
Expand Down Expand Up @@ -672,6 +688,8 @@ public static Object compilePerlCode(CompilerOptions compilerOptions) throws Exc
globalSymbolTable.enableStrictOption(Strict.HINT_UTF8);
}

compilerOptions.compilationUnitCallerContext = RuntimeContextType.SCALAR;

EmitterContext ctx = new EmitterContext(
new JavaClassInfo(),
globalSymbolTable.snapShot(),
Expand Down
Loading
Loading