Skip to content

alx-0014 - Refactor syntax of preprocessing directives#8270

Open
alejandro-colomar wants to merge 7 commits into
cplusplus:mainfrom
alejandro-colomar:alx-0014
Open

alx-0014 - Refactor syntax of preprocessing directives#8270
alejandro-colomar wants to merge 7 commits into
cplusplus:mainfrom
alejandro-colomar:alx-0014

Conversation

@alejandro-colomar
Copy link
Copy Markdown

@alejandro-colomar alejandro-colomar commented Oct 1, 2025

Closes: #8249

Cc: @jwakely , @jensmaurer


Revisions:

v1b
  • Rebase after CWG Motion 3b (p3920).
$ git range-diff main..origin/alx-0014 d8bd209c..alx-0014 
1:  ed2683a4 = 1:  19c52494 (alx-0014r6-A) Refactor syntax of include directives
2:  c6ce3de1 = 2:  f99709f8 (alx-0014r6-B) Refactor syntax of embed directives
3:  600fa4c2 = 3:  47b7a580 (alx-0014r6-C) Refactor syntax of macros
4:  22509106 = 4:  1d405f7f (alx-0014r6-D) Refactor syntax of conditional directives
5:  61fd02e7 = 5:  823c6a4d (alx-0014r6-E) Refactor syntax of line directives
6:  fb1efad7 = 6:  c117fa30 (alx-0014r6-F) Refactor syntax of diagnostic directives
7:  2f2dee42 = 7:  4425d456 (alx-0014r6-G) Refactor syntax of pragma directives
8:  9b4bf160 = 8:  5f136b8b (alx-0014r6-H) Refactor syntax of null directives
v1c
  • Rebase after CWG Motion 4 (p3868r3). Motion 4 did apply the change from alx-0014r6-E, so that patch has been dropped.
$ git range-diff 19c52494^..origin/alx-0014 6cdd68b3..alx-0014 
1:  19c52494 = 1:  f32fe47f (alx-0014r6-A) Refactor syntax of include directives
2:  f99709f8 = 2:  4900dba2 (alx-0014r6-B) Refactor syntax of embed directives
3:  47b7a580 ! 3:  0e7d29b1 (alx-0014r6-C) Refactor syntax of macros
    @@ source/preprocessor.tex
     -    \terminal{\# undef \ } identifier new-line\br
     +    define-directive\br
     +    undef-directive\br
    -     \terminal{\# line \ \ } pp-tokens new-line\br
    +     line-directive\br
          \terminal{\# error \ } \opt{pp-tokens} new-line\br
          \terminal{\# warning} \opt{pp-tokens} new-line\br
     @@
4:  1d405f7f ! 4:  88c8cdaa (alx-0014r6-D) Refactor syntax of conditional directives
    @@ Commit message
     
      ## source/preprocessor.tex ##
     @@
    -     \terminal{\# }new-line
    +     line-directive \opt{line-directives}
      \end{bnf}
      
     -\begin{bnf}
    @@ source/preprocessor.tex
     -    \terminal{\# endif \ } new-line
     -\end{bnf}
     -
    - \begin{bnf}
    +-\begin{bnf}
      \nontermdef{text-line}\br
          \opt{pp-tokens} new-line
    + \end{bnf}
     @@ source/preprocessor.tex: has been replaced.
      \indextext{preprocessing directive!conditional inclusion}%
      \indextext{inclusion!conditional|see{preprocessing directive, conditional inclusion}}
5:  823c6a4d < -:  -------- (alx-0014r6-E) Refactor syntax of line directives
6:  c117fa30 = 5:  7ac593dc (alx-0014r6-F) Refactor syntax of diagnostic directives
7:  4425d456 = 6:  a2dc93b6 (alx-0014r6-G) Refactor syntax of pragma directives
8:  5f136b8b = 7:  e2ebd3bf (alx-0014r6-H) Refactor syntax of null directives
v1d
  • Rebase.
$ git range-diff f32fe47f^..origin/alx-0014 wg21/main..alx-0014 
1:  f32fe47f = 1:  8a7862f5 (alx-0014r6-A) Refactor syntax of include directives
2:  4900dba2 = 2:  90e5fd46 (alx-0014r6-B) Refactor syntax of embed directives
3:  0e7d29b1 = 3:  cf3a68ab (alx-0014r6-C) Refactor syntax of macros
4:  88c8cdaa = 4:  9b795749 (alx-0014r6-D) Refactor syntax of conditional directives
5:  7ac593dc = 5:  59c1c171 (alx-0014r6-F) Refactor syntax of diagnostic directives
6:  a2dc93b6 = 6:  1b3c1d67 (alx-0014r6-G) Refactor syntax of pragma directives
7:  e2ebd3bf = 7:  609580fe (alx-0014r6-H) Refactor syntax of null directives
v1e
  • Fix rebasing accident from v1c.
$ git range-diff 8a7862f5^ origin/alx-0014 alx-0014 
1:  8a7862f5 = 1:  8a7862f5 (alx-0014r6-A) Refactor syntax of include directives
2:  90e5fd46 = 2:  90e5fd46 (alx-0014r6-B) Refactor syntax of embed directives
3:  cf3a68ab = 3:  cf3a68ab (alx-0014r6-C) Refactor syntax of macros
4:  9b795749 ! 4:  acbd537d (alx-0014r6-D) Refactor syntax of conditional directives
    @@ source/preprocessor.tex
     -    \terminal{\# endif \ } new-line
     -\end{bnf}
     -
    --\begin{bnf}
    + \begin{bnf}
      \nontermdef{text-line}\br
          \opt{pp-tokens} new-line
    - \end{bnf}
     @@ source/preprocessor.tex: has been replaced.
      \indextext{preprocessing directive!conditional inclusion}%
      \indextext{inclusion!conditional|see{preprocessing directive, conditional inclusion}}
5:  59c1c171 = 5:  ebc741a1 (alx-0014r6-F) Refactor syntax of diagnostic directives
6:  1b3c1d67 = 6:  8441d23b (alx-0014r6-G) Refactor syntax of pragma directives
7:  609580fe = 7:  3a31df2b (alx-0014r6-H) Refactor syntax of null directives

@alejandro-colomar
Copy link
Copy Markdown
Author

alejandro-colomar commented Oct 1, 2025

I wasn't able to build the draft PDF locally, so this is untested.

@alejandro-colomar
Copy link
Copy Markdown
Author

alejandro-colomar commented Oct 5, 2025

@jwakely , @jensmaurer

I recommend reviewing these commits with git-show(1)'s --color-moved, which shows that all of this is moving text around, with no real changes (other than the additions of *-directive syntax entries, of course).

Is it okay like this? Or do you prefer it squashed in one commit?

(I wish github had this --color-moved in the WebUI. Actually, I wish I wasn't using github. :) )

@alejandro-colomar
Copy link
Copy Markdown
Author

Ping. :)

@AlisdairM
Copy link
Copy Markdown
Contributor

As we are in the ballot resolution period for the C++26 standard, I would not be waiting too eagerly for a change of this scale to be reviewed in the next six months or so. I might happen, as we want to produce the highest quality standard that we can, but we also have significant work to resolve NB concerns and publish the completed standard that clearly have our undivided attention until then.

@alejandro-colomar
Copy link
Copy Markdown
Author

As we are in the ballot resolution period for the C++26 standard, I would not be waiting too eagerly for a change of this scale to be reviewed in the next six months or so. I might happen, as we want to produce the highest quality standard that we can, but we also have significant work to resolve NB concerns and publish the completed standard that clearly have our undivided attention until then.

Thanks! That's useful info to me. :)

Cheers,
Alex

@tkoeppe
Copy link
Copy Markdown
Contributor

tkoeppe commented Oct 31, 2025

Also, even though grammar productions are ultimately not observable, I'd hesitate to call such a change editorial, since the wording groups made a deliberate choice in the manner of presentation. I would prefer if such a proposed change were delivered as an (editorial) paper addressed at the relevant wording group (CWG in this case).

@tkoeppe tkoeppe added the needs-paper The proposed change should be written up and published as a paper. label Oct 31, 2025
@alejandro-colomar
Copy link
Copy Markdown
Author

@tkoeppe

Also, even though grammar productions are ultimately not observable, I'd hesitate to call such a change editorial, since the wording groups made a deliberate choice in the manner of presentation.

Why do you claim "the wording groups" made a deliberate choice in the manner of presentation?

This wording has been kept essentially intact since ANSI C89.

The only choice that has been made ever since, was when adding modules; the only significant change to the directives since ANSI C89.

Guess what? They added modules in the same manner I'm proposing now. If you notice, the only thing I haven't moved is the syntax for modules, precisely because it's organized exactly as I'm proposing for the rest of the directives.

So, the only actual deliberate choice that has been ever made regarding this in the last 35 years has been in the same direction of my proposal.

See the current syntax for pp-import, FWIW.

I would prefer if such a proposed change were delivered as an (editorial) paper addressed at the relevant wording group (CWG in this case).

You have my proposal for wg14, which is written in a paper. With minor modifications, you can get it to apply to C++. In fact, you can just skip the "proposed wording", as you have here the actual diff, and read only the rationale, which applies entirely.

I'll paste the rationale here (I'll paste the one from the updated paper I'll present soon):

Rationale
        Editorial change.  This moves text around for better
        organization and readability.  No semantic changes.  This makes
        it so that any future proposals to the preprocessor will be
        easier to apply.

        The proposal is split in subsections, separated by '---', for
        better readability.

        This proposal applies as is to the C++ latest draft, N5014, only
        changing section numbers and their titles.

        This change has prior art in C++, as they've done the same exact
        thing with their pp-import directive.  I guess they haven't done
        it with the other directives for compatibility with us, so let's
        help them and do it everywhere.

    Interaction with other proposals
        This blocks alx-0003 ("Add directives #def and #enddef")
        This blocks alx-0013 ("Prohibit non-directives (other than ID directives)").

Here's the wg14 paper:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3632.txt

@jensmaurer
Copy link
Copy Markdown
Member

jensmaurer commented Nov 2, 2025

Why do you claim "the wording groups" made a deliberate choice in the manner of presentation?

A choice made four decades ago can still qualify as a deliberate choice, even if we prefer to phrase things differently now.

@alejandro-colomar
Copy link
Copy Markdown
Author

alejandro-colomar commented Nov 2, 2025

Why do you claim "the wording groups" made a deliberate choice in the manner of presentation?

A choice made four decades ago can still qualify as a deliberate choice, even if we prefer to phrase things differently now.

But WG21 has already decided to not follow that anymore, when modules were added. That makes it an editorial change to align the rest of the directives with modules. I don't think this needs to waste much time from subgroups.

15.1 (Preprocessing directives :: Preamble) refers to pp-import

control-line :
  # include pp-tokens new-line
  pp-import
  # define identifier replacement-list new-line
  # define identifier lparen identifier-listopt ) replacement-list new-line
  # define identifier lparen ... ) replacement-list new-line
  # define identifier lparen identifier-list , ... ) replacement-list new-line
  # undef identifier new-line
  # line pp-tokens new-line
  # error pp-tokensopt new-line
  # warning pp-tokensopt new-line
  # pragma pp-tokensopt new-line
  # new-line

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf#section.15.1

which is defined in 15.5 (Header unit importation)

pp-import :
  exportopt import header-name pp-tokensopt ; new-line
  exportopt import header-name-tokens pp-tokensopt ; new-line
  exportopt import pp-tokens ; new-line

https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/n4950.pdf#section.15.5

That's how the syntax for all directives should be defined.

@tkoeppe
Copy link
Copy Markdown
Contributor

tkoeppe commented Nov 2, 2025

All the same, CWG is used to the current grammar, and I would not want to change it without their awareness. It seems that Jens has already said as much in #8249 (comment).

But perhaps we can just get CWG to take a look at this pull request to get us started. @jensmaurer Could we slot this in somewhere?

@alejandro-colomar
Copy link
Copy Markdown
Author

But perhaps we can just get CWG to take a look at this pull request to get us started. @jensmaurer Could we slot this in somewhere?

@jensmaurer , @tkoeppe Were you able to have a look at this?

@alejandro-colomar
Copy link
Copy Markdown
Author

Approved in WG14.

@wg21bot wg21bot added the needs rebase The pull request needs a git rebase to resolve merge conflicts. label Apr 20, 2026
@alejandro-colomar
Copy link
Copy Markdown
Author

It seems it needs another rebase.

@tkoeppe will you do that, or should I?

@tkoeppe
Copy link
Copy Markdown
Contributor

tkoeppe commented May 7, 2026

Please do, probably easiest on your end.

@alejandro-colomar
Copy link
Copy Markdown
Author

alejandro-colomar commented May 7, 2026

Please do, probably easiest on your end.

Ahh, sorry, I saw a message in this PR that you had force pushed, but now I see it talks about the main branch not this PR. I thought you had rebased this PR.

Yes, I'll rebase this PR myself.

@alejandro-colomar alejandro-colomar force-pushed the alx-0014 branch 2 times, most recently from 5f136b8 to e2ebd3b Compare May 12, 2026 11:36
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
@alejandro-colomar
Copy link
Copy Markdown
Author

alejandro-colomar commented May 12, 2026

Done; I've rebased. An interesting thing is that the conflict was because p3868r3 had already applied some of the changes I was proposing, for line directives.

Signed-off-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
@tkoeppe
Copy link
Copy Markdown
Contributor

tkoeppe commented May 12, 2026

@jensmaurer: should CWG have a look here, so at least they're all on the same page?

@jensmaurer jensmaurer added cwg Issue must be reviewed by CWG. and removed needs rebase The pull request needs a git rebase to resolve merge conflicts. labels May 12, 2026
@jensmaurer
Copy link
Copy Markdown
Member

@tkoeppe Yes, this touches grammar, and preprocessor grammar is sometimes easy to anger, so this definitely needs CWG processing.

@alejandro-colomar
Copy link
Copy Markdown
Author

alejandro-colomar commented May 12, 2026

@tkoeppe Yes, this touches grammar, and preprocessor grammar is sometimes easy to anger, so this definitely needs CWG processing.

While this "touches" grammar, it only moves grammar specification from one subsection to others, without any text changes (you can check that with git show --color-moved).

Comment thread source/preprocessor.tex
\indextext{preprocessing directive!embed a resource}
\indextext{\idxcode{\#embed}}%

\begin{bnf}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets us a "hanging paragraph" in ISO parlance (text directly in a subsection that also has further sub-subsections).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based the diff on related existing code. There's for example a recent commit that did the same thing, I believe:

commit 2c473cc8744fc5b2077349add2283826061ce881
Author: Jens Maurer <Jens.Maurer@gmx.net>
Date:   2025-11-09 20:46:24 +0100

    P3868R1 Allow #line before module declarations
    
    Fixes NB US 55-102 (C++26 CD).
    
    Editorial note:
    * Retained position of #line alternative within control-line.

diff --git a/source/preprocessor.tex b/source/preprocessor.tex
index 487c0360..e449e830 100644
--- a/source/preprocessor.tex
+++ b/source/preprocessor.tex
@@ -16,11 +16,11 @@
     module-file
 \end{bnf}
 
 \begin{bnf}
 \nontermdef{module-file}\br
-    \opt{pp-global-module-fragment} pp-module \opt{group} \opt{pp-private-module-fragment}
+    \opt{line-directives} \opt{pp-global-module-fragment} pp-module \opt{group} \opt{pp-private-module-fragment}
 \end{bnf}
 
 \begin{bnf}
 \nontermdef{pp-global-module-fragment}\br
     \keyword{module} \terminal{;} new-line \opt{group}
@@ -53,17 +53,22 @@
     \terminal{\# define } identifier replacement-list new-line\br
     \terminal{\# define } identifier lparen \opt{identifier-list} \terminal{)} replacement-list new-line\br
     \terminal{\# define } identifier lparen \terminal{... )} replacement-list new-line\br
     \terminal{\# define } identifier lparen identifier-list \terminal{, ... )} replacement-list new-line\br
     \terminal{\# undef \ } identifier new-line\br
-    \terminal{\# line \ \ } pp-tokens new-line\br
+    line-directive\br
     \terminal{\# error \ } \opt{pp-tokens} new-line\br
     \terminal{\# warning} \opt{pp-tokens} new-line\br
     \terminal{\# pragma } \opt{pp-tokens} new-line\br
     \terminal{\# }new-line
 \end{bnf}
 
+\begin{bnf}
+\nontermdef{line-directives}\br
+    line-directive \opt{line-directives}
+\end{bnf}
+
 \begin{bnf}
 \nontermdef{if-section}\br
     if-group \opt{elif-groups} \opt{else-group} endif-line
 \end{bnf}
 
@@ -2060,10 +2065,15 @@ a macro name.
 
 \rSec1[cpp.line]{Line control}%
 \indextext{preprocessing directive!line control}%
 \indextext{\idxcode{\#line}|see{preprocessing directive, line control}}
 
+\begin{bnf}
+\nontermdef{line-directive}\br
+    \terminal{\# line} pp-tokens new-line
+\end{bnf}
+
 \pnum
 The \grammarterm{string-literal} of a
 \tcode{\#line}
 directive, if present,
 shall be a character string literal.

Am I missing something?

Copy link
Copy Markdown
Author

@alejandro-colomar alejandro-colomar May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, now I see. Embed has subsections, while the others don't. What do you suggest? I don't know what should be done here. Is there any part of the standard that I could imitate for this?

Maybe put it in General (cpp.embed.gen)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cwg Issue must be reviewed by CWG. needs-paper The proposed change should be written up and published as a paper.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

alx-0014 - Refactor syntax of preprocessing directives

5 participants