genksyms: fix last 3 shift/reduce conflicts

The genksyms parser has ambiguities in its grammar, which are currently
suppressed by a workaround in scripts/genksyms/Makefile.

Building genksyms with W=1 generates the following warnings:

    YACC    scripts/genksyms/parse.tab.[ch]
  scripts/genksyms/parse.y: warning: 3 shift/reduce conflicts [-Wconflicts-sr]
  scripts/genksyms/parse.y: note: rerun with option '-Wcounterexamples' to generate conflict counterexamples

The ambiguity arises when decl_specifier_seq is followed by '(' because
the following two interpretations are possible:

  - decl_specifier_seq direct_abstract_declarator '(' parameter_declaration_clause ')'
  - decl_specifier_seq '(' abstract_declarator ')'

This issue occurs because the current parser allows an empty string to
be reduced to direct_abstract_declarator, which is incorrect.

K&R [1] explains the correct grammar:

    <parameter-declaration> ::= {<declaration-specifier>}+ <declarator>
                              | {<declaration-specifier>}+ <abstract-declarator>
                              | {<declaration-specifier>}+

    <abstract-declarator> ::= <pointer>
                            | <pointer> <direct-abstract-declarator>
                            | <direct-abstract-declarator>

    <direct-abstract-declarator> ::=  ( <abstract-declarator> )
                                   | {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
                                   | {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )

This commit resolves all remaining conflicts.

We need to consider the difference between the following two examples:

[Example 1] ( <abstract-declarator> ) can become <direct-abstract-declarator>

        void my_func(int (foo));

    ... is equivalent to:

        void my_func(int foo);

[Example 2] ( <parameter-type-list> ) can become <direct-abstract-declarator>

        typedef int foo;
        void my_func(int (foo));

    ... is equivalent to:

        void my_func(int (*callback)(int));

Please note that the function declaration is identical in both examples,
but the preceding typedef creates the distinction. I introduced a new
term, open_paren, to enable the type lookup immediately after the '('
token. Without this, we cannot distinguish between [Example 1] and
[Example 2].

[1]: https://cs.wmich.edu/~gupta/teaching/cs4850/sumII06/The%20syntax%20of%20C%20in%20Backus-Naur%20form.htm

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Nicolas Schier <n.schier@avm.de>
This commit is contained in:
Masahiro Yamada 2025-01-14 00:00:43 +09:00
parent 3ccda63a3a
commit 668de2b9d4

View File

@ -363,35 +363,47 @@ parameter_declaration_list:
;
parameter_declaration:
decl_specifier_seq abstract_declarator
decl_specifier_seq abstract_declarator_opt
{ $$ = $2 ? $2 : $1; }
;
abstract_declarator_opt:
/* empty */ { $$ = NULL; }
| abstract_declarator
;
abstract_declarator:
ptr_operator abstract_declarator
ptr_operator
| ptr_operator abstract_declarator
{ $$ = $2 ? $2 : $1; }
| direct_abstract_declarator
{ $$ = $1; dont_want_type_specifier = false; }
;
direct_abstract_declarator:
/* empty */ { $$ = NULL; }
| IDENT
IDENT
{ /* For version 2 checksums, we don't want to remember
private parameter names. */
remove_node($1);
$$ = $1;
}
| direct_abstract_declarator '(' parameter_declaration_clause ')'
| direct_abstract_declarator open_paren parameter_declaration_clause ')'
{ $$ = $4; }
| direct_abstract_declarator '(' error ')'
| direct_abstract_declarator open_paren error ')'
{ $$ = $4; }
| direct_abstract_declarator BRACKET_PHRASE
{ $$ = $2; }
| '(' abstract_declarator ')'
| open_paren parameter_declaration_clause ')'
{ $$ = $3; }
| '(' error ')'
| open_paren abstract_declarator ')'
{ $$ = $3; }
| open_paren error ')'
{ $$ = $3; }
| BRACKET_PHRASE
;
open_paren:
'(' { $$ = $1; dont_want_type_specifier = false; }
;
function_definition: