Is reusing rules always slower than redefine with with tokens in Antlr4?

I am profiling my Antlr4 generated parser in JavaScript. I have a few rules that match ID | STRING.

Lexer

ID
 : [a-zA-Z_] [a-zA-Z_0-9]*
 ;

STRING
 : '"' (~["\r\n] | '""')* ('"'|[\r\n])?
 ;

Parser

name: ID | STRING ;

rule1: some other rules;
rule2: different rules

some: ID | STRING ;
different: ID | STRING ;

If I change some to some: name; and different to different: name; the performance goes down about 30%. (To parse a given code 100 times, time goes up from 1.5s to about 2s).

In this case, name is a terminal node in parser. So I would not assume a lot of overhead in itself. We have 8 other places using ID | STRING. That 30% was after I replaced all of them with name.

The testing code is:

x = B."method. {a, b} 1"(1,2)

In the above code, the following will be matched by "ID | STRING":

  1. x
  2. B
  3. "metohd. {a, b} 1"
  4. 1
  5. 2

Is my assumption stated in the title correct?

9 thoughts on “Is reusing rules always slower than redefine with with tokens in Antlr4?”

Leave a Comment