Stories
Slash Boxes
Comments

Dev.SN ♥ developers

posted by martyb on Saturday February 13 2021, @05:54PM   Printer-friendly

Title: Parsing C++ Is Literally Undecidable (2013)

--- --- --- --- Entire Story Below - Must Be Edited --- --- --- --- --- --- ---

Arthur T Knackerbracket has found the following story:

Many programmers are aware that C++ templates are Turing-complete, and this was proved in the 2003 paper C++ Templates are Turing Complete.

However, there is an even stronger result that many people are not aware of. The C++ FQA has a section showing that parsing C++ is undecidable, but many people have misinterpreted the full implications of this (understandable, since the FQA is discussing several issues over the course of its questions and does not make explicit the undecidability proof).

Some people misinterpret this statement to simply mean that fully compiling a C++ program is undecidable, or that showing the program valid is undecidable. This line of thinking presumes that constructing a parse tree is decidable, but only further stages of the compiler such as template instantiation are undecidable.

For example, see this (incorrect, but top-voted) Stack Overflow answer to the question What do people mean when they say C++ has “undecidable grammar”? This answer errs when it says: “Note this has nothing to do with the ambiguity of the C++ grammar.”

In fact, simply producing a parse tree for a C++ program is undecidable, because producing a parse tree can require arbitrary template instantiation. I will demonstrate this with a short program, which is a simplification/adaptation of what is in the FQA link above.

The parse tree for this program depends on whether TuringMachine::output is SomeType or not. If it is SomeType then ::name is an integer and the parse tree for the program is multiplying two integers and throwing away the result. If it is not SomeType, then ::name is a typedef for int and the parse tree is declaring a pointer-to-int named x. These two are completely different parse trees, and the difference between them cannot be delayed to further stages of the compiler.

The parse tree itself depends on arbitrary template instantiation, and is therefore the parsing step is undecidable.

In practice, compilers limit template instantiation depth, so this is more of a theoretical problem than a practical one. But it is still a deep and significant result if you are ever planning on writing a C++ parser.

Parsing, performance, and low-level programming.


Original Submission

 

Post Comment

Edit Comment You are not logged in. You can log in now using the convenient form below, or Create an Account, or post as Anonymous Coward.

Public Terminal

Anonymous Coward [ Create an Account ]

Use the Preview Button! Check those URLs!


Score: 0 (Logged-in users start at Score: 1). Create an Account!

Allowed HTML
<b i p br a ol ul li dl dt dd em strong tt blockquote div ecode quote sup sub abbr sarc sarcasm user spoiler del s strike>

URLs
will auto-link a URL

Important Stuff

  • Please try to keep posts on topic.
  • Try to reply to other people's comments instead of starting new threads.
  • Read other people's messages before posting your own to avoid simply duplicating what has already been said.
  • Use a clear subject that describes what your message is about.
  • Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive comments might be moderated. (You can read everything, even moderated posts, by adjusting your threshold on the User Preferences Page)
  • If you want replies to your comments sent to you, consider logging in or creating an account.

If you are having a problem with accounts or comment posting, please yell for help.