Talk:C preprocessor

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
WikiProject C/C++ (Rated Start-class, Top-importance)
WikiProject iconThis article is within the scope of WikiProject C/C++, a collaborative effort to improve the coverage of C/C++ on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the quality scale.
 Top  This article has been rated as Top-importance on the importance scale.
 

do-while-0 and if-1-else[edit]

The standard trick of writing macro statements in a do loop is especially important when writing "wrapper" macros that include their arguments as statements:

#define WITH_FOO(stmts) \
do {                    \
  int __ofoo=foo;       \
  foo=1;                \
  stmts;                \
  foo=__ofoo;           \
} while 0

The stmts; means that the provided code can end with or without a semicolon (although commas are still problematic!), and the overall structure is always one statement (for safety in putting calls to it in control structures) and expects a trailing semicolon (so that it looks like a function call; automatic indentation will like it better). The braces also mean that an if inside the argument cannot attach to an else following the macro call. But there's another option:

#define WITH_FOO(stmts) \
if(1) {                 \
  int __ofoo=foo;       \
  foo=1;                \
  stmts;                \
  foo=__ofoo;           \
} else

This is very similar, but has the improvement of being transparent to break and continue. Disadvantages are that the else is capable of binding to any following statement, and that some compilers may issue warnings about the empty else or about the use of if-if-else-else without braces

There is NO compiler that will complain about embedded if-else construct without braces which is very standard since the earliest incarnations of C and still in all modern C and C++ compilers. Such warnings would be really harmful if they were enabled by default (but some projects are setting more strict lint-like verifications). Such warninf would be extremely pedantic, and used only for debugging the source, in a specific precompilation rule used to find the location of missing braces, where a lot of warnings will be expected by the programmer... verdy_p (talk) 20:09, 28 November 2009 (UTC)

in constructs like these:

  if(need_foo) WITH_FOO(foofunc());
  else barfunc();

I suppose that there is a healthy debate on the subject. When you're writing a multiline but non-wrapper macro, you know if there are loop control statements in the loop, and you should prefer do-while-0 in their absence. Is there a consensus otherwise? Should we include both in the article? Are there well-known sources for both styles? --Tardis (talk) 00:52, 13 March 2009 (UTC)

There is the possibility to insert a do_nothing statement in the else clause (such as "if(1){ ... } else ((void)0)" to discard the value explicilty, but not "if(1){ ... } else 0" which generates a warning almost always), possibly surrounded by a pragma to avoid the warning. Unfortunately the simple typename "void" is not valid as an expression alone (or it could still absorb a following statement in case of a missing colon, notably in C++ where it may be interpreted as an initial typecast to "void".
I also don't like the dangling else clause, because it can siliently absorb the following statement, if ever there's a semilcolon missing after the macro invokation.
But if you use GCC, just surround the block within parentheses, without if/else or do/while. In MS compilers, there's a do-nothing intrinsic void function which can work as well, without surrounding the block by a control statement. In all other cases, the do/while(0) is the best approach: the only risk is a compiler warning, not a compile-error or an undetected error (like a missing colon). Warnings about do-nothing statements (or always false conditions when using do/while) can also be controled in the source file using the macro, because these warnings are non-standard (lint-like, possibily wrong) and must not be forced without a compiler option to disable them.
If you want to be transparent to break and continue, you can replace them by an explicit goto to a labelled statement after end of loop or switch (for break) or before end of loop (for continue) (and for code clarity, the label should include "break_" or "continue_" in its name, the goto being also always downward in a switch or do/while or for-loop, and always upward in a while-loop). verdy_p (talk) 13:18, 28 November 2009 (UTC)
Anyway, your proposed syntax will not work with most statements that are not single statements or that contain commas. I suggest this instead:
#define BEGIN_BLOCK  if (1) { 
#define END_BLOCK    ; } else

#define BEGIN_FOO  BEGIN_BLOCK int __ofoo = foo; foo = 1;
#define END_FOO    foo = __ofoo; END_BLOCK
or still with the alternative (which is still safer):
#define BEGIN_BLOCK  do {
#define END_BLOCK    ; } while(0)
that can be used as (with or without semi-colon terminating the middle statements, but with a required semi-colon after END_FOO (its absence will cause a syntax error within "else else" or "while(0) else") :
  if (need_foo) BEGIN_FOO foofunc() END_FOO;
  else barfunc();
The implicit termination of if constructs with an optional additional else clause is a wellknown syntax caveat of C/C++/Java/C#/J# (that must also be solved in complex ways to avoid advance/reduce ambiguities in context-free language parser generators taking their decision using a single look-ahead symbol without backtracking); other languages (including the C preprocessor) are avoiding in a safer way by requiring an explicit "endif" construct in all cases (including in Pascal/Modula where the semicolon is still required to terminate the "if" statement even after the "end" keyword terminating multi-statement block used as one of its "statement" clauses).
verdy_p (talk) 20:45, 28 November 2009 (UTC)
There's the
if (0); else {block}
construct to get rid of the "else" problem. 70.239.12.234 (talk) 19:47, 18 July 2011 (UTC)

#warning[edit]

Checked and added that the C-compilers by Intel and IBM also support the #warning directive. Is this sufficient to remove the weasel word warning? —Preceding unsigned comment added by 141.84.9.25 (talk) 15:06, 11 November 2009 (UTC)

Refer the Preprocessor section —Preceding unsigned comment added by 203.91.193.5 (talk) 11:11, 12 January 2010 (UTC)

Indentation[edit]

I'm currently googling around trying to confirm / deny that the # symbol should be in the first column and that indentation may appear between it and the directive. This article surely should state the indentation rule, or state that it is a myth? Sweavo (talk) 13:53, 19 August 2010 (UTC)

There's no such rule, you may indent preprocessor directives as you wish. From the C99 standard (6.10:
A preprocessing directive consists of a sequence of preprocessing tokens that begins with
# preprocessing token that (at the start of translation phase 4) is either the first character
in the source file (optionally after white space containing no new-line characters) or that
follows white space containing at least one new-line character, and is ended by the next
new-line character.
And the # and the word following it are separate tokens, so you can put spaces between them too. This is basically the same in C89 and C++. Rwessel (talk) 04:41, 5 November 2011 (UTC)

I'm going to start moving sections over to wikibooks[edit]

The article is too detailed. Wikipedia is not an instruction manual. - Richfife (talk) 19:26, 28 February 2011 (UTC)

I came here looking for that detailed info (since there is a stackoverflow page referring to it here), a link to the wikibooks might be useful (at the least on this talk page): http://en.wikibooks.org/wiki/C_Programming/Preprocessor — Preceding unsigned comment added by 94.208.248.165 (talk) 09:13, 10 June 2012 (UTC)

You people are killing wiki. I used to come here looking for info, now I rarely visit wiki as I never find what I need on it anymore, just a bunch of bureaucrats trying to exercise influence over articles to get mod status. — Preceding unsigned comment added by 76.113.141.126 (talk) 00:52, 11 April 2013 (UTC)

Every time I go to the pharmacy to buy shoes, their selection totally sucks. Just a couple of cheap flip-flops. The pharmacy totally sucks. It's completely useless. I'm not going there anymore. - Richfife (talk) 17:17, 12 April 2013 (UTC)

Read above again. zzzz — Preceding unsigned comment added by 76.113.141.126 (talk) 01:41, 18 April 2013 (UTC)

If Wikipedia isn't supposed to be an instruction manual, then the link to the Wikibooks "instruction manual" should be moved to the top of the page. People often come to Wikipedia looking for what you describe as instructions, making Wikipedia vs Wikibooks into a form of ambiguity. This should be solved in approximately the same way as disambiguating similar pages: by putting the Wikibooks link at the TOP of the page where it will be noticed, instead of the BOTTOM where you will only notice it if you ALREADY know that the link exists (I initially missed it, and was going to complain about the lack of such a link until I double-checked the page). This is a basic usability issue (source? Myself, looking for a reference on XMacros, a matter of minutes ago. This really does come up, and IS an issue). — 104.63.66.180 (talk) 17:44, 21 December 2014 (UTC)

"OlderSmall" example should be much simpler[edit]

Encyclopedant (talk) 21:44, 24 September 2011 (UTC)

Syntax highlighing[edit]

I'd like to change the syntax highlighting to CPP from C (source lang="cpp"). The CPP highlighting is more attractive, and is what C (programming language) uses. Comments? Rwessel (talk) 01:04, 15 October 2011 (UTC)

Including files section[edit]

A perhaps pedantic point is that that the description of including stdio.h as a text image is not strictly correct. The C standard headers do not actually have to be text files in any meaning sense of the world, although most (all?) implementations do have text files for those. The implementation is allowed to handle the standard header in a special way, so while "#include <stdio>" must have the defined result, it does *not* have to happen by including any sort of text file. In short, the explanation is at least potentially incorrect when applied (as it is) to one of the standard headers.

On the flip side, it's a pretty pedantic point, and I'm not actually aware of any implementations that don't treat the system headers as text files.

So there are three options:

(1) leave it alone, and accept that the text is not quite correct (2) add additional text clarifying the (potential) special nature of the system headers (3) change the example to include a non-system header instead (which must be a text file)

Frankly, I mostly lean towards option (1).

Comments? Rwessel (talk) 09:51, 22 March 2012 (UTC)

Pedantically, I would tend to your second option above, if only to keep editors from having this discussion again. A single sentence would be enough I think. I don't have the C language standard, but this question on Stack Overflow already quotes the relevant sections of the C++ standard.
Like you, I have never seen an implementation of C or C++ where the standard headers were not represented as text files. I have asked for examples of such implementations at the reference desk. —Tobias Bergemann (talk) 10:42, 22 March 2012 (UTC)

Phases[edit]

The Phases section states:

"The first four (of eight) phases of translation specified in the C Standard are:"


Which begs the question, What are the other four? Rojomoke (talk) 10:08, 22 October 2012 (UTC)

There are links to several versions of the C standard at the bottom of C (programming language). To quote:
5. Each escape sequence in character constants and string literals is
   converted to a member of the execution character set.
6. Adjacent character string literal tokens are concatenated and
   adjacent wide string literal tokens are concatenated.
7. White-space characters separating tokens are no longer
   significant.  Preprocessing tokens are converted into tokens.  The
   resulting tokens are syntactically and semantically analyzed and
   translated.
8. All external object and function references are resolved.  Library
   components are linked to satisfy external references to functions and
   objects not defined in the current translation.  All such translator
   output is collected into a program image which contains information
   needed for execution in its execution environment.
Basically the first four phases define what's normally thought of as preprocessing, although that's not made clear in the article (which I will fix in a minute). Phases five and six finish cleaning up the source after preprocessing, seven is what most people think of as the compilation process itself, and eight is linking. But the eight phases are *conceptual* phases which define the context in which the definition of the language within the standard are made. Almost no implementations follow it strictly in the way they operate, but work as if they had, and the results are (hopefully!) indistinguishable from an implementations where they had actually implemented the eight phases as separate steps. Rwessel (talk) 16:59, 22 October 2012 (UTC)
Ah, thanks. I'd misread it to mean there were eight phases in the preprocessing activity. Rojomoke (talk) 09:19, 24 October 2012 (UTC)

Turing completeness?[edit]

Can we get a reference for the claim that the C preprocessor is Turing Complete? I have heard mention of it before, but only as a hack that required the use of external tools or something that required non-standard behavior, I forget which and where I heard it. It is a pretty big claim however that is not referenced. If it is true, is it true for the original C preprocessor, the C89, C99 or C11 preprocessor? — Preceding unsigned comment added by 212.77.163.111 (talk) 09:30, 18 February 2015 (UTC)

The article says it's not Turing Complete. It's far from it. I think you may have misread. - Richfife (talk) 15:30, 18 February 2015 (UTC)
It's not Turing complete because it lacks recursion or iteration. M4, which supports recursion, is. Rp (talk) 15:02, 19 February 2015 (UTC)
It exist an interpreter for brainfuck written enteirly in c preprocessor,so it is (surprisingly enough) turing complete,althought you still can't do much with it--Pasqui23(talk-please understand,my native language is not english) 00:38, 20 February 2015 (UTC)
Here's a quote from that exact page: "There has been much speculation on the turing completeness of the C Preprocessor. Does this act as a demonstrative proof that it is? The short answer - not really." - Richfife (talk) 00:51, 20 February 2015 (UTC)
I have to agree, this is not really a what I'd consider a demonstration that the PP is Turing complete. Rwessel (talk) 05:24, 20 February 2015 (UTC)
The explanation is a little clearer. However it still presents the difference between the ability to express bounded iteration and the ability to express unbounded iteration as a "subtlety", while it is actually one of the fundamental cornerstones of reasoning about programming. With only fixed-bounded iteration, you can do with a fixed (and precomputable) amount of memory, which makes implementing and reasoning about programs a lot easier. Some C compilers for embedded systems actually remove unbounded iteration and recursion from the language. Rp (talk) 17:39, 20 February 2015 (UTC)
This gets into what I call pure vs. applied engineering. Can you make a case that the preprocessor is sort of, kind of, Turing complete? Yes. Would Turing slap you upside the head if he was still around for doing it? Also yes. Presenting the preprocessor as Turing Complete isn't doing anybody any favors. - Richfife (talk) 18:40, 20 February 2015 (UTC)

#line example wrong?[edit]

The article includes this example about #line:

#line 314 "pi.c"
puts("line=" #__LINE__ " file=" __FILE__);

and claims it generates this:

puts("line=314 file=pi.c");

I've tried it and it gave me an error. I think the syntax of using # outside a macro definition is incorrect. This worked for me:

#define XSTRINGIFY(x) #x
#define STRINGIFY(x) XSTRINGIFY(x)
#line 314 "pi.c"
puts("line=" STRINGIFY(__LINE__) " file=" __FILE__);

As did this:

#line 314 "pi.c"
printf("line=%d file=%s\n", __LINE__, __FILE__);

Does that need to be corrected, or did I miss something? --pgimeno (talk) 22:30, 7 March 2015 (UTC)

Nope, you're right. Fixed. Rwessel (talk) 23:43, 7 March 2015 (UTC)

Comments[edit]

The article does not explain how comments can be done for .h files. I came to wikipedia first in order to find out whether /* */ is valid or whether I have to use // instead. Now I have to google for another site. :-) 2A02:8388:1600:3280:BE5F:F4FF:FECD:7CB2 (talk) 17:10, 4 April 2016 (UTC)

Preprocessor Comments[edit]

Both GCC and MSVC both have a concept of a preprocessor comment. If a line comment is preceded with a # the comment will be consumed by the preprocessor and not included in the output of the preprocessor. This effect is only ever noticed if the compiler is run as only a preprocessor. So a line that starts "#//" is replaced with an empty line. --192.107.155.6 (talk) 13:42, 20 July 2018 (UTC)

Phases text is overly specific[edit]

"The preprocessor simultaneously expands macros and, in the 1999 version of the C standard,[clarification needed] handles _Pragma operators."

It seems odd to focus on this specific operator. I would prefer

"The preprocessor simultaneously expands macros and handles operators."