My Avatar

LanternD's Castle

PhD Student in ECE @ MSU

C Coding Standard - Formatted

2018-08-26

A formatted "C coding standard" by Eno Thereska.

Preface

It is originally posted by Eno Thereska on his CMU website (https://users.ece.cmu.edu/~eno/coding/CCodingStandard.html).

I just repost it with better formatting for my reference only. Hope it may help you.

He also has a C++ coding standard available at https://users.ece.cmu.edu/~eno/coding/CppCodingStandard.html.

Read more:

C Coding Standard

Adapted from http://www.possibility.com/Cpp/CppCodingStandard.html and NetBSD’s style guidelines.

Names

Make Names Fit

Names are the heart of programming. In the past people believed knowing someone’s true name gave them magical power over that person. If you can think up the true name for something, you give yourself and the people coming after power over the code. Don’t laugh!

A name is the result of a long deep thought process about the ecology it lives in. Only a programmer who understands the system as a whole can create a name that “fits” with the system. If the name is appropriate everything fits together naturally, relationships are clear, meaning is derivable, and reasoning from common human expectations works as expected.

If you find all your names could be Thing and DoIt then you should probably revisit your design.

Function Names

Include Units in Names

If a variable represents time, weight, or some other unit then include the unit in the name so developers can more easily spot problems. For example:

1
2
uint32_t timeout_msecs;
uint32_t my_weight_lbs;

Structure Names

Example

1
2
3
4
5
6
7
8
9
10
struct foo {
  struct foo *next;	/* List of active foo */
  struct mumble amumble;	/* Comment for mumble */
  int bar;
  unsigned int baz:1,	/* Bitfield; line up entries if desired */
         fuz:5,
         zap:2;
  uint8_t flag;
};
struct foo *foohead;		/* Head of global foo list */

Variable Names on the Stack

Justification

Example

1
2
3
4
5
int handle_error (int error_number) {
  int            error= OsErr();
  Time           time_of_error;
  ErrorProcessor error_processor;
}

Pointer Variables

place the * close to the variable name not pointer type.

Example

1
2
char *name= NULL;
char *name, address; 

Global Variables

Justification

Example

1
2
Logger  g_log;
Logger* g_plog;

Global Constants

Global constants should be all caps with ‘_’ separators.

Justification

It’s tradition for global constants to named this way. You must be careful to not conflict with other global #define s and enum labels.

Example

1
const int A_GLOBAL_CONSTANT= 5;

#define and Macro Names

Put #define s and macros in all upper using ‘_’ separators. Macros are capitalized, parenthesized, and should avoid side-effects. Spacing before and after the macro name may be any whitespace, though use of TABs should be consistent through a file. If they are an inline expansion of a function, the function is defined all in lowercase, the macro has the same name all in uppercase. If the macro is an expression, wrap the expression in parenthesis. If the macro is more than a single statement, use do {...} while (0), so that a trailing semicolon works. Right-justify the backslashes; it makes it easier to read.

Justification

This makes it very clear that the value is not alterable and in the case of macros, makes it clear that you are using a construct that requires care.

Some subtle errors can occur when macro names and enum labels use the same name.

Example

1
2
3
4
5
6
7
#define MAX(a,b) blah
#define IS_ERR(err) blah
#define	MACRO(v, w, x, y)			\
do {							         		\
  v = (x) + (y);							\
  w = (y) + 2;						  	\
} while (0)

Enum Names

Labels All Upper Case with ‘_’ Word Separators

This is the standard rule for enum labels. No comma on the last element.

Example

1
2
3
4
enum PinStateType {
  PIN_OFF,
  PIN_ON
};

Make a Label for an Error State

It’s often useful to be able to say an enum is not in any of its valid states. Make a label for an uninitialized or error state. Make it the first label if possible.

Example

1
enum {STATE_ERR,  STATE_OPEN, STATE_RUNNING, STATE_DYING};

Formatting

Brace Placement

Of the three major brace placement strategies one is recommended:

1
2
3
4
5
6
7
if (condition) {    
  ...
}

while (condition) {
  ...
}

Editor’s note: I prefer placing the left brace to a new line.

When Braces are Needed

All if, while and do statements must either have braces or be on a single line.

1
2
3
if (1 == somevalue) {
  somevalue = 2;
}

Justification

It ensures that when someone adds a line of code later there are already braces and they don’t forget. It provides a more consistent look. This doesn’t affect execution speed. It’s easy to do.

1
if (1 == somevalue) somevalue = 2;

Editor’s note: highly unrecommanded to write an if or while statement without braces.

Justification

It provides safety when adding new lines while maintainng a compact readable form.

Add Comments to Closing Braces

Adding a comment to closing braces can help when you are reading code because you don’t have to find the begin brace to know what is going on.

1
2
3
4
5
6
7
8
while(1) {
  if (valid) {

  } /* if valid */
  else {
  } /* not valid */

} /* end forever */

Consider Screen Size Limits

Some people like blocks to fit within a common screen size so scrolling is not necessary when reading code.

Parens () with Key Words and Functions Policy

Justification

Keywords are not functions. By putting parens next to keywords keywords and function names are made to look alike.

Example

1
2
3
4
5
6
7
8
9
if (condition) {  /* add space after keywords*/
}

while (condition) {  /* add space after keywords*/
}

strcpy(s, s1);  /* do not add space after function names*/

return 1;

A Line Should Not Exceed 78 Characters

Lines should not exceed 78 characters.

Justification

“If Then Else” Formatting

Layout

It’s up to the programmer. Different bracing styles will yield slightly different looks. One common approach is:

1
2
3
4
if (condition) {
} else if (condition) {
} else {
}

If you have else if statements then it is usually a good idea to always have an else block for finding unhandled cases. Maybe put a log message in the else even if there is no corrective action taken.

Condition Format

Always put the constant on the left hand side of an equality/inequality comparison. For example:

1
if (6 == errorNum) ...

One reason is that if you leave out one of the = signs, the compiler will find the error for you. A second reason is that it puts the value you are looking for right up front where you can find it instead of buried at the end of your expression. It takes a little time to get used to this format, but then it really gets useful.

switch Formatting

Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
switch (...)
{
  case 1:
      ...
  /* comments */

  case 2:
  {        
      int v;
      ...
  }
  break;

  default:
}

Use of goto, continue, break, and ?:

Goto

Goto statements should be used sparingly, as in any well-structured code. The goto debates are boring so we won’t go into them here. The main place where they can be usefully employed is to break out of several levels of switch, for, and while nesting, although the need to do such a thing may indicate that the inner constructs should be broken out into a separate function, with a success/failure return code.

1
2
3
4
5
6
7
8
9
10
11
for (...) {
  while (...) {
  ...
      if (disaster) {
        goto error;
      } 
  }
}
...
error:
   clean up the mess 

When a goto is necessary the accompanying label should be alone on a line and to the left of the code that follows. The goto should be commented (possibly in the block header) as to its utility and purpose.

Continue and Break

Continue and break are really disguised goto s so they are covered here.

Continue and break like goto should be used sparingly as they are magic in code. With a simple spell the reader is beamed to god knows where for some usually undocumented reason.

The two main problems with continue are:

Consider the following example where both problems occur:

1
2
3
4
5
6
7
8
9
10
11
12
while (TRUE) {
   ...
   /* A lot of code */
   ...
   if (/* some condition */) {
      continue;
   }
   ...
   /* A lot of code */
   ...
   if ( i++ > STOP_VALUE) break;
}

Note: “A lot of code” is necessary in order that the problem cannot be caught easily by the programmer. From the above example, a further rule may be given: Mixing continue with break in the same loop is a sure way to disaster.

?:

The trouble is people usually try and stuff too much code in between the ? and :. Here are a couple of clarity rules to follow:

Example

1
(condition) ? funct1() : func2();

or

1
2
3
(condition)
  ? long statement
  : another long statement;

One Statement Per Line

There should be only one statement per line unless the statements are very closely related.

The reasons is the code is easier to read. Use some white space too. Nothing better than to read code that is one line after another with no white space or comments.

One Variable Per Line

Related to this is always define one variable per line:

Not:

1
char **a, *x;

Do:

1
2
char **a = 0;  /* add doc */
char  *x = 0;  /* add doc */

The reasons are:

To Use Enums or Not to Use Enums

C allows constant variables, which should deprecate the use of enums as constants. Unfortunately, in most compilers constants take space. Some compilers will remove constants, but not all. Constants taking space precludes them from being used in tight memory environments like embedded systems. Workstation users should use constants and ignore the rest of this discussion.

In general enums are preferred to #define as enums are understood by the debugger.

Be aware enums are not of a guaranteed size. So if you have a type that can take a known range of values and it is transported in a message you can’t use an enum as the type. Use the correct integer size and use constants or #define. Casting between integers and enums is very error prone as you could cast a value not in the enum.

Use Header File Guards

Include files should protect against multiple inclusion through the use of macros that “guard” the files. Note that for C++ compatibility and interoperatibility reasons, do not use underscores ‘_’ as the first or last character of a header guard (see below)

1
2
3
#ifndef sys_socket_h
#define sys_socket_h  /* NOT _sys_socket_h_ */
#endif 

Macros

Don’t Turn C into Pascal

Don’t change syntax via macro substitution. It makes the program unintelligible to all but the perpetrator.

Replace Macros with Inline Functions

In C macros are not needed for code efficiency. Use inlines. However, macros for small functions are ok.

Example

1
#define  MAX(x,y)	(((x) > (y) ? (x) : (y))	// Get the maximum

The macro above can be replaced for integers with the following inline function with no loss of efficiency:

1
2
3
4
inline int 
max(int x, int y) {
  return (x > y ? x : y);
}

Be Careful of Side Effects

Macros should be used with caution because of the potential for error when invoked with an expression that has side effects.

Example

1
MAX(f(x),z++);

Always Wrap the Expression in Parenthesis

When putting expressions in macros always wrap the expression in parenthesis to avoid potential communitive operation abiguity.

Example

1
#define ADD(x,y) x + y

must be written as

1
#define ADD(x,y) ((x) + (y))

Make Macro Names Unique

Like global variables macros can conflict with macros from other packages.

Initialize all Variables

You shall always initialize variables. Always. Every time. gcc with the flag -W may catch operations on uninitialized variables, but it may also not.

Justification

More problems than you can believe are eventually traced back to a pointer or variable left uninitialized.

Short Functions

Functions should limit themselves to a single page of code.

Justification

Document Null Statements

Always document a null body for a for or while statement so that it is clear that the null body is intentional and not missing code.

1
2
3
4
while (*dest++ = *src++) 
{
  ;       
}  

Do Not Default If Test to Non-Zero

Do not default the test for non-zero, i.e.

1
if (FAIL != f()) 

is better than

1
if (f()) 

even though FAIL may have the value 0 which C considers to be false. An explicit test will help you out later when somebody decides that a failure return should be -1 instead of 0. Explicit comparison should be used even if the comparison value will never change; e.g., if (!(bufsize % sizeof(int))) (wrong) should be written instead as if ((bufsize % sizeof(int))==0) (correct) to reflect the numeric (not boolean) nature of the test. A frequent trouble spot is using strcmp to test for string equality, where the result should never ever be defaulted. The preferred approach is to define a macro STREQ.

1
#define STREQ(a, b) (strcmp((a), (b)) == 0) 

Or better yet use an inline method:

1
2
3
4
5
6
7
inline bool
string_equal(char* a, char* b)
{
  (strcmp(a, b) == 0) ? return true : return false;
  /* Or more compactly:
  return (strcmp(a, b) == 0); */
}

Note, this is just an example, you should really use the standard library string type for doing the comparison.

The non-zero test is often defaulted for predicates and other functions or expressions which meet the following restrictions:

Usually Avoid Embedded Assignments

There is a time and a place for embedded assignment statements. In some constructs there is no better way to accomplish the results without making the code bulkier and less readable.

1
2
3
while (EOF != (c = getchar())) {
  process the character
}

The ++ and -- operators count as assignment statements. So, for many purposes, do functions with side effects. Using embedded assignment statements to improve run-time performance is also possible. However, one should consider the tradeoff between increased speed and decreased maintainability that results when embedded assignments are used in artificial places. For example,

1
2
a = b + c;
d = a + r; 

should not be replaced by

1
d = (a = b + c) + r; 

even though the latter may save one cycle. In the long run the time difference between the two will decrease as the optimizer gains maturity, while the difference in ease of maintenance will increase as the human memory of what’s going on in the latter piece of code begins to fade.

Documentation

Comments Should Tell a Story

Consider your comments a story describing the system. Expect your comments to be extracted by a robot and formed into a man page. Class comments are one part of the story, method signature comments are another part of the story, method arguments another part, and method implementation yet another part. All these parts should weave together and inform someone else at another point of time just exactly what you did and why.

Document Decisions

Comments should document decisions. At every point where you had a choice of what to do place a comment describing which choice you made and why. Archeologists will find this the most useful information.

Use Headers

Use a document extraction system like Doxygen.

These headers are structured in such a way as they can be parsed and extracted. They are not useless like normal headers. So take time to fill them out. If you do it right once no more documentation may be necessary.

Comment Layout

Each part of the project has a specific comment layout. Doxygen has the recommended format for the comment layouts.

Make Gotchas Explicit

Explicitly comment variables changed out of the normal control flow or other code likely to break during maintenance. Embedded keywords are used to point out issues and potential problems. Consider a robot will parse your comments looking for keywords, stripping them out, and making a report so people can make a special effort where needed.

Gotcha Keywords

Gotcha Formatting

Commenting function declarations

Functions headers should be in the file where they are declared. This means that most likely the functions will have a header in the .h file. However, functions like main() with no explicit prototype declaration in the .h file, should have a header in the .c file.

Include Statement Documentation

Include statements should be documented, telling the user why a particular file was included.

1
2
3
4
5
6
7
8
9
10
11
12
/* 
 * Kernel include files come first.
*/
/* Non-local includes in brackets. */
/*
 * If it's a network program, put the network include files next.
 * Group the includes files by subdirectory.
*/
/*
 * Then there's a blank line, followed by the /usr include files.
 * The /usr include files should be sorted!
*/

Layering

Layering is the primary technique for reducing complexity in a system. A system should be divided into layers. Layers should communicate between adjacent layers using well defined interfaces. When a layer uses a non-adjacent layer then a layering violation has occurred.

A layering violation simply means we have dependency between layers that is not controlled by a well defined interface. When one of the layers changes code could break. We don’t want code to break so we want layers to work only with other adjacent layers.

Sometimes we need to jump layers for performance reasons. This is fine, but we should know we are doing it and document appropriately.

Miscellaneous

General advice

This section contains some miscellaneous do’s and don’ts.

1
if (abool = bbool) { ... }

Does the programmer really mean assignment here? Often yes, but usually no. The solution is to just not do it, an inverse Nike philosophy. Instead use explicit tests and avoid assignment with an implicit test. The recommended form is to do the assignment before doing the test:

1
2
abool= bbool;
if (abool) { ... }

Be Const Correct

C provides the const key word to allow passing as parameters objects that cannot change to indicate when a method doesn’t modify its object. Using const in all the right places is called “const correctness.” It’s hard at first, but using const really tightens up your coding style. Const correctness grows on you.

Use #if Not #ifdef

Use #if MACRO not #ifdef MACRO. Someone might write code like:

1
2
3
#ifdef DEBUG
  temporary_debugger_break();
#endif

Someone else might compile the code with turned-of debug info like:

1
cc -c lurker.cc -DDEBUG=0

Alway use #if, if you have to use the preprocessor. This works fine, and does the right thing, even if DEBUG is not defined at all (!)

1
2
3
#if DEBUG
  temporary_debugger_break();
#endif

If you really need to test whether a symbol is defined or not, test it with the defined() construct, which allows you to add more things later to the conditional without editing text that’s already in the program:

1
2
3
#if !defined(USER_NAME)
#define USER_NAME "john smith"
#endif

Commenting Out Large Code Blocks

Sometimes large blocks of code need to be commented out for testing.

1
2
3
4
5
6
7
8
9
10
11
void 
example()
{
  great looking code

  #if 0
  lots of code
  #endif

  more code
}

You can’t use /**/ style comments because comments can’t contain comments and surely a large block of your code will contain a comment, won’t it?

Don’t use #ifdef as someone can unknowingly trigger ifdef s from the compiler command line. #if 0 is that even day later you or anyone else has know idea why this code is commented out. Is it because a feature has been dropped? Is it because it was buggy? It didn’t compile? Can it be added back? It’s a mystery.

1
2
3
4
5
#if NOT_YET_IMPLEMENTED  

#if OBSOLETE

#if TEMP_DISABLED 

File Extensions

In short: Use the .h extension for header files and .c for source files.

No Data Definitions in Header Files

Do not put data definitions in header files. for example:

1
2
3
4
/* 
 * aheader.h 
 */
int x = 0;

Mixing C and C++

In order to be backward compatible with dumb linkers C++’s link time type safety is implemented by encoding type information in link symbols, a process called name mangling. This creates a problem when linking to C code as C function names are not mangled. When calling a C function from C++ the function name will be mangled unless you turn it off. Name mangling is turned off with the extern “C” syntax. If you want to create a C function in C++ you must wrap it with the above syntax. If you want to call a C function in a C library from C++ you must wrap in the above syntax. Here are some examples:

Calling C Functions from C++

1
2
3
4
5
6
7
8
9
/* Way 1 */
extern "C" int strncpy(...);
extern "C" int my_great_function();
/* Way 2 */
extern "C"
{
   int strncpy(...);
   int my_great_function();
};

Creating a C Function in C++

1
2
3
4
extern "C" void
a_c_function_in_cplusplus(int a)
{
}

__cplusplus Preprocessor Directive

If you have code that must compile in a C and C++ environment then you must use the __cplusplus preprocessor directive. For example:

1
2
3
4
5
6
7
8
9
#ifdef __cplusplus

extern "C" some_function();

#else

extern some_function();

#endif

No Magic Numbers

A magic number is a bare naked number used in source code. It’s magic because no-one has a clue what it means including the author inside 3 months. For example:

1
2
3
4
if      (22 == foo) { start_thermo_nuclear_war(); }
else if (19 == foo) { refund_lotso_money(); }
else if (16 == foo) { infinite_loop(); }
else                { cry_cause_im_lost(); }

In the above example what do 22 and 19 mean? If there was a number change or the numbers were just plain wrong how would you know? Instead of magic numbers use a real name that means something. You can use #define or constants or enums as names. Which one is a design choice. For example:

1
2
3
4
5
6
7
8
9
10
#define   PRESIDENT_WENT_CRAZY  (22)
const int WE_GOOFED = 19;
enum  {
   THEY_DIDNT_PAY = 16
};

if      (PRESIDENT_WENT_CRAZY == foo) { start_thermo_nuclear_war(); }
else if (WE_GOOFED            == foo) { refund_lotso_money(); }
else if (THEY_DIDNT_PAY       == foo) { infinite_loop(); }
else                                  { happy_days_i_know_why_im_here(); }

Now isn’t that better? The const and enum options are preferable because when debugging the debugger has enough information to display both the value and the label. The #define option just shows up as a number in the debugger which is very inconvenient. The const option has the downside of allocating memory. Only you know if this matters for your application.

Error Return Check Policy



Disqus Comment 0