Because the coding standard has been kept deliberately brief, there are some items missing that would be included in a more comprehensive standard. For more on commonsense C programming, consult the Indian Hill C coding standard or the comp.lang.c FAQ.
Every header should be protected against multiple inclusion
using the following idiom:
Note: memory allocation for C code that must interface
with Mercury code or the Mercury runtime should be
done using the routines defined and documented in
mercury/runtime/mercury_memory.h and/or mercury/runtime/mercury_heap.h,
according to the documentation in those files,
in mercury/trace/README, and in the Mercury Language Reference Manual.
Use the GNU convention of comments that indicate whether the variable
is true in the #if and #else parts of an #ifdef or #ifndef. For
instance:
Adhere to POSIX-supported operating system calls whenever possible
since they are widely supported, even by Windows and VMS.
When POSIX doesn't provide the required functionality, ensure that
the operating system specific calls are localised.
Don't rely on features whose behaviour is undefined according to
the ANSI C standard. For that matter, don't rely on C arcana
even if they are defined. For instance,
setjmp/longjmp and ANSI signals often have subtle differences
in behaviour between platforms.
If you write threaded code, make sure any non-reentrant code is
appropriately protected via mutual exclusion. The biggest cause
of non-reentrant (non-threadsafe) code is function-static data.
Note that some C library functions may be non-reentrant. This may
or may not be documented in the man pages.
Bear this in mind when tempted to add YetAnotherToolTM.
In order to modify and maintain the source code of the Mercury compiler,
you need the above and also:
#ifndef MODULE_H
#define MODULE_H
/* body of module.h */
#endif /* not MODULE_H */
2. Comments
2.1. What should be commented
2.1.1. Functions
Each function should have a one-line description of what it does.
Additionally, both the inputs and outputs (including pass-by-pointer)
should be described. Any side-effects not passing through the explicit
inputs and outputs should be described. If any memory is allocated,
you should describe who is responsible for deallocation.
If memory can change upon successive invocations (such as function-static
data), mention it. If memory should not be deallocated by anyone
(such as constant string literals), mention this.
2.1.2. Macros
Each non-trivial macro should be documented just as for functions (see above).
It is also a good idea to document the types of macro arguments and
return values, e.g. by including a function declaration in a comment.
2.1.3. Headers
Such function comments should be present in header files for each function
exported from a source file. Ideally, a client of the module should
not have to look at the implementation, only the interface.
In C terminology, the header should suffice for
working out how an exported function works.
2.1.4. Source files
Every source file should have a prologue comment which includes:
2.1.5. Global variables
Any global variable should be excruciatingly documented. This is
especially true when globals are exported from a module.
In general, there are very few circumstances that justify use of
a global.
2.2. Comment style
Use comments of this form:
/*
** Here is a comment.
** And here's some more comment.
*/
For annotations to a single line of code:
i += 3; /* Here's a comment about this line of code. */
2.3. Guidelines for comments
2.3.1. Revisits
Any code that needs to be revisited because it is a temporary hack
(or some other expediency) must have a comment of the form:
/*
** XXX: <reason for revisit>
*/
The <reason for revisit> should explain the problem in a way
that can be understood by developers other than the author of the
comment.
2.3.2. Comments on preprocessor statements
The #ifdef constructs should
be commented like so if they extend for more than a few lines of code:
#ifdef SOME_VAR
/*...*/
#else /* not SOME_VAR */
/*...*/
#endif /* not SOME_VAR */
Similarly for
#ifndef.
#ifdef SOME_VAR
#endif /* SOME_VAR */
#ifdef SOME_VAR
/*...*/
#else /* not SOME_VAR */
/*...*/
#endif /* not SOME_VAR */
#ifndef SOME_VAR
/*...*/
#else /* SOME_VAR */
/*...*/
#endif /* SOME_VAR */
3. Declarations
3.1. Pointer declarations
Attach the pointer qualifier to the variable name.
char *str1, *str2;
3.2. Static and extern declarations
Limit module exports to the absolute essentials. Make as much static
(that is, local) as possible since this keeps interfaces to modules simpler.
3.3. Typedefs
Use typedefs to make code self-documenting. They are especially
useful on structs, unions, and enums.
4. Naming conventions
4.1. Functions, function-like macros, and variables
Use all lowercase with underscores to separate words.
For instance, MR_soul_machine.
4.2. Enumeration constants, #define constants, and non-function-like macros
Use all uppercase with underscores to separate words.
For instance, ML_MAX_HEADROOM.
4.3. Typedefs
Use first letter uppercase for each word, other letters lowercase and
underscores to separate words.
For instance, MR_Directory_Entry.
4.4. Structs and unions
If something is both a struct and a typedef, the
name for the struct should be formed by appending `_Struct'
to the typedef name:
typedef struct MR_Directory_Entry_Struct {
...
} MR_DirectoryEntry;
For unions, append `_Union' to the typedef name.
4.5. Mercury specifics
Every symbol that is externally visible (i.e. declared in a header
file) should be prefixed with a prefix that is specific to the
package that it comes from.
For anything exported from mercury/runtime, prefix it with MR_.
For anything exported from mercury/library, prefix it with ML_.
5. Syntax and layout
5.1. Minutiae
Use 8 spaces to a tab. No line should be longer than 79 characters.
If a statement is too long, continue it on the next line indented
two levels deeper. If the statement extends over more than two
lines, then make sure the subsequent lines are indented to the
same depth as the second line. For example:
here = is_a_really_long_statement_that_does_not_fit +
on_one_line + in_fact_it_doesnt_even_fit +
on_two_lines;
if (this_is_a_somewhat_long_conditional_test(
in_the_condition_of_an +
if_then))
{
/*...*/
}
5.2. Statements
Use one statement per line.
Here are example layout styles for the various syntactic constructs:
5.2.1. If statement
Use the "/* end if */" comment if the if statement is larger than a page.
/*
** Curlies are placed in a K&R-ish manner.
** And comments look like this.
*/
if (blah) {
/* Always use curlies, even when there's only
** one statement in the block.
*/
} else {
/* ... */
} /* end if */
/*
** if the condition is so long that the open curly doesn't
** fit on the same line as the `if', put it on a line of
** its own
*/
if (a_very_long_condition() &&
another_long_condition_that_forces_a_line_wrap())
{
/* ... */
}
5.2.2. Functions
Function names are flush against the left margin. This makes it
easier to grep for function definitions (as opposed to their invocations).
In argument lists, put space after commas. And use the /* func */
comment when the function is longer than a page.
int
rhododendron(int a, float b, double c) {
/* ... */
} /* end rhododendron() */
5.2.3. Variables
Variable declarations shouldn't be flush left, however.
int x = 0, y = 3, z;
int a[] = {
1,2,3,4,5
};
5.2.4. Switches
switch (blah) {
case BLAH1:
/*...*/
break;
case BLAH2: {
int i;
/*...*/
break;
}
default:
/*...*/
break;
} /* switch */
5.2.5. Structs, unions, and enums
struct Point {
int tag;
union cool {
int ival;
double dval;
} cool;
};
enum Stuff {
STUFF_A, STUFF_B /*...*/
};
5.2.6. Loops
while (stuff) {
/*...*/
}
do {
/*...*/
} while(stuff)
for (this; that; those) {
/* Always use curlies, even if no body. */
}
/*
** If no body, do this...
*/
while (stuff)
{}
for (this; that; those)
{}
5.3. Preprocessing
5.3.1. Nesting
Nested #ifdefs, #ifndefs and #ifs should be indented by two spaces for
each level of nesting. For example:
#ifdef GUAVA
#ifndef PAPAYA
#else /* PAPAYA */
#endif /* PAPAYA */
#else /* not GUAVA */
#endif /* not GUAVA */
6. Portability
6.1. Architecture specifics
Avoid relying on properties of a specific machine architecture unless
necessary, and if necessary localise such dependencies. One solution is
to have architecture-specific macros to hide access to
machine-dependent code.
Some machine-specific properties are:
6.2. Operating system specifics
Operating system APIs differ from platform to platform. Although
most support standard POSIX calls such as `read', `write'
and `unlink', you cannot rely on the presence of, for instance,
System V shared memory, or BSD sockets.
6.3. Compiler and C library specifics
ANSI C compilers are now widespread and hence we needn't pander to
old K&R compilers. However compilers (in particular the GNU C compiler)
often provide non-ANSI extensions. Ensure that any use of compiler
extensions is localised and protected by #ifdefs.
6.4. Environment specifics
This is one of the most important sections in the coding standard.
Here we mention what other tools Mercury depends on.
Mercury must depend on some tools, however every tool that
is needed to use Mercury reduces the potential user base.
6.4.1. Tools required for Mercury
In order to run Mercury (given that you have the binary installation), you need:
In order to build the Mercury compiler, you need the above and also:
awk basename cat cp dirname echo egrep expr false fgrep grep head
ln mkdir mv rmdir rm sed sort tail
test true uniq xargs
6.4.2. Documenting the tools
If further tools are required, you should add them to the above list.
And similarly, if you eliminate dependence on a tool, remove
it from the above list.
7. Coding specifics
#define STREQ(s1,s2) (strcmp((s1),(s2)) == 0)
Comments? Mail mercury@cs.mu.oz.au,
or see our contact page.
Last update was $Date: 2006/03/13 06:36:05 $ by $Author: juliensf $@cs.mu.oz.au.
Note: This coding standard is an amalgam of suggestions from the
entire Mercury team, not ncessarily the opinion of any single author.