C programming language
The C programming language is a standardized imperative computer programming language developed in the early 1970s by Ken Thompson and Dennis Ritchie for use on the UNIX operating system. It has since spread to many other operating systems, and is one of the most widely used programming languages. C is prized for its efficiency, and is the most popular programming language for writing system software, though it is also used for writing applications. It is also commonly used in computer science education, despite not being designed for novices.
Problems
A popular saying, repeated by such notable language designers as Bjarne Stroustrup, is that "C makes it easy to shoot yourself in the foot." http://www.research.att.com/~bs/bs_faq.html#really-say-that In other words, C permits many operations that are generally not desirable, and thus many simple errors made by a programmer are not detected by the compiler or even when they occur at runtime. This leads to programs with unpredictable behavior and security holes. The safe C dialect Cyclone addresses some of these problems.
Related Topics:
Bjarne Stroustrup - Cyclone
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Part of the reason for this is to avoid compile- and runtime checks that were too expensive when C was originally designed. Another reason is the desire to keep C as efficient and flexible as possible; the more powerful a language, the more difficult it is to prove things about programs written in it. Some checks were also relegated to external tools, such as those discussed in Compiler-external static-checking tools below.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Memory allocation
One problem with C is that automatically and dynamically allocated objects are not initialized; they initially have whatever value is present in the memory space they are assigned. This value is highly unpredictable, and can vary between two machines, two program runs, or even two calls to the same function. If the program attempts to use such an uninitialized value, the results are usually unpredictable. Most modern compilers can detect and warn about this problem in some cases, but both false positives and false negatives occur.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Another common problem is that heap memory cannot be reused until it is explicitly released by the programmer with free(). The result is that if the programmer accidentally forgets to free memory, but continues to allocate it, more and more memory will be consumed over time. This is called a memory leak. Conversely, it is possible to release memory too soon, and then continue to use it. Because the allocation system can reuse the memory at any time for unrelated reasons, this results in insidiously unpredictable behavior. These issues in particular are ameliorated in languages with automatic garbage collection.
Related Topics:
Memory leak - Automatic garbage collection
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Pointers
Pointers are one primary source of danger; because they are unchecked, a pointer can be made to point to any object of any type, including code, and then written to, causing unpredictable effects. Although most pointers point to safe places, they can be moved to unsafe places using pointer arithmetic, the memory they point to may be deallocated and reused (dangling pointers), they may be uninitialized (wild pointers), or they may be directly assigned any value using a cast or through another corrupt pointer. Another problem with pointers is that C freely allows conversion between any two pointer types. Other languages attempt to address these problems by using more restrictive reference types.
Related Topics:
Dangling pointer - Wild pointer - Reference
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Arrays
Although C has native support for static arrays, it does not verify that array indexes are valid (bounds checking). For example, one can write to the sixth element of an array with five elements, yielding generally undesirable results. This is called a buffer overflow. This has been notorious as the source of a number of security problems in C-based programs. On the other hand, since bounds checking elimination technology was largely nonexistent when C was defined, bounds checking came with a severe performance penalty, particularly in numerical computation. It was also believed to be inconsistent with C's minimalist approach.
Related Topics:
Bounds checking - Buffer overflow - Bounds checking elimination
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Multidimensional arrays are necessary in numerical algorithms (mainly from applied linear algebra) to store matrices. The structure of the C array is very well adapted and fit for this particular task, provided one is prepared to count one's indices from 0 instead of 1. This issue is discussed in the book Numerical Recipes in C, Chap. 1.2, page 20 ff (read online). In that book there is also a solution based on negative addressing which introduces other dangers.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Variadic functions
Yet another common problem are variadic functions, which take a variable number of arguments. Unlike other prototyped C functions, checking the arguments of variadic functions at compile-time is not mandated by the standard, and is impossible in general without additional information. If the wrong type of data is passed, the effect is unpredictable, and often fatal. Variadic functions also handle null pointer constants in an unexpected way. For example, the printf family of functions supplied by the standard library, used to generate formatted text output, is notorious for its error-prone variadic interface, which relies on a format string to specify the number and type of trailing arguments.
Related Topics:
Variadic function - Printf
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Type-checking of variadic functions from the standard library is a quality of implementation issue, however, and many modern compilers do in particular type-check printf calls, producing warnings if the argument list is inconsistent with the format string. However, not all printf calls can be checked statically, since the format string can be built at runtime, and other variadic functions typically remain unchecked.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Syntax
Although mimicked by many languages because of its widespread familiarity, C's syntax has been often targeted as one of its weakest points. For example, Kernighan and Ritchie say in the second edition of The C Programming Language, "C, like any other language, has its blemishes. Some of the operators have the wrong precedence; some parts of the syntax could be better." Bjarne Stroustrup has also derided C++'s syntax, which is very similar to that of C: "Within C++, there is a much smaller and cleaner language struggling to get out. the C++ semantics is much cleaner than its syntax." http://www.research.att.com/~bs/bs_faq.html Some specific problems worth noting are:
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
- A function prototype which does not specify any parameters actually implicitly allows any set of parameters, a syntax problem introduced for backward compatibility with K&R C, which lacked prototypes.
- Some questionable choices of operator precedence, as mentioned by Kernighan and Ritchie above, such as
==binding more tightly than&and|in expressions likex & 1 == 0. - The use of the
=operator, used in mathematics for equality, to indicate assignment, leading to unintended assignments in comparisons and a false impression that assignment is transitive. Having=denote assignment and==equality was a deliberate decision by Ritchie, who noted that assignment occurs much more often than comparisons. - A lack of infix operators for complex objects, particularly for string operations, making programs which rely heavily on these operations difficult to read.
- Heavy reliance on punctuation-based symbols even where this is arguably less clear, such as "&&" and "||" instead of "and" and "or".
- The un-intuitive declaration syntax, particularly for function pointers. In the words of language researcher Damian Conway speaking about the very similar C++ declaration syntax:
::Specifying a type in C++ is made difficult by the fact that some of the components of a declaration (such as the pointer specifier) are prefix operators while others (such as the array specifier) are postfix. These declaration operators are also of varying precedence, necessitating careful bracketing to achieve the desired declaration. Furthermore, if the type ID is to apply to an identifier, this identifier ends up at somewhere between these operators, and is therefore obscured in even moderately complicated examples (see Appendix A for instance). The result is that the clarity of such declarations is greatly diminished.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
::Ben Werther & Damian Conway. A Modest Proposal: C++ Resyntaxed. Section 3.1.1. 1996.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Maintenance problems
There are other problems in C that don't directly result in bugs or errors, but do inhibit the ability of a programmer to build a robust, maintainable, large-scale system. Examples of these include:
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
- A fragile system for importing definitions (
#include) that relies on literal text inclusion and redundantly keeping prototypes and function definitions in sync, and drastically increases build times. - A cumbersome compilation model that forces manual dependency tracking and inhibits compiler optimizations between modules (except by link-time optimization).
- A weak type system that lets many clearly erroneous programs compile without errors.
- The difficulty of creating opaque structures, which results in programs that tend to violate information hiding.
Compiler-external static-checking tools
Tools have been created to help C programmers avoid these errors in many cases. Automated source code checking and auditing is fruitful in any language, and for C many such tools exist such as Lint. A common practice is to use Lint to detect questionable code when a program is first written. Once a program passes Lint, it is then compiled using the C compiler. There are also libraries for performing array bounds checking and a limited form of automatic garbage collection, but they are not a standard part of C.
Related Topics:
Lint - Automatic garbage collection
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
It should be recognized that these tools are not a panacea. Because of C's flexibility, many types of errors such as misuse of variadic functions, out-of-bound array indexing, and incorrect memory management cannot typically be detected. However, some common cases can be recognized.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ Table of Content ~
| ► | Introduction |
| ► | Features |
| ► | Problems |
| ► | History |
| ► | Relation to C++ |
| ► | Intermediate language |
| ► | See also |
| ► | References |
| ► | External links |
~ What's Hot ~
~ Community ~
| ► | History Forum Come and discuss about History, Civilizations, Historical Events and Figures |
| ► | History Web-Ring A community of sites, blogs and forums dedicated to History. Do not hesitate to submit your site. |
and are licensed under the GNU Free Documentation License.
Lexicon - Privacy Policy - Spiritus-Temporis.com ©2005.