Microsoft Store
 

C programming language


 

The C programming language is a standardized imperative computer programming language developed in the early 1970s by Ken Thompson and Dennis Ritchie for use on the UNIX operating system. It has since spread to many other operating systems, and is one of the most widely used programming languages. C is prized for its efficiency, and is the most popular programming language for writing system software, though it is also used for writing applications. It is also commonly used in computer science education, despite not being designed for novices.

Features

Overview

C is a relatively minimalist programming language that operates close to the hardware, and is more similar to assembly language than to most high-level languages. Indeed, C is sometimes referred to as "portable assembly", reflecting its important difference from low-level languages such as assembly languages: C code can be compiled to run on almost any computer, more than any other language in existence, while any given assembly language runs on at most a few very specific models of computers. For these reasons C has been called a medium-level language.

Related Topics:
Programming language - Assembly language - High-level languages - Portable - Low-level languages - Medium-level language

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

C was created with one important goal in mind: to make it easier to write large programs with fewer errors in the procedural programming paradigm, but without encumbering the writer of the C compiler by complex language features.

Related Topics:
Procedural programming - Compiler

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

To this end, C has the following important features:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

"hello, world" example

The following simple application appeared in the first edition of K&R, and has become a standard introductory program in most programming textbooks, regardless of language. The program prints out "hello, world" to standard output, which is usually a terminal or screen display. However, it might be a file or some other hardware device, including the bit bucket, depending on how standard output is mapped at the time the program is executed.

Related Topics:
K&R - Hello, world - Standard output - Bit bucket

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

main()

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

printf("hello, world ");

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

}

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

The above program will compile correctly on most modern compilers that are not in compliance mode. However, it produces several warning messages when compiled with a compiler that conforms to the ANSI C standard. Additionally, the code will not compile if the compiler strictly conforms to the C99 standard, as a return value of type int will no longer be assumed if the source code has not specified otherwise. These messages can be eliminated with a few minor modifications to the original program:

Related Topics:
ANSI C - C99

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

#include

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

int main(void)

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

printf("hello, world ");

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

return 0;

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

}

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

What follows is a line-by-line analysis of the above program:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

#include

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

This first line of the program is a preprocessing directive, #include. This causes the preprocessor — the first tool to examine source code when it is compiled — to substitute for that line the entire text of the file or other entity to which it refers. In this case, the header stdio.h — which contains the definitions of standard input and output functions — will replace that line. The angle brackets surrounding stdio.h indicate that stdio.h can be found using an implementation-defined search strategy. Double quotes may also be used for headers, thus allowing the implementation to supply (up to) two strategies. Typically, angle brackets are used for headers supplied by the implementation, and double quotes for "in-house" headers.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

int main(void)

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

This next line indicates that a function named main is being defined. The main function serves a special purpose in C programs. When they are executed, main() is the first function called. The portion of the code that reads int indicates that the return value — the value to which the main function will evaluate — is an integer. The portion that reads (void) indicates that the main function takes no arguments. See also void.

Related Topics:
Main - Void

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

{

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

This opening curly brace indicates the beginning of the definition of the main function.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

printf("hello, world ");

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

This line calls — looks up and then executes the code for — a function named printf, which was declared in the included header stdio.h. In this call, the printf function is passed — provided with — a single argument, the address of the first character in the string literal "hello, world ". The sequence that reads is an escape sequence that is translated to the EOL—or end-of-line—character, which is intended to move the output device's current position indicator to the beginning of the next line. The return value of the printf function is of type int, but no use was made of it so it will be quietly discarded.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

return 0;

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

This line terminates the execution of the main function and causes it to return the integral value 0.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

}

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

This closing curly brace indicates the end of the code for the main function.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

If the above code were compiled, it would do the following:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

  • Print the string "hello, world" onto the standard output device (typically but by no means always a terminal),
  • Move the current position indicator to the beginning of the next line,
  • Then return the integer zero to the application's executor.

Types

C has a type system similar to that of other ALGOL descendants such as Pascal, although different in a number of ways. There are types for integers of various sizes, both signed and unsigned, floating-point numbers, characters, enumerated types (enum), records (struct), and untagged unions (union).

Related Topics:
ALGOL - Pascal - Floating-point number - Records - Union

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

C makes extensive use of pointers, a very simple type of reference that stores the address of a memory location. Pointers can be dereferenced to retrieve the data stored at that address. The address can be manipulated with regular assignment and pointer arithmetic. At runtime, a pointer represents a memory address. At compile-time, it is a complex type that represents both the address and the type of the data. This allows expressions including pointers to be type-checked. Pointers are used for many different purposes in C. Text strings are commonly represented with a pointer to an array of characters. Dynamic memory allocation, which is described below, is performed using pointers.

Related Topics:
Pointer - Reference - Pointer arithmetic - Dynamic memory allocation

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

A null pointer has a reserved value indicating that it points to no valid location. These are useful for indicating special cases such as the next pointer in the final node of a linked list. Dereferencing a null pointer causes unpredictable behavior. Pointers to type void also exist, and point to objects of unknown type. These are particularly useful for generic programming. Since the size and type of the objects they point to is not known they cannot be dereferenced, but they can be converted to other types of pointers.

Related Topics:
Null pointer - Linked list - Generic programming

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Array types in C are of a fixed, static size known at compile-time; this isn't too much of a hindrance in practice, since one can allocate blocks of memory at runtime using the standard library and treat them like arrays. Unlike many other languages, C typically represents arrays just as it does pointers: as a memory address with associated data type. In this case, index values are translated into memory addresses by computing an offset from the base address of the array. The array index is not checked against the array bounds, which can result in illegal memory accesses. This may reveal confidential data, corrupt data, or cause run-time errors or exceptions, depending on the situation and the detailed run time environment.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

C also supplies multi-dimensional arrays. The index values of the arrays are assigned in row-major order. Semantically these arrays function like arrays of arrays, but physically they are stored as a single one-dimensional array with computed offsets.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

C is often used in low-level systems programming, where it may be necessary to treat an integer as a memory address, a double-precision value as an integer, or one type of pointer as another. For such cases C provides casting, which forces the explicit conversion of a value from one type to another. The use of casts sacrifices some of the safety normally provided by the type system.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Data storage

One of the most important functions of a programming language is to provide facilities for managing memory and the objects that are stored in memory. C provides three distinct ways to allocate memory for objects:

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

  • Static memory allocation: space for the object is provided in the binary at compile-time; these objects have an extent (or lifetime) as long as the binary which contains them exists
  • Automatic memory allocation: temporary objects can be stored on the stack, and this space is automatically freed and reusable after the block they are declared in is left
  • Dynamic memory allocation: blocks of memory of any desired size can be requested at run-time using the library function malloc() from a region of memory called the heap; these blocks are reused after the library function free() is called on them
  • These three approaches are appropriate in different situations and have various tradeoffs. For example, static memory allocation has no allocation overhead, automatic allocation has a small amount of overhead during initialization, and dynamic memory allocation can potentially have a great deal of overhead for both allocation and deallocation. On the other hand, stack space is typically much more limited than either static memory or heap space, and only dynamic memory allocation allows allocation of objects whose size is only known at run-time. Most C programs make extensive use of all three.

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

    Where possible, automatic or static allocation is usually preferred because the storage is managed by the compiler, freeing the programmer of the error-prone hassle of manually allocating and releasing storage. Unfortunately, many data structures can grow in size at runtime; since automatic and static allocations must have a fixed size at compile-time, there are many situations in which dynamic allocation must be used. Variable-sized arrays are a common example of this (see "malloc" for an example of dynamically allocated arrays).

    ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Syntax

Main article: C syntax

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Unlike languages like Fortran 77, C is free-form, allowing programmers to use arbitrary whitespace (rather than rigid lines) in laying out their code. Comments can be included either between the delimiters /* and */, or (in C99) following // until the end of the line.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

Each source file contains declarations and function definitions. Function definitions, in turn, contain declarations and statements. Declarations either define new types using keywords such as struct, union, and enum, or assign types to and reserve storage for new variables, usually by writing the type followed by the variable name. Keywords such as char and int, as well as the pointer-to symbol *, specify built-in types. Sections of code are enclosed in braces ({ and }) to indicate the extent to which declarations and control structures apply.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~

As an imperative language, C depends on statements to do most of the work. Most statements are expression statements which simply cause an expression to be evaluated -- and, in the process, cause variables to receive new values or values to be printed. Control-flow statements are also available for conditional or iterative execution, constructed with reserved keywords such as if, else, switch, do, while, and for. Arbitrary jumps are possible with goto. A variety of built-in operators perform primitive arithmetic, logical, comparative, bitwise, and array indexing operations and assignment. Expressions can also call functions, including a large number of standard library functions, for performing many common tasks.

~ ~ ~ ~ ~ ~ ~ ~ ~ ~