C Language
C Language
Contents
hide
(Top)
Overview
History
Syntax
Data types
Memory management
Libraries
Language tools
Uses
Limitations
Related languages
See also
Notes
References
Sources
Further reading
External links
C (programming language)
117 languages
Article
Talk
Read
Edit
View history
From Wikipedia, the free encyclopedia
(Redirected from C language)
"C programming language" redirects here. For the book, see The C Programming
Language.
Not to be confused with C++ or C#.
Paradigm Multi-paradigm: imperative (procedural), structured
by
ared
release
release
Typing Static, weak, manifest, nominal
discipline
OS Cross-platform
Filename .c, .h
extensions
Website www.iso.org/standard/74528.html
www.open-std.org/jtc1/sc22/wg14/
Major implementations
Dialects
Influenced by
B (BCPL, CPL), ALGOL 68,[4] assembly, PL/I, FORTRAN
Influenced
Numerous: AMPL, AWK, csh, C++, C--, C#, Objective-C, D, Go, Jav
a, JavaScript, JS++, Julia, Limbo, LPC, Perl, PHP, Pike, Processing, Py
thon, Rust, Seed7, Vala, Verilog (HDL),[5] Nim, Zig
C Programming at Wikibooks
Overview[edit]
The language has a small, fixed number of keywords, including a full set
of control flow primitives: if/else , for , do/while , while , and switch . User-
defined names are not distinguished from keywords by any kind of sigil.
It has a large number of arithmetic, bitwise, and logic operators: + , += , ++ , & , || ,
etc.
More than one assignment may be performed in a single statement.
Functions:
o Function return values can be ignored, when not needed.
o Function and data pointers permit ad hoc run-time polymorphism.
o Functions may not be defined within the lexical scope of other functions.
o Variables may be defined within a function, with scope.
o A function may call itself, so recursion is supported.
Data typing is static, but weakly enforced; all data has a type, but implicit
conversions are possible.
User-defined (typedef) and compound types are possible.
o Heterogeneous aggregate data types ( struct ) allow related data elements to
be accessed and assigned as a unit.
o Union is a structure with overlapping members; only the last member stored
is valid.
o Array indexing is a secondary notation, defined in terms of pointer arithmetic.
Unlike structs, arrays are not first-class objects: they cannot be assigned or
compared using single built-in operators. There is no "array" keyword in use
or definition; instead, square brackets indicate arrays syntactically, for
example month[11] .
o Enumerated types are possible with the enum keyword. They are freely
interconvertible with integers.
o Strings are not a distinct data type, but are
conventionally implemented as null-terminated character arrays.
Low-level access to computer memory is possible by converting machine
addresses to pointers.
Procedures (subroutines not returning values) are a special case of function, with
an untyped return type void .
Memory can be allocated to a program with calls to library routines.
A preprocessor performs macro definition, source code file inclusion,
and conditional compilation.
There is a basic form of modularity: files can be compiled separately
and linked together, with control over which functions and data objects are visible
to other files via static and extern attributes.
Complex functionality such as I/O, string manipulation, and mathematical
functions are consistently delegated to library routines.
The generated code after compilation has relatively straightforward needs on the
underlying platform, which makes it suitable for creating operating systems and
for use in embedded systems.
While C does not include certain features found in other languages (such as object
orientation and garbage collection), these can be implemented or emulated, often
through the use of external libraries (e.g., the GLib Object System or the Boehm
garbage collector).
Relations to other languages[edit]
Many later languages have borrowed directly or indirectly from C, including C++, C#,
Unix's C
shell, D, Go, Java, JavaScript (including transpilers), Julia, Limbo, LPC, Objective-C,
Perl, PHP, Python, Ruby, Rust, Swift, Verilog and SystemVerilog (hardware
description languages).[5] These languages have drawn many of their control
structures and other basic features from C. Most of them (Python being a dramatic
exception) also express highly similar syntax to C, and they tend to combine the
recognizable expression and statement syntax of C with underlying type systems,
data models, and semantics that can be radically different.
History[edit]
Early developments[edit]
Timeline of C language
1972 Birth
1978 K&R C
1989/199
ANSI C, ISO C ISO/IEC 9899:1990
0
2023* C23, C2x
Even after the publication of the 1989 ANSI standard, for many years K&R C was
still considered the "lowest common denominator" to which C programmers
restricted themselves when maximum portability was desired, since many older
compilers were still in use, and because carefully written K&R C code can be legal
Standard C as well.
In early versions of C, only functions that return types other than int must be
declared if used before the function definition; functions used without prior
declaration were presumed to return type int .
For example:
The int type specifiers which are commented out could be omitted in K&R C, but
are required in later standards.
Since K&R function declarations did not include any information about function
arguments, function parameter type checks were not performed, although some
compilers would issue a warning message if a local function was called with the
wrong number of arguments, or if multiple calls to an external function used different
numbers or types of arguments. Separate tools such as Unix's lint utility were
developed that (among other things) could check for consistency of function use
across multiple source files.
In the years following the publication of K&R C, several features were added to the
language, supported by compilers from AT&T (in particular PCC[20]) and some other
vendors. These included:
C2x[edit]
Main article: C2x
C2x is an informal name for the next (after C17) major C language standard revision.
It is expected to be voted on in 2023 and would therefore be called C23. [24][better source needed]
Embedded C[edit]
Main article: Embedded C
Historically, embedded C programming requires nonstandard extensions to the C
language in order to support exotic features such as fixed-point arithmetic, multiple
distinct memory banks, and basic I/O operations.
In 2008, the C Standards Committee published a technical report extending the C
language[25] to address these issues by providing a common standard for all
implementations to adhere to. It includes a number of features not available in
normal C, such as fixed-point arithmetic, named address spaces, and basic I/O
hardware addressing.
Syntax[edit]
Main article: C syntax
C has a formal grammar specified by the C standard.[26] Line endings are generally
not significant in C; however, line boundaries do have significance during the
preprocessing phase. Comments may appear either between the
delimiters /* and */ , or (since C99) following // until the end of the line.
Comments delimited by /* and */ do not nest, and these sequences of characters
are not interpreted as comment delimiters if they appear inside string or character
literals.[27]
C source files contain declarations and function definitions. Function definitions, in
turn, contain declarations and statements. Declarations either define new types
using keywords such as struct , union , and enum , or assign types to and perhaps
reserve storage for new variables, usually by writing the type followed by the variable
name. Keywords such as char and int specify built-in types. Sections of code are
enclosed in braces ( { and } , sometimes called "curly brackets") to limit the scope of
declarations and to act as a single statement for control structures.
As an imperative language, C uses statements to specify actions. The most common
statement is an expression statement, consisting of an expression to be evaluated,
followed by a semicolon; as a side effect of the evaluation, functions may
be called and variables may be assigned new values. To modify the normal
sequential execution of statements, C provides several control-flow statements
identified by reserved keywords. Structured programming is supported by if ...
[ else ] conditional execution and by do ... while , while , and for iterative execution
(looping). The for statement has separate initialization, testing, and reinitialization
expressions, any or all of which can be omitted. break and continue can be used to
leave the innermost enclosing loop statement or skip to its reinitialization. There is
also a non-structured goto statement which branches directly to the
designated label within the function. switch selects a case to be executed based on
the value of an integer expression. Different from many other languages, control-flow
will fall through to the next case unless terminated by a break .
Expressions can use a variety of built-in operators and may contain function calls.
The order in which arguments to functions and operands to most operators are
evaluated is unspecified. The evaluations may even be interleaved. However, all
side effects (including storage to variables) will occur before the next "sequence
point"; sequence points include the end of each expression statement, and the entry
to and return from each function call. Sequence points also occur during evaluation
of expressions containing certain operators ( && , || , ?: and the comma operator).
This permits a high degree of object code optimization by the compiler, but requires
C programmers to take more care to obtain reliable results than is needed for other
programming languages.
Kernighan and Ritchie say in the Introduction of The C Programming Language: "C,
like any other language, has its blemishes. Some of the operators have the wrong
precedence; some parts of the syntax could be better." [28] The C standard did not
attempt to correct many of these blemishes, because of the impact of such changes
on already existing software.
Character set[edit]
The basic C source character set includes the following characters:
auto
break
case
char
const
continue
default
do
double
else
enum
extern
float
for
goto
if
int
long
register
return
short
signed
sizeof
static
struct
switch
typedef
union
unsigned
void
volatile
while
_Bool
_Complex
_Imaginary
inline
restrict
_Alignas
_Alignof
_Atomic
_Generic
_Noreturn
_Static_assert
_Thread_local
Most of the recently reserved words begin with an underscore followed by a capital
letter, because identifiers of that form were previously reserved by the C standard for
use only by implementations. Since existing program source code should not have
been using these identifiers, it would not be affected when C implementations started
supporting these extensions to the programming language. Some standard headers
do define more convenient synonyms for underscored identifiers. The language
previously included a reserved word called entry , but this was seldom implemented,
and has now[when?] been removed as a reserved word.[30]
Operators[edit]
Main article: Operators in C and C++
C supports a rich set of operators, which are symbols used within an expression to
specify the manipulations to be performed while evaluating that expression. C has
operators for:
arithmetic: + , - , * , / , %
assignment: =
augmented assignment: += , -= , *= , /= , %= , &= , |= , ^= , <<= , >>=
bitwise logic: ~ , & , | , ^
bitwise shifts: << , >>
boolean logic: ! , && , ||
conditional evaluation: ? :
equality testing: == , !=
calling functions: ( )
increment and decrement: ++ , --
member selection: . , ->
object size: sizeof
order relations: < , <= , > , >=
reference and dereference: & , * , [ ]
sequencing: ,
subexpression grouping: ( )
type conversion: (typename)
C uses the operator = (used in mathematics to express equality) to indicate
assignment, following the precedent of Fortran and PL/I, but unlike ALGOL and its
derivatives. C uses the operator == to test for equality. The similarity between these
two operators (assignment and equality) may result in the accidental use of one in
place of the other, and in many cases, the mistake does not produce an error
message (although some compilers produce warnings). For example, the conditional
expression if (a == b + 1) might mistakenly be written as if (a = b + 1) , which
will be evaluated as true if a is not zero after the assignment.[31]
The C operator precedence is not always intuitive. For example, the
operator == binds more tightly than (is executed prior to) the operators & (bitwise
AND) and | (bitwise OR) in expressions such as x & 1 == 0 , which must be written
as (x & 1) == 0 if that is the coder's intent.[32]
main()
{
printf("hello, world\n");
}
#include <stdio.h>
int main(void)
{
printf("hello, world\n");
}
Data types[edit]
Main article: C variable types and declarations
This section needs additional citations for verification. Please help improve
this article by adding citations to reliable sources. Unsourced material may be
challenged and removed. (October 2012) (Learn how and when to remove this
template message)
The type system in C is static and weakly typed, which makes it similar to the type
system of ALGOL descendants such as Pascal.[35] There are built-in types for integers
of various sizes, both signed and unsigned, floating-point numbers, and enumerated
types ( enum ). Integer type char is often used for single-byte characters. C99 added
a boolean datatype. There are also derived types
including arrays, pointers, records ( struct ), and unions ( union ).
C is often used in low-level systems programming where escapes from the type
system may be necessary. The compiler attempts to ensure type correctness of
most expressions, but the programmer can override the checks in various ways,
either by using a type cast to explicitly convert a value from one type to another, or
by using pointers or unions to reinterpret the underlying bits of a data object in some
other way.
Some find C's declaration syntax unintuitive, particularly for function pointers.
(Ritchie's idea was to declare identifiers in contexts resembling their use:
"declaration reflects use".)[36]
C's usual arithmetic conversions allow for efficient code to be generated, but can
sometimes produce unexpected results. For example, a comparison of signed and
unsigned integers of equal width requires a conversion of the signed value to
unsigned. This can generate unexpected results if the signed value is negative.
Pointers[edit]
C supports the use of pointers, a type of reference that records the address or
location of an object or function in memory. Pointers can be dereferenced to access
data stored at the address pointed to, or to invoke a pointed-to function. Pointers can
be manipulated using assignment or pointer arithmetic. The run-time representation
of a pointer value is typically a raw memory address (perhaps augmented by an
offset-within-word field), but since a pointer's type includes the type of the thing
pointed to, expressions including pointers can be type-checked at compile time.
Pointer arithmetic is automatically scaled by the size of the pointed-to data type.
Pointers are used for many purposes in C. Text strings are commonly manipulated
using pointers into arrays of characters. Dynamic memory allocation is performed
using pointers; the result of a malloc is usually cast to the data type of the data to be
stored. Many data types, such as trees, are commonly implemented as dynamically
allocated struct objects linked together using pointers. Pointers to other pointers
are often used in multi-dimensional arrays and arrays of struct objects. Pointers to
functions (function pointers) are useful for passing functions as arguments to higher-
order functions (such as qsort or bsearch), in dispatch tables, or
as callbacks to event handlers .[34]
A null pointer value explicitly points to no valid location. Dereferencing a null pointer
value is undefined, often resulting in a segmentation fault. Null pointer values are
useful for indicating special cases such as no "next" pointer in the final node of
a linked list, or as an error indication from functions returning pointers. In appropriate
contexts in source code, such as for assigning to a pointer variable, a null pointer
constant can be written as 0 , with or without explicit casting to a pointer type, or as
the NULL macro defined by several standard headers. In conditional contexts, null
pointer values evaluate to false, while all other pointer values evaluate to true.
Void pointers ( void * ) point to objects of unspecified type, and can therefore be
used as "generic" data pointers. Since the size and type of the pointed-to object is
not known, void pointers cannot be dereferenced, nor is pointer arithmetic on them
allowed, although they can easily be (and in many contexts implicitly are) converted
to and from any other object pointer type. [34]
Careless use of pointers is potentially dangerous. Because they are typically
unchecked, a pointer variable can be made to point to any arbitrary location, which
can cause undesirable effects. Although properly used pointers point to safe places,
they can be made to point to unsafe places by using invalid pointer arithmetic; the
objects they point to may continue to be used after deallocation (dangling pointers);
they may be used without having been initialized (wild pointers); or they may be
directly assigned an unsafe value using a cast, union, or through another corrupt
pointer. In general, C is permissive in allowing manipulation of and conversion
between pointer types, although compilers typically provide options for various levels
of checking. Some other programming languages address these problems by using
more restrictive reference types.
Arrays[edit]
See also: C string
Array types in C are traditionally of a fixed, static size specified at compile time. The
more recent C99 standard also allows a form of variable-length arrays. However, it is
also possible to allocate a block of memory (of arbitrary size) at run-time, using the
standard library's malloc function, and treat it as an array.
Since arrays are always accessed (in effect) via pointers, array accesses are
typically not checked against the underlying array size, although some compilers
may provide bounds checking as an option.[37][38] Array bounds violations are therefore
possible and can lead to various repercussions, including illegal memory accesses,
corruption of data, buffer overruns, and run-time exceptions.
C does not have a special provision for declaring multi-dimensional arrays, but rather
relies on recursion within the type system to declare arrays of arrays, which
effectively accomplishes the same thing. The index values of the resulting "multi-
dimensional array" can be thought of as increasing in row-major order. Multi-
dimensional arrays are commonly used in numerical algorithms (mainly from
applied linear algebra) to store matrices. The structure of the C array is well suited to
this particular task. However, in early versions of C the bounds of the array must be
known fixed values or else explicitly passed to any subroutine that requires them,
and dynamically sized arrays of arrays cannot be accessed using double indexing.
(A workaround for this was to allocate the array with an additional "row vector" of
pointers to the columns.) C99 introduced "variable-length arrays" which address this
issue.
The following example using modern C (C99 or later) shows allocation of a two-
dimensional array on the heap and the use of multi-dimensional array indexing for
accesses (which can use bounds-checking on many C compilers):
Array–pointer interchangeability[edit]
The subscript notation x[i] (where x designates a pointer) is syntactic
sugar for *(x+i) .[39] Taking advantage of the compiler's knowledge of the pointer
type, the address that x + i points to is not the base address (pointed to by x )
incremented by i bytes, but rather is defined to be the base address incremented
by i multiplied by the size of an element that x points to. Thus, x[i] designates
the i+1 th element of the array.
Furthermore, in most expression contexts (a notable exception is as operand
of sizeof ), an expression of array type is automatically converted to a pointer to the
array's first element. This implies that an array is never copied as a whole when
named as an argument to a function, but rather only the address of its first element is
passed. Therefore, although function calls in C use pass-by-value semantics, arrays
are in effect passed by reference.
The total size of an array x can be determined by applying sizeof to an expression
of array type. The size of an element can be determined by applying the
operator sizeof to any dereferenced element of an array A , as in n = sizeof A[0] .
Thus, the number of elements in a declared array A can be determined as sizeof A
/ sizeof A[0] . Note, that if only a pointer to the first element is available as it is
often the case in C code because of the automatic conversion described above, the
information about the full type of the array and its length are lost.
Memory management[edit]
One of the most important functions of a programming language is to provide
facilities for managing memory and the objects that are stored in memory. C
provides three princ