to the target system. A
special translator is provided to convert the executable file to a down-
loadable format.
Lines
Each C text file contains a set of lines. A line contains characters and is
finished by a line terminator (line feed, carriage return). The C compiler
allows several physical lines to be concatenated in a single logical line,
whose length should not exceed 511 characters in order to strictly com-
ply with the ANSI standard. The COSMIC compiler accepts up to 4095
characters in a logical line. Two physical lines are concatenated into a
single logical line if the first line ends with a backslash character ‘
\’
C Files
2-2 C Language Overview © Copyright 2003 by COSMIC Software
2
just before the line terminator. This feature is important as the C lan-
guage implements special directives, known as preprocessing direc-
tives, whose operands have to be located on the same logical line.
Comments
Comments are part of the text which are not meaningful for the com-
piler, but very important for the program readability and understanding.
They are removed from the original text and replaced by a single
whitespace character. A comment starts with the sequence
/* and ends
with the sequence
*/. A comment may span over several lines but nest-
ing comments is not allowed. As an extension from the ANSI standard,
the compiler also accepts C++ style comments, starting with the
sequence
// and ending at the end of the same logical line.
Trigraphs
The C language uses almost all the ASCII character set to build the lan-
guage components. Some terminals or workstations cannot display the
full ASCII set, and they need a specific mechanism to have access to all
the needed characters. These special characters are encoded using spe-
cial sequences called trigraphs. A trigraph is a sequence of three char-
acters beginning with two question marks
?? and followed by a
common character. These three characters are equivalent to a single one
from the following table:
??( [
??/ \
??) ]
??’ ^
??< {
??! |
??> }
??- ~
??= #
All other sequences beginning with two question marks are left
unchanged.
Lexical Tokens
© Copyright 2003 by COSMIC Software
C Language Overview 2-3
Lexical Tokens
Characters on a logical line are grouped together to form lexical tokens.
These tokens are the basic entities of the language and consist of:
identifiers
keywords
constants
operators
punctuation
Identifiers
An identifier is used to give a name to an object. It begins with a letter
or the underscore character
_, and is followed by any letter, digit or
underscore character. Uppercase and lowercase letters are not equiva-
lent in C, so the two identifiers
VAR1 and var1 do not describe the
same object. An identifier may have up to 255 characters. All the char-
acters are significant for name comparisons.
Keywords
A keyword is a reserved identifier used by the language to describe a
special feature. It is used in declarations to describe the basic type of an
object, or in a function body to describe the statements executed.
A keyword name cannot be used as an object name. All C keywords are
lowercase names, so because lowercase and uppercase letters are differ-
ent, the uppercase version of a keyword is available as an indentifier
although this is probably not a good programming practice.
The C keywords are:
auto double int struct
break else long switch
case enum register typedef
char extern return union
const float short unsigned
continue for signed void
default goto sizeof volatile
do if static while
Lexical Tokens
2-4 C Language Overview © Copyright 2003 by COSMIC Software
2
Constants
A constant is used to describe a numerical value, or a character string.
Numerical constants may be expressed as real constants, integer con-
stants or character constants. Integer constants may be expressed in
decimal, octal or hexadecimal base. The syntax for constants is
explained in the Expressions chapter.
Operators and Punctuators
An operator is used to describe an operation applied to one or several
objects. It is mainly meaningful in expressions, but also in declarations.
It is generally a short sequence using non alphanumeric characters. A
punctuator is used to separate or terminate a list of elements.
C operators and punctuators are:
... && -= >= ~ + ; ]
<<= &= -> >> % , < ^
>>= *= /= ^= & - = {
!= ++ << |= ( . > |
%= += <= || ) / ? }
## -- == ! * : [ #
Note that some sequences are used as operators and as punctuators,
such as
*, =, :, # and ,.
Several punctuators have to be used by pairs, such as
( ), [ ], { }.
When parsing the input text, the compiler tries to build the longest
sequence as possible for a token, so when parsing:
a+++++b
the compiler will recognize:
a ++ ++ + bwhich is not a valid construct
and not:
a ++ + ++ bwhich may be valid.
Declarations
© Copyright 2003 by COSMIC Software
C Language Overview 2-5
Declarations
A C program is a set of tokens defining objects or variables, and func-
tions to operate on these variables. The C language uses a declaration
to associate a type to a name. A type may be a simple type or a complex
type.
A simple type is numerical, and may be integer or real.
Integer Types
An integer type is one of:
char1
Continue reading on your phone by scaning this QR Code
Tip: The current page has been bookmarked automatically. If you wish to continue reading later, just open the
Dertz Homepage, and click on the 'continue reading' link at the bottom of the page.