C Program Compilation
A simple C program
For this lecture, we will learn how to write, compile, and run a very basic C program, and we
will discuss the steps that are involved in creating the executable. The following C program
prints out the text “Hello world!” to the screen:
/*
* File: hello.c
* -------------
* This simple C program prints out the text "Hello world!".
*/
#include<stdio.h>
int main(void) {
printf("Hello world!\n");
}
To compile this program, we will be using the gcc compiler in Linux, which stands for “Gnu
Compiler Collection”, and it is used as follows:
$ gcc hello.c
This creates an executable called a.out. To run the executable, you would type
$ ./a.out
Hello world!
If we wanted to create an executable that is named something other than a.out, we would
use the -o option in the form
$ gcc -o hello hello.c
and we could run the hello executable with
$ ./hello
Hello world!
In what follows we will go over the details of what actually happens when you invoke the C
compiler gcc.
The C compilation model
When you invoked the gcc compiler, a series of steps were performed in order to generate
an executable from the text you entered as your C program. These steps are as follows
Source Code (hello.c)
#
Preprocessor
#
Compiler
#
Assembly Code
#
Assembler
#
Object Code (hello.o)
+
Libraries
#
Linker
#
Executable (a.out or hello)
The preprocessor
The main function of the C preprocessor is to remove comments from the source code and
interpret preprocessor directives which are given by the statements that begin with #. In our
simple hello.c code, the preprocessor would strip the source of the comments contained
within the /*...*/ and would include the file called stdio.h, which contains the standard
input/output functions that are usually called within a C program. The #include statement
can either be called with
#include<file.h>
or with
#include ‘‘file.h’’
The first method tells the preprocessor to look for the file in the standard include directories,
which for Linux are in /usr/include. The second method, which uses the quotes, tells the
preprocessor that the file to be included is in the local directory. We will go into more detail
on different preprocessor directives as we look at different facets of the C programming
language.
Compiling and Assembling
Once the C preprocessor has stripped the source code of the comments and expanded the
preprocessor directives given by the lines that begin with # , the compiler translates the C
code into assembly language, which is a machine level code that contains instructions that
manipulate the memory and processor directly, in a layer beneath the operating system.
Usually, you do not need to see the assembly code. But you can create the assembly code
with
$ gcc -S hello.c
This will create a file called hello.s, which looks like
.file "hello.c"
.section .rodata
.LC0:
.string "Hello world!\n"
.text
.align 2
.globl main
.type main,@function
main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
andl $-16, %esp
movl $0, %eax
subl %eax, %esp
subl $12, %esp
pushl $.LC0
call printf
addl $16, %esp
leave
ret
.Lfe1:
.size main,.Lfe1-main
.ident "GCC: (GNU) 3.2 20020903 (Red Hat Linux 8.0 3.2-7)"
This file contains machine-level instructions like pushl and movl, which we will not go over
in much detail. What is important to understand is that the compiler is directly responsible
for converting C syntax into this machine level code.
You do not usually see this level of compilation. Instead, you see what is known as
the object code. The compiler creates the assembly code and converts the machine-level
instructions into binary code. You can create object code from a C source with
$ gcc -c hello.c
This creates a binary file called hello.o that cannot be viewed with a text viewer.
Linking
The object file hello.o contains a binary version of the machine language that was created
from your source code hello.c. In order to create the executable hello or a.out, you need
to use the linker to process your main function and any possible input arguments you might
use, and link your program with other programs that contain functions that your program
uses. In this very simple example, we used the printf function. The printf function is
a standard function that is provided by the C compiler that your current object file knows
nothing about. In order to use this function, we need to use the linker in order to link our
program with the precompiled libraries provided to us by the C compiler. The linker links
other precompiled object files or libraries together and creates the executable hello. When
you type
$ gcc -o hello hello.c
the gcc compiler creates an object file and does the linking for you. However, when you use
the -c option, you create an object file called hello.o. In order to link this object file and
create the executable, you can do the linking yourself by again using the gcc compiler, but
this time you provide the object files as the command line arguments rather than the source
codes. Then you would type
$ gcc hello.o -o hello
When gcc sees object files, it invokes the linker automatically and links the necessary files
to create the hello executable.
Libraries and linking in Linux
The beauty of the C programming language is that C itself is a relatively simple and small
compiler. The power of C becomes obvious when you use the wealth of precompiled libraries
that have already been written that you can use in your codes. We will illustrate the use of
one of the standard C libraries in an example.
Let’s say we would like to use the math library in order to take the square root of the
number 7 in our C code. All of the standard libraries in Linux are located in the /usr/lib
directory. With every standard library, there is a header file that contains information about
how to execute each of the functions in that library. In order to determine which header
file contains the function we are looking for, we use the man 3 command at the command
prompt. Let’s say we want to know which header file contains the sqrt function and how
to use this function. To do so, we type
$ man 3 sqrt
This give us information about the function and the appropriate header file to use
NAME
sqrt - square root function
SYNOPSIS
#include <math.h>
double sqrt(double x);
From this we know that in order to use the sqrt function, we need to use it as sqrt(7), since
the number 7 will be type cast as a double precision number. This also tells us that details
about the sqrt function can be found in /usr/include/math.h. Our simple C program that
computes square root of 7 would then be given by (Leaving out the lengthy comments for
space-conservation, but you should never leave out comments! In this case we use shorter
comments with the //):
// Program math.c
int main(void) {
sqrt(7);
}
If we try and compile our program, we will see that we get an error:
$ gcc math.c -o math
/tmp/cciMjg1a.o: In function ‘main’:
/tmp/cciMjg1a.o(.text+0x1b): undefined reference to ‘sqrt’
collect2: ld returned 1 exit status
This is an error given to us by the linker. In this case the linker is complaining to us of an
“undefined reference to sqrt”. When you get this error, it means that you did not provide
enough information to the linker because you need to tell the linker where the sqrt function
exists. You can tell the linker where certain libraries exist that must be linked with your
program with the -l flag. In our case, the standard math library is linked with our program
using the -lm flag, as in
$ gcc math.c -lm -o math
Compilation occurs without errors because the linker was able to find the sqrt function in
the standard math library. Note that in our first simple program hello.c, we did not need
to do any linking because functions in the standard C library like printf are automatically
linked with the linker. We could, however, have use the linker implicitly with
$ gcc -o hello.c -lc -o hello
and this would have linked our program with the standard C library. But the -lc flag is
redundant here because gcc does this linking automatically.
When you need header files
Now let’s assume that in our program we would also like to use a variable that is defined in the
standard math library. Variables are defined in C programs using the #define precompiler
directive. We will get into this in more detail later, but for now let’s say we need to use the
M_PI variable that is defined in the math library, which is in /usr/include/math.h. We
can find the line with the grep command with
$ grep M_PI /usr/include/math.h
# define M_PI 3.14159265358979323846 /* pi */
which shows us that it is defined in /usr/include/math.h as the value of to 20 decimal
places. If we tried to use the value of M_PI in our simple C program, as in
// Program math.c
int main(void) {
sqrt(7*M_PI);
}
we get the following compiler error:
$ gcc -c math.c
math.c: In function ‘main’:
math.c:2: ‘M_PI’ undeclared (first use in this function)
It is important to understand that this error is produced by the compiler and not by the
linker. The compiler is attempting to create the object file math.o but can’t in this case
because there is no definition of the variable M_PI. In order for the compiler to know what
the value of M_PI is, we need to include the header file math.h. If we do so, and our code
looks like
// Program math.c
#include <math.h>
int main(void) {
sqrt(7*M_PI);
}
then we will get no errors when we try to create the object file with
$ gcc -c math.c
But remember, even by including the header file we still need to tell the linker about the
library that contains the sqrt function if we want to create an executable.
The basics of C programming
Basic structure of a C program
As we saw in the hello.c example in the last lecture, Every C program must contain a main
function , since the main function is the first function called when you run your program at
the command line. In its simplest form, a C program is given by
int main(void) {
printf(‘‘Hello world!\n’’);
}
The int stands for the “return type”, which says that the main function returns an integer
if you tell it to do so. We will get into more detail with functions and return types, but you
can return a number to the command line with the return function in C, as in
int main(void) {
printf(‘‘Hello world!\n’’);
return 2;
}
When you compile and run your program, it will return a 2 to the command line, that
you can access with the $? character, which stores the value returned by the last command
executed, as in
$ ./a.out
Hello world!
$ echo $?
2
Just as in shell scripts, you can specify exit codes in C as well, which perform the same
function as the return function in the previous example:
int main(void) {
printf(‘‘Hello world!\n’’);
exit(2);
}
Compiling and running this example yields the same result as the previous example. The
void statement in main(void) tells the main function that there are no arguments being
supplied to it at the command line. We will get into this in more detail when we look at
functions.
Note that all statements in C programs end with the semicolon ; except for the preprocessor
directives that begin with #.
Variables
Standard variable types
All variables in C must be defined before they are used. Variables in C are defined by their
type, which determines how much space they use in memory. The more memory a variable
requires, the larger the number it can store, and therefore the more precision you can obtain.
For example, to define an integer i , a floating point number x , and a character c in our
simple example, we would use
int main(void) {
int i;
float x;
char c;
printf(‘‘Hello world!\n’’);
exit(2);
}
The int type stores 4-byte integers. Since 1 byte contains 8 bits, then each integer type can
store 32 bits of information. If the first bit stores the sign, then the last 31 can store a binary
number. The largest binary number that 31 bits can store is 231 − 1. We can store a larger
(absolute value) negative number because we can assume that if the first bit is negative and
the rest are 0 then this represents the number -1, and not 0, which is represented by the
first bit being positive and the rest being 0. In C, we can specify the unsigned type, which
doubles the size of the number because we do not need to store a bit for the sign of the
number. We can also specify the short type, which is just a smaller version of the type int.
The following table lists the available types in Unix and their ranges [1]
Type Space (bytes) Range (min,max)
char 1 -,-
unsigned char 1 0,28 − 1
short int 2 −215,215 − 1
unsigned short int 2 0,216 − 1
int 4 −231, 231 − 1
float 4 ±3.2 × 10±38
double 8 ±1.7 × 10±308
You can define variables sequentially in a list or separately, line by line, as in
int i, j, k;
float x, y, z;
You can initialize variables with variables at the same time you declare them, such as
int i=0,j=1,k=1000;
float x=0,y=2,z=3;
Global variables
Global variables are defined at the top of your C codes before the main statement. They are
recognized everywhere in the code and not only in the function in which they are defined.
The following C code defines the variables globx and globy as global variables before the
main statement:
float globx, globy;
int main(void) {
...
}
It is normally not good practice to declare global variables in C programs. An alternative
method is to employ the preprocessor directives to define variables that you will need
throughout your programs. For example, to define the above variables as global variables,
you should define them using the preprocessor directives with
#define GLOBX 10
#define GLOBY 20
int main(void) {
...
}
Note the standard practice of defining these global variables in caps. Remember, the two
methods are very different. The first method defines and stores the variables in memory
and they can be changed in your program. The second method uses the C preprocessor to
replace all instances of GLOBX with 10 and GLOBY with 20 before any compilation takes place.
Defining your own types
C allows you to define your own types, and this can make your codes much more legible.
For example, you can define a type called price that you can use to define variables that
you use to represent the price of certain objects. To do so, you would use
typedef float price;
int main(void) {
price x, y;
}
The typedef is used in the following way:
typedef existing_type new_type;
The price type example is identical to using
float x, y;
but it is useful to use typedef when you would like the ability to change the types of certain
variables throughout your codes by only changing one line. That is, if you wanted to change
all of your prices to type double, then you would only need to change the typedef line to
typedef double price;
Constant types
You can declare types that you do not want to be altered in your programs with the const
declaration. For example,
const int constant_integer=2;
declares the value of constant_integer to be 2 and this value cannot be changed in any
part of the code. It is effectively “read-only” and any attempts to change it will result in a
compiler warning such as
warning: assignment of read-only variable ‘constant_integer’
The value can ultimately be changed but the const declaration causes the compiler warning
when you do change it. You use the const declaration mostly to let users of your functions
know that the variables will not be altered.
Formatted output
To print output to the screen, you use the printf statement. To print out a string to the
screen, you would use
printf(‘‘This is a string.\n’’);
Any time the \ character is encountered, it is evaluated by printf as a formatting character.
In this case, the \n character prints a carriage return to the screen. Another commonly used
formatting character is the \t character, which prints a tab to the screen, as in
printf(‘‘This string is separated\tfrom this one by a tab.\n’’);
Note that use of the \ evaluates the first character after it to determine the format, and the
next character is printed, so you do not need a space after the \t in this example. To print
the actual \ character, you would use two backslashes, as in
printf(‘‘This is a backslash: \\\n’’);
To print out the values of variables, you use the % character. As an example,
float x=5.2543;
printf(‘‘The value is %f\n’’,x);
will print out The value is 5.254300. The default of %f is to print out floating point
numbers with six digits. You can specify how many digits you would like to print with
printf(‘‘The value is %.2f\n’’,x);
which will print out two digits with The value is 5.25. If you print out several numbers,
then if they have a different number of digits before the decimal point, then the numbers
may be misaligned, as in
double x=5.2543, y=6558.2391;
printf("The value of x is %.2f\n",x);
printf("The value of y is %.2f\n",y);
which will print out
The value of x is 5.25
The value of y is 6558.24
You can align the numbers by specifying how many digits to print in the format, which
includes the decimal point and the digits after it. Since the number 6558.24 has 7 digits,
we can format the printout so that it prints at least 7 digits, as in
double x=5.2543, y=6558.2391;
printf("The value of x is %7.2f\n",x);
printf("The value of y is %7.2f\n",y);
which will align the decimals when the numbers are printed, as in
The value of x is 5.25
The value of y is 6558.24
Note that the format statement will always print out all of the digits before the decimal
point as well as the number you specify after it. So if you specify %4.2f to print out y in the
above example, you will still get 6558.24. This is because the number of digits you specify
is just a minimum. Other formatting statements include %d, which is used to print decimal
integers , such as
int i=5, j=6558;
printf("The value of i is %4d\n",i);
printf("The value of j is %4d\n",j);
which will print out
The value of i is 5
The value of j is 6558
You can also print characters with %c , strings with %s, octal numbers with %o, and hexadecimal
numbers with %x.
Formatted input
Formatted input is performed with the scanf function. For example, to read two floating
point numbers from the command line, you would use
scanf(‘‘%f %f’’,&x,&y);
This would convert input of the form 1 2 appropriately, even though this input is not
necessarily floating point input. To read in two integers, you would use
scanf(‘‘%d %d’’,&i,&j);
This would work fine with input that does not have decimal points, but if you try to read in
floating point numbers as integers, you will not achieve the desired result. The & sign stands
for the address of the variables. We will go over this when we discuss pointers.
File I/O
Input and output to and from files is identical to that at the command line, except the
fprintf and fscanf functions are used and they require another argument. This additional
argument is called a file pointer. In order to write two floating point numbers to a file, you
first need to declar the file pointer with the FILE type, and you need to open it, as in
float x=1, y=2;
FILE *file;
file = fopen(‘‘file.txt’’,’’w’’);
fprintf(file,’’%f %f\n’’,x,y);
fclose(file);
The function fprintf is identical to the printf function, except now we see it has another
argument file, which is a pointer to the file. Before you use the file variable, you need to
open the file with
file = fopen(‘‘file.txt’’,’’w’’);
This opens up the file ‘‘file.txt’’ and the ‘‘w’’ which is the mode and indicates how
the file will be used. The following three modes are allowed:
Mode String
Open for reading “r”
Open for writing “w”
Open and append “a”
When you are done with the file, you close it with
fclose(file);
The ‘‘r’’ mode is used when you would like to open a file for reading. To read two floating
point numbers from a file, you would use the fscanf function, which is identical to scanf,
except that it takes the file pointer as its first argument, as in
float x, y;
FILE *file;
file = fopen(‘‘file.txt’’,’’r’’);
fscanf(file,’’%f %f\n’’,&x,&y);
fclose(file);
You can check to make sure your files are opened correctly (that is, that they exist), by
checking to make sure file is not the predefined NULL pointer. We’ll discuss this pointer in
more detail later, but for now, to ensure that your file was opened correctly, you use
if(!file) printf(‘‘File did not open correctly!\n’’);
Note that in order to use the FILE type, we need to include the standard C header file with
#include<stdio.h>
The fscanf and fprintf functions can be used to print to and read from the terminal
using the stdin, stdout,and stderr file pointers defined in stdio.h. To write to the
standard output of the terminal using fprintf, then, you would use
fprintf(stdout,’’%f %f\n’’,x,y);
which is identical to using
printf(’’%f %f\n’’,x,y);
To write to the standard error, you would use
fprintf(stderr,’’%f %f\n’’,x,y);
To read from the terminal’s standard input with fscanf, you would use
fscanf(stdin,’’%f %f’’,&x,&y);
which is identical to using
scanf(’’%f %f’’,&x,&y);
Again, in order for the compiler to know what stdin, stdout, and stderr are, you need to
include stdio.h.
Operators
Arithmetic Operators
The arithmetic operators are
*,/,%,+,-
The modulus operator % , when used as
z = x % y;
assigns z with the value of the remainder when x is divided by y. The modulus operator does
not work with types other than int, as it will truncate the result. The + and - operators have
the same precedence, which is lower than that of *, /, and %, and all arithmetic operators of
equal precedence are evaluated from left to right. Therefore, the expression
2*z-x+y*z%2
is evaluated as
((2*z)-x)+((y*z)%2)
Relational operators
The relational operators are given by
< > <= >=
and the equality operators are
== !=
The relational operators exceed the equality operators in precedence, and the arithmetic
operators exceed the relational operators.
Logical operators
The logical operators are && and ||, which are not to be confused with the bitwise operators
& and |. Logical operators are evaluated from left to right, until the evaluation is known to
be either true or false, while the precedence of the && operator is greater than that of the ||
operator. That is,
a==b && c==d || x==y && w==z
is evaluated as
(a==b && c==d) || (x==y && w==z)
As another example, the logical relation
4==2*2 && 3==5-1 && 2==1
will return false as soon as the 3==5-1 is encountered, since this means the entire statement
is false, without having to evaluate the last false conditional 2==1.
Increment, decrement, and assignment operators
In C, the increment and decrement operators are ++ and --, so that in order to increment a
variable i and decrement it you would use i++ or i--. These operators are provided merely
for source code compactness, since they are identical to i=i+1 and i=i-1. A peculiar aspect
to the increment or decrement operators is that they can be used as prefix operators (++i)
as well as postfix operators (i++). The difference between the two is that i++ returns the
value of i before it is incremented, while ++i increments i and then returns it. For example,
in the following example,
i=2
j=i++
the value of j will be 2 while the value of i will be 3. If we use the prefix operator, as in
i=2
j=++i
Handout 7 13/03/03 9
the value of j will be 3 while the value of i will also be 3.
Assignment operators provide compact notation to assign an expression using a binary
operator to a left-hand side argument. For example, i=i+2 can be expressed using an
assignment operator as i+=2. This works with the following operators
+ - * / & << >> & ^ |
As another examples,
x = x*(2+z)
can be expressed more compactly as
x*=2+z
Note that the argument on the right hand side of the assignment operator always takes
precendence, in that x*=2+z is not the same as x=x*2+z but is given above by x=x*(2+z).
Precedence and order of operations
So far we have seen that certain operators take precendence over others and that determined
the order of operations in which the operators were performed. It is important to understand
that in C, every operator performs an operation on its arguments and then returns a result,
and the calculation proceeds from there. If you think about things this way it makes it easier
to understand the result from an operation which contains confusing precedence issues. For
example, in the statement i=i+1, the only way we know what the result will be is by knowing
that the + operator takes precendence over the = operator. So we write the above statement
as i=(i+1). We know that it is not (i=i)+1 only because the + operator takes precendence
over the = operator.
Another important aspect of operators is their associativity. That is, are the operations
evaluated from right to left or from left to right? Associativity determines how things are
evaluated by the way you would place parentheses around the operators. For example, since
+ is performed from left to right, we would evaluate a sum of several operands as
i+j+k+l+m=((((i+j)+k)+l)+m)
This may seem obvious only because of the associative property of addition, in that addition
is identical if it is evaluated from left to right or from right to left, since
((((i+j)+k)+l)+m)=(i+(j+(k+(l+m)))).
However, this does not hold true for division, since
((((i/j)/k)/l)/m)6= (i/(j/(k/(l/m)))).
We know that C performs the divisions as they are on the left because associativity of division
is from left to right. The following table shows the associativity of the operators in order of
precedence, for which the operators at the top have the highest precedence [2].
Operator Associativity
() [] -> . left to right
! ~ ++ -- + - * & (type) sizeof right to left
* / % left to right
+ - left to right
<< >> left to right
< <= > >= left to right
== != left to right
& left to right
^ left to right
| left to right
&& left to right
|| left to right
?: right to left
= += -= *= /= %= &= ^= |= <<= >>= right to left
, left to right
We will deal with the -> and . operators when we discuss structs and pointers.
Conditionals
If and ?
The syntax of the if statement in C is given by
if ( expression )
statement
else
statement
and for multiple if statements, the syntax is given by
if ( expression )
statement
else if ( expression )
statement
else if ( expression )
statement
else
statement
where you don’t necessarily need the last default else statement . As an example, we might
use the if statement as
if ( U == 0 )
x = y;
else
x = z;
The if statement first evaluates the expression. The expression, like all expressions in C,
returns a value, which is either true (1) or false (0). Therefore, the expression U == 0 returns
a 0 when U is 0. Rather than evaluating this expression, we can more compactly use
if ( U )
x = y;
else
x = z;
which will execute the if ( U ) when U is nonzero. We can also use if ( !U ) to represent
if ( U != 0 ) as well. Note that multiple expressions within the statement must be placed
within braces ({}) to represent a block. For example,
if ( U ) {
x = y;
z = 1;
} else {
x = z;
y = 1;
}
A more compact form of the if statement involves using the ? operator. For example,
the following ifstatement
if ( y > z )
x = y;
else
x = z;
can be represented more compactly with
x = (y > z) ? y : z ;
The syntax is given by
expression1 ? expression2 : expression3
Which says evaluate expression1 first. If it is true, then evaluate and return expression2,
otherwise evaluate and return expression3.
Switch
Often you may have a particular variable whose value may take on several possibilities.
Rather than using an if statement, it is more convenient to use the switchstatement, for
which the syntax is given by
switch ( expression ) {
case const-expr:
statement
case const-expr:
statement
default:
statement
It is not necessary to employ the default statement, since if none of the cases are satisfied
then the statements will execute. As an example, consider the following switch statement
int i=25;
switch ( i ) {
case 25:
printf("i=25\n");
case 15:
printf("i=15\n");
}
This switch statement tests the value of i. If the first case is true, or when i==25, then
the first printf statement will be evaluated. But now that the first case has been satisfied,
all subsequent statements will be evaluated as well. That is, after one case is satisfied,
evaluation falls into the next case unless a break statement is issued. In the example given
above, since i==25, this code will print out both i=25 and i=15. To prevent this, you need
to use the break statement, as in
int i=25;
switch ( i ) {
case 25:
printf("i=25\n");
break;
case 15:
printf("i=15\n");
break;
}
This will cause execution to break out of the switch statement after one of the cases is
satisfied.
For and while loops
The syntax of a while loop is given by
while ( expression )
statement
This will repeatedly evaluate statement as long as expression is true. For loops are just
special whileloops. The syntax of a for loop is given by
for ( expression1 ; expression2 ; expression3 )
statement
The equivalent while loop is given by
expression1;
while ( expression2 ) {
statement
expression3;
}
As an example, consider the for loop which loops through the values of i and prints the
values
for(i=0;i<10;i++) {
printf(‘‘i=%d\n’’,i);
}
The equivalent while loop is
i=0;
while ( i < 10 ) {
printf(‘‘i=%d\n’’,i);
i++;
}
Use of for or while for looping is purely a matter of preference, although some situations
are better suited to using for loops. The above example was clearly suited to using a for
loop since it looped through a specified set of values. At any time during the execution of
the for or while loops, you can stop the loop with the break command.
Arrays
Arrays are declared to store a certain number of variables of a specified type either on
the stack, which is the local static memory that is allocated to run your program when it
executes, or in the heap, which is memory that can be dynamically allocated and deallocated
as your program runs. To declare an array called values that contains 5 floating point
elements, you would use
float values[5];
This declaration requires that your program statically allocate a total of 20 bytes of memory
(5 floats × 4 bytes/float = 20 bytes) at runtime. You can also initialize arrays to contain
given values, such as
float values[] = {1,2,3,4,5};
which will allocate enough space to store the 5 numbers shown as floating point numbers.
You can access these values and print them, for example, in a loop, as follows
for(i=0;i<5;i++)
printf(‘‘%f\n’’,values[i]);
To declare multidimensional arrays on the stack, you would use
float values[5][3];
and you can access these values in a similar manner with
for(i=0;i<5;i++)
for(j=0;j<3;j++)
printf(‘‘%f\n’’,values[i][j]);
Strings
Strings are character arrays that contain the null character ’\0’ as the final character. They
are NOT simply character arrays. It is important to understand the difference between a
simple array of characters and a string. An array of 5 characters is declared with
char string[5];
this is NOT a string until the last character is the null character, or unless
string[4]=’\0’;
Handout 8 17/03/03 2
If you attempted to print an array of characters that is not terminated with the null character,
then you get nonsense. Strings can be declared in a similar manner to arrays, as in
char string[] = ‘‘Hello!’’;
This is known as a declaration of a string constant. The string “Hello!” has a total of 6
characters, but the actual storage includes 7 characters, because the compiler adds on the
null character ’\0’. This is equivalent to the commands
char string[7];
string[0]=’H’;
string[1]=’e’;
string[2]=’l’;
string[3]=’l’;
string[4]=’o’;
string[6]=’!’;
string[7]=’\0’;
Note that since each element of a string is a character, if you wanted to determine whether
or not a particular string contained a given character, you would use a loop such as
i=0;
while(string[i++]!=’l’) ;
printf(‘‘Found c at location %d\n’’,i-1);
Functions for use with strings
The standard library string.h contains a host of commands that are very useful when
dealing with strings. To obtain the length of a string, you use
strlen(string);
This is useful when performing an operation on each individual character in a string, such
as
for(i=0;i<strlen(string);i++)
printf(‘‘%c\n’’,string[i]);
To copy the contents of one string into another, equating them does not work!. For
example, the following code will not work:
char string1[10], string2[10];
string2 = string1;
Instead, each character from string2 must be copied into string1 with
for(i=0;i<strlen(string1)+1;i++)
string2[i]=string1[i];
Note that the for loop ends with the condition i<strlen(string1)+1 rather than with
i<strlen(string1) so that we can be sure to copy the null character. The standard library
string.h provides the useful function that does this for us as
strcpy(string2,string1);
which copies string1 into string2.
The other useful function (among many others) is strcmp, which enables us to compare
the equality of strings. This function is used as
if(strcmp(string1,string2)>0)
printf(‘‘string1 > string2\n’’);
else if(strcmp(string1,string2)==0)
printf(‘‘string1 == string2\n’’);
else if(strcmp(string1,string2)<0)
printf(‘‘string1 < string2\n’’);
The equality of string1 and string2 implies that they are identical, while if string1 is
less than string2, then string1 comes before string2 alphabetically, and vice-versa when
string1 is greater than string2.
Functions
Functions in C must be declared at the beginning of your source code. The format of a
declaration is given by
returntype FunctionName(argtype arg1, argtype arg2,...,etc...);
The actual function definition is declared in an identical manner with braces used to define
the function, as in
returntype FunctionName(argtype arg1, argtype arg2,...,etc...) {
returntype returnvariable;
code for function...
return returnvariable;
}
As an example, let’s say we would like to read in a string from the standard input and print
out the first location of a specified character in that string. The functions in the header file
/home/cos315/include/cos315.h
contain some useful operations and string types that we can use for these purposes. When
we discussed the declaration of strings before, the memory allocated for these strings was
all allocated on the stack. The functions in the cos315.h header file can be used to allocate
space for strings dynamically in the heap. To declare a string using this header file and
allocate space for it in the heap, we would use
string line = NewString();
This uses the new type called string in the header file. We will discuss the details of this
declaration in more detail when we cover pointers in the next lecture. We need to be sure,
however, that every time we allocate something in dynamic memory that we free up the
memory associated with that string with
FreeString(line);
Once we allocate space for a string, we can read in a string from the command line with
GetLine(line,stdin);
which places the next line read in from stdin into the string line. The following code uses
these functions to create the find function which finds the location of the occurence of a
particular character in a given string.
#include<stdio.h>
#include "cos315.h"
#define TRUE 1
/*
* Function declarations.
*
*/
int find(string str, char c);
/*
* Main function.
*
*/
int main(void) {
int location;
char c = ’l’;
string line = NewString();
printf("Type quit to exit.\n");
while(TRUE) {
printf("String: ");
GetLine(line,stdin);
if(!strcmp(line,"quit\n"))
exit(0);
location = find(line,c);
if(location == -1)
printf("Character %c not found in string %s\n",c,line);
else
printf("The character %c is located at index %d in string %s\n",
c,location,line);
}
FreeString(line);
}
/*
* Function: find
* Usage: loc=find(string,’c’);
* ----------------------------
* This function returns the first index at which the specified
* character exists in the given string. If it is not found
* then a -1 is returned.
*
*/
int find(string str, char c) {
int i;
for(i=0;i<strlen(str);i++)
if(str[i]==c)
return i;
return -1;
}
This code is available on the class server as the file
/home/cos315/assignments/assignment2/find.c
To compile this code, you need to link it with the precompiled object file on the class server
and you also need to tell it where to find the header file with the -I option, as in
$ gcc find.c /home/cos315/include/cos315.o -I/home/cos315/include
Comments
Post a Comment