2 C Features - Pointers and Arrays

2.1 Pointers

The unique and core feature of C programming language is the capability of using memory addresses for data operations. C uses pointers to represent and store addresses and to do address operations. A pointer is a special type variable which stores the address of a data object (i.e. variable instance).

2.1.1 Variable name, value, and memory location

Every variable in C has a name (identifier) in a source code program. A variable is allocated a memory block at compile time, which is mapped to a memory location of a computer system at runtime. A variable is assigned a value in a source program, the compiler generates instructions, which write the value to the memory location of the variable at runtime. In other words, when the compiler encounters the statement of variable declaration and assignment of a value, it will allocate a memory block to the variable with a relative address, and generate instructions to write the value to the memory block. The relative address is an offset from a certain position. The size of the allocated memory block of a variable is the size of the data type of the variable. The compiler also puts the variable name and the relative address of the memory block and size into a table for later use of the variable.

For example, when the compiler processes statement int x = 1890259661;, it sets aside 4 bytes of memory block with a relative address, and generates instructions to write value 1890259661 to the memory block. At runtime, when these instructions are executed, the memory block is instanced with absolute address, i.e., having exact memory location and address, the value is written to the memory location.

For the convenience of accessing data objects by their addresses, C provides an operator & (reference operator) to get the address of an variable instance. For example, &x represents the address of variable x. At runtime, the value of &x is the address of the first memory cell of the memory block of the instance of x. The notation &x is called the reference of x.

C also provides a unary operator * to get value stored at a given address. For example, expression *(&x) represents the value stored at the memory address of &x. We have seen these notations in pass-by-references of functions. Listing 1 shows an example of using & and * in programming.

Compile and run the above program, we have the output like the following. Note that the address value may be different by computers.

Value of x is 1890259661
Runtime memory address of x in Hex is 0065FE9C
Runtime memory address of x in decimal is 6684316
Value stored at address 6684316 is 1890259661

Figure 1 illustrates the memory address and value 1890259661 in the memory block.

Figure 1: Variable name, memory, and value.
Figure 1: Variable name, memory, and value.

From Listing 1 and its output we see that, x is an int variable, x is assigned value 1890259661. &x represents the address of x and has runtime address 6684316. *(&x) represents the value stored at the memory location of address 6684316, namely 1890259661. *(&x) represents the value of variable x.

2.1.2 Concepts of pointers

The memory address of a variable is an integer number representing the lowest address of the memory block of the variable at the runtime. For the convenience of accessing data values by their addresses, C introduces a new type of variables, called pointers, to store the addresses of variables, thus makes it possible to access and operate data objects by their memory addresses.

A pointer is a variable to hold the memory address of another variable.

The syntax of declaring a pointer is data_type *ptr_name;

Here, * tells the compiler that ptr_name is a pointer type variable and data_type specifies that it will store the address of data_type variable.

The syntax to assign an address to a pointer is data_type x; ptr_name = &x;

It is called that ptr_name references to x, or ptr_name points to x.

Notation *ptr_name is called dereferencing ptr_name. Here, unary operator * is called dereference operator. Dereferencing a pointer is an operation to get the value stored at the memory location pointed by the pointer.

The expression *ptr_name tells compiler to generate instructions to:

  1. get the address value stored in ptr_name, then
  2. get the value stored at the memory location at the address given in ptr_name.

This process is called dereferencing a pointer, or getting value pointed by a pointer. In programming, *ptr_name represents the value or data pointed by ptr_name.

Notation *ptr_name can also be used to assign/set/store a value to the memory location pointed by ptr_name as *ptr_name = value.

Now we see, pointers provide an alternative way to get/set (i.e., access) data through the addresses of data objects.

Example:

int x = 10;  // this declares x as an int variable and initialize it with value 10 
int *p;      // this declares p as an int type pointer 
p = &x;      // this assigns the address of x to p
int b = *p;  // this dereferences p, and assigns the value to variable b, b will have value 10
b = b + 10;  // this increases b's value by 10
*p = b;      // this save the value of b to the memory location pointed by p, i.e. x, now x will have value 20

Note:

  1. A pointer must point to a proper memory location before it can be dereferenced, or set a value to; otherwise it leads to a runtime error due to non-proper address. For example, the following code will cause a problem at runtime even though it can pass compiling.
int *p;   // declare p as an int type pointer 
*p = 20;  // this does not work properly as p does not hold any valid address yet.
  1. Dereferencing a pointer provides an alternative method to get and set values at memory locations. But it is less efficient than directly using variable name because dereferencing needs first to get the variable address from the pointer, while using variable name will directly get the value from the variable location.
  2. The size of pointers is same for all types of pointers. The type of pointer is to assure that a pointer stores the address of data object of its type so that it gets the data value when dereferencing. For example, statements float x = 10.0; int *p = &x; will fail the compiling.
  3. Values of pointers are not application data. They are runtime intermediate or auxiliary data of running a program, they are meaningful when running the program, and they may be different at different running and on different computers.

Listing 2 illustrates using pointer for getting and setting values. Compile and run this program, check the output to see the value of pointer ptr and the value of dereferencing *ptr.

The output is like the following. Note that the address values may be different when you run it.

Value of x is 1890259661
Runtime memory address of x is 6684316
Value of pointer ptr is 6684316
Size of pointer ptr is 4
Runtime memory address of pointer ptr is 6684312
Value of *ptr is 1890259661
Value of *ptr is 10
Value of x is 10

Figure 2 shows the variable x and pointer ptr in memory. Note that the binary number stored at address 6684312 is 00000000 01100101 11111110 10011100, it represents decimal number 6684316, i.e., the runtime address of x. It also indicates that the pointer type variable ptr has 4 bytes. We use the 32-bit system, the address size is 32 bits (4 bytes). The sizeof pointer will be 8 if 64-system is used.

Figure 2: Pointer and memory
Figure 2: Pointer and memory

2.1.3 Pointer operations

Pointers represent and store data object addresses. The address values are integers. Pointers support the following operations.

1. Dereferencing operation

Dereferencing operation is to get the value pointed by a pointer. Assignment operation is to assign the value of one pointer or address of a variable to another pointer of the same type. The following code shows examples of dereferencing and assignment operations.

int num1=2, num2= 3, sum=0, mul=0;
int *ptr1, *ptr2, *ptr3;
ptr1 = &num1;        // address assignment, ptr1 is pointing to num1 
ptr2 = &num2;        // address assignment, ptr2 is pointing to num2 
ptr3 = ptr1;         // assign ptr1 to to ptr2, ptr3 is pointing to num1 
sum = *ptr1 + *ptr2; // dereferencing ptr1 and ptr2, the value of sum will be 5 at runtime
mul = sum * *ptr3;   // dereferencing ptr3, the value of mul will be 10 at runtime

2. Assignment operation

Dereferencing operation can also be used to set values at memory location pointed by a pointer.

int num;
int *ptr;
ptr = #     // address assignment, ptr is pointing num
*ptr = 2;       // num will have value 2 at runtime

3. Increase and decrease operations

Pointers support increase and decrease operations by adding/subtracting an integer value k (or expression). Such an operation will increase/decrease the address value by k times the size of data type of the pointer. For example, using ptr1 in the above code fragment, ptr1-1 will minus 4 from the address value of ptr1, so ptr1-1 will point to num2. ptr1-2 will minus 8 (=2*4) from the address value of ptr1, so ptr1-2 will point to num.

Unary increment (++) and decrement (–) operators are also supported. Expressions ptr1++, ++ptr1, ptr1--, --ptr1 are all valid in C. For example, ptr2++; is equivalent to ptr2 = ptr2+1;. Note that the amount of increment/decrement is also determined by the size of the pointer type. For example, given char c = 'a'; char *p = &c;, p++ will increase and update the value of p by 1. If given float c = 3.14; float *p = &c;, p++ will increase and update the value of p by 4.

Unary increment (++) and decrement (–) operators have greater precedence than the dereference operator (). Therefore, the expression ptr++ is equivalent to *(ptr++).

4. Comparison operations

Pointer comparisons are supported by using relational operators. For example, ptr1 > ptr2, ptr1 == ptr2 and ptr1 != ptr2 are all valid in C.


2.1.4 Special pointers

1. Null pointers

A null pointer is a special pointer value for not pointing anywhere. This means that a null pointer does not point to any valid memory address. C uses predefined macro constant NULL to represent the null pointer, it has value 0. For example, use NULL to declare and initialize a pointer, int *ptr = NULL; or equivalently int *ptr = 0;

It is a good programming practice to set NULL to a pointer if we don’t have a target to point. Then we can check if a pointer is equal to NULL, so to decide if the pointer is pointing to somewhere.

if (ptr == NULL)
{
  statement block;
}

2. Generic pointers

A generic pointer is a pointer that has void as its type. For example, void *ptr; declare ptr a generic pointer.

A generic pointer can point at a variable of any type, but needs be cased to a specific type when doing operations and dereferencing.

Example:

int a = 10;
void *ptr = &a; 
printf("%d", *(int*)ptr);  // will print 10, (int*)ptr casts ptr to int type pointer

float f = 3.14;
ptr = &f;   
printf("%f", *(float*)ptr); // will print 3.14

3. Pointers to pointers

C allows a pointer pointing to another pointer. To declare a pointers to a pointer, just add an asterisk (*) for each level of reference. For example,

int x=10;
int *px;    // pointer pointing to an int variable
int **ppx;  // pointer pointing to an int pointer
px=&x;      // px pointing to x
ppx=&px;    // ppx pointing to px
printf("%d\n", **ppx); // or *(*ppx) will print 10

Listing 3 shows the usage of pointers to pointers.

Output is like the following.

Value of x is 1890259661
Runtime address of x in decimal is 6684316
Value of pointer ptr in decimal is 6684316
Runtime address of pointer ptr is 6684312
The value pointed by ptr is 1890259661
Runtime address of pointer pptr is 6684308
The value  pointed by pptr, i.e., value of *pptr is 6684316
The value pointed by *pptr, i.e. value of *(*ptr) or **pptr, is 1890259661

Stop and think

Does it make sense to have pointers of three asterisks?

2.1.5 Pointers of functions

Pointers can be used to point to functions. This gives the flexibility to use one function pointer to represent different functions, and further more, makes it possible to pass a function to another function by references.

Function type pointers

Example:

int max();
int (*p)();       // this declares a function type pointer p
p = max;          // this lets p point to function max, that is  
                  // p holds the starting address of function block of max
c = (*p)(a, b);   // this uses function pointer p to call the function

Example of using function type pointers.

Functions as parameters

The following programming example shows how to use function type pointers as function parameters.

2.1.6 Memory allocations in C

Knowing the concepts of pointers, we can look into the dynamic memory allocation for data storage.

Allocating memory to store data is a fundamental task in programming. In a simple word, a memory allocation is to assign a memory block to store a data value of a certain type. C supports three memory allocation methods: static memory allocation, automatic memory allocation, and dynamic memory allocation.

Static memory allocation refers the memory allocation done by declaration of global or static variables. At compile time, a global or static variable is allocated a memory block with a relative address to the data region. At runtime, the statically allocated memory blocks are instanced with absolute addresses in the data region. The size of the data region is determined by the sum of the sizes of statically allocated memory blocks at compile time, and the size won’t be changed at runtime.

Automatic memory allocation refers to the memory allocation done by function arguments and local variables. At compile time, a local or argument variable is assigned a memory block with a relative address to the function scope. At runtime, the memory block of an automatic allocation is put in stack region when the function is called, and automatically released after the function call is finished. By release we mean that the space of the memory block can be reused by follow-up function calls. The automatic release means that the memory release is managed by program execution mechanism, not by program statements.

Dynamic memory allocation refers to the memory allocation done by stdlib library function malloc(). The memory block of a dynamic allocation is located in the heap region, and it won’t be released automatically after the calling function finishes. So dynamically allocated memory blocks can be shared by different functions. If a dynamically allocated memory block is used anymore, it can be released by using the free() function. The releasing will return the memory block to the pool of open memory space in the heap region.

For example, statement int *p = (int*) malloc (sizeof(int)); allocates a memory block of sizeof(int) bytes in the heap region with address stored in pointer p. Pointer p is used to access the memory block for setting and getting data. Statement free(p); will release the memory block, so that the memory block can be reused by other malloc() calls.

Example:

int *p = (int*) malloc (sizeof(int));
*p = 3;
printf("%d", *p);

The following is a list of C stdlib functions for the dynamic memory operations.

malloc()

Syntax: data_type *ptr = (data_type*) calloc(n*sizeof(data_type));.

It allocates memory block of n*sizeof(data_type) bytes and returns a pointer to the first byte of the memory block. It returns NULL if the allocation fails at runtime. The failure may be caused by there being no memory block of the requested size left in the heap region.

calloc()

Syntax: data_type *ptr = (data_type*) calloc(n, sizeof(data_type));

It allocates memory block of n*sizeof(data_type) bytes, initializes its memory cells to 0, and returns a pointer to the first byte of the memory block. It returns NULL if the allocation fails.

realloc()

Syntax: data_type *ptr = (data_type*) realloc(ptr, n);

It alters the size of previously allocated memory block to a new size n. The alternation could expand the previous memory block by adding more bytes or shrink it by removing some bytes at the high end of the memory block. It returns the same memory address if the alternation successes, or NULL if the alternation fails. The failure could happen when there is no room to expand at the memory location.

free()

Syntax: free(ptr)

It releases the previous dynamically allocated memory block pointed by ptr to the heap, so the memory space of the block can be reused.

Memory leaking

It is important to keep the address of a dynamically allocated memory block by a pointer. If the address is lost, then the memory block can not be accessed, released, and reused. This situation is called memory leaking.

Example:

int *p = (int*) malloc (sizeof(int));
*p = 3;
printf("%d", *p);
p = NULL; // this causes a memory leaking

The following code does not cause memory leaking.

int *p = (int*) malloc (sizeof(int));
*p = 3;
printf("%d", *p);
free(p);  // release the memory block
p = NULL; // this won't cause a memory leaking

2.1.7 Summary

You learned that pointers are variables to store address of other variables. Pointers provide an alternative method to access data objects. Pointer values are runtime intermediate auxiliary data of a running program, they are meaningful when the program is running.

Pointers make it possible to do address operations for data access and operations. Such address based operations bring lots of flexibility, power, and features to C programming.

Stop and think:

Make a list of advantages and disadvantages of pointers.


2.1.8 Exercises

Self-quiz

Take a moment to review what you have read above. When you feel ready, take this self-quiz which does not count towards your grade but will help you to gauge your learning. Answer the questions posed below.

Go back