CP264B Lesson 2.2

2.2 Arrays

Arrays are fundamental data structures to store a collection of data items (values, elements, records) of the same type in contiguous memory locations. Arrays are used to

store application data records.
store strings (text data).
implement other data structures like queues, stacks, heaps, hash tales, and heaps.
implement mathematical vectors, matrices, and rectangular tables.

In this section, we introduce the concepts of arrays, array operations, and array operations by pointers.

2.2.1 Data structures

Let’s first clarify the term data structure. Simply say, a data structure is a collection of data items of certain types together with a set of operations that can be performed on the data items. In other words, a data structure is a collection of data items (also called elements, or components) connected in certain structures and accessed by defined logics. Formally say, a data structure defines how data items are represented, organized, and stored in memory, and how they can be accessed and operated. Data structures are used in algorithms and programs to represent and operate data.

For example, a list of variables is the simplest form of data structures that we often use in programming. We use variables to represent data values, access variables by their names. The list of local variables of a function are held in contiguous memory locations in the stack region. The list of global variables are held in contiguous memory locations in the data region.

Listing 1: Example of global and local variables.

#include<stdio.h>
int a = 1;   // a global variable
int b = 2;   // a global variable 
int main()
{
  int x = 3; // a local variable
  int y = 4; // a local variable
  printf("a: %d, address: %lu\n", a, &a);
  printf("b: %d, address: %lu\n", b, &b);
  printf("x: %d, address: %lu\n", x, &x);
  printf("y: %d, address: %lu\n", y, &y);
  return 0;
}

In Listing { 1}, a and b are global variables, x and y are local variables. Compile and run the program. The output is like the following.

a: 1, address: 4206600
b: 2, address: 4206604
x: 3, address: 6684316
y: 4, address: 6684312

Check the above output, you will notice:

Addresses of global variables are much smaller that that of local variables. Why?
The address of global variable b is 4 more than that of global variable a. Why?
The address of local variable y is 4 less than that of local variable x. Why?

Stop and think:

How a list of global or local variables is arranged in memory?

2.2.2 Concepts of arrays

Using a list of variables works well when the number of data items to be represented is fixed and small, but does not work when the number of data items is large. Array is a solution to solve this problem for the case when the data items are of the same data type.

By definition, an array is an ordered list of variables (called elements) of the same data type. Array elements are represented and accessed by their index positions in the linear ordering. The elements of an array are stored in contiguous memory locations.

Array syntax

Arrays are declared using syntax: data_type array_name[length];.

Here, data_type is the data type of array elements, array_name is a name identifier, and length is a positive integer representing the number of elements in the array (referred to as the length of the array). The declaration tells the compiler to allocate a memory block of length*sizeof(data_type) bytes for the array, so it can store length number of data items of type data_type.

Expression sizeof(array_name) returns the size (number of bytes) of the array, and sizeof(array_name)/sizeof(data_type) gives the length of the array.

The elements of the array are accessed by syntax array_name[index], where the subscript index is an integer between 0 and length - 1 representing a position in the array. For example, array_name[0] represents the first element, which has the lowest address in the array memory block. array_name[length-1] represents the last element of the array.

Since an array element can be accessed directly using index with a constant number of instructions, array element accessing time is a constant. Accessing any data item in a data structure is often referred to as random access. So arrays are extremely efficient for random access.

The notation array_name[index] is used as a variable identifier to get/set values, and &array_name[index] represents the address of array_name[index]. The following code fragment shows the usage of the bracket notation.

float a[5];         // declare an array of length 5 of float type
a[0] = 10.1;        // set value 10.1 to the first element
float b = a[0];     // get the first array element and assign it to another variable
float *ptr = &a[1]; // create pointer ptr pointing to a[1]

Array initialization

Arrays can be initialized at declaration.

Example:

int a[5] = { 4, 1, 7, 2, 1 };  
// this declares an int type array of length 5, 
// and sets element values 4, 1, 7, 2, 1, respectively  
// e.g. a[0] holds value 4, a[4] holds value 1

int a[5] = { 4, 1, 0 };  
// this declares an int array of length 5, 
// and sets the values 4, 1, 0 for the first three elements, and 0 for the rest elements.
 
int a[5] = { 0 };
// this declares an int array of length 5, and sets 0 to all elements. 

int a[] = { 2, 6, 4 };
// this declares an int array of length 3, and sets element values in order of 2, 6, 4.

Array traversal

Array traversal is to access or visit every element of an array. When traversing an array, we can process array elements by doing something such as printing values, getting addresses, and changing values.

The simple array traversal algorithm uses for loop with subscript index changing from 0 to length-1 for forward direction traversal, or from length-1 to 0 for backward direction traversal.

Example:

int a[5] = { 4, 1, 7, 2, 1 }; 
int i;

for (i=0; i<5; i++)  printf("%d ", a[i]);  
// this prints the array data in forward order: 4 1 7 2 1
  
for (i=4; i>=0; i--) printf("%d ", a[i]); 
// this prints the array data in backward order: 1 2 7 1 4

Listing 2: Example of array traversal.

#include <stdio.h>
#define SIZE 5
int a[SIZE]; // global array, will be stored in data region
int main()
{
  int i;
  printf("global array\n");
  printf("Size of array a: %d\n", sizeof(a));
  printf("Length of array a: %d\n", sizeof(a)/sizeof(int));
  for(i = 0; i < SIZE; i++) {
    a[i] = i + 1;
    printf("a[%d] : %d, address : %lu\n", i, a[i], &a[i]);
  }
  
  char b[] = {'a', 'b', 'c', 'd'};  // local array, will be stored in stack region
  printf("local array\n");
  int size =  sizeof(b);
  int length =  size / sizeof(char);
  printf("Size of array b: %d\n", size);
  printf("Length of array b: %d\n", length);
  for(i = 0; i < 4; i++) {
    printf("b[%d] :  %c, address : %lu\n", i, b[i], &b[i]);
  }
  return 0;
}

The output of the above program is like the following.

global array
Size of array a: 20
Length of array a: 5
a[0] : 1, address : 4219876
a[1] : 2, address : 4219880
a[2] : 3, address : 4219884
a[3] : 4, address : 4219888
a[4] : 5, address : 4219892
local array
Size of array b: 4
Length of array b: 4
b[0] :  a, address : 6684304
b[1] :  b, address : 6684305
b[2] :  c, address : 6684306
b[3] :  d, address : 6684307

List 2 shows the array traversal with initialization, setting values and getting addresses. Check the runtime addresses of array elements in the output. You can see how array elements are arranged in memory.

2.2.3 Creating arrays by dynamic method

The above method of creating an array by declaration is a static method, which allocates the array memory block in data region or stack region. Such arrays (i.e., arrays created by the static method) are sometime called static arrays.

Arrays can also be created by using the dynamic method, namely using malloc() function to allocate memory space for an array. Arrays created by the dynamic method are called **dynamic arrays*, and have memory blocks in the heap region at runtime.

Example:

Listing 3: Example of dynamic array.

#include<stdio.h>
#include<stdlib.h>     
int main() {
  int length = 5;
  int *a = (int*) malloc(sizeof(int)*length);
  int i;
  for (i=0; i<length; i++)  {
    a[i] = i; 
    printf("a[%d]:%d, address:%lu\n", i, a[i], &a[i]); 
  }
  return 0; 
}

The output of the above program is like the following.

a[0]:0, address:12326352
a[1]:1, address:12326356
a[2]:2, address:12326360
a[3]:3, address:12326364
a[4]:4, address:12326368

Note that address values are changing with different runs due to the nature of dynamic memory allocations.

2.2.4 Array operations by pointers

Pointers provide an alternative and efficient method to operate arrays.

Accessing array elements by pointers

Pointers can be used to access array elements. For example, int a[10], *ptr = &a[0]; Then ptr points to a[0], namely ptr holds the address of a[0]. *ptr gives the value of a[0].

An array name actually represents the reference of the array, namely, the address of the first element. Statement ptr = a; is equivalent to ptr = &a[0];.

Accessing array elements by pointers is more efficient than using subscript index. For example, given int a[10], i=4, *ptr = &a[i];, then both *ptr and a[i] give the value of a[i]. What is the difference? a[i] tells the compiler to generate instructions to compute the location of a[i], namely, compute array a's starting address + i*4. While *ptr gets the location of a[i] directly from pointer ptr, so it is more efficient if the pointer is available.

Using pointer arithmetic operations makes array accessing even more efficient. For example, if ptr points to a[i], then after ptr++, ptr will point to a[i+1], or ptr will point to a[i-1] after ptr--.

List 4 is an example of array traversals in both directions using pointers.

Listing 4: Example of array traversal by pointers.

#include <stdio.h>
int main()
{
  int len = 5, a[len], i=0;
  int *min_ptr = &a[0], *max_ptr=&a[len-1], *ptr;
  
  for (ptr = min_ptr ; ptr <= max_ptr; ptr++) {
    *ptr = i;    
    i++;
  }
  
  for (ptr = max_ptr ; ptr >= min_ptr; ptr--)
    printf("value:%d, address:%lu\n", *ptr, ptr);

  return 0;
}

Output:

value:4, address:6684252
value:3, address:6684248
value:2, address:6684244
value:1, address:6684240
value:0, address:6684236

Array pointer

An array can be viewed as a derived data type. An array pointer is a pointer pointing to an array.

An array pointer can be declared by syntax: data_type (*ptr_name)[k];.

Then ptr_name is a pointer to an array of type data_type of length k.

For example, int (*ptr)[10]; declares array pointer ptr, which is used to point to an int array of length 10. The notation ptr++ will increase the ptr value by 10*sizeof(int). The following code fragment shows the usages of array pointers.

int a[10], *p;
int (*ptr)[10];
p = a;      // equivalent to p = &a[0]; p points to a[0], p++ will increase by sizeof(int)
ptr = &a;   // ptr and p has the same value, but ptr++ will increase by sizeof(int)*10. 

*ptr points to a[0]
*ptr + i  points to a[i]
*(*ptr + i) gives a[i]

We will see the application of array pointers for 2D array operations in the next section.

Pointer array

A pointer array (also called array of pointers) is an array of pointer type elements.

A pointer array is declared by syntax: data_type *ptr_array_name[k].

Here, ptr_array_name is an array of pointers. That is, array element ptr_array_name[index] is a pointer pointing to a data_type variable. *ptr_array_name[index] get the value it points to.

Example:

int *ptr[3];
int a=1, b=2, c=3;
ptr[0]=&a;
ptr[1]=&b;
ptr[2]=&c;
printf("%d", *ptr[2]);  // output: 3

Stop and think

How do you apply pointer arrays in applications?

2.2.5 Passing arrays to functions

When input data are stored in an array and a data processing algorithm is implemented in a function, we need to pass the array to the function. For example, we need to pass an array to a sorting function. There are three methods to pass array to a function.

Passing individual array elements by values

Passing one (or more individual) array element to a function is like passing by value. The value of an array element is copied into the function. The array elements will not be changed by the function.

Example:

Listing 5: Example of passing an array element value to a function.

#include <stdio.h>
void f(int num);
void main()
{
  int a[5] = {1, 2, 3, 4, 5};
  f(a[3]);
}
void f(int num)   
{
  printf("%d", num);
}
/* output:
4
*/

Passing individual array elements by references

Passing one (or more) individual array element by reference is the same as passing a variable reference. The address of the array element is copied into the function. Through the address, the array element can be accessed and changed.

Example:

Listing 6: Example of passing an array element reference to a function.

#include <stdio.h>
void f(int*);
void main()
{
  int a[5] = {1, 2, 3, 4, 5};
  f(&a[3]);
  printf(" %d", a[3]);
}
void f(int *num)
{
  printf("%d", *num);
  *num = 10;
}

/* output:
4 10
*/

Passing array by name

Passing array name to a function is actually passing the address of the array to the function. The array address is the address of its first element, from there all array elements can be accessed.

Example:

Listing 7: Example of passing array name to a function.

#include <stdio.h>
void f(int n, int arr[]);
void main()
{
  int a[5] = {1, 2, 3, 4, 5};
  f(5, a);  //  or f(5, &a[0]);
  printf("\n");
  int i;
  for(i=0; i<5; i++) 
    printf("%d ", a[i]);
 
}
void f(int n, int arr[])
{
  int i;
  for(i=0; i<n; i++) 
    printf("%d  ", arr[i]);
    
  for(i=0;i<n;i++) 
    arr[i] = arr[i]*arr[i];     
}

/* output:
1 2 3 4 5
1 4 9 16 25
*/

A function with array argument can also be implemented by pointer argument. For example, the function in the above example can be implemented as in the following program.

Listing 8: Example of passing array for pointer arguments.

#include <stdio.h>
void f(int, int *);
void main()
{
  int a[5] = {1, 2, 3, 4, 5};
  f(5, a);  //  or f(5, &a[0]);
  printf("\n");
  int i;
  for(i=0; i<5; i++) 
     printf("%d ", a[i]);
 
}
void f(int n, int *p)
{
   int i;
   for(i=0; i<n; i++) 
     printf("%d  ", *(p+i));
    
   for(i=0;i<n;i++) 
     *(p+i) = *(p+i)* *(p+i);     
}

/* output:
1 2 3 4 5
1 4 9 16 25
*/

Note:

When a function needs to access all elements of an input array, it needs to know the array length. So it is necessary to pass the array length to the function.
C does not directly support passing whole array data values to a function due to the cost of copying the whole data into the call stack. So passing by array reference is an efficient approach. However, passing array reference to a function can change the external array data by the function.

2.2.6 Summary

You learned that arrays are fundamental data structures to store a collection of data objects of the same type. Array elements stored in contiguous memory space. Array elements can be accessed efficiently using position index. Array can be operated efficiently by pointers.

Stop and think:

How do you create an array and how do you access array elements?

2.2.7 Exercises

Self-quiz

Take a moment to review what you have read above. When you feel ready, take this self-quiz which does not count towards your grade but will help you to gauge your learning. Answer the questions posed below.

Go back

2 C Features - Pointers and Arrays

2.2 Arrays

2.2.1 Data structures

2.2.2 Concepts of arrays

2.2.3 Creating arrays by dynamic method

2.2.4 Array operations by pointers

2.2.5 Passing arrays to functions

2.2.6 Summary

2.2.7 Exercises