2.2 Arrays
Arrays are fundamental data structures to store a collection of data items (values, elements, records) of the same type in contiguous memory locations. Arrays are used to
- store application data records.
- store strings (text data).
- implement other data structures like queues, stacks, heaps, hash tales, and heaps.
- implement mathematical vectors, matrices, and rectangular tables.
In this section, we introduce the concepts of arrays, array operations, and array operations by pointers.
2.2.1 Data structures
Let’s first clarify the term data structure. Simply say, a data structure is a collection of data items of certain types together with a set of operations that can be performed on the data items. In other words, a data structure is a collection of data items (also called elements, or components) connected in certain structures and accessed by defined logics. Formally say, a data structure defines how data items are represented, organized, and stored in memory, and how they can be accessed and operated. Data structures are used in algorithms and programs to represent and operate data.
For example, a list of variables is the simplest form of data structures that we often use in programming. We use variables to represent data values, access variables by their names. The list of local variables of a function are held in contiguous memory locations in the stack region. The list of global variables are held in contiguous memory locations in the data region.
Listing 1: Example of global and local variables.
#include<stdio.h>
int a = 1; // a global variable
int b = 2; // a global variable
int main()
{
int x = 3; // a local variable
int y = 4; // a local variable
printf("a: %d, address: %lu\n", a, &a);
printf("b: %d, address: %lu\n", b, &b);
printf("x: %d, address: %lu\n", x, &x);
printf("y: %d, address: %lu\n", y, &y);
return 0;
}
In Listing { 1}, a and b are global variables, x and y are local variables. Compile and run the program. The output is like the following.
a: 1, address: 4206600 b: 2, address: 4206604 x: 3, address: 6684316 y: 4, address: 6684312
Check the above output, you will notice:
- Addresses of global variables are much smaller that that of local variables. Why?
- The address of global variable b is 4 more than that of global variable a. Why?
- The address of local variable y is 4 less than that of local variable x. Why?
Stop and think:
How a list of global or local variables is arranged in memory?
- Global variables are ordered linearly in contiguous memory locations with increasing addresses in data region by the order of their appearances in program.
- Local variables are ordered linearly in contiguous memory locations with decreasing addresses in stack region by the order of their appearances in a function.
2.2.2 Concepts of arrays
Using a list of variables works well when the number of data items to be represented is fixed and small, but does not work when the number of data items is large. Array is a solution to solve this problem for the case when the data items are of the same data type.
By definition, an array is an ordered list of variables (called elements) of the same data type. Array elements are represented and accessed by their index positions in the linear ordering. The elements of an array are stored in contiguous memory locations.
Array syntax
Arrays are declared using syntax: data_type array_name[length];
.
Here, data_type
is the data type of array elements, array_name
is a name identifier, and length
is a positive integer representing the number of elements in the array (referred to as the length of the array). The declaration tells the compiler to allocate a memory block of length*sizeof(data_type)
bytes for the array, so it can store length
number of data items of type data_type
.
Expression sizeof(array_name)
returns the size (number of bytes) of the array, and sizeof(array_name)/sizeof(data_type)
gives the length of the array.
The elements of the array are accessed by syntax array_name[index]
, where the subscript index
is an integer between 0 and length - 1
representing a position in the array. For example, array_name[0]
represents the first element, which has the lowest address in the array memory block. array_name[length-1]
represents the last element of the array.
Since an array element can be accessed directly using index with a constant number of instructions, array element accessing time is a constant. Accessing any data item in a data structure is often referred to as random access. So arrays are extremely efficient for random access.
The notation array_name[index]
is used as a variable identifier to get/set values, and &array_name[index]
represents the address of array_name[index]
. The following code fragment shows the usage of the bracket notation.
float a[5]; // declare an array of length 5 of float type
a[0] = 10.1; // set value 10.1 to the first element
float b = a[0]; // get the first array element and assign it to another variable
float *ptr = &a[1]; // create pointer ptr pointing to a[1]
Array initialization
Arrays can be initialized at declaration.
Example:
int a[5] = { 4, 1, 7, 2, 1 };
// this declares an int type array of length 5,
// and sets element values 4, 1, 7, 2, 1, respectively
// e.g. a[0] holds value 4, a[4] holds value 1
int a[5] = { 4, 1, 0 };
// this declares an int array of length 5,
// and sets the values 4, 1, 0 for the first three elements, and 0 for the rest elements.
int a[5] = { 0 };
// this declares an int array of length 5, and sets 0 to all elements.
int a[] = { 2, 6, 4 };
// this declares an int array of length 3, and sets element values in order of 2, 6, 4.
Array traversal
Array traversal is to access or visit every element of an array. When traversing an array, we can process array elements by doing something such as printing values, getting addresses, and changing values.
The simple array traversal algorithm uses for loop with subscript index changing from 0 to length-1 for forward direction traversal, or from length-1 to 0 for backward direction traversal.
Example:
int a[5] = { 4, 1, 7, 2, 1 };
int i;
for (i=0; i<5; i++) printf("%d ", a[i]);
// this prints the array data in forward order: 4 1 7 2 1
for (i=4; i>=0; i--) printf("%d ", a[i]);
// this prints the array data in backward order: 1 2 7 1 4
Listing 2: Example of array traversal.
#include <stdio.h>
#define SIZE 5
int a[SIZE]; // global array, will be stored in data region
int main()
{
int i;
printf("global array\n");
printf("Size of array a: %d\n", sizeof(a));
printf("Length of array a: %d\n", sizeof(a)/sizeof(int));
for(i = 0; i < SIZE; i++) {
a[i] = i + 1;
printf("a[%d] : %d, address : %lu\n", i, a[i], &a[i]);
}
char b[] = {'a', 'b', 'c', 'd'}; // local array, will be stored in stack region
printf("local array\n");
int size = sizeof(b);
int length = size / sizeof(char);
printf("Size of array b: %d\n", size);
printf("Length of array b: %d\n", length);
for(i = 0; i < 4; i++) {
printf("b[%d] : %c, address : %lu\n", i, b[i], &b[i]);
}
return 0;
}
The output of the above program is like the following.
global array Size of array a: 20 Length of array a: 5 a[0] : 1, address : 4219876 a[1] : 2, address : 4219880 a[2] : 3, address : 4219884 a[3] : 4, address : 4219888 a[4] : 5, address : 4219892 local array Size of array b: 4 Length of array b: 4 b[0] : a, address : 6684304 b[1] : b, address : 6684305 b[2] : c, address : 6684306 b[3] : d, address : 6684307
List 2 shows the array traversal with initialization, setting values and getting addresses. Check the runtime addresses of array elements in the output. You can see how array elements are arranged in memory.
2.2.3 Creating arrays by dynamic method
The above method of creating an array by declaration is a static method, which allocates the array memory block in data region or stack region. Such arrays (i.e., arrays created by the static method) are sometime called static arrays.
Arrays can also be created by using the dynamic method, namely using malloc() function to allocate memory space for an array. Arrays created by the dynamic method are called **dynamic arrays*, and have memory blocks in the heap region at runtime.
Example:
Listing 3: Example of dynamic array.
The output of the above program is like the following.
a[0]:0, address:12326352 a[1]:1, address:12326356 a[2]:2, address:12326360 a[3]:3, address:12326364 a[4]:4, address:12326368
Note that address values are changing with different runs due to the nature of dynamic memory allocations.
2.2.4 Array operations by pointers
Pointers provide an alternative and efficient method to operate arrays.
Accessing array elements by pointers
Pointers can be used to access array elements. For example, int a[10], *ptr = &a[0];
Then ptr
points to a[0]
, namely ptr
holds the address of a[0]
. *ptr
gives the value of a[0]
.
An array name actually represents the reference of the array, namely, the address of the first element. Statement ptr = a;
is equivalent to ptr = &a[0];
.
Accessing array elements by pointers is more efficient than using subscript index. For example, given int a[10], i=4, *ptr = &a[i];
, then both *ptr
and a[i]
give the value of a[i]
. What is the difference? a[i]
tells the compiler to generate instructions to compute the location of a[i]
, namely, compute array a's starting address + i*4
. While *ptr
gets the location of a[i]
directly from pointer ptr
, so it is more efficient if the pointer is available.
Using pointer arithmetic operations makes array accessing even more efficient. For example, if ptr
points to a[i]
, then after ptr++
, ptr
will point to a[i+1]
, or ptr
will point to a[i-1]
after ptr--
.
List 4 is an example of array traversals in both directions using pointers.
Listing 4: Example of array traversal by pointers.
Output:
value:4, address:6684252 value:3, address:6684248 value:2, address:6684244 value:1, address:6684240 value:0, address:6684236
Array pointer
An array can be viewed as a derived data type. An array pointer is a pointer pointing to an array.
An array pointer can be declared by syntax: data_type (*ptr_name)[k];
.
Then ptr_name
is a pointer to an array of type data_type of length k
.
For example, int (*ptr)[10];
declares array pointer ptr
, which is used to point to an int array of length 10. The notation ptr++
will increase the ptr
value by 10*sizeof(int)
. The following code fragment shows the usages of array pointers.
int a[10], *p;
int (*ptr)[10];
p = a; // equivalent to p = &a[0]; p points to a[0], p++ will increase by sizeof(int)
ptr = &a; // ptr and p has the same value, but ptr++ will increase by sizeof(int)*10.
*ptr points to a[0]
*ptr + i points to a[i]
*(*ptr + i) gives a[i]
We will see the application of array pointers for 2D array operations in the next section.
Pointer array
A pointer array (also called array of pointers) is an array of pointer type elements.
A pointer array is declared by syntax: data_type *ptr_array_name[k]
.
Here, ptr_array_name is an array of pointers. That is, array element ptr_array_name[index]
is a pointer pointing to a data_type
variable. *ptr_array_name[index]
get the value it points to.
Example:
int *ptr[3];
int a=1, b=2, c=3;
ptr[0]=&a;
ptr[1]=&b;
ptr[2]=&c;
printf("%d", *ptr[2]); // output: 3
Stop and think
How do you apply pointer arrays in applications?
An auxiliary pointer array can be used to represent the same type of data objects stored at different locations. By working on the pointer array, we avoid having to work directly on the data objects. For example, when we want to get a sorting result of given data objects, but do not want to change the data objects in their locations because swapping actual data objects will have a high cost in time. We can create a pointer array, such that each element of the pointer array points to a data object. Then do sorting on the pointer array using their pointed data objects for comparisons. Finally, using the sorted pointer array we can get the sorted information of the original data objects.
Using auxiliary data structures to hold pointers to application data objects is a fundamental approach for data accessing in practical programming.
2.2.5 Passing arrays to functions
When input data are stored in an array and a data processing algorithm is implemented in a function, we need to pass the array to the function. For example, we need to pass an array to a sorting function. There are three methods to pass array to a function.
- Passing individual array elements by values
Passing one (or more individual) array element to a function is like passing by value. The value of an array element is copied into the function. The array elements will not be changed by the function.
Example:
Listing 5: Example of passing an array element value to a function.
- Passing individual array elements by references
Passing one (or more) individual array element by reference is the same as passing a variable reference. The address of the array element is copied into the function. Through the address, the array element can be accessed and changed.
Example:
Listing 6: Example of passing an array element reference to a function.
- Passing array by name
Passing array name to a function is actually passing the address of the array to the function. The array address is the address of its first element, from there all array elements can be accessed.
Example:
Listing 7: Example of passing array name to a function.
#include <stdio.h>
void f(int n, int arr[]);
void main()
{
int a[5] = {1, 2, 3, 4, 5};
f(5, a); // or f(5, &a[0]);
printf("\n");
int i;
for(i=0; i<5; i++)
printf("%d ", a[i]);
}
void f(int n, int arr[])
{
int i;
for(i=0; i<n; i++)
printf("%d ", arr[i]);
for(i=0;i<n;i++)
arr[i] = arr[i]*arr[i];
}
/* output:
1 2 3 4 5
1 4 9 16 25
*/
A function with array argument can also be implemented by pointer argument. For example, the function in the above example can be implemented as in the following program.
Listing 8: Example of passing array for pointer arguments.
#include <stdio.h>
void f(int, int *);
void main()
{
int a[5] = {1, 2, 3, 4, 5};
f(5, a); // or f(5, &a[0]);
printf("\n");
int i;
for(i=0; i<5; i++)
printf("%d ", a[i]);
}
void f(int n, int *p)
{
int i;
for(i=0; i<n; i++)
printf("%d ", *(p+i));
for(i=0;i<n;i++)
*(p+i) = *(p+i)* *(p+i);
}
/* output:
1 2 3 4 5
1 4 9 16 25
*/
Note:
- When a function needs to access all elements of an input array, it needs to know the array length. So it is necessary to pass the array length to the function.
- C does not directly support passing whole array data values to a function due to the cost of copying the whole data into the call stack. So passing by array reference is an efficient approach. However, passing array reference to a function can change the external array data by the function.
2.2.6 Summary
You learned that arrays are fundamental data structures to store a collection of data objects of the same type. Array elements stored in contiguous memory space. Array elements can be accessed efficiently using position index. Array can be operated efficiently by pointers.
Stop and think:
How do you create an array and how do you access array elements?
- Create a static array.
- Create a dynamic array.
- Random access array element by index.
- Random access array element by pointer.
2.2.7 Exercises
Self-quiz
Take a moment to review what you have read above. When you feel ready, take this self-quiz which does not count towards your grade but will help you to gauge your learning. Answer the questions posed below.