1 Introduction to C

1.1 Introduction

We are not teaching C programming language in this course. We use C to teach data structures. Students taking this course may already learned C. If C is new to you, you are encouraged to take a tutorial on C. But don’t worry if you never touched C. We will introduce features of C from the perspective of studying data structures. As time goes, you will learn C, and eventually be familiar with C programming after completing the course.

First of all, why we use C programming language in this Data Structure II course? Well, C is the de facto standard of programming language. It was used to implement operating systems such as Unix, Linux, Windows OS, and macOS. It is the primary programming language for computer science and computer engineering. C is simple but powerful to do low level operations. Using C for data structures makes it possible to:

  1. Understand fundamentally how data structures work at depth of memory level.
  2. Design, implement, and use data structures for efficient data representation, storage, accessing and processing.
  3. Design application specific data structures to improve the performance of algorithms.

Learning and doing data structures using C will bridge the knowledge gap between computer hardware (processors, memories, etc.) and software by knowing how software works on hardware.

Learning and doing data structures using C will set a solid foundation for follow-up computer science major courses including

  • Algorithm Design and Analysis, in which data structures are used in algorithms to represent and access data.
  • System Programming, in which C programing language and many data structures are used.
  • Operating Systems, in which C programming language and most basic data structures are used.
  • Computer Graphics, in which C++, OpenGL library, and customized high performance data structures are used.
  • Introduction to Compiling, many auxiliary data structures are used in compiler programs.

1.1.1 Background of C

Origin

C is a general-purpose programming language initially developed by Dennis Ritchie (supplementary link) between 1969 and 1973 at AT&T Bell Labs. Dennis Ritchie created the C programming language to rewrite and export the UNIX operating system to different platforms.

UNIX was originally designed and written by Ken Thompson (supplementary link) in assembly language for DEC PDP-7 computers. Since assembly programs are hard to debug and enhance, Thompson designed a small programming language named “B” for rewriting UNIX. The B programming language was based on BCPL (supplementary link), a systems programming language developed in the mid 1960s.

Dennis Ritchie joined the UNIX project and developed an extended version of B with new features to meet the need of UNIX development. He renamed it “C” as the successor of “B”. UNIX was rewritten in C, and was exported to other computer platforms by using platform dependent C compilers.

Further more, application programs written in C could be compiled by the compilers and run on different computers. This means that C programs are portable. The portability makes C a general-purpose programming language for applications.

Standardization

The first book on C, The C Programming Language reference 2, was written by Brian Kernighan (supplementary link) and Dennis Ritchie, published in 1978. It served as a standard reference to learn and use C. It was a must-read book when learning C, and termed as K&R by C programmers.

In 1980s, C continued to evolve toward a general-purpose programming language with new features added and old features removed. The need for a standard description for up-to-date features became critical for the healthy growth of C. Without a standard, different C compilers would have different sets of features, and as a result, C programs compilable by one compiler may not be compilable by another compiler. Thus C programs would be less portable.

The development of C standard began in 1983. The first C standard was completed in 1988, and approved by American National Standards Institute (ANSI) in 1989, and further approved by International Organization for Standardization (ISO) in 1990. This first C standard is known as ANSI C, C89, or C90. While the original C is referred to as K&R C. Further evolution of C and its standard in 1990s leaded to a revised C standard, which was approved by ISO in 1999, known as C99. The most recent standard was C18, approved by ISO in 2018. Refer to the ISO site for more information about C standards (supplementary link).

In this course, we will use the C89 and GNU C compiler GCC (supplementary link) for programming in code examples, labs, and assignments. For details of C89, refer to the C89 draft.

1.1.2 Characteristics of C

Several characteristics make C the de facto standard of programming language.

1. C is a high level programming language

C is a high level compiled programming language. C source code programs are written by programmers using any text editor. But C source code programs are not executable, they need to be converted (compiled) to executable (machine code) programs and save as files, which can then be loaded into memory to run. C compilers are the programs to convert C source code programs to executable programs.

2. C is a low level language

C is also said to be a low level language by means of supporting low level operations such as memory address operations and bitwise operations. This is a unique feature of the C programming language. As a result, C programs can be fast on data accessing and operations. The bitwise operations are close to the machine instructions and work on every bit of data efficiently. The low level operations provide flexibility and power for C programming, but could also make C programs hard to understand, debug, modify, and maintain.

3. C is a portable (or cross-platform) language

A program written in C is compiled by a platform dependent compiler, and then run on the computers or operating systems of the computing platform (supplementary link). It is not like Java programs, which are compiled once and run everywhere.

4. C is a small language

C has a small set of features comparing with many other languages. For example, C has only 32 keywords, a small set of primitive data types, and a minimal set of operations, many of them are close to processor instructions.

5. C is extensible

C is extensible using libraries of functions to extend features and functionalities.

6. C is a stable

C is a stable programming language, not changing much over time due to the standardization.

1.1.3 Applications of C

The standardization and characteristics of C make it the primary computer programming language in computer software development. C compilers are available for almost all computer architectures and operating systems. C are used in many applications and tools, the following is a list of some example applications.

1. C is used for operating systems and high performance computing.

C was used to develop the operating systems including UNIX, Linux, Windows, macOS, iOS, and Android. Particularly, C is the language for software development of embedded systems. Application programs written in C can be integrated with the OS without much overhead, so C is the language to write device drivers, utility programs (such as compilers), system programs (such as shells/command consoles/terminals), and native applications running on top of OS.

2. C is used for interpreters of other programming languages.

C was used to develop interpreters of some other programming languages such as Perl, PHP, and Python. C is also used to develop critical application software when memory or processing power is limited or high performance is desired.

3. C has influenced other programming languages.

C has influenced the development of other programming languages. The syntax and principles of C have been used in the development of some major programming languages. The C++ programming language, created by Bjarne Stroustrup (supplementary link) in 1979, is an extension of C with features of Object-Oriented Programming (OOP). The Java programming language, created by James Gosling in 1984, is based on C’s syntax and OOP of C++. The scripting languages such as Perl and PHP are based on the syntax of C, with interpreters written in C.

1.1.4 C for data structures

The characteristics of C make it a primary language to implement data structures. The features of memory level operations enable to design and implement data structures at low level, and access and operate data components of data structures efficiently.

C is the mostly used language tool to teach data structures. Learning data structures with C helps the fundamental understanding how data structures really work. Knowing how to operate data structures at memory level enables us to design and implement efficient data structures for algorithms and applications.

We use C in Data Structure II to have a deep understanding how data structures work, and to learn how to use data structures for high performance algorithm designs and implementations. The knowledge, skill and experience with C and data structures will set a solid foundation for upper level computer science courses and software developer career.

There are many books and online resources on C programming language. For a fundamental understanding of C programming language, reference 2 is the book to start. If you want to learn C programming in depth, reference 3 is a good book to read.

Notation convention

We mostly use the standard notations in content description. However, there are exceptions. For example, we use * to represent multiplication operator, instead of using the standard mathematical operator \(\cdot\) or \(\times\). We use ≤ to represent less or equal in theoretical proofs, and use <= to represent less or equal in algorithm descriptions. We use non-standard notation A <=> B to present equivalence of statements A and B. We use HTML to present simple mathematical expressions, and use LaTeX to represent complex mathematical expressions.

1.1.5 Exercises

Self-quiz

Take a moment to review what you have read above. When you feel ready, take this self-quiz which does not count towards your grade but will help you to gauge your learning. Answer the questions posed below.

Go back