Process – a program in execution; process execution must progress in a sequential fashion
Multiple parts
The program code also called the text section
Current activity including program counter, processor registers
Stack containing temporary data
Function parameters, return addresses, local variables
Data section containing global variables
Heap containing memory dynamically allocated during run time
Process
The sizes of the text and data sections are fixed, as their sizes do not change during program run time.
Stack and heap sections can shrink and grow dynamically during program execution.
Each time a function is called an activation record containing function parameter is pushed onto the stack.
Program is passive entity stored on disk ( executable file ), process is active
Program becomes process when executable file loaded into memory
One program can be several processes
Each of these is a separate process; and although the text sections are equivalent, the data, heap, and stack sections vary.
Process in Memory
Process in Memory
The global data section is divided into different sections for (a) initialized data and (b) uninitialized data.
A separate section is provided for the argc and argv parameters passed to the main() function.
The data field refers to initialized data, and bss refers to uninitialized data. (bss is a historical term referring to block started by symbol).
The GNU size command can be used to determine the size (in bytes) of some of these sections.
Process State
new: The process is being created
running: Instructions are being executed
waiting: The process is waiting for some event to occur
ready: The process is waiting to be assigned to a processor
terminated: The process has finished execution
Process Control Block (PCB)
Information associated with each process (also called task control block )
Process state – running, waiting, etc
Program counter – location of instruction to next execute
CPU registers – contents of all process-centric registers
Process Control Block (PCB)
CPU scheduling information- priorities, scheduling queue pointers
Memory-management information – memory allocated to the process
Accounting information – CPU used, clock time elapsed since start, time limits
I/O status information – I/O devices allocated to process, list of open files ![]
Finer control, we will discuss in next chapter.
Process Representation in Linux
Represented by the C structure task_struct pid t_pid;
pid t_pid; /* process identifier */ long state; /* state of the process */ unsigned int time_slice; /* scheduling information */ struct task_struct* parent; /* this process ’ s parent */ struct list_head children; /* this process ’ s children */ struct files_struct* files; /* list of open files */ struct mm_struct* mm; /* address space of this process */
Process Scheduling
Maximize CPU use, and quickly switch processes onto CPU for time sharing.
The number of processes currently in memory is known as the degree of multiprogramming.
Process scheduler selects among available processes for next execution on CPU Generally, most processes can be either I/O bound or CPU bound.
Process Scheduling
A new process is initially put in the ready queue.
It waits there until it is selected for execution, or dispatched.
Once the process is allocated a CPU core and is executing, one of several events could occur:
The process could issue an I/O request and then be placed in an I/O wait queue.
The process could create a new child process and then be placed in a wait queue while it awaits the child’s termination.
The process could be removed forcibly from the core, as a result of an interrupt or having its time slice expire, and be put back in the ready queue.
In the first two cases, the process eventually switches from the waiting state to the ready state and is then put back in the ready queue.
A process continues this cycle until it terminates, at which time it is removed from all queues and has its PCB and resources deallocated.
Context Switch
Context Switch
When CPU switches to another process, the system must save the state of the old process and load the saved state for the new process via a context switch.
The context of a process is represented in the Process Control Block (PCB).
The complexity of the operating system and the PCB can affect the duration of the context switch.
The time required for a context switch is dependent on hardware support.
Some hardware provides multiple sets of registers per CPU, allowing multiple contexts to be loaded at once.
Switching the CPU core from one process to another is known as a context switch.
During a context switch, the current process’s state is saved, and the state of a different process is restored.
Operations on Processes
System must provide mechanisms for: process creation, process termination, and so on
Parent process create children processes, which, in turn create other processes, forming a tree of processes Generally, process identified and managed via a process identifier (pid)
Resource sharing options Parent and children share all resources
Children share subset of parent ’ s resources Parent and child share no resources
Execution options:
Parent and children execute concurrently
Parent waits until children terminate
A Tree of Processes in Linux
Process Creation (Cont.)
Address space
Child duplicate of parent
Child has a program loaded into it
System Calls:
fork() system call creates new process
exec() system call used after a fork() to replace the process’ memory space with a new program
System Calls
System call is the services provided by Linux kernel.
In C programming, it often uses functions defined in libc which provides a wrapper for many system calls. Manual page section 2 provides more information about system calls.
To get an overview, use “man 2 intro” in a command shell. You can find the list of system calls in the file /usr/include/sys/syscall.h.
It is also possible to invoke syscall() function directly. Each system call has a function number defined in <syscall.h> or <unistd.h>.
fork() System Call
The fork() system call creates a new process that is copy of the parent process that is calling fork().
int fork();
This is the only way to create a new process in UNIX:
int pid; pid = fork();
it returns: ● ret == 0 in the child process ● ret == pid > 0 in the parent process. ● ret < 0 if there is an error
fork() System Call
The memory in the child process is a copy of the parent process’s memory.
This copy is optimized by using VM copy on write, that is, the memory of the parent will be shared with the child keeping only one copy the memory in physical memory.
Only when one page is modified by either the parent or the child, the OS will make a copy of the modified page.
This “lazy” copy improves the execution of fork() since most of the time only a few pages are modified. During fork() the Open File table is copied in the child.
However, the Open File objects of the parent are shared with the child. This allows the communication between the parent and the children. Only the reference counters of the Open File Objects are increased.
fork() System Call contd…
fork() System Call contd…
fork() System Call contd…
fork() System Call contd…
fork() System Call contd…
exec System Call Family
The exec system call family replaces the currently running process with a new process.
But the original process identifier remains the same, and all the internal details, such as stack, data, and instructions, the new process replaces the executable.
This function call family runs binary executable and shell scripts.
There are several system calls of the same family type available in the unistd.h library.
They create a new process or execute another binary executable. The family of the exec system call functions include the following.
exec System Call Family
exec System Call Family
execl
execlp
execle
execv
execvp
execve
execl( )
This system call takes the first and second parameter as a path of the binary executable. and the remaining parameters are the ones that you need to pass as based on your interest; that is, optional parameters or flags that are required for the executable program and purpose followed by a NULL value.
This system call is available in the unistd.h library. The return type of this function is an integer. If the execution is unsuccessful, it returns -1; otherwise, it returns nothing. The following shows the syntax.
int execl(const char *path, const char* arg, ..., NULL)
path takes the binary executable with the complete path.
arg also takes the binary executable path as an argument.
[...] considers the variable number of arguments, which means you can pass any number of arguments.
NULL is the default parameter, which the execl function’s last parameter should be.
execlp( )
This system call is a bit more advanced than the execl() system call.
It does not require the path for the binary built-in executable, but for custom executable, it does require the path to execute.
The return type of this system call is an integer. It returns –1 if any error occurs and returns anything for successful execution. The following shows the syntax.
int execlp(const char *path, const char* arg, ..., NULL)
path takes the binary executable with the complete path.
arg also takes the binary executable path as an argument.
[...] considers the variable number of arguments, which means you can pass any number of arguments.
NULL is the default parameter, which the execl function’s last parameter should be.
execle( )
This system call works similarly to the execl() system call.
The major difference is that you can pass your own environment variables as an array. You can access the environment variables from the envp constant array pointer.
The return type of this system call is an integer. It returns –1 on an error and returns anything for the successful execution of the executable. The following shows the syntax.
path takes the binary executable with the complete path.
arg also takes the binary executable path as an argument.
[...] considers the variable number of arguments, which means you can pass any number of arguments.
NULL is the default parameter, which the execl function’s last parameter should be.
envp is an environment pointer variable that lets you access the environment variables from the array. The last element of the array is a NULL value.
execv( )
This execv() system call is slightly different from this all three system calls.
In this system call you can pass your parameters as an argv array that you want to execute.
The last element of this array is a NULL value.
The return type of this system call is an integer value. It returns -1 on an error and returns nothing on success. The following shows the syntax.
int execv(const char *path, char* const argv)
The path argument points to the path of the executable that is being executed.
argv is the second argument. It is a NULL-terminated array of character pointers.
execvp( )
This system call works the as same as the execv() system call.
The major difference is that you don’t need to pass the path for system executables like an execlp() system call.
The execvp() system call tries to find the path of the file in an operating system, for example, the ls command is a program name. The execvp() system call automatically finds its path in the system and performs the action. The following shows the syntax.
int execvp(const char *file, char* const argv)
file points to the executable file name associated with the file being executed.
argv is a NULL-terminated array of character pointers that contain the executables information.
execve( )
This system call works the same as the execle() system call.
You can pass the environment variables, and those variables can access it from your program.
int execve(const char *file, char* const argv, char *const envp)
Example
The following shows the syntax.
#include<stdio.h> #include<unistd.h>int main() { char *program_name = "ls"; //A null terminated array of character pointers char *args[]={program_name,"-la", ".", NULL}; //You don’t need to pass the path for system executables for exexvp() system call. execvp(program_name,args); return 0;}
wait System Call
A process needs to wait for resources or for other processes to complete execution.
A common situation that occurs during the creation of a child process is that the parent process needs to wait or suspend until the child process execution is completed.
After the child process execution completes, the parent process resumes execution. The work of the wait system call is to suspend the parent system call until its child process terminates.
This wait system call is available in the sys/wait.h header file.
The process ID is the return type of the wait system call. On successful termination of the child process, it returns the child process ID to the parent process.
wait System Call
If the process doesn’t have any child processes, the initiated wait call does not affect the parent activity.
It returns -1 if there are no child processes. If the parent process has multiple child processes, the wait() call returns the appropriate result to the parent when the child processes have terminated.
wait System Call
The following shows the syntax.
pid_t wait(int *status)pid_t wait(NULL)
This system call takes the child status as an argument and returns the terminated child process ID. If you don’t want to give the child status, you can use the NULL value.
Process Termination - exit System Call
It is available in the stdlib.h library.
The return of this system call is void. It doesn’t return anything on execution.
It is used to terminate the normal execution of the program while encountered the exit () function.
We can use the exit() function to flush or clean all open stream data like read or write with unwritten buffered data.
It closed all opened files linked with a parent or another function or file and can remove all files created by the tmpfile function.
exit System Call
The program’s behaviour is undefined if the user calls the exit function more than one time or calls the exit and quick_exit function.
The following shows the syntax: void exit(int status)
The exit function is categorized into two parts: exit(0) and exit(1). The status takes the value that is returned to the parent process. It can also be EXIT_SUCCESS and EXIT_FAILURE to represent the successful termination (0) and abnormally terminate the program (1), respectively.
C Program Forking Separate Process
Process Termination
Process executes last statement and then asks the operating system to delete it using the exit() system call.
Returns status data from child to parent (via wait() )
Process’ resources are deallocated by operating system Parent may terminate the execution of children processes using the abort() system call.
Some reasons for doing so:
Child has exceeded allocated resources Task assigned to child is no longer required
The parent is exiting and the operating systems does not allow a child to continue if its parent terminates
Process Termination
Some operating systems do not allow child to exists if its parent has terminated. If a process terminates, then all its children must also be terminated
Cascading termination - All children, grandchildren, etc. are terminated. The termination is initiated by the operating system.
The parent process may wait for termination of a child process by using the wait() system call.
The call returns status information and the pid of the terminated process pid = wait(&status);
Zombie Process
When a process terminates, its resources are deallocated by the operating system. However, its entry in the process table must remain there until the parent calls wait(), because the process table contains the process’s exit status.
A zombie process is a process that has finished executing but still has an entry in the process table, as if no parent waiting (did not invoke wait() ) process.
Generally, all processes transition to this state only briefly, once the parent calls the wait(), the process identifier of the zombie process and its entry in the process table would be released.
int main() { pid_t child_pid = fork(); // Parent process if (child_pid > 0){ printf("In Parent Process.!\n"); // Making the Parent Process to Sleep for some time. sleep(10); } else{ printf("In Child process.!\n"); exit(0);} return 0; }
Orphan Process
An orphan process is a child process that continues to run even after its parent process has terminated.
A child process becomes an orphan when either of the following occurs.
When the task of the parent process finishes and terminates without terminating the child process.
When an abnormal termination occurs in the parent process.
int main(){ pid_t child_pid = fork(); // Parent process if (child_pid > 0){ printf("In Parent Process.!\n"); } else{ printf("In Child process.!\n"); // Making the Child Process to Sleep for some time. sleep(10); printf("After Sleep Time"); } return 0; }
Interprocess Communication
Cooperating processes may need to communicate with each other. Two models of interprocess communication are shared memory and message passing.
Shared Memory: In shared memory, processes can access a shared memory region without the need for system calls. It is faster than message passing but requires synchronization.
Message Passing: Message passing involves using system calls to send and receive messages between processes. It is easier to implement but slower than shared memory.
Some applications, like the Google Chrome browser, use a multiprocess architecture where different processes handle different tasks, such as user interface, rendering, and plugins.
Communications Models
(a) Message passing. (b) shared memory.
Interprocess Communication – Shared Memory
An area of memory shared among the processes that wish to communicate.
The processes needs to establish a shared region of memory.
Typically, a shared-memory region resides in the address space of the process creating the shared-memory segment.
Other processes that wish to communicate using this shared-memory segment must attach it to their address space.
Shared memory requires that two or more processes agree to remove this restriction.
The communication is under the control of the users processes not the operating system.
Major issues is to provide mechanism that will allow the user processes to synchronize their actions when they access shared memory. Synchronization is discussed in great details in Chapter 5.
Classical Producer-Consumer Problem
For cooperating processes, producer process produces information that is consumed by a consumer process
Unbounded-buffer places no practical limit on the size of the buffer. Consumer may have to wait()
Bounded-buffer assumes that there is a fixed buffer size. Cosumer may have to wait if it is empty and producer may have to wait if it is full.
Bounded-Buffer
Shared data #define
#define BUFFER_SIZE 10typedef struct { . . .} item;item buffer[BUFFER_SIZE];int in = 0;int out = 0;
The variable in points to the next free position in the buffer; out points to the first full position in the buffer. The buffer is empty when in == out; the buffer is full when ((in + 1) % BUFFER_SIZE) == out.
Solution is correct, but can only use BUFFER_SIZE-1 elements.
Bounded-Buffer - Producer
item next_produced; while (true) { /* produce an item in next produced */ while (((in + 1) % BUFFER_SIZE) == out) ; /* do nothing */ buffer[in] = next_produced; in = (in + 1) % BUFFER_SIZE; }
Bounded-Buffer - Consumer
item next_consumed; while (true) { while (in == out) ; /* do nothing */ next_consumed = buffer[out]; out = (out + 1) % BUFFER_SIZE; /* consume the item in next consumed */ }
Shared Memory
Shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies.If you use the following functions in your program, you should link your program with -lrt.
If a process want to access shared memory, it should: (use POSIX API)
1.Create, or gain access to, a shared memory object.
int shm_open(const char *name, int oflag, mode_t mode);
2.Map a shared memory object into its address space.
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
Shared Memory Contd..
3.Do operations on shared memory (read, write, update).
4.Delete mappings of the shared memory object.
int munmap(void *addr, size_t length);
Finally, destroy a shared memory object when no reference to it remain open.
int shm_unlink(const char *name);
Actually a shared memory can be found under the/devfolder as a file.
Interprocess Communication
The mechanism for processes to communicate and synchronize their actions
Message system – processes communicate with each other without resorting to shared variables
IPC facility provides two operations:
send(message)
receive(message)
The message size is either fixed or variable Implementing the variable size message are complex but makes the programmers life easier
Direct Communication
Indirect Communication
POSIX Shared Memory
Process first creates shared memory segment
shm_fd = shm_open(name, O CREAT | O RDWR, 0666);
Also used to open an existing segment to share it
Set the size of the object
ftruncate(shm fd, 4096);
Then, ptr = mmap(0, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0) function establishes a memory-mapped file containing the shared-memory object.
It returns the pointer to the memory-mapped file used to access the shared memory object.
Now the process could write to the shared memory
sprintf(shared memory, "Writing to shared memory");
IPC POSIX Producer
IPC POSIX Consumer
Standard File Descriptors
In Linux, there are 3 standard file descriptors. They are:
stdin: This is the standard input file descriptor. It is used to take input from the terminal by default. scanf(), getc() etc functions uses stdin file descriptor to take user inputs. The stdin file descriptor is also represented by the number 0.
stdout: This is the standard output file descriptor. It is used to print something to the console/terminal by default. The widely used printf() function uses stdout to print your desired output to the console/terminal. The stdout file descriptor is also represented by the number 1.
Standard File Descriptors
stderr:
This is the standard error file descriptor. It does the same thing as the stdout file descriptor.
The stderr file descriptor is used to print error messages on the console/terminal.
The only difference is if you use stderr file descriptor to print the error messages, and stdout file descriptor to print normal outputs, then you can later separate them.
For example, you can redirect the error messages to a file and normal outputs to the console or another file. The stderr file descriptor is also represented by the number 2.
Pipes
Acts as a conduit allowing two processes to communicate.
Ordinary pipes – cannot be accessed from outside the process that created it.
Typically, a parent process creates a pipe and uses it to communicate with a child process that it created.
Once the processes have finished communication and have terminated the pipe ceases to exist.
Named pipes can be accessed without a parent-child relationship.
A common visual analogy for a pipe is a real-world water pipe; water that is poured into one end of the pipe comes out the other end.
Pipes cond..
Pipes cond..
Ordinary Pipes
Ordinary Pipes allow communication in standard producer-consumer style
Producer writes to one end (the write-end of the pipe)
Consumer reads from the other end (the read-end of the pipe)
Ordinary pipes are therefore unidirectional, if two-way communication is required, two pipes must be used, with each pipe sending data in a different direction.
Require parent-child relationship between communicating processes
The dup() System Call
The dup system call is used to create a different file descriptor to an existing file object.
fd2=dup(fd1)
This dup(fd1) will return a new file descriptor that will point to the same file object that fd1 is pointing to.
The reference counter of the open file object that fd1 refers to is increased.
This will be useful to “save” the stdin, stdout, stderr, so the shell process can restore it after doing the redirection.
The dup() System Call
The dup2() function copies a file descriptor to another file descriptor
int dup2(fd1, fd2)
On success, the dup2() function returns the new file descriptor. If an error occurs, dup2() returns -1.
The dup2() function is defined in the header file unistd.h.
After calling dup2(fd1, fd2), fd2 will refer to the same open file object that fd1 refers to. The open file object that fd2 referred to before is closed. The reference counter of the open file object that fd1 refers to is increased.