Fork, Exec and Process control

fork():

The fork() system call will spawn a new child process which is an identical process to the parent except that has a new system process ID. The process is copied in memory from the parent and a new process structure is assigned by the kernel. The return value of the function is which discriminates the two threads of execution. A zero is returned by the fork function in the child's process.

The environment, resource limits, umask, controlling terminal, current working directory, root directory, signal masks and other process resources are also duplicated from the parent in the forked child process.

Example:

#include <iostream>
#include <string>

// Required by for routine
#include <sys/types.h>
#include <unistd.h>

#include <stdlib.h>   // Declaration for exit()

using namespace std;

int globalVariable = 2;

main()
{
   string sIdentifier;
   int    iStackVariable = 20;

   pid_t pID = fork();
   if (pID == 0)                // child
   {
      // Code only executed by child process

      sIdentifier = "Child Process: ";
      globalVariable++;
      iStackVariable++;
    }
    else if (pID < 0)            // failed to fork
    {
        cerr << "Failed to fork" << endl;
        exit(1);
        // Throw exception
    }
    else                                   // parent
    {
      // Code only executed by parent process

      sIdentifier = "Parent Process:";
    }

    // Code executed by both parent and child.
  
    cout << sIdentifier;
    cout << " Global variable: " << globalVariable;
    cout << " Stack variable: "  << iStackVariable << endl;
}

Compile: g++ -o ForkTest ForkTest.cpp
Run: ForkTest

Parent Process: Global variable: 2 Stack variable: 20
Child Process:  Global variable: 3 Stack variable: 21

[Potential Pitfall]: Some memory duplicated by a forked process such as file pointers, will cause intermixed output from both processes. Use the wait() function so that the processes do not access the file at the same time or open unique file descriptors. Some like stdout or stderr will be shared unless synchronized using wait() or some other mechanism. The file close on exit is another gotcha. A terminating process will close files before exiting. File locks set by the parent process are not inherited by the child process.

[Potential Pitfall]: Race conditions can be created due to the unpredictability of when the kernel scheduler runs portions or time slices of the process. One can use wait(). the use of sleep() does not guarentee reliability of execution on a heavily loaded system as the scheduler behavior is not predictable by the application.

Note on exit() vs _exit(): The C library function exit() calls the kernel system call _exit() internally. The kernel system call _exit() will cause the kernel to close descriptors, free memory, and perform the kernel terminating process clean-up. The C library function exit() call will flush I/O buffers and perform aditional clean-up before calling _exit() internally. The function exit(status) causes the executable to return "status" as the return code for main(). When exit(status) is called by a child process, it allows the parent process to examine the terminating status of the child (if it terminates first). Without this call (or a call from main() to return()) and specifying the status argument, the process will not return a value.

#include <stdlib.h>

void exit(int status);

#include <unistd.h>

void _exit(int status);

Man Pages:

fork - create a child process

vfork():

The vfork() function is the same as fork() except that it does not make a copy of the address space. The memory is shared reducing the overhead of spawning a new process with a unique copy of all the memory. This is typically used when using fork() to exec() a process and terminate. The vfork() function also executes the child process first and resumes the parent process when the child terminates.

#include <iostream>
#include <string>

// Required by for routine
#include <sys/types.h>
#include <unistd.h>

using namespace std;


int globalVariable = 2;

main()
{
   string sIdentifier;
   int    iStackVariable = 20;

   pid_t pID = vfork();
   if (pID == 0)                // child
   {
      // Code only executed by child process

      sIdentifier = "Child Process: ";
      globalVariable++;
      iStackVariable++;
      cout << sIdentifier;
      cout << " Global variable: " << globalVariable;
      cout << " Stack variable: "  << iStackVariable << endl;
      _exit(0);
    }
    else if (pID < 0)            // failed to fork
    {
        cerr << "Failed to fork" << endl;
        exit(1);
        // Throw exception
    }
    else                                   // parent
    {
      // Code only executed by parent process

      sIdentifier = "Parent Process:";
    }

    // executed only by parent

    cout << sIdentifier;
    cout << " Global variable: " << globalVariable;
    cout << " Stack variable: "  << iStackVariable << endl;
    exit(0);
}

Compile: g++ -o VForkTest VForkTest.cpp
Run: VForkTest

Child Process:  Global variable: 3 Stack variable: 21
Parent Process: Global variable: 3 Stack variable: 21

Note: The child process executed first, updated the variables which are shared between the processes and NOT unique, and then the parent process executes using variables which the child has updated.

[Potential Pitfall]: A deadlock condition may occur if the child process does not terminate, the parent process will not proceed.

Man Pages:

vfork - create a child process and block parent
_exit - - terminate the current process

clone():

The function clone() creates a new child process which shares memory, file descriptors and signal handlers with the parent. It implements threads and thus launches a function as a child. The child terminates when the parent terminates.
See the YoLinux POSIX threads tutorial

Man Pages:

clone - create a child process

wait():

The parent process will often want to wait until all child processes have been completed. this can be implemented with the wait() function call.

wait(): Blocks calling process until the child process terminates. If child process has already teminated, the wait() call returns immediately. if the calling process has multiple child processes, the function returns when one returns.

waitpid(): Options available to block calling process for a particular child process not the first one.

Code snipet:

#include <sys/wait.h>

...

      pid_t pID = set-to-child-process-id-with-call-to-fork-OR-
                  // If set <-1, wait for any child process whose process group ID = abs(pID)
                  // If = -1, wait  for  any child process. Same as wait().
                  // If =  0, wait for any child process whose process group ID is same as calling process.
                  // If >  0, wait for the child whose process ID = pID.

... 
      int childExitStatus;

      pid_t ws = waitpid( pID, &childExitStatus, WNOHANG);

      if( WIFEXITED(childExitStatus) )
      {
         // Child process exited thus exec failed.
         // LOG failure of exec in child process.
         cout << "Result of waitpid: Child process exited thus exec failed." << endl;
      }

      ... 
      int childExitStatus;

      pid_t ws = waitpid( pID, &childExitStatus, 0);

      if( !WIFEXITED(childExitStatus) )
      {
         cerr << "waitpid() exited with an error: Status= " 
              << WEXITSTATUS(childExitStatus)
              << endl;
      }
      else if( WIFSIGNALED(childExitStatus) )
      {
         cerr << "waitpid() exited due to a signal: " 
              << WTERMSIG(childExitStatus)
              << endl;
      }

Notes:

See man page for options: WNOHANG, WUNTRACED.
See man page for return macros: WIFEXITED(), WEXITSTATUS(), WIFSIGNALED(), WTERMSIG(), WIFSTOPPED(), WSTOPSIG().
See man page for errors: ECHILD, EINVAL, EINTR. (Also see sample of error processing below.)

Man Pages:

wait / waitpid - wait for process termination

Set system group ID and process ID:

Avoids orphaned process group when parent terminates. When parent dies, this will be a zombie. (No parent process. Parent=1) Instead, create a new process group for the child. Later process the group is terminated to stop all spawned processes. Thus all subsequent processes should be of this group if they are to be terminated by the process group id. Process group leader has the same process id and group process id. If not changed then the process group is that of the parent. Set the process group id to that of the child process.

#include <sys/types.h>
#include <unistd.h>
#include <errno.h>

      errno = 0;
#ifdef __gnu_linux__
      pid_t pgid = setpgid(child_pID, child_pID);
#endif

      ...
      ...

      if( pgid < 0)
      {
        cout << "Failed to set process group ID" << endl;
  
        if(errno == EACCES)
           printf("setpgid: Attempted to change child proecess GID. Child process already performed an execve\n");
        else if(errno == EINVAL)
           printf("setpgid: pgid is less than 0\n");
        else if(errno == EPERM)
           printf("setpgid: Attempt to move proocess into process group failed. Can not move process to group in different session.\n");
        else if(errno == ESRCH)
           printf("setpgid: pid does not match any process\n");

        _exit(0);      // If exec fails then exit forked process.
      }

Use the setgid call to set the group id of the current process. Requires root access.

The macro testing for __gnu_linux__ is for cross platform support as man other OS's use a different system call.

Man Pages:

setpgid/getpgid setpgrp/getpgrp - set process group
setsid - creates a session and sets the process group ID
getuid/geteuid - get user identity
setgid - set group identity
getgid/getegid - get group (real/effective) identity
setreuid/setregid - set real user or group identity
errno - number of last error

Kill all processes in a process group:

This is the real reason to set up a process group. One may kill all the processes in the process group without having to keep track of how many processes have been forked and all of their process id's.

See /usr/include/bits/signum.h for list of signals.

#include <errno.h>
   ...
   ...

   errno = 0;
   int  killReturn = killpg( pID, SIGKILL);  // Kill child process group

   if(killReturn == -1)
   {
       if( errno == ESRCH)      // pid does not exist
       {
          cout << "Group does not exist!" << endl;
       }
       else if( errno == EPERM) // No permission to send signal
       {
          cout << "No permission to send signal!" << endl;
       }
       else
          cout << "Signal sent. All Ok!" << endl;
   }

   ...

Man Pages:

killpg - send signal to a process group
kill - send signal to a process
signal (2) - ANSI C signal handling
signal (7) - List of available signals
sigaction - POSIX signal handling functions.
pause (2) - wait for signal
raise (3) - send a signal to a current process

system() and popen():

The system() call will execute an OS shell command as described by a character command string. This function is implemented using fork(), exec() and waitpid(). The command string is executed by calling /bin/sh -c command-string. During execution of the command, SIGCHLD will be blocked, and SIGINT and SIGQUIT will be ignored. The call "blocks" and waits for the task to be performed before continuing.

#include <stdio.h>
#include <stdlib.h>
main()
{
    system("ls -l");
    printf("Command done!");
}

The statement "Command done!" will not print untill the "ls -l" command has completed.

The popen() call opens a process by creating a pipe, forking, and invoking the shell (bourne shell on Linux). The advantage to using popen() is that it will allow one to interrogate the results of the command issued.

This example opens a pipe which executes the shell command "ls -l". The results are read and printed out.

#include <stdio.h>
#include <stdlib.h>

main()
{
   FILE *fpipe;
   char *command = (char *)"ls -l";
   char line[256];

   if ( !(fpipe = (FILE*)popen(command,"r")) )
   {  // If fpipe is NULL
      perror("Problems with pipe");
      exit(1);
   }

   while ( fgets( line, sizeof line, fpipe))
   {
     printf("%s", line);
   }
   pclose(fpipe);
}

The second argument to popen:

r: Read from stdin (command results)
w: write to stdout (command)
For stderr: command="ls -l 2>&1","w");

Man Pages:

system - execute a shell command
popen - process I/O

exec() functions and execve():

The exec() family of functions will initiate a program from within a program. They are also various front-end functions to execve().

The functions return an integer error code. (0=Ok/-1=Fail).

execl() and execlp():

The function call "execl()" initiates a new program in the same environment in which it is operating. An executable (with fully qualified path. i.e. /bin/ls) and arguments are passed to the function. Note that "arg0" is the command/file name to execute.

int execl(const char *path, const char *arg0, const char *arg1, const char *arg2, ... const char *argn, (char *) 0);

#include <unistd.h>
main()
{
   execl("/bin/ls", "/bin/ls", "-r", "-t", "-l", (char *) 0);
}

Where all function arguments are null terminated strings. The list of arguments is terminated by NULL.

The routine execlp() will perform the same purpose except that it will use environment variable PATH to determine which executable to process. Thus a fully qualified path name would not have to be used. The first argument to the function could instead be "ls". The function execlp() can also take the fully qualified name as it also resolves explicitly.

Man Pages:

execl / execlp - execute

execv() and execvp():

This is the same as execl() except that the arguments are passed as null terminated array of pointers to char. The first element "argv[0]" is the command name.

int execv(const char *path, char *const argv[]);

#include <unistd.h>
main()
{
   char *args[] = {"/bin/ls", "-r", "-t", "-l", (char *) 0 };

   execv("/bin/ls", args);
}

The routine execvp() will perform the same purpose except that it will use environment variable PATH to determine which executable to process. Thus a fully qualified path name would not have to be used. The first argument to the function could instead be "ls". The function execvp() can also take the fully qualified name as it also resolves explicitly.

Man Pages:

execv / execvp - execute

execve():

The function call "execve()" executes a process in an environment which it assigns.

Set the environment variables:

Assignment:


   char *env[] = { "USER=user1", "PATH=/usr/bin:/bin:/opt/bin", (char *) 0 };

Read from file:

#include <iostream>
#include <fstream>
#include <string>
#include <vector>

// Required by for routine
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>

using namespace std;

//   Class definition:

class CReadEnvironmentVariablesFile
{
 public:
    CReadEnvironmentVariablesFile() { m_NumberOfEnvironmentVariables=0; };
   ~CReadEnvironmentVariablesFile();
    char **ReadFile(string& envFile);
 private:
    int m_NumberOfEnvironmentVariables;
    char **m_envp;
};

//   Read environment variables:

char **
CReadEnvironmentVariablesFile::ReadFile(string& envFile)
{
   int ii;
   string tmpStr;
   vector<string> vEnvironmentVariables;

   if( envFile.empty() ) return 0;

   ifstream inputFile( envFile.c_str(), ios::in);
   if( !inputFile )
   {
       cerr << "Could not open config file: " << envFile << endl;
       return 0;
   }

   while( !inputFile.eof() )
   {
       getline(inputFile, tmpStr);
       if( !tmpStr.empty() ) vEnvironmentVariables.push_back(tmpStr);
   }

   inputFile.close();

   m_NumberOfEnvironmentVariables = vEnvironmentVariables.size();

   // ---------------------------------------
   // Generate envp environment variable list
   // ---------------------------------------

   // Allocate pointers to strings.
   // +1 for array terminating NULL string

   m_envp = new char * [m_NumberOfEnvironmentVariables + 1];

   // Allocate arrays of character strings.
   
   for(ii=0; ii < m_NumberOfEnvironmentVariables; ii++)
   {
      // Character string terminated with a NULL character.
      m_envp[ii] = new char [vEnvironmentVariables[ii].size()+1];
      strcpy( m_envp[ii], vEnvironmentVariables[ii].c_str());
   }

   // must terminate array with null string
   m_envp[ii] = (char*) 0;

   return m_envp;
}

//   Free memory:

CReadEnvironmentVariablesFile::~CReadEnvironmentVariablesFile()
{
   int ii;

   // Free array's of characters
   for(ii=0; ii < m_NumberOfEnvironmentVariables; ii++)
   {
      delete [] m_envp[ii];
   }

   // Free array of pointers.
   delete [] m_envp;
}

Call execve:

#include <errno.h>
std::string getErrMsg(int errnum);

main()
{
   string envFile("environment_variables.conf");
   CReadEnvironmentVariablesFile readEnvFile;
   char **Env_envp = readEnvFile.ReadFile(envFile);

   // Command to execute
   char *Env_argv[] = { (char *)"/bin/ls", (char *)"-l", (char *)"-a", (char *) 0 };

   pid_t pID = fork();
   if (pID == 0)                // child
   { 
      // This version of exec accepts environment variables.
      // Function call does not return on success.

      errno = 0;
      int execReturn = execve (Env_argv[0], Env_argv, Env_envp);
      if(execReturn == -1)
      {
         cout << "Failure! execve error code=" << errno << endl;
         cout << getErrMsg(errno) << endl;
      }

      _exit(0); // If exec fails then exit forked process.
   }
   else if (pID < 0)             // failed to fork
   {
      cerr << "Failed to fork" << endl;
   }
   else                             // parent
   {
      cout << "Parent Process" << endl;
   }
}

Handle errors:

std::string getErrMsg(int errnum)
{

    switch ( errnum ) {

#ifdef EACCES
        case EACCES :
        {
            return "EACCES Permission denied";
        }
#endif

#ifdef EPERM
        case EPERM :
        {
            return "EPERM Not super-user";
        }
#endif

#ifdef E2BIG
        case E2BIG :
        {
            return "E2BIG Arg list too long";
        }
#endif

#ifdef ENOEXEC
        case ENOEXEC :
        {
            return "ENOEXEC Exec format error";
        }
#endif

#ifdef EFAULT
        case EFAULT :
        {
            return "EFAULT Bad address";
        }
#endif

#ifdef ENAMETOOLONG
        case ENAMETOOLONG :
        {
            return "ENAMETOOLONG path name is too long     ";
        }
#endif

#ifdef ENOENT
        case ENOENT :
        {
            return "ENOENT No such file or directory";
        }
#endif

#ifdef ENOMEM
        case ENOMEM :
        {
            return "ENOMEM Not enough core";
        }
#endif

#ifdef ENOTDIR
        case ENOTDIR :
        {
            return "ENOTDIR Not a directory";
        }
#endif

#ifdef ELOOP
        case ELOOP :
        {
            return "ELOOP Too many symbolic links";
        }
#endif

#ifdef ETXTBSY
        case ETXTBSY :
        {
            return "ETXTBSY Text file busy";
        }
#endif

#ifdef EIO
        case EIO :
        {
            return "EIO I/O error";
        }
#endif

#ifdef ENFILE
        case ENFILE :
        {
            return "ENFILE Too many open files in system";
        }
#endif

#ifdef EINVAL
        case EINVAL :
        {
            return "EINVAL Invalid argument";
        }
#endif

#ifdef EISDIR
        case EISDIR :
        {
            return "EISDIR Is a directory";
        }
#endif

#ifdef ELIBBAD
        case ELIBBAD :
        {
            return "ELIBBAD Accessing a corrupted shared lib";
        }
#endif
        
        default :
        {
            std::string errorMsg(strerror(errnum));
            if ( errnum ) return errorMsg;
        }
     }
}

Man Pages:

strerror / strerror_r - return string describing error code
errno - number of last error
perror - print a system error message

Data File: environment_variables.conf

PATH=/usr/bin:/bin
MANPATH=/opt/man
LANG=C
DISPLAY=:0.0

Man Pages:

execve - execute with given environment

Note: Don't mix malloc() and new. Choose one form of memory allocation and stick with it.

Malloc:

..
...

   int ii;

   m_envp = (char **) calloc((m_NumberOfEnvironmentVariables+1), sizeof(char **));

...
   // Allocate arrays of character strings.
   int ii;
   for(ii=0; ii < NumberOfEnvironmentVariables; ii++)
   {
      // NULL terminated
      m_envp[ii] = (char *) malloc(vEnvironmentVariables[ii].size()+1);
      strcpy( m_envp[ii], vEnvironmentVariables[ii].c_str());
   }

   // must terminate with null
   m_envp[ii] = (char*) 0;

   return m_envp;
}
...

Free:

   ...

   // Free array's of characters

   for(ii=0; ii < m_NumberOfEnvironmentVariables; ii++)
   {
      free(m_envp[ii]);
   }

   // Free array of pointers.

   free(m_envp);

   ...

Man Pages:

malloc - Dynamically allocate memory
free - Free allocated memory

Books:

	"UNIX Network Programming, Volume 1: Sockets Networking API" Third Edition by W. Richard Stevens, Bill Fenner, Andrew M. Rudoff, Richard W. Stevens ISBN # 0131411551, Addison-Wesley Pub Co; 3 edition (October 22, 2003) This book covers POSIX, IPv6, network APIs, sockets (elementary, advanced, routed, and raw), multicast, UDP, TCP, Threads, Streams, ioctl. In depth coverage of topics.
	"UNIX Network Programming, Volume 1: Networking APIs - Sockets and XTI" Second Edition by W. Richard Stevens ISBN # 013490012X, Prentice Hall PTR This book covers network APIs, sockets + XTI, multicast, UDP, TCP, ICMP, raw sockets, SNMP, MBONE. In depth coverage of topics.
	"UNIX Network Programming Volume 2: Interprocess Communications" by W. Richard Stevens ISBN # 0130810819, Prentice Hall PTR This book covers semaphores, threads, record locking, memory mapped I/O, message queues, RPC's, etc.
	"Advanced Linux Programming" by Mark Mitchell, Jeffrey Oldham, Alex Samuel, Jeffery Oldham ISBN # 0735710430, New Riders Good book for programmers who already know how to program and just need to know the Linux specifics. Covers a variety of Linux tools, libraries, API's and techniques. If you don't know how to program, start with a book on C.
	"Advanced UNIX Programming" Second Edition by Marc J. Rochkind ISBN # 0131411543, Addison-Wesley Professional Computing Series
	"Advanced Programming in the UNIX Environment" First Edition by W. Richard Stevens ISBN # 0201563177, Addison-Wesley Professional Computing Series It is the C programmers guide to programming on the UNIX platform. This book is a must for any serious UNIX/Linux programmer. It covers all of the essential UNIX/Linux API's and techniques. This book starts where the basic C programming book leaves off. Great example code. This book travels with me to every job I go to.
	"Advanced Unix Programming" by Warren W. Gay ISBN # 067231990X, Sams White Book Series This book covers all topics in general: files, directories, date/time, libraries, pipes, IPC, semaphores, shared memory, forked processes and I/O scheduling. The coverage is not as in depth as the previous two books (Stevens Vol 1 and 2)
	"Linux Programming Bible" by John Goerzen ISBN # 0764546570, Hungry Minds, Inc This covers the next step after "C" programming 101.

YoLinux Tutorial: Fork, Exec and Process control

Kill all processes in a process group:

execl() and execlp():

execv() and execvp():

execve():