Why "a fork is often followed by an exec"? Can't you just create a new process in UNIX?

  • I am citing Rago from his APUE book. Commented Aug 17, 2022 at 15:54
  • 10
  • 18
    Because exec's don't like to eat with their hands?
    – Davidw
    Commented Aug 18, 2022 at 0:03
  • 1
    Actually I believe that in Linux the real core function to spawn new processes is clone, which is used both to create new processes and new threads. You can specify various flags to clone to tell it to copy/share certain kinds of memory/file descriptors etc, so you can end up either creating a whole new process same as a plain fork, or a new thread that shares the same memory space.
    – GACy20
    Commented Aug 18, 2022 at 9:42
  • 6
    Other time-sharing OSes contemporary to Unix development (e.g., VMS, TSO) had spawn() system calls that needed many options and thus were extremely complex. Splitting spawn into fork() and exex() replaced that complexity with two relatively simple calls with some program code between them.
    – mpez0
    Commented Aug 18, 2022 at 13:00

5 Answers 5


A slightly different perspective from most of the "fork-shaming" existing answers... ;)

Originally it was probably, as @davidbak mentioned, just so temptingly easy to do it this way. But having worked with fork/exec a lot (and also often with only fork, for multiprocessing) there definitely are reasons why this way of working is still alive and kicking, and not delegated to the fog of history:

  • It still is extremely simple from the view of a programmer in literally any programming language at all. It does not matter where I am coding - any language can trust the extremely simple implications of the fork semantics and offer it as part of the language. Hence, every language has a relatively trivial (compared to in-process multithreading) method of offering at least multi-processing to its users. (N.B.: one exception to this is if you are using multithreading in your program - after a fork, only one thread is running; this can lead to obvious issues in all but the most trivial multithreading applications.)
  • As a user (programmer), I can write my multiprocessing in a few lines of code, with no worry about mutexes, semaphores, illegally overwriting state of any of my program variables, and so on and so forth. At the same time, the "initial communication" between parent child is also trivially handled for me - the child does have full access to any variables or RAM the parent had, and can continue working with it. In practice this means, if I, say, have a need to perform some short I/O or networking process in parallel to my main program, I can do that with a few lines of code; all is in one place, easily visible. I can collect the child afterwards and be on my merry way. There are no "worker threads", I do not need to take care to use only thread-safe methods or data structures.
  • Again, since the memory content starts identical but is in effect separate, there is zero risk of overwriting anything between parent/child processes. Yes, I do have to find other ways to to IPC then between parent and child, but those methods are not that hard either; often, languages offer a standard function like "open3" or something like that which automatically provides bi-directional pipe-based file handles for communication to avoid deadlocks and such.
  • Specifically, when switching between programming languages, once one has understood the fork semantics, one never needs to learn anything more about the new environment - it's always as simple as in any other language.
  • It is nice to have exec anyways. It allows us to replace the current process image (i.e. the executable that's being executed) with something different. This makes it clean to, say, have some script or program that prepares some kind of environment and then execs something else, while disappearing from the scene, itself. Not only does it free resources (RAM, but also space in the process table, and so on), but makes it very clear to anybody involved or looking at it that the erstwhile parent is not going to play any role whatsoever anymore in the future. You often find this in well-written bash scripts, which free up the resources of the bash interpreter when starting their "payload".

In addition:

  • It perfectly fits the Unix philosophy of having many small tools which can interact with each other, instead of fat black boxes that are either very limited, or need a big set of parameters or an API to really use.
  • As shown above, it is powerful in some scenarios where only having a single function would be limiting; but it is also still easy enough to have fork+exec following each other. It's not like you have to do a lot of stuff inbetween (or any at all, really) unless you need it.
  • As per the man pages, in some modern Unixes (namely, Linux), fork itself is only a wrapper around the more modern and more powerful clone call, which indeed is somewhat like a fork+exec. Note that here we see complexity raise its ugly head already; Linux also has a clone3 function which supersedes clone and makes the interface a little easier or more convenient (using structs instead of so many flags).
  • 1
    “Any language can trust the extremely simple implications of the fork semantics and offer it as part of the language” - Any language…? Then where is it in Java? C#? Erlang? Ada? BASIC? Cobol? Haskell?
    – Dai
    Commented Aug 20, 2022 at 13:03
  • @Dai All of them can call it through their native call systems, and it will break none of their runtimes if you where to do so. Commented Aug 20, 2022 at 16:02
  • 1
    @user1937198 a very bold statement indeed
    – OrangeDog
    Commented Aug 20, 2022 at 22:16
  • @Dai, yes, those languages could offer fork-based multiprocessing, if they wanted. Emphasis on "can". I'm sure for all of those the reason they don't isn't technical problems but a design decision (possibly to stay platform-independent).
    – AnoE
    Commented Aug 22, 2022 at 7:59
  • @user1937198 this isn't true. It will break at least. Net. Probably Java too. Commented Jun 2, 2023 at 21:31

fork() creates a new process, which is a copy of the parent process. So if you did only fork(), you would have two identical processes running. Therefore in order to replace the forked process with another code, you need to perform exec() which replaces the currently running process with the specified executable file.

Linux kernel is just organized that way. You don't have a single system call that creates new process and loads a new executable at the same time. You have to do it in two steps - first create new process, then load a new executable into this new process. (Although you may have a library function in your programming language that combines these two - for example there is spawn() in many C variants).

Sometimes exec() is not needed, if just creating another copy of the current process is all that you want. Many daemons for example do this.

  • 6
    The answer is not really Linux-specific. I think BSDs do the same.
    – fraxinus
    Commented Aug 18, 2022 at 8:20
  • 19
    @fraxinus I'm pretty sure this comes from original UNIX systems, so this has been the same for like 50+ years now.
    – GACy20
    Commented Aug 18, 2022 at 9:39
  • 4
    Linux kernel is just organized that way that's absolutely not true. It's the Unix way. Linux has many other ways to spawn new processes
    – phuclv
    Commented Aug 18, 2022 at 12:07
  • 1
    A single multiprocessing programme could fork() several times, then distribute calculations based on the process ID?
    – gerrit
    Commented Aug 18, 2022 at 16:04
  • 2
    @gerrit Probably yes. I'm not too much into numerical programming; but for example network daemons like web servers or SSH servers do it all the time. They have a single master process that forks when a network request is coming, and the forked process serves that request, then if there are no more requests exits after some time.
    – raj
    Commented Aug 18, 2022 at 16:22

This is because of historical reasons: At the beginning of time there was only fork and exec. Because it was easy to implement (according to DMR: only 27 lines of PDP-7 assembly code for fork! - see e.g. A fork() in the road (Baumann, Appavoo, Krieger, Roscoe, 2019) - a secondary source, though it references a primary source The Evolution of the Unix time-sharing system (Ritchie, 1979). Anyway, true ab initio direct process creation came much much much much later. (And is not in POSIX, possibly?)

The fact that a true direct process creation API came much much later affects Unix programming to this day. Because hundreds of books, manuals, tutorials, slide decks, and courses were written explaining fork and exec and they've been taught to students and programmers for decades as the way to do process creation/control in Unix and that extensive heritage persists in the way code is written to this day.

Oh, here is The Evolution of the Unix Time-sharing System (Ritchie, 1979). Scroll down to page 6 to see: "Process control in its modern form was designed and implemented within a couple of days. ..... In fact, the PDP-7's fork call required precisely 27 lines of assembly code."

  • 1
    Nice bit of history - thanks for doing the research. Scary to think how much of this stuff has happened during my lifetime!
    – kbro
    Commented Aug 18, 2022 at 3:46
  • 2
    Does anyone know what the ab initio direct process creation API is called in Unix? I'd like to read about it, if it exists. Commented Aug 18, 2022 at 5:16
  • 2
    @JeremyFriesner isn't it called "boot"? The kernel starts the first process (e.g. /bin/init). Init starts everything else by fork and exec.
    – fraxinus
    Commented Aug 18, 2022 at 8:24
  • 9
    @JeremyFriesner posix_spawn. On many POSIX implementations, it's a library function on top of fork and exec, since those have to exist anyway and can't be implemented on top of posix_spawn. Commented Aug 18, 2022 at 8:34
  • 1
    It should be noted that posix_spawn() is not intended or positioned as somehow more fundamental than fork(), as can be seen by the fact that the former can be implemented on top of the latter, but not vice versa. The primary goal of posix_spawn() is to provide for process creation on systems with capabilities too limited to support fork(), such as those without MMUs. Commented Aug 19, 2022 at 15:55

Because an exec doesn't create a process, and linux doesn't have a single syscall for create process and load executable because that only works in the trivial case of creating a process with new executable with no preexisting resources. If you want to do anything more than the trivial case, the complexity rapidly ramps up, and it becomes easier to have separate 'create process' and 'start executable' steps with the ability to manipulate the process in between. See https://lwn.net/Articles/360556/ for a discussion on this.

Unix, going back to the earliest versions has solved this by using fork to create a duplicate of the parent process dedicated to setting up the environment, which then loads in the new executable, once it has completed. The child process then exists in a temporary state of having access to all the resources of the parent process, but of running in the child process. This approach has a couple of advantages:

  1. You can use the existing in-process manipulation APIs to set up the child process. This means you don't then need a whole family of API calls for manipulating the child process to set up resources.
  2. You can use exec on its own if the parent process will no longer need to exist after creating the new one.
  3. You can fork without using exec if you want a second process of the same executable.
  • 3
    See also: clone() and posix_spawn()
    – BowlOfRed
    Commented Aug 18, 2022 at 2:52
  • 1
    This is more like a rationalization than a proper reason. IMO. Other systems besides Unix have been able to create processes from scratch.
    – davidbak
    Commented Aug 18, 2022 at 3:17
  • 11
    It's more than a rationalization, it's a design philosophy. Unics [sic] simplicity was inspired as a counter to Multics complexity, so if fork was a good way to duplicate a process when you wanted a duplicate, and exec was a good way to replace the executable image when you wanted to replace it, why waste time and resources implementing another way to do the same thing? multicians.org/unix.html
    – kbro
    Commented Aug 18, 2022 at 3:57
  • 3
    @BowlOfRed posix_spawn is not a linux syscall. It is emulated. Clone is basically fork with the ability to handle some extremely common configuration cases more efficiently, along with better safety from multi-threading. It does not have the ability to do fork + exec in a single step. Commented Aug 18, 2022 at 10:08
  • 3
    @user253751 I don't see anybody saying "this is the ideal way that it would be designed in any OS and should never be changed", they are saying "this was the sound reasoning that led to this decision back then". Saying "well, here's a third way it could work which I think is better" in no way invalidates the fact that the design chosen had clear advantages over how other OSes implemented it.
    – IMSoP
    Commented Aug 18, 2022 at 15:38

No, you cannot create a new process in UNIX, you can only duplicate your current process (using fork). If you want the new process to do something other than exactly what the current process is doing, you then replace it (using exec).

You do not have to fork before calling exec. There's a common usage in scripts that start up a login session (.xinitrc and the like) where you set up environment variables and start background tasks (such as ssh-agent) and then run the session manager. There's nothing more for the startup script to do after launching the session manager so you exec it to free up the resources assigned to running your startup script. The parent of the startup script is unaware of this replacement - the PID remains the same - so they continue to wait for this child to die before they perform their tidy-up actions.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .