20

After reading up on this answer and "Linux Kernel Development" by Robert Love and, subsequently, on the clone() system call, I discovered that processes and threads in Linux are (almost) indistinguishable to the kernel. There are a few tweaks between them (discussed as being "more sharing" or "less sharing" in the quoted SO question), but I do still have some questions yet to be answered.

I recently worked on a program involving a couple of POSIX threads and decided to experiment on this premise. On a process that creates two threads, all threads of course get a unique value returned by pthread_self(), however, not by getpid().

A sample program I created follows:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
#include <pthread.h>

void* threadMethod(void* arg)
{
    int intArg = (int) *((int*) arg);

    int32_t pid = getpid();
    uint64_t pti = pthread_self();

    printf("[Thread %d] getpid() = %d\n", intArg, pid);
    printf("[Thread %d] pthread_self() = %lu\n", intArg, pti);
}

int main()
{
    pthread_t threads[2];

    int thread1 = 1;

    if ((pthread_create(&threads[0], NULL, threadMethod, (void*) &thread1))
         != 0)
    {
        fprintf(stderr, "pthread_create: error\n");
        exit(EXIT_FAILURE);
    }

    int thread2 = 2;

    if ((pthread_create(&threads[1], NULL, threadMethod, (void*) &thread2))
         != 0)
    {
        fprintf(stderr, "pthread_create: error\n");
        exit(EXIT_FAILURE);
    }

    int32_t pid = getpid();
    uint64_t pti = pthread_self();

    printf("[Process] getpid() = %d\n", pid);
    printf("[Process] pthread_self() = %lu\n", pti);

    if ((pthread_join(threads[0], NULL)) != 0)
    {
        fprintf(stderr, "Could not join thread 1\n");
        exit(EXIT_FAILURE);
    }

    if ((pthread_join(threads[1], NULL)) != 0)
    {
        fprintf(stderr, "Could not join thread 2\n");
        exit(EXIT_FAILURE);
    }

    return 0;
}

(This was compiled [gcc -pthread -o thread_test thread_test.c] on 64-bit Fedora; due to the 64-bit types used for pthread_t sourced from <bits/pthreadtypes.h>, the code will require minor changes to compile on 32-bit editions.)

The output I get is as follows:

[bean@fedora ~]$ ./thread_test 
[Process] getpid() = 28549
[Process] pthread_self() = 140050170017568
[Thread 2] getpid() = 28549
[Thread 2] pthread_self() = 140050161620736
[Thread 1] getpid() = 28549
[Thread 1] pthread_self() = 140050170013440
[bean@fedora ~]$ 

By using scheduler locking in gdb, I can keep the program and its threads alive so I can capture what top says, which, just showing processes, is:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
28602 bean      20   0 15272 1112  820 R  0.4  0.0   0:00.63 top
 2036 bean      20   0  108m 1868 1412 S  0.0  0.0   0:00.11 bash
28547 bean      20   0  231m  16m 7676 S  0.0  0.4   0:01.56 gdb
28549 bean      20   0 22688  340  248 t  0.0  0.0   0:00.26 thread_test
28561 bean      20   0  107m 1712 1356 S  0.0  0.0   0:00.07 bash

And when showing threads, says:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
28617 bean      20   0 15272 1116  820 R 47.2  0.0   0:00.08 top
 2036 bean      20   0  108m 1868 1412 S  0.0  0.0   0:00.11 bash
28547 bean      20   0  231m  16m 7676 S  0.0  0.4   0:01.56 gdb
28549 bean      20   0 22688  340  248 t  0.0  0.0   0:00.26 thread_test
28552 bean      20   0 22688  340  248 t  0.0  0.0   0:00.00 thread_test
28553 bean      20   0 22688  340  248 t  0.0  0.0   0:00.00 thread_test
28561 bean      20   0  107m 1860 1432 S  0.0  0.0   0:00.08 bash

It seems to be quite clear that programs, or perhaps the kernel, have a distinct way of defining threads in contrast to processes. Each thread has its own PID according to top - why?

3
  • 1
    clone() is just how Linux implements both threads and fork(). All that matters is that talking to a PID will pass the signal on to everyone who needs to know. If the kernel assigns additional IDs to the threads, that's none of your business and it doesn't affect how you talk to your processes.
    – Kerrek SB
    Commented Feb 6, 2012 at 1:33
  • Good link to go through. Commented Mar 16, 2014 at 10:42
  • "processes and threads in Linux are (almost) indistinguishable to the kernel" Umm, not really true. There is almost nothing you can say about how the Linux kernel works that is true of both processes and threads. Owns a view of vm? Only processes. Can be scheduled? Only threads. Has a file descriptor table? Only processes. Has a priority? Only threads. And so on down the line. Commented Mar 9, 2017 at 6:02

3 Answers 3

33

These confusions all stem from the fact that the kernel developers originally held an irrational and wrong view that threads could be implemented almost entirely in userspace using kernel processes as the primitive, as long as the kernel offered a way to make them share memory and file descriptors. This lead to the notoriously bad LinuxThreads implementation of POSIX threads, which was rather a misnomer because it did not give anything remotely resembling POSIX thread semantics. Eventually LinuxThreads was replaced (by NPTL), but a lot of the confusing terminology and misunderstandings persist.

The first and most important thing to realize is that "PID" means different things in kernel space and user space. What the kernel calls PIDs are actually kernel-level thread ids (often called TIDs), not to be confused with pthread_t which is a separate identifier. Each thread on the system, whether in the same process or a different one, has a unique TID (or "PID" in the kernel's terminology).

What's considered a PID in the POSIX sense of "process", on the other hand, is called a "thread group ID" or "TGID" in the kernel. Each process consists of one or more threads (kernel processes) each with their own TID (kernel PID), but all sharing the same TGID, which is equal to the TID (kernel PID) of the initial thread in which main runs.

When top shows you threads, it's showing TIDs (kernel PIDs), not PIDs (kernel TGIDs), and this is why each thread has a separate one.

With the advent of NPTL, most system calls that take a PID argument or act on the calling process were changed to treat the PID as a TGID and act on the whole "thread group" (POSIX process).

4
  • Thank you for your answer, and I will be researching what you say for a while, no doubt. Perhaps the biggest question out of my post today (as far as I can tell) is, 'why haven't others asked about this?' (At least, through an easily available resource.) Surely, this must be a big topic for those who, like me, are involved in multithreaded applications?
    – Doddy
    Commented Feb 6, 2012 at 2:15
  • 1
    Well nowadays (now that we're past the LinuxThreads fiasco) application programmers can really just get to business using POSIX threads as specified by POSIX, without worrying too much about what goes on under the hood, because everything mostly works correctly. I suspect that's the reason why the implementation details don't get much attention anymore. BTW man 7 pthreads has some basic explanation of how it works. Commented Feb 6, 2012 at 3:43
  • @bean - the question has been asked many different ways from many different angles here. R has given a particularly good answer in that it touches on several levels of confusion from both the linux historical and technical perspectives.
    – Duck
    Commented Feb 6, 2012 at 3:44
  • Thanks. "When top shows you threads, it's showing TIDs (kernel PIDs), not PIDs (kernel TGIDs), and this is why each thread has a separate one." Is it now still the case? I type top -H, but can't figure that out.
    – Tim
    Commented Jan 1, 2019 at 13:43
1

Imagine some sort of "meta-entity". If the entity shares none of the resources (address space, file descriptors, etc) of its parent then it's a process, and if the entity shares all of the resources of its parent then it's a thread. You could even have something half-way between process and thread (e.g. some resources shared and some not shared). Take a look at the "clone()" system call (e.g. http://linux.die.net/man/2/clone ) and you'll see this is how Linux does things internally.

Now hide that behind some sort of abstraction that makes everything look like either a process or a thread. If the abstraction is flawless you'd never know the difference between "entities" and "processes and threads". The abstraction isn't quite flawless though - the PID you're seeing is actually an "entity ID".

1
  • Sorry, your answer doesn't shed any light on my question at all. I've already looked at the man page of clone(). The fact that there was abstraction between processes and threads was what made me ask the question in the first place. It's very easy to say that because I've chosen to call clone() with less than 50% resources shared, I should be given a thread as opposed to a process, and likewise.
    – Doddy
    Commented Feb 6, 2012 at 2:22
0

On Linux, every thread gets a thread ID. The thread ID of the main thread serves double duty as the process ID (and is rather well-known in the user interface). The thread ID is an implementation detail of Linux, and unrelated to the POSIX ID. For more details, refer to the gettid system call (not available from pure Python since it's system-specific).

1
  • I am not sure if The thread ID of the main thread serves double duty as the process ID is true. When I ran the code above, the thread ID of the main thread wasn't same as the PID. Commented Mar 9, 2017 at 6:23

Not the answer you're looking for? Browse other questions tagged or ask your own question.