Clone
In the manual page for clone()
/clone3()
system call I find:
CLONE_THREAD (since Linux 2.4.0).
If CLONE_THREAD is set, the child is placed in the same thread group as the calling process. To make the remainder of the discussion of CLONE_THREAD more readable, the term "thread" is used to refer to the processes within a thread group.
So, as I understand, from the clone
's perspective, a thread is something that was created with the CLONE_THREAD
flag set (and, thus, ended up in the same thread group as the caller).
Futex
But looking at, for instance, the manual page for futex()
, I find:
FUTEX_PRIVATE_FLAG (since Linux 2.6.22)
This option bit can be employed with all futex operations. It tells the kernel that the futex is process-private and not shared with another process (i.e., it is being used for synchronization only between threads of the same process). This allows the kernel to make some additional performance optimizations.
This seems to relate with the definition of clone
, but not quite. When a thread (as in clone is created (that is, the CLONE_THREAD
flag is specified), the created tasks also have to share their VM. However, it is possible to create two tasks which are not threads (as in clone), which would still sharing VM (just specifying CLONE_VM
). But, judging by the FUTEX : new PRIVATE futexes article/patch, the optimization used for _PRIVATE
futexes is that the virtual address of a futex word are used rather than the physical ones, so, one could probably use private futexes in tasks created with CLONE_VM
... but the man for futex()
forbids that.
That is not a critical issue, though: the manual imposes a (seemingly unnecessary) restriction but it doesn't break anything. So, here goes a more thrilling example.
Close
From the manual for the close()
system call:
Furthermore, consider the following scenario where two threads are performing operations on the same file descriptor:
(1) One thread is blocked in an I/O system call on the file descriptor. For example, it is trying to write(2) to a pipe that is already full, or trying to read(2) from a stream socket which currently has no available data.
(2) Another thread closes the file descriptor.
The behavior in this situation varies across systems. <...>
Clearly, it is assumed here that the two tasks share a file descriptor table, but being threads (as in clone) is neither necessary nor sufficient to claim they do!
If a task is created with CLONE_THREAD|...|CLONE_FILES
, it's all good, but if it's just CLONE_THREAD|...
(which is allowed), the two threads (as in clone) do not share file descriptors, and the tasks cloned with ...|CLONE_FILES
are not threads but do share file descriptors!
Question
First of all, is this (at least the close()
example) a bug in the manual? Is it because it was written before the clone()
system call was designed? Or am I missing something?
In general: when using a particular system call that has the term "thread" used in its manual, how do I tell what's implied?
In particular (assuming I want to write code that will work on future kernel versions): would it be ok to use private futexes in tasks that share VM but are not threads (as in clone)? Would it be ok thread-safe to call close
in threads (as in clone) that do not share file descriptor tables, as the common sense suggests?