The process management permissions were defined in flask/access_vectors and interpreted for each component policy in the appropriate files in policy. It is through these policy files that the need to allow every process to signal the init process was addressed. Permission checks using the AVC interface were added at various places throughout the kernel as needed.
A convenient location to place the execve access checks was in the prepare_binprm kernel routine used in the implementation of the execve call. This routine was the natural choice because it is used for loading the executable requested in the system call arguments and also any other executables indicated by the binary image header. The specified SID for the new process image was made accessible from the linux_binprm structure built for the call. The other SID values necessary for permission checking were already accessible from within this routine. In general, placing the access checks in this routine made it unnecessary to place additional checks in all of the individual binary handlers. However, it was necessary to add process_execute and file_execute checks in the ELF binary handler since it was possible that it could call other interpreters that otherwise would have gone unchecked. These checks were added to do_load_elf_binary in fs/binfmt_elf.c.
Since shared libraries are loaded using mmap, a process execute check was needed in the old_mmap routine defined in arch/i386/kernel/sys_i386.c. Section 6.1.3 describes the complete control requirements of mmap. This check, however, is not sufficient on the x86 architecture since a file may be mmapped read only and still be executed. This is actually an instance of the general problem of not being able to control execution of anything that a process can read. The security impact of this particular problem and the best way to minimize it are still being investigated.
Whenever a call to execve is going to result in a change in security context, additional action must be taken to ensure that the policy can not be violated. The call is aborted with the global variable errno set to EPERM when there is inappropriate sharing that resulted from a previous call to clone. Similarly, the call is aborted with EPERM if the process is being traced and the parent process lacks ptrace permission to the new SID for the process. Open descriptors must be revalidated with the inherit permission and closed when necessary. This is done in a new function, revalidate_fds which is modeled after the flush_old_files routine used to check the close_on_exec flag, called from the flush_old_exec routine. Finally, the compute_creds routine was modified to update the sid and osid fields of the task structure and to call wake_up_interruptible on the parent to force permission checking if the parent was waiting on the transformed process. If the process is not waiting, this action is harmless. A small side effect to this approach is that it is possible for a parent process to notice that its child has undergone a SID transition which prevents it from waiting.
Linux currently checks if signals may be delivered in the send_sig_info routine defined in kernel/signal.c which is the central control point for the signal mechanism. The appropriate signal permission checks were placed immediately after the existing checks. When the checks fail, the global variable errno is set to EACCES. Linux's signal checking for signals resulting from asynchronous I/O is done in the send_sigio routine defined in fs/fcntl.c. Here too, the permission checking is done following the existing checks. On failure, no signal is sent.
Signal checking is also used to control a process' ability to wait on another. Checks to determine if a child's exit_signal can be delivered to the parent were added to the sys_wait4 routine in kernel/exit.c. Whenever a process is awakened, Linux checks to see if the wait call should return or if the process should be placed back to sleep. At this time, the permission checks are repeated to ensure that the waiting process can continue to wait. If not, the wait call returns with the global variable errno set to ECHILD. This ensures that the waiting process will not be blocked indefinitely. In this case, when the child eventually exits, it will remain a zombied process until it can be reaped by the init process.
The execute permission check for a shared library specified in uselib was placed in sys_uselib. Failure aborts the call with the global variable errno set to EACCES.
No special changes to fork or clone were necessary to handle the initialization of the new fields of the task structure. When the init process is properly initialized during system startup, those fields are inherited from the parent process during process creation automatically without modification to the existing code. Since fork is implemented as a special case of clone, only the clone call actually needed modification. The permission checking was added to the do_fork routine defined in kernel/fork.c. The SID of the current process was used twice in the call to the AVC. Failure aborts the call with the global variable errno set to EACCES.
The ptrace permission check was added to the ptrace system call (arch/i386/kernel/ptrace.c:sys_ptrace), the execve call (fs/exec.c:must_not_trace_exec_flask), and the access routines for the mem file in procfs (fs/proc/mem.c:get_task). The scheduling, session, and process group permission checks were added to the corresponding system calls in kernel/sched.c and kernel/sys.c. The setcap and getcap permission checks were added to the corresponding system calls in kernel/capability.c.
The checking for all of the capability permissions was centralized in a single location, the capable function defined in include/linux/sched.h. Because the capability checks are done here, not all of the context information that might make more interesting security policies possible is available. This limits the check to only the current process SID and prevents the ability to limit the use of a capability on a per-object basis as was possible in the DTOS system. The AVC reference in the task structure was used for these permission checks. An important note is that the capability permissions correspond to the capability definitions in linux/include/linux/capability.h. The implementation of the checking mechanism is dependent on the correct ordering of the permission definitions with respect to the capability definitions.