Operating System Interfaces
An operating system manages and abstracts low-level hardware, shares physical resources among multiple programs, and provides controlled ways for programs to interact.
- Kernel: A special privileged program that provides core services to running programs.
- Process: A running program consisting of memory (instructions, data, and a stack) and private state managed by the kernel.
- System Call: A defined entry point in the operating system’s interface that transitions execution from user space to kernel space to perform privileged operations.
- Hardware Protection: The kernel utilizes CPU mechanisms to ensure processes access only their own memory and execute without hardware privileges.
To provide these services, the kernel manages isolated execution environments, forming the basis of processes.
Processes and Memory
The operating system time-shares hardware by transparently switching available CPUs among waiting processes, saving and restoring CPU registers during transitions.
- Process Identifier (PID): A unique integer the kernel associates with each process.
- Process Creation:
fork()creates a new child process by exactly duplicating the parent’s memory contents.fork()returns in the child process and the child’s PID in the parent process.- The parent and child execute independently with different memory spaces and registers; changes in one do not affect the other.
- Process Execution:
exec(file, argv)replaces the calling process’s memory with a new memory image loaded from a file (structured in the ELF format).exec()takes an executable filename and an array of string arguments, starting execution at the binary’s declared entry point without returning to the calling program.
- Process Termination and Synchronization:
exit(status)stops the calling process and releases resources like memory and open files. A status of conventionally indicates success, while indicates failure.wait(*status)pauses the calling process until a child exits, returning the child’s PID and copying its exit status into the provided address.
- Memory Management:
- Most user-space memory is allocated implicitly during
fork()andexec(). sbrk(n)grows a process’s data memory by bytes dynamically at run-time and returns the location of the new memory.
- Most user-space memory is allocated implicitly during
Processes require controlled mechanisms to interact with the outside world, which the operating system abstracts through a unified file descriptor interface.
I/O and File Descriptors
A file descriptor is a small integer acting as an index into a per-process table, representing a kernel-managed object such as a file, directory, device, or pipe.
- Standard Conventions: By default, processes read from file descriptor (standard input), write to (standard output), and write errors to (standard error).
- Core I/O Operations:
read(fd, buf, n)reads up to bytes from into , advancing the file offset by the number of bytes read. It returns to indicate the end of the file.write(fd, buf, n)writes bytes from to , advancing the file offset sequentially.close(fd)releases a file descriptor for future reuse. Newly allocated file descriptors always use the lowest-numbered unused integer for the current process.
- I/O Redirection:
fork()copies the parent’s file descriptor table to the child, granting the child the exact same open files.exec()replaces the process memory but completely preserves the file table.- A shell redirects I/O by forking a child, closing standard file descriptors, opening specific files to claim those low-numbered descriptors, and then calling
exec()to run the new program.
- Offset Sharing:
- Underlying file offsets are shared between file descriptors only if they were derived from the same original descriptor via
fork()ordup(). dup(fd)duplicates an existing descriptor, returning a new one that refers to the same underlying I/O object and shares its offset.
- Underlying file offsets are shared between file descriptors only if they were derived from the same original descriptor via
While file descriptors abstract access to static files and external devices, they also serve as the foundation for direct inter-process communication via pipes.
Pipes
A pipe is a small kernel buffer exposed to processes as a pair of file descriptors: one for reading and one for writing.
- Creation:
pipe(p)creates the buffer and records the read descriptor in and the write descriptor in . - Communication Flow:
- Writing data to the write end makes it available for reading at the read end.
- If no data is available, a read operation blocks until data is written or until all file descriptors referring to the write end are closed.
- If all write ends are closed,
read()returns , simulating an end-of-file. This requires processes to rigorously close unused write descriptors to prevent readers from waiting indefinitely.
- Advantages Over Temporary Files:
- Pipes automatically clean themselves up, whereas temporary files require explicit deletion.
- Pipes can pass arbitrarily long streams of data without being constrained by disk space.
- Pipes allow parallel execution of pipeline stages, unlike files which require the first program to finish before the second starts.
- Blocking reads and writes in pipes are significantly more efficient than non-blocking file semantics for inter-process communication.
Pipes and process I/O rely on the existence of a structured namespace and persistent storage, provided by the file system.
File System
The file system provides data files (uninterpreted byte arrays) and directories (named references to files and other directories), structured as a tree originating from a root directory.
- Path Resolution:
- Paths beginning with
/are evaluated from the root directory. - Paths not beginning with
/are evaluated relative to the calling process’s current directory, which can be modified usingchdir(dir).
- Paths beginning with
- Inodes and Links:
- Inode: The underlying physical file object that holds file metadata, including type (file, directory, or device), length, disk location, and the number of links.
- Link: An entry in a directory containing a filename and a reference to an inode.
- A single inode can have multiple links (names) pointing to it.
- File System Operations:
mkdir(dir)creates a new directory.open(file, O_CREATE)creates a new data file.mknod(file, major, minor)creates a special device file that diverts I/O system calls directly to a kernel device implementation identified by major and minor numbers.link(file1, file2)creates a new name (file2) referring to the exact same inode as an existing file (file1).unlink(file)removes a name from the file system. The underlying inode and disk space are only freed when the file’s link count drops to and no active file descriptors refer to it.fstat(fd, *st)andstat(file, *st)retrieve inode information into astruct statobject.
The combination of these standardized abstractions—processes, file descriptors, and a unified file system—forms a general-purpose architecture that heavily influences modern operating systems.
Real World Context
The integration of standard file descriptors, pipes, and shell syntax originated in Unix and sparked a culture of general-purpose, reusable software tools.
- The Unix system call interface is standardized through the Portable Operating System Interface (POSIX) and persists in modern systems like BSD, Linux, and macOS.
- Modern operating systems extend these foundations with networking, user-level threads, and windowing systems, but still rely on the core concepts of multiplexing hardware, isolating processes, and enabling controlled communication.
- While Unix unifies access to files, directories, and devices through file descriptors, alternative models exist, such as Multics (which abstracted file storage to look like memory) and Plan 9 (which extended the file abstraction to networks and graphics).