class: center, middle # File Systems ## Alan Cox and Scott Rixner --- layout: true --- # File Systems - An abstraction layer between applications and storage - A hierarchical way to store data on storage - *Persistent* storage of data --- # Persistence - Data is stored on a non-volatile medium - Data is not lost when the program exits - Data is not lost when the system is powered off, rebooted, or crashes --- # Operating System Buffer Cache - Disks are slow compared to memory - The OS *buffer cache* is an in-memory cache of file system data - The buffer cache reduces disk acceses for file system operations - Writes to the file system do *not* immediately go to the disk - The buffer cache is shared among all processes -- File writes are not immediately persistent! --- # Network File Systems - File systems that are accessed over a network - The OS buffer cache is used to reduce network traffic - Multiple systems can access the same file system - May temporarily have inconsistent views of the file system - Need to be synchronized to ensure consistency -- File writes are not immediately persistent! --- # Consistency - Ensuring that the data is in a valid state at all times - The file system does not know the *meaning* of the data - Applications must ensure consistency with respect to application state and multiple files --- # Ordering - Why does the data file need to be written before the map file in the checkpointing assignment? - What would happen if it wasn't? - While ordering isn't necessary for persistence, it may be necessary for consistency --- # Synchronization ``` int fsync(int fd); void sync(void); int msync(void *addr, size_t length, int flags); ``` - `fsync(2)` forces the file system to write data to disk - `sync(2)` forces the file system to write all data to disk - `msync(2)` forces the file system to write data to disk for a specific memory region --- # Exercises - Add synchronization calls to producer/consumer programs - Use memory-mapped I/O, unix I/O, and standard I/O - Compare the results of different combinations - Using different types of I/O - With and without synchronization - On the same machine and on different machines - Things to consider: - When and why is synchronization necessary? - When and why do you need to close and reopen files? --- class: center, middle # Get Started!