Lab 10 - Memory Mapped I/O

Lab goals:

Understand the typical use cases and advantages of mmap().
Gain experience with different mmap() uses.

The `mmap()` System Call

mmap() is a system call that can be used by a user process to ask the operating system kernel to map either files or devices into the memory (i.e., address space) of that process. The mmap() system call can also be used to allocate memory (an anonymous mapping). A key point here is that the mapped pages are not actually brought into physical memory until they are referenced; thus mmap() can be used to implement lazy loading of pages into memory (demand paging).

The following figure illustrates the use of the mmap() system call:

Here is the function prototype for mmap():

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);

There are six arguments to the mmap() system call:

addr - A hint to the operating system kernel as to the address at which the virtual mapping should start in the virtual memory (i.e., the virtual address space) of the process. The value can be specified as NULL to indicate that the kernel can place the virtual mapping anywhere it sees fit. If not NULL, then addr should be a multiple of the page size.
length - The length (number of bytes) for the mapping.
prot - The protection for the mapped memory. The value of prot is the bitwise or of various of the following single-bit values:
- PROT_READ - Enable the contents of the mapped memory to be readable by the process.
- PROT_WRITE - Enable the contents of the mapped memory to be writable by the process.
- PROT_EXEC - Enable the contents of the mapped memory to be executable by the process as CPU machine instructions.
flags - Various options controlling the mapping. Some of the more common flags values are described below:
- MAP_ANONYMOUS (or MAP_ANON) - Allocate anonymous memory; the pages are not backed by any file.
- MAP_FILE - The default setting; it need not be specified. The mapped region is backed by a regular file.
- MAP_FIXED - Don't interpret addr as a hint: place the mapping at exactly that address, which must be a multiple of the page size.
- MAP_PRIVATE - Modifications to the mapped memory region are not visible to other processes mapping the same file.
- MAP_SHARED - Modifications to the mapped memory region are visible to other processes mapping the same file and are eventually reflected in the file.
fd - The open file descriptor for the file from which to populate the memory region. If MAP_ANONYMOUS is specified, then fd should be given as -1.
offset - If this is not an anonymous mapping, the memory mapped region will be populated with data starting at position offset bytes from the beginning of the file open as file descriptor fd. Should be a multiple of the page size.

On success, mmap() returns a pointer to the mapped area. On error, the value MAP_FAILED (that is, (void *)(-1)) is returned and errno is set to indicate the reason. Full details on the mmap() system call are available using the man mmap command.

The `munmap()` System Call

munmap() is a system call used to unmap memory previously mapped with mmap(). The call removes the mapping for the memory from the address space of the calling process process:

int munmap(void *addr, size_t length);

There are two arguments to the munmap() system call:

addr - The address of the memory to unmap from the calling process's virtual mapping. Should be a multiple of the page size.
length - The length of the memory (number of bytes) to unmap from the calling process's virtual mapping. Should be a multiple of the page size.

On success, munmap() returns 0, on failure -1 and errno is set to indicate the reason. If successful, future accesses to the unmapped memory area will result in a segmentation fault (SIGSEGV).

Advantages of Using `mmap()`

There are many advantages to using mmap() to gain access to the contents of some file on disk.

One advantage of using mmap() is lazy loading. If no memory within a certain page is ever referenced, then that page is never loaded into physical memory. This can be crucial in certain applications in terms of saving both memory and time.

Another advantage with mmap() is speed improvements. Traditional I/O involves a lot of system calls (e.g., calls to read()) to load data into memory. There are costs associated with these calls, such as error checking within the functions themselves. Loading data into main memory also has to go through several layers of software abstraction, with the data being copied around in various buffers in the operating system before finally being placed in memory, which clearly will slow down the program. Using mmap() avoids both of these issues.

Finally, mmap() has the advantage of being able to support versatile dynamic memory allocation. In particular, malloc() cannot safely be called in a signal handler, whereas mmap() can, using anonymous mappings. Additionally, malloc() itself can be made more versatile by using mmap() to allocate memory rather than simply raising the process's break by using sbrk(). Calling sbrk() necessitates that memory may bem freed only at the top of the heap (i.e., by lowering the break). However, there is no such restriction if memory is allocated using mmap(). As such, memory in the middle of the heap may be freed at will (using munmap()).

An `mmap()` Example

The following simple, but buggy, example program appears in your lab repo as exercise1.c. This program demonstrates a typical use case of mmap() for reading data from a file (the #include lines in this example have been omitted here for simplicity of presentation).

/*
 * Requires:
 *   This program's source code must reside in the current working directory
 *   in a readable file named "exercise1.c".
 *
 * Effects:
 *   Reads this program's source code, via mmap(), and prints that source
 *   code to stdout.
 */
int
main(void)
{
        struct stat stat;
        int fd, size;
        char *buf;

        // Open the file and get its size.
        fd = open("exercise1.c", O_RDONLY);
        if (fd < 0 || fstat(fd, &stat) < 0) {
                fprintf(stderr, "open() or fstat() failed.\n");
                return (1);
        }
        size = stat.st_size;

        // Map the file to a new virtual memory area.
        buf = mmap(NULL, size, PROT_NONE, MAP_PRIVATE, fd, 0);
        if (buf == MAP_FAILED) {
                fprintf(stderr, "mmap() failed.\n");
                return (1);
        }

        // Print out the contents of the file to stdout.
        for (int i = 0; i < size; i++) {
                printf("%c", buf[i]);
        }

        // Clean up.  Ignore the return values because we exit anyway.
        (void)munmap(buf, size);
        (void)close(fd);

        return (0);
}

This program calls the open() system call to open its own source code file (the file exercise1.c) and then gets the size (in bytes) of this file by calling the fstat() system call. The fstat() system call returns a number of different pieces of information about the status of the file open on the given file descriptor. (The stat() system call is identical to fstat(), except that stat() takes the name of the file as its first argument rather than a file descriptor open to that file, as in fstat().) The program then uses mmap() to map that file into the process's memory, and then prints out the contents of the file from the memory into which the file has been mapped.

GitHub Repository for This Lab

To obtain your private repo for this lab, please point your browser to this link for the starter code for the lab. Follow the same steps as for previous labs and assignments to create your repository on GitHub and then to clone it onto CLEAR. The directory for your repository for this lab will be

lab-10-memory-mapped-i-o-name

where name is your GitHub userid.

Submission

Be sure to git push the appropriate C source files for this lab before 11:55 PM tonight, to get credit for this lab.

COMP 321: Introduction to Computer Systems

Navigation

Lab 10 - Memory Mapped I/O

The `mmap()` System Call

The `munmap()` System Call

Advantages of Using `mmap()`

An `mmap()` Example

GitHub Repository for This Lab

Submission

COMP 321: Introduction to Computer Systems

Navigation

Lab 10 - Memory Mapped I/O

The mmap() System Call

The munmap() System Call

Advantages of Using mmap()

An mmap() Example

GitHub Repository for This Lab

Submission

The `mmap()` System Call

The `munmap()` System Call

Advantages of Using `mmap()`

An `mmap()` Example