Lab 10 - Memory Mapped I/O

Lab goals:


The mmap() System Call

mmap() is a system call that can be used by a user process to ask the operating system kernel to map either files or devices into the memory (i.e., address space) of that process. The mmap() system call can also be used to allocate memory (an anonymous mapping). A key point here is that the mapped pages are not actually brought into physical memory until they are referenced; thus mmap() can be used to implement lazy loading of pages into memory (demand paging).

The following figure illustrates the use of the mmap() system call:

mmap

Here is the function prototype for mmap():

void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);

There are six arguments to the mmap() system call:

On success, mmap() returns a pointer to the mapped area. On error, the value MAP_FAILED (that is, (void *)(-1)) is returned and errno is set to indicate the reason. Full details on the mmap() system call are available using the man mmap command.

The munmap() System Call

munmap() is a system call used to unmap memory previously mapped with mmap(). The call removes the mapping for the memory from the address space of the calling process process:

int munmap(void *addr, size_t length);

There are two arguments to the munmap() system call:

On success, munmap() returns 0, on failure -1 and errno is set to indicate the reason. If successful, future accesses to the unmapped memory area will result in a segmentation fault (SIGSEGV).

Advantages of Using mmap()

There are many advantages to using mmap() to gain access to the contents of some file on disk.

One advantage of using mmap() is lazy loading. If no memory within a certain page is ever referenced, then that page is never loaded into physical memory. This can be crucial in certain applications in terms of saving both memory and time.

Another advantage with mmap() is speed improvements. Traditional I/O involves a lot of system calls (e.g., calls to read()) to load data into memory. There are costs associated with these calls, such as error checking within the functions themselves. Loading data into main memory also has to go through several layers of software abstraction, with the data being copied around in various buffers in the operating system before finally being placed in memory, which clearly will slow down the program. Using mmap() avoids both of these issues.

Finally, mmap() has the advantage of being able to support versatile dynamic memory allocation. In particular, malloc() cannot safely be called in a signal handler, whereas mmap() can, using anonymous mappings. Additionally, malloc() itself can be made more versatile by using mmap() to allocate memory rather than simply raising the process's break by using sbrk(). Calling sbrk() necessitates that memory may bem freed only at the top of the heap (i.e., by lowering the break). However, there is no such restriction if memory is allocated using mmap(). As such, memory in the middle of the heap may be freed at will (using munmap()).

An mmap() Example

The following simple, but buggy, example program appears in your lab repo as exercise1.c. This program demonstrates a typical use case of mmap() for reading data from a file (the #include lines in this example have been omitted here for simplicity of presentation).

/*
 * Requires:
 *   This program's source code must reside in the current working directory
 *   in a readable file named "exercise1.c".
 *
 * Effects:
 *   Reads this program's source code, via mmap(), and prints that source
 *   code to stdout.
 */
int
main(void)
{
        struct stat stat;
        int fd, size;
        char *buf;

        // Open the file and get its size.
        fd = open("exercise1.c", O_RDONLY);
        if (fd < 0 || fstat(fd, &stat) < 0) {
                fprintf(stderr, "open() or fstat() failed.\n");
                return (1);
        }
        size = stat.st_size;

        // Map the file to a new virtual memory area.
        buf = mmap(NULL, size, PROT_NONE, MAP_PRIVATE, fd, 0);
        if (buf == MAP_FAILED) {
                fprintf(stderr, "mmap() failed.\n");
                return (1);
        }

        // Print out the contents of the file to stdout.
        for (int i = 0; i < size; i++) {
                printf("%c", buf[i]);
        }

        // Clean up.  Ignore the return values because we exit anyway.
        (void)munmap(buf, size);
        (void)close(fd);

        return (0);
}

This program calls the open() system call to open its own source code file (the file exercise1.c) and then gets the size (in bytes) of this file by calling the fstat() system call. The fstat() system call returns a number of different pieces of information about the status of the file open on the given file descriptor. (The stat() system call is identical to fstat(), except that stat() takes the name of the file as its first argument rather than a file descriptor open to that file, as in fstat().) The program then uses mmap() to map that file into the process's memory, and then prints out the contents of the file from the memory into which the file has been mapped.


GitHub Repository for This Lab

To obtain your private repo for this lab, please point your browser to this link for the starter code for the lab. Follow the same steps as for previous labs and assignments to create your repository on GitHub and then to clone it onto CLEAR. The directory for your repository for this lab will be

lab-10-memory-mapped-i-o-name

where name is your GitHub userid.


Submission

Be sure to git push the appropriate C source files for this lab before 11:55 PM tonight, to get credit for this lab.