Lab 10 - Memory Mapped I/O
Lab goals:
- Understand the typical use cases and advantages of
mmap()
. - Gain experience with different
mmap()
uses.
The mmap() System Call
mmap() is a system call that can be used by a
user process to ask the operating system kernel to
map either files or devices into the memory (i.e.,
address space) of that process. The mmap()
system call can also be used to allocate memory (an
anonymous mapping). A key point here is that the mapped
pages are not actually brought into physical memory until
they are referenced; thus mmap()
can be used
to implement lazy
loading of pages into memory
(demand paging).
The following figure illustrates the use of the mmap() system call:
Here is the function prototype for mmap():
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
There are six arguments to the mmap() system call:
- addr - A hint to the operating system kernel as to the address at which the virtual mapping should start in the virtual memory (i.e., the virtual address space) of the process. The value can be specified as NULL to indicate that the kernel can place the virtual mapping anywhere it sees fit. If not NULL, then addr should be a multiple of the page size.
- length - The length (number of bytes) for the mapping.
-
prot - The protection for the mapped
memory. The value of
prot
is the bitwiseor
of various of the following single-bit values:PROT_READ
- Enable the contents of the mapped memory to be readable by the process.PROT_WRITE
- Enable the contents of the mapped memory to be writable by the process.PROT_EXEC
- Enable the contents of the mapped memory to be executable by the process as CPU machine instructions.
-
flags - Various options controlling the
mapping. Some of the more common
flags
values are described below:- MAP_ANONYMOUS (or MAP_ANON) - Allocate anonymous memory; the pages are not backed by any file.
- MAP_FILE - The default setting; it need not be specified. The mapped region is backed by a regular file.
- MAP_FIXED - Don't interpret addr as a hint: place the mapping at exactly that address, which must be a multiple of the page size.
- MAP_PRIVATE - Modifications to the mapped memory region are not visible to other processes mapping the same file.
- MAP_SHARED - Modifications to the mapped memory region are visible to other processes mapping the same file and are eventually reflected in the file.
- fd - The open file descriptor for the file from which to populate the memory region. If MAP_ANONYMOUS is specified, then fd should be given as -1.
- offset - If this is not an anonymous mapping, the memory mapped region will be populated with data starting at position offset bytes from the beginning of the file open as file descriptor fd. Should be a multiple of the page size.
On success, mmap() returns a pointer to the
mapped area. On error, the value
MAP_FAILED (that is, (void *)(-1)) is
returned and errno is set to indicate the
reason. Full details on the mmap() system
call are available using the man mmap
command.
The munmap() System Call
munmap() is a system call used to
unmap
memory previously mapped with
mmap(). The call removes the mapping for the
memory from the address space of the calling process
process:
int munmap(void *addr, size_t length);
There are two arguments to the munmap() system call:
- addr - The address of the memory to unmap from the calling process's virtual mapping. Should be a multiple of the page size.
- length - The length of the memory (number of bytes) to unmap from the calling process's virtual mapping. Should be a multiple of the page size.
On success, munmap() returns 0, on failure -1 and errno is set to indicate the reason. If successful, future accesses to the unmapped memory area will result in a segmentation fault (SIGSEGV).
Advantages of Using mmap()
There are many advantages to using mmap()
to gain access to the contents of some file on disk.
One advantage of using mmap()
is lazy
loading. If no memory within a certain page is ever
referenced, then that page is never loaded into physical
memory. This can be crucial in certain applications in
terms of saving both memory and time.
Another advantage with mmap() is speed
improvements. Traditional I/O involves a lot of system
calls (e.g., calls to read()) to load data into
memory. There are costs associated with these calls, such
as error checking within the functions themselves. Loading
data into main memory also has to go through several layers
of software abstraction, with the data being copied around
in various buffers in the operating system before finally
being placed in memory, which clearly will slow down the
program. Using mmap()
avoids both of these
issues.
Finally, mmap()
has the advantage of being
able to support versatile dynamic memory allocation.
In particular, malloc() cannot safely be called
in a signal handler, whereas mmap()
can, using
anonymous mappings. Additionally, malloc()
itself can be made more versatile by using
mmap()
to allocate memory rather than simply
raising the process's break by using sbrk()
.
Calling sbrk()
necessitates that memory may
bem freed only at the top of the heap (i.e., by lowering
the break). However, there is no such restriction if memory
is allocated using mmap()
. As such, memory in
the middle of the heap
may be freed at will (using
munmap()
).
An mmap()
Example
The following simple, but buggy, example program appears
in your lab repo as exercise1.c
. This
program demonstrates a typical use case of
mmap()
for reading data from a file (the
#include lines in this example have been omitted
here for simplicity of presentation).
/*
* Requires:
* This program's source code must reside in the current working directory
* in a readable file named "exercise1.c".
*
* Effects:
* Reads this program's source code, via mmap(), and prints that source
* code to stdout.
*/
int
main(void)
{
struct stat stat;
int fd, size;
char *buf;
// Open the file and get its size.
fd = open("exercise1.c", O_RDONLY);
if (fd < 0 || fstat(fd, &stat) < 0) {
fprintf(stderr, "open() or fstat() failed.\n");
return (1);
}
size = stat.st_size;
// Map the file to a new virtual memory area.
buf = mmap(NULL, size, PROT_NONE, MAP_PRIVATE, fd, 0);
if (buf == MAP_FAILED) {
fprintf(stderr, "mmap() failed.\n");
return (1);
}
// Print out the contents of the file to stdout.
for (int i = 0; i < size; i++) {
printf("%c", buf[i]);
}
// Clean up. Ignore the return values because we exit anyway.
(void)munmap(buf, size);
(void)close(fd);
return (0);
}
This program calls the open() system call to
open its own source code file (the file
exercise1.c
) and then gets the size (in
bytes) of this file by calling the fstat()
system call. The fstat() system call returns a
number of different pieces of information about the
status
of the file open on the given file
descriptor. (The stat() system call is identical
to fstat(), except that stat() takes
the name of the file as its first argument rather than a
file descriptor open to that file, as in
fstat().) The program then uses
mmap() to map that file into the process's
memory, and then prints out the contents of the file from
the memory into which the file has been mapped.
GitHub Repository for This Lab
To obtain your private repo for this lab, please point your browser to this link for the starter code for the lab. Follow the same steps as for previous labs and assignments to create your repository on GitHub and then to clone it onto CLEAR. The directory for your repository for this lab will be
lab-10-memory-mapped-i-o-name
where name is your GitHub userid.
Submission
Be sure to git push the appropriate C source files for this lab before 11:55 PM tonight, to get credit for this lab.