Lab 10 - Linking in C

Lab goals:


Pre-lab


Background: Linking Concepts

In this lab, we will explore the second half of the compilation pipeline: linking. When you compile C code, each source file is translated into an object file. These object files may contain undefined symbols that must be resolved by the linker before the final executable can be produced. The linker combines all object files, resolves symbols, relocates addresses, and outputs a final binary.

This pre-lab introduces key ideas you will explore in more depth during the in-lab portion: linking object files, resolving symbol dependencies, inspecting symbol tables, and observing linker failures.

Exercise A: Inspect Object Files

Create two files, main.c and sum.c:

// main.c
int sum(int *a, int n);
int array[2] = {1, 2};

int main(void) {
    int val = sum(array, 2);
    return val;
}
// sum.c
int sum(int *a, int n) {
    int i, s = 0;
    for (i = 0; i < n; i++) {
        s += a[i];
    }
    return s;
}

Compile each file into an object file:

gcc -c -g -o main.o main.c
gcc -c -g -o sum.o sum.c

Inspect the symbol tables of each object file. Check symbol meanings here.

nm main.o
nm sum.o

Or you can use the following commands:

objdump -t main.o
objdump -t sum.o

Questions to answer before lab:

Exercise B: Link the Program Manually

Link the object files into an executable:

gcc -g -o linked main.o sum.o

Then inspect the linked binary with:

nm linked
objdump -t linked

Note the addresses for main and sum. Are any undefined symbols remaining?

Exercise C: Induce and Interpret a Linking Error

Delete sum.o and attempt to link again:

gcc -g -o broken main.o

You should see an error: undefined reference to 'sum'. What caused this error? Why didn't it happen during compilation?

This error illustrates that the compiler checks only for syntax and translation-unit level issues. The linker checks for missing definitions across files.

Review

Before coming to lab, make sure you understand:


In-lab


This lab is about making the linker happy and deliberately breaking it to understand symbol resolution, static vs. dynamic linking, and multi-file modular programming. You will write code across multiple translation units, observe linking errors, explore symbol tables, and link with external libraries. Let’s dive in. Make sure to create a new directory within the repo for each exercise as needed.

Get the repo here: https://classroom.github.com/a/NURXT2x4

Exercise 1: Linking Three Object Files

Create a small project with three C files. Start with:

// main.c
#include <stdio.h>

int sum(int*, int);
int double_sum(int*, int);

int main() {
    int x[] = {1, 2, 3};
    printf("%d\n", double_sum(x, 3));
    return 0;
}
// sum.c
int sum(int* arr, int len) {
    int total = 0;
    for (int i = 0; i < len; i++) total += arr[i];
    return total;
}
// double_sum.c
int sum(int*, int);

int double_sum(int* a, int n) {
    return 2 * sum(a, n);
}

Compile and link with:

gcc -c main.c sum.c double_sum.c
gcc -o final main.o sum.o double_sum.o
./final

Use nm and objdump -t on each object file. Where is sum defined and where is it only referenced? Try reordering the link line; does gcc -o final double_sum.o sum.o main.o still work? (Note: For individual object files, the order may not matter as much; but this becomes crucial when working with static libraries.)

Exercise 2: External Function Without Declaration

Delete the line int sum(int*, int); from double_sum.c and recompile. What happens? Why does the compiler allow it? Inspect nm double_sum.o. What kind of symbol is sum?

Recompile and re-run the full program. Does it still work? The takeaway: undeclared functions are assumed to return int, but this is dangerous and type-unsafe. Always declare externally used functions properly.

Exercise 3: Name Collisions Across Files

Create a file alt_sum.c with a conflicting definition:

// alt_sum.c
int sum(int* arr, int len) {
    return 42;
}

Now try:

gcc -c alt_sum.c
gcc -o weird main.o alt_sum.o double_sum.o

Which sum function is used? Use nm to verify which object file defines it. What happens if both sum.o and alt_sum.o are linked?

This experiment illustrates that linking multiple definitions of the same symbol causes "multiple definition" errors unless the symbol is declared static or inline.

Exercise 4: Static and Dynamic Libraries

Create a static library from sum.o:

ar rcs libsum.a sum.o

Now link against it:

gcc -o prog main.o double_sum.o -L. -lsum

Does it work? Use nm prog and ldd prog. Is libsum.a statically linked or dynamically linked? Try renaming libsum.a and re-running prog. What happens?

Repeat with a shared library:

gcc -fPIC -c -o sum_pic.o sum.c
gcc -shared -o libsum.so sum_pic.o
gcc -o prog_dyn main.o double_sum.o -L. -lsum

Now run:

setenv LD_LIBRARY_PATH .
./prog_dyn

Compare the outputs of nm prog and nm prog_dyn. Try deleting libsum.so and running the program again. What does this reveal about dynamic linkage?

Exercise 5: Setting Up a Multi-File Project

Create the following files in your lab directory:

main.c

#include <stdio.h>
#include "math_utils.h"
#include "array_ops.h"

int main() {
    int a = 12, b = 8;
    printf("GCD of %d and %d is %d\n", a, b, gcd(a, b));

    int data[] = {1, 2, 3, 4, 5};
    printf("Sum of array: %d\n", array_sum(data, 5));
    return 0;
}

math_utils.c

#include "math_utils.h"

int gcd(int a, int b) {
    while (b != 0) {
        int temp = b;
        b = a % b;
        a = temp;
    }
    return a;
}

math_utils.h

#ifndef MATH_UTILS_H
#define MATH_UTILS_H

int gcd(int a, int b);

#endif

array_ops.c

#include "array_ops.h"

int array_sum(int* arr, int len) {
    int sum = 0;
    for (int i = 0; i < len; i++) {
        sum += arr[i];
    }
    return sum;
}

array_ops.h

#ifndef ARRAY_OPS_H
#define ARRAY_OPS_H

int array_sum(int* arr, int len);

#endif

Create the Makefile

Analyze and create the below Makefile to help automate the compilation and linking process. You may need to adjust the formatting as you copy/paste the code into your file.

CC = gcc
CFLAGS = -Wall -Werror -Wextra -g

OBJS = main.o math_utils.o array_ops.o
EXEC = link_test

all: $(EXEC)

$(EXEC): $(OBJS)
        $(CC) $(CFLAGS) -o $@ $^

%.o: %.c
        $(CC) $(CFLAGS) -c $<

clean:
        rm -f $(EXEC) $(OBJS)

Try running make. This should build all object files and link them into the final executable link_test. If any file is modified, only the relevant parts will recompile.

Inspect Symbols and Linking Behavior

Now use the following tools to understand what’s happening under the hood:

What happens if you delete math_utils.o and run make again? Try changing the name of the gcd function in the .c file but not in the header. Recompile — what kind of error do you get? This demonstrates linker-level symbol mismatch.

Exercise 6: Interposing Functions with LD_PRELOAD

In this exercise, you'll override standard library functions at runtime using symbol interposition. This technique allows you to alter the behavior of dynamically linked functions without modifying the original binary. It is commonly used in profiling, logging, sandboxing, and debugging tools.

Step 1: Original Program Using printf

// main.c
#include <stdio.h>

int main() {
    printf("Hello, world! Keep it %d.\n", 100);
    return 0;
}

Compile the program:

gcc -g -o simple_prog main.c
./simple_prog

Step 2: Write Interposing Implementation

// fake_printf.c
#include <stdio.h>
#include <stdarg.h>

int printf(const char *format, ...) {
    // Add a prefix to all printf output
    fputs("[interposed] ", stdout);
    
    // Use vprintf to handle variadic arguments
    va_list args;
    va_start(args, format);
    int result = vprintf(format, args);
    va_end(args);
    
    return result;
}

Compile it into a shared library:

gcc -shared -fPIC -o libfakeprintf.so fake_printf.c

Step 3: Use LD_PRELOAD

setenv LD_PRELOAD ./libfakeprintf.so
./simple_prog

Observe the change in behavior. The original printf function is replaced, and all output is prefixed with [interposed]. This demonstrates how dynamic linking allows functions to be overridden at runtime.


Post-lab


Ensure you have pushed all of the source and header .c and .h files from this lab.

Submission Instructions

Once you’ve completed the exercises, be sure to git add your modified source and header files. Then commit and push to your GitHub repository. All submissions are due by 11:55 PM on Sunday, 11/9.