Lab 3 - Debugging

Lab goals:


GitHub Repository for This Lab

To obtain your private repo for this lab, please point your browser to this starting link. Follow the protocol from previous labs and assignments to get your repository link. The repository for this lab will be
  lab-3-debugging-[YOUR GITHUB ID]

Overview

Debugging is something you'd rather not need to do. It means you made mistakes when first coding that now need to be fixed. You'll save yourself time and stress if you don't make the mistakes in the first place.

Debugging includes the following steps:

  1. Detecting an error, e.g., through testing.
  2. Reproducing the error. Errors that can't be reproduced consistently are almost impossible to diagnose. Common causes of irreproducibility include improper initialization, improper dependencies on memory layout, use of randomization, and incorrect synchronization of multiple threads.
  3. Finding and identifying the mistake causing the error. This is usually the hardest part. This lab concentrates on this step.
  4. Fixing the mistake and testing the fix.
  5. Looking for similar and related mistakes.

Debugging is, in general, very difficult. It requires practice to become proficient. The scheduled lab session should emphasize the hands-on sections.

Avoid making mistakes

"Debugging" actually begins when you are writing your code, rather than after you finish a first draft. You should always proofread your code before compiling it. The earlier you find and fix a bug, the better. It takes less time to identify and fix the problem when the code and its purpose is fresh in your mind. And in a real-world business environment, identifying and fixing problems later delays other aspects of development (such as quality assurance and manual writing).

Look for common mistakes

From experience, you should realize that some mistakes are more common than others. Some have been discussed in class. When coding, testing, and debugging, you should keep these in mind. These include the following examples (see also Expert C Programming: Deep C Secrets by Peter Van Der Linden, Prentice Hall, 1994):

Be sure to use the pickiest compiler warning level possible! As we've previously indicated for this course, that means using clang's flags -Wall -Wextra -Werror.

Test your code during development

We could spend weeks talking about testing. You've already been required to be somewhat systematic about testing, especially in the Word Count assignment. An important idea is to get small parts of the program working correctly first. As you implement each function in your program, you can write small test functions (sometimes referred to as unit tests) to test that function alone. This way when you implement the next function, and your program passes all tests, you can be confident that you have not broken any of the previously written functions (this is also called regression testing). Each function must be tested against all valid inputs. A simple idea is to divide your inputs into different categories and sample each category. But make sure you cover all corner cases. You've probably also done whole-program testing on an ad hoc basis, whereas our grading uses an automated whole-program tester.

To be worthwhile, use the tests. Test frequently, and debug immediately. To this end, automate testing. That's what you've done with unit testing in this and previous courses. That's what we do with whole-program testing when grading.


Trace Debugging

The idea of trace debugging is simple: print out messages and values during program execution, so that you can track some of its behavior. Then you can decide if this is the expected behavior.

The primary strength of trace debugging is to narrow the potential scope of the problem -- e.g., from "there's a bug somewhere" to something like "function foo() isn't calculating the right result".

"printf"-style trace debugging

Where should you add print statements for debugging?

These print statements generate a trace of the program execution. With even small programs, traces can have too much information to be immediately understood. To collect all this information to examine more easily, you can redirect this output to a file. To reduce the amount of information, you can make this output conditional on one or more debugging flags and/or an integer representing levels of debugging verbosity.

    int debug_level = 2;

    if (debug_level > 1) {
            /* Print basic debug messages. */
            ....
    }

    if (debug_level > 2) {
            /* Print more detailed debug messages. */
            ....
    }

Assertions

A variant on this theme is to print something out and exit the program if some unexpected condition occurs. This is usually called an assertion, as you assert that the condition should hold. Assertions are usually distinct from the normal error-checking in your program -- error-checking looks for "expected" mistakes such as bad user input, while assertions look for "unexpected" mistakes such as a programmer calling a function with improper arguments.

Assertions are a common programming technique in any language. They can, and should, be used in conjunction with the other debugging and testing techniques. They are placed within your code, not separated like unit tests.

The standard C library provides a convenient mechanism for this:

     #include <assert.h>

     assert(condition);

If the condition is true, as desired, nothing happens. However, if it is false, it prints an error message with the condition, file name, and line number, and exits the program.

For example in the first project stub code, there was the following:

     /*
      * Requires:  
      *   The input "n" must be greater than 1.
      *
      * Effects: 
      *   Returns the number of factors of the input "n".
      */
     unsigned int
     count_factors(unsigned int n)
     {
             /* Put your local declarations here. */
         
             assert(n > 1);

             /* Put your code here. */
     }

While the specification indicates that the function may assume n>1, for debugging purposes it is best to really make sure that assumption holds.

By default, the compiler generates code for this check, but you can tell it not to by defining the tag NDEBUG ("no debugging"):

     clang -DNDEBUG foo.c
In other words, the assertion code is compiled only if NDEBUG is not defined (double negation!).


Debuggers

With a debugger, you can see what's happening in your program by stepping through its instructions and looking at the changes in your variables. Additionally, you usually want to quickly skip though large chunks of your program without always single-stepping. As such, the four basic operations of any debugger are to

Debuggers usually have many more options, but most are simply to increase the usability of these basic four.

The primary strength of a debugger is to find the error from an already narrow scope -- e.g., from "function foo() isn't calculating the right result" to "line 14 of foo() is using unsigned arithmetic instead of signed", or from "variable bar somehow gets the wrong value" to "variable bar gets the value 42 on line 78".

The most useful and most common debuggers are also known as symbolic or source-level debuggers, since their operations are relative to the original source code, and not its compiled machine code.

gdb

The most common Unix debugger is gdb.. See man gdb for details, but this lab covers the highlights.

In order to use any Unix debugger, you must compile extra information into your code, using the -g flag, and then run the debugger. For a great majority of unix systems, the following 2 lines will accomplish the task.

     clang -g -Wall -Wextra -Werror foo.c -o foo
     gdb foo

The most common commands are

gdb Exercise 1

We'll walk you through the basic commands:

  1. Compile debug1.c with the -g option to add debugging information to the executable produced:

                 clang -g -o debug1 debug1.c
                 
  2. Start gdb with the executable loaded:

                  gdb debug1
                 
  3. Set a breakpoint at print_sum:

                 (gdb) break print_sum
                 
  4. Start the execution of test:

                 (gdb) run
                 
  5. Once the program stops at your breakpoint, you might see a message such as:

    
    		Missing separate debuginfos, use: debuginfo-install glibc-2.17-157.el7_3.4.x86_64
    	    
    The message is harmless, ignore it.

    While at the print_sum breakpoint, inspect the current stack trace:

                 (gdb) bt
                 

    Observe that the line numbers and parameter values are printed for each function.

  6. Print out some information about the variables in the current stack frame:

                 (gdb) print i
                 (gdb) print j
                 (gdb) print i+j
                 (gdb) print 2*i
                 

    Notice that gdb can evaluate expressions using the variables in the current stack frame.

  7. Execute the printf (the next line in the test program) and then stop again:

                 (gdb) n
                 

    Notice that if we had used step, we would have stepped into the printf code - usually not desirable at all.

  8. Allow the program to continue executing:

                 (gdb) c
                 
  9. Once the program breaks (stops at the breakpoint) again, delete the breakpoint so the program will run to completion:

                 (gdb) delete
                 
  10. Start the executable again from the current line of code, and allow it to run to completion (since there are no more breakpoints for it to stop at):

                 (gdb) c
                 
  11. Exit gdb:

                 (gdb) quit
                 
gdb Exercise 2

In this exercise, you are given a code (debug2.c which does not work as expected. Your task is to debug the code using gdb and fix the code.

The code calculates the "n"th number in the fibonnaci series for a given "n". It uses both recursion and a technique called memoization. Memoization is an optimization technique which involves storing intermediate results so that they are not calculated again in the future. This can significantly speed up programs, especially if the program otherwise calculates the same results over and over again.

Hints:

  • Look at the source code and get an idea of how the program works.
  • Compile and run the program. Find out why the result is not as expected.
  • Now run the program in gdb. Step through the program and analyze the intermediate values, especially the variable result. Set breakpoints within the function fib and check the values returned.

  • Core Files

    The shell can be configured so that what is called a "core file" is created when your program crashes. CLEAR is configured to "dump" a core file by default. Such core files can be used by gdb to analyze the state that the program had right at the time of the crash (at the time that the "core" file was produced). For this, you simply need to execute a command such as:

         gdb program core
    

    where program is the name of the executable program and core is the name of the core file that was produced when it crashed. Different systems use different naming conventions for the dumped core files, but they will have "core" in their name somewhere (normally at the start of the file name). On CLEAR, one way to find the name of your most recent core file is to use the shell command

         ls -lt core* | head
    
    (This command prints the names of the first few files whose names start with core, sorted with the most recent one first.)

    Once you start gdb in this way, you can then use all of the usual gdb commands to examine the program's state. Note, however, that you are not going to be able to step forward in the program, as the program is not actually currently runing, and if it were, it would be about to crash!

    Conclusion

    Between unit tests, assertions, and printf-style debugging, you often write as much or more code for testing/debugging purposes as to actually perform the desired task. Students are often dismayed by this, as it is truly overkill for many of the examples you will see in introductory courses. Hopefully, you'll trust us that these techniques are worth the effort in large projects.

    Also, we have only touched upon these ideas, providing rather minimal support for them. Commercial development tools attempt (with varying degrees of success) to minimize the amount of programmer effort needed to use these techniques.


    Submission

    To turn in your lab, please push your (fixed) debug2.c file to your personal repository before 11:55PM on Saturday at the end of this week.

    NOTE: Please push only the debug2.c file. The simplest possible way to ensure you do not push too much is to use git add only on the debug2.c file. Then your commits and pushes will propagate only the source files.

    WARNING: If you push core files or executables, that will be considered an erroneous lab. You will not receive credit for it.