Posted:

15 May 2024

A reminder and exploration of low-level software basics: building and debugging (in "easy mode", i.e., with debugging symbols).

Preparation | Procedure

1. Preparation

Complete the following steps before coming to the lab. For this week, you do not need to submit your pre-lab work, but you are very welcome to ask questions if anything doesn’t make sense to you.

  1. [optional] Identify a lab partner[1].

  2. Ensure that you are able to log into a LabNet Linux image.

  3. Explore some of the available text editors on our LabNet machines. You’re probably quite used to a full-featured IDE such as Visual Studio Code, but it’s good to also be familiar with command-line tools that tend to be available in more restricted environments than their graphical counterparts.

    Vim / gVim

    This is the classic modal editor that I use in class (actually, I use Neovim, a modern take on the classic…​ probably a good place to start if you’re new to the Vim ecosystem, either via the command line nvim or a GUI like Neovide).

    If you haven’t used Vim before, you might like to learn a bit about it by running vimtutor at the command line (a great place to start!), via the OpenVim interactive tutorial or perhaps by a less interactive tutorial.

    Vim is very configurable via commands and startup scripts. Once you get into it, you’ll be able to customize things very much to your liking. For example, here’s my Vim config: my .vimrc file for all Vim instances and my .gvimrc file for graphical instances)

    Emacs

    This is another classic, popular editor among programmers. You can run emacs at the command line or use a graphical version. Options include:

    Visual Studio Code

    A cross-platform editor from Microsoft. It should be installed and ready for use with at least Python, as we use it in ENGI 1020. You may need to install additional extensions for C, LLDB, etc.

    gedit

    This is a general-purpose text editor for open-source platforms. It’s included with the GNOME desktop environment (the default graphical environment for Ubuntu) and runs on FreeBSD, Linux and other Unix-like platforms.

    Others

    There may be other editors installed on our LabNet image, too…​ perhaps Kate or Atom or others?

2. Procedure

Complete the following procedure, recording all commands that you execute and their outputs.

Save all output files for next week’s lab.

2.1. Compilation

  1. Download product.cpp and compile it using the following command:

    $ c++ -g -c product.cpp

    What file is generated?

  2. Inspect this file by passing it to the nm program. Find the symbol for the product function within the nm output (hint: you may find the grep utility to be helpful…​ run man grep for more information about this utility). What is this symbol called?

  3. Read the DESCRIPTION section of the manual page for the c++filt program (run man c++filt). Use c++filt to demangle this C++ name into a function signature. What is this signature?

  4. What is the symbol name for the main function? Even within a C++ file, main is named according to C symbol name rules. How does this differ from the C++ symbol for the product function?

  5. Download Makefile to the same directory, ensuring that its name is Makefile after it is saved (not Makefile.txt, just plain Makefile). Run make product to build an executable binary called product from your product.o object file. What commands does make execute for you?

  6. Use the command objdump -S to disassemble the object file product.o. Pipe the result through c++filt and save it to a file product.o.dump (hint: the > character can be used to redirect output from a command into a file). Within this file, find the disassembly of the product function. What is its numeric offset within the object file? What else do you observe about the function?

  7. Re-generate the dump file above, this time using the option --disassemble=SYMBOL, where SYMBOL is the C++-mangled symbol name for the product function. How does this dump file differ from the previous dump file?

  8. Use objdump to disassemble the product function within the product binary and save the (C++-demangled) output to a file product.dump. What is different about the dumps of the object file and the full binary?

    Details

    There are a few ways that you can compare these two dump files:

    1. Visual inspection in side-by-side text editors (sounds like a lot of work!)

    2. The diff command, i.e., diff product.o.dump product.dump. Personally I like to use the "unified diff" format (-u option), but YMMV.

    3. Fancy word-granual coloured diffs at the command line:

      $ wdiff -n product.o.dump product.dump | colordiff
    4. A GUI tool like Meld (which is available in our LabNet Ubuntu image)

  9. Run the product program with no command-line arguments. What do you observe?

2.2. Debugging with symbols

  1. Use the product program to compute the product of the numbers 1, 2 and 3. What do you observe?

  2. Run product under the LLDB debugger by executing lldb ./product, then executing the run command within LLDB. What do you observe? Exit using exit.

  3. Run product under the debugger with command-line arguments:

    $ lldb -- ./product 1 2 3

    What do you observe when you run the program?

  4. Run help break set to explore the syntax of setting a breakpoint. How can you set a breakpoint at the beginning of the product function?

  5. Set a breakpoint at the beginning of the product function and re-run the program from the start (with run). Use the bt command to get a backtrace of the current call stack. What is the next instruction to execute? Verify this using the register read command.

  6. Use the frame variable command to print all local variables. Where does the numbers pointer point to?

  7. Use the command expr &numbers to get the address of the numbers pointer (i.e., the address of the variable containing the pointer to the array). What is this address (of the numbers pointer)?

  8. Use the memory read command to output the pointer value held in the numbers variable, and then the contents of the array pointed at by numbers. For example, if you wanted to output 64 B of memory starting at address 0x00007ff7bfeff000, you would run:

    (lldb) mem read 0x00007ff7bfeff000 0x00007ff7bfeff000+64
  9. Using these addresses and memory contents, draw a diagram showing how numbers points to location in memory, and how that memory contains the integer values derived from the program’s command line arguments. Include all relevant memory addresses in your diagram.

  10. Run the continue command three times to pause the program at the fourth invocation of product. What is the address of each instruction in the call stack? Find these instructions in product.dump and include them in your report.

  11. What address does numbers point to now? What is the address of the numbers pointer?

  12. Use the memory read command to print the contents of 384 B (0x180 B) of stack memory, starting at the address of numbers.

    1. What value is contained at the address of numbers?

    2. Identify all parameters and return address within the portion of stack memory that you printed.


1. You are free to work with a different lab partner every week, or the same partner, whichever you prefer.