Share to learn: How do debuggers work?

Ok, as mentioned in the previous entry, let's me continue to discuss about the problem a debugger faces - thread or more exactly, multithread.

When you want to stop a debuggee that's churning like mad, you need to get a breakpoint jammed into the CPU instruction stream so that you can stop in the debugger. The question is, what do you need to do to get the instruction in there? If a thread is running, the only thing you can do to get it to a known point is to suspend it by using the SuspendThread API function. Once the thread is suspended, you can look at it with the GetThreadContext API function and determine the current instruction pointer. Once you have the instruction pointer, you're back to setting simple breakpoints. After you set the breakpoint, you need to call the ResumeThread API function so that you can let the thread continue execution and have it hit your breakpoint.

Although breaking into the debugger is fairly simple, you still need to think about a couple of issues. The first issue is that your breakpoint might not trigger. If the debuggee is processing a message or doing some other work, it will break. If the debuggee is sitting there waiting for a message to arrive, however, the breakpoint won't trigger until the debuggee receives a message. Although you could require the user to move the mouse over the debuggee to generate a WM_MOUSEMOVE message, the user might not be too happy about this requirement.

To ensure that the debuggee reaches your breakpoint, you need to send a message to the debuggee. If all you have is a thread handle given to you by the Debugging API, how do you turn the handle into the appropriate HWND? Unfortunately, you can't. However, because you have the thread handle, you can always call PostThreadMessage, which will post a message to the thread message queue. Because the HWND message processing layers on top of the thread message queue, calling PostThreadMessage does exactly what you need it to do.

The only question then becomes, What message do I post? You don't want to post a message that could cause the debuggee to do any real processing, thus allowing the debugger to change the behavior of the debuggee. For example, posting a WM_CREATE message probably wouldn't be a good idea. Fortunately, the WM_NULL message is supposed to be a benign message and is what you're supposed to use in hooks if you change a message. It does no harm to post the WM_NULL message with PostThreadMessage even if the thread doesn't have a message queue. And if the thread doesn't have a message queue, such as in a console application, calling PostThreadMessage doesn't do any damage. Because console-based applications will always be processing, even if waiting for a keystroke, setting the breakpoint at the current executing instruction will cause the break.

Another issue involves multithreading. If you're going to suspend only a single thread and the application is multithreaded, how do you know which thread to suspend? If you suspend and set the breakpoint in the wrong thread, say one that is blocked waiting on an event that is signaled only when background printing occurs, your breakpoint might never go off unless the user decides to print something. If you want to break on a multithreaded application, the only safe course is to suspend all the threads and set a breakpoint in each one.

Suspending all the threads and setting a breakpoint in each one works just great on an application that has only two threads. If you want to break on an application that has many threads, however, you could leave yourself open to a problem. As you're walking through and suspending each of the debuggee's threads, you're changing the state of the application such that it's possible for you to cause the application to deadlock. To get all the threads suspended, the breakpoints set, and the threads resumed without causing problems, the debugger needs to boost its own thread priority. By boosting the priority to THREAD_BASE_PRIORITY_LOWRT, the debugger can have its thread stay scheduled so that the debuggee's threads don't execute as the debugger manipulates them.

So far, my algorithm for breaking in a multithreaded application sounds reasonable. However, the debugger still needs to deal with one last issue to make the Debug Break option work completely. If you have all the breakpoints set in all the threads and you resume the threads, you still face one situation in which the break won't happen. By setting the breakpoints, you're relying on at least one of the threads to execute in order to trigger the breakpoint exception. What do you think happens if the process is in a deadlock situation? Nothing happens—no threads execute and your carefully positioned breakpoints never trigger the exception.

The Debug Break business gets interesting. When you're breaking in a deadlock, you need to set up a timer to mark when you added the break. After your period of time elapses (the Visual C++ debugger uses 3 seconds), you need to take some drastic action. When the Debug Break option times out, you'll need to set one of the thread's instruction pointers to another address, set a breakpoint at that new address, and restart the thread. When that special breakpoint fires, you need to set the thread instruction pointer back to its original location.

Symbol Information

Have you ever wonder what the f**k difference between a debug file and a release one is? Or have you - a developer play with Visual Studio in many years, cared about the sh*t it creates, a lot of things seem to be not neccessary with you, right? One of them concerns with symbol information which is needed for debugging. Very so neccessary that in his article, John Robbins said that it was as important as source code.

There're several symbol formats like SYM, COFF, C7 or CodeView, and PDB (stands for Program Database) format which is the most common one used today (yeah, it's one of the shit created when you F7 or F6 in Visual Studio). Unfortunately, the PDB file format is undocumented by Microsoft, its information can be extracted by using the DIA (Debug Interface Access) interface.

Let me turn to what's in a PDB and how the debugger finds them. The actual file format of a PDB file is a closely guarded secret but Microsoft provides APIs to return the data for debuggers. A native C++ PDB file contains quite a bit of information:

Public, private, and static function addresses
Global variable names and addresses
Parameter and local variable names and offsets where to find them on the stack
Type data consisting of class, structure, and data definitions
Frame Pointer Omission (FPO) data, which is the key to native stack walking on x86
Source file names and their lines

When you load a module into the process address space, the debugger uses two pieces of information to find the matching PDB file. The first is obviously the name of the file. If you load ZZZ.DLL, the debugger looks for ZZZ.PDB. The extremely important part is how the debugger knows this is the exact matching PDB file for this binary. That's done through a GUID that's embedded in both the PDB file and the binary. If the GUID does not match, you certainly won't debug the module at the source code level.

The .NET compiler, and for native the linker, puts this GUID into the binary and PDB. Since the act of compiling creates this GUID, stop and think about this for a moment. If you have yesterday's build and did not save the PDB file will you ever be able to debug the binary again? No! This is why it is so critical to save your PDB files for every build. Because I know you're thinking it, I'll go ahead and answer the question already forming in your mind: no, there's no way to change the GUID.

To view the GUID we can dumpbin utility with /headers option, and pay attention to the Debug Directories section.

Greatz thanks:

John Robbins' Blog, http://www.wintellect.com/CS/blogs/jrobbins/archive/2009/05/11/pdb-files-what-every-developer-must-know.aspx
John Robbins, Debugging Applications, 2000
http://en.wikipedia.org/wiki/Debug_symbol
http://en.wikipedia.org/wiki/Program_database

How do debuggers work? - P.3