Ok, as mentioned in the previous entry, let's me continue to discuss about the problem a debugger faces - thread or more exactly, multithread.
When you want to stop a debuggee that's churning like mad, you need to get a
breakpoint jammed into the CPU instruction stream so that you can stop in the
debugger. The question is, what do you need to do to get the instruction in
there? If a thread is running, the only thing you can do to get it to a known
point is to suspend it by using the SuspendThread API function. Once the
thread is suspended, you can look at it with the GetThreadContext API
function and determine the current instruction pointer. Once you have the
instruction pointer, you're back to setting simple breakpoints. After you set
the breakpoint, you need to call the ResumeThread API function so that
you can let the thread continue execution and have it hit your breakpoint.
Although breaking into the debugger is fairly simple, you still need to think
about a couple of issues. The first issue is that your breakpoint might not
trigger. If the debuggee is processing a message or doing some other work, it
will break. If the debuggee is sitting there waiting for a message to arrive,
however, the breakpoint won't trigger until the debuggee receives a message.
Although you could require the user to move the mouse over the debuggee to
generate a WM_MOUSEMOVE message, the user might not be too happy about
this requirement.
To ensure that the debuggee reaches your breakpoint, you need to send a
message to the debuggee. If all you have is a thread handle given to you by the
Debugging API, how do you turn the handle into the appropriate HWND?
Unfortunately, you can't. However, because you have the thread handle, you can
always call PostThreadMessage, which will post a message to the thread
message queue. Because the HWND message processing layers on top of the
thread message queue, calling PostThreadMessage does exactly what you
need it to do.
The only question then becomes, What message do I post? You don't want to
post a message that could cause the debuggee to do any real processing, thus
allowing the debugger to change the behavior of the debuggee. For example,
posting a WM_CREATE message probably wouldn't be a good idea.
Fortunately, the WM_NULL message is supposed to be a benign message and
is what you're supposed to use in hooks if you change a message. It does no harm
to post the WM_NULL message with PostThreadMessage even if the
thread doesn't have a message queue. And if the thread doesn't have a message
queue, such as in a console application, calling PostThreadMessage
doesn't do any damage. Because console-based applications will always be
processing, even if waiting for a keystroke, setting the breakpoint at the
current executing instruction will cause the break.
Another issue involves multithreading. If you're going to suspend only a
single thread and the application is multithreaded, how do you know which thread
to suspend? If you suspend and set the breakpoint in the wrong thread, say one
that is blocked waiting on an event that is signaled only when background
printing occurs, your breakpoint might never go off unless the user decides to
print something. If you want to break on a multithreaded application, the only
safe course is to suspend all the threads and set a breakpoint in each one.
Suspending all the threads and setting a breakpoint in each one works just
great on an application that has only two threads. If you want to break on an
application that has many threads, however, you could leave yourself open to a
problem. As you're walking through and suspending each of the debuggee's
threads, you're changing the state of the application such that it's possible
for you to cause the application to deadlock. To get all the threads suspended,
the breakpoints set, and the threads resumed without causing problems, the
debugger needs to boost its own thread priority. By boosting the priority to
THREAD_BASE_PRIORITY_LOWRT, the debugger can have its thread stay
scheduled so that the debuggee's threads don't execute as the debugger
manipulates them.
So far, my algorithm for breaking in a multithreaded application sounds
reasonable. However, the debugger still needs to deal with one last issue to
make the Debug Break option work completely. If you have all the breakpoints set
in all the threads and you resume the threads, you still face one situation in
which the break won't happen. By setting the breakpoints, you're relying on at
least one of the threads to execute in order to trigger the breakpoint
exception. What do you think happens if the process is in a deadlock situation?
Nothing happens—no threads execute and your carefully positioned breakpoints
never trigger the exception.
The Debug Break business gets interesting. When you're breaking in
a deadlock, you need to set up a timer to mark when you added the break. After
your period of time elapses (the Visual C++ debugger uses 3 seconds), you need
to take some drastic action. When the Debug Break option times out, you'll need
to set one of the thread's instruction pointers to another address, set a
breakpoint at that new address, and restart the thread. When that special
breakpoint fires, you need to set the thread instruction pointer back to its
original location.
There're several symbol formats like SYM, COFF, C7 or CodeView, and PDB (stands for Program Database) format which is the most common one used today (yeah, it's one of the shit created when you F7 or F6 in Visual Studio). Unfortunately, the PDB file format is undocumented by Microsoft, its information can be extracted by using the DIA (Debug Interface Access) interface.
Let me turn to what's in a PDB and how the debugger finds them. The actual file format of a PDB file is a closely guarded secret but Microsoft provides APIs to return the data for debuggers. A native C++ PDB file contains quite a bit of information:
The .NET compiler, and for native the linker, puts this GUID into the binary and PDB. Since the act of compiling creates this GUID, stop and think about this for a moment. If you have yesterday's build and did not save the PDB file will you ever be able to debug the binary again? No! This is why it is so critical to save your PDB files for every build. Because I know you're thinking it, I'll go ahead and answer the question already forming in your mind: no, there's no way to change the GUID.
To view the GUID we can dumpbin utility with /headers option, and pay attention to the Debug Directories section.
- Symbol Information
There're several symbol formats like SYM, COFF, C7 or CodeView, and PDB (stands for Program Database) format which is the most common one used today (yeah, it's one of the shit created when you F7 or F6 in Visual Studio). Unfortunately, the PDB file format is undocumented by Microsoft, its information can be extracted by using the DIA (Debug Interface Access) interface.
Let me turn to what's in a PDB and how the debugger finds them. The actual file format of a PDB file is a closely guarded secret but Microsoft provides APIs to return the data for debuggers. A native C++ PDB file contains quite a bit of information:
- Public, private, and static function addresses
- Global variable names and addresses
- Parameter and local variable names and offsets where to find them on the stack
- Type data consisting of class, structure, and data definitions
- Frame Pointer Omission (FPO) data, which is the key to native stack walking on x86
- Source file names and their lines
The .NET compiler, and for native the linker, puts this GUID into the binary and PDB. Since the act of compiling creates this GUID, stop and think about this for a moment. If you have yesterday's build and did not save the PDB file will you ever be able to debug the binary again? No! This is why it is so critical to save your PDB files for every build. Because I know you're thinking it, I'll go ahead and answer the question already forming in your mind: no, there's no way to change the GUID.
To view the GUID we can dumpbin utility with /headers option, and pay attention to the Debug Directories section.
Greatz thanks:
- John Robbins' Blog, http://www.wintellect.com/CS/blogs/jrobbins/archive/2009/05/11/pdb-files-what-every-developer-must-know.aspx
- John Robbins, Debugging Applications, 2000
- http://en.wikipedia.org/wiki/Debug_symbol
- http://en.wikipedia.org/wiki/Program_database