Windbg can do my job for me ;-)

windbg is a debugger for Windows. It can step through programs while they run, or be used to examine the dumps produced when Windows crashes or hangs (including when you get the infamous blue-screen-of-death).

And with a couple of simple commands, it can get you a long way towards figuring out the cause of the problem.

I started using it a couple of months ago after being introduced to it on a Windows Internals course by the brilliant Mark Russinovich. It’s very powerful – letting you view the contents of memory, the values in registers, environment info, details of processes – and even the different threads within processes, with information such as what they were doing – their stack history, and so much more. Very powerful. But also more than a little confusing to the newbie!

The biggest problem is knowing what to look for. Windbg can show you pretty much anything you might want to know – like what was the value in memory at address X at the time of a crash. But how do you know to check that? And how do you know what it means once you have?

It’s a bit of a dark art, and one that I’ve started to get my head round. But in the meantime, I’ve found that windbg can actually be more helpful than I realised!

!analyze -v
Run this after opening a crash dump, and it tells you which process crashed, what it was doing when it crashed, and an explanation of the likely cause.

!analyze -v -hang
This is the one which I hadn’t come across before. Run this after opening a dump from a hung process, and it will determine which thread was responsible for the hang and what it was waiting for. I knew windbg could let you examine handles, show all locks open in each thread, and let you examine them all in detail. But I didn’t realise that with !analyse, windbg can do the hard work for me – examining locks held open by all threads, determining which threads are waiting for which others, and build up an order to work out which was initially responsible for the hang.

In this case, windbg identified that a thread was waiting for a specific event. This thread was holding a couple of locks, which other threads ended up waiting for. Those threads themselves held locks which others depended on. Everything ended up grinding to a halt all because of that one thread that was sat patiently waiting for an event that never happened.

And windbg sorted that all out for me. I could still look into it further – looking at the thread in question, examining the handle in more detail, and so on. But !analyze helped get me started.

I like. 🙂

Comments are closed.