Does this site look plain?

This site uses advanced css techniques

Hardcore Win32 developers are probably familiar with the OutputDebugString() API function that lets your program talk with a debugger. It's handier than having to create a logfile, and all the "real" debuggers can use it. The mechanism by which an application talks to the debugger is straightforward, and this Tech Tip documents how the whole thing works.

Table of Contents
  1. Application program usage
  2. The Protocol
  3. The permission problem
  4. Detailed implementation
  5. Random thoughts
  6. tool: dbmutex

This Tech Tip was prompted first by our observation that OutputDebugString() didn't always work reliably when Admin and non-Admin users tried to work and play together (on Win2000, at least). We suspected permissions issues on some of the kernel objects involved, and in the process ran across enough information that we had to write it up.

We'll note that though we're using the term "debugger", it's not used in the Debugging API sense: there is no "single stepping" or "breakpoints" or "attach to process" going on like one might find in MS Visual C or some real interactive development environment. Any program that implements the protocol is a "debugger" in this sense. This could be a very simple command-line tool, or one more advanced such as DebugView from the very smart guys at SysInternals.

Application program usage

The <windows.h> file declares two version of the OutputDebugString() function - one for ASCII, one for Unicode - and unlike most of the Win32 API, the native version is ASCII. Most of the Win32 API is Unicode native.

Simply calling OutputDebugString() with a NUL-terminated string buffer causes the message to appear on the debugger, if there is one. Common usage builds a message and sends it

sprintf(msgbuf, "Cannot open file %s [err=%ld]\n", fname, GetLastError());


but in practice many of us create a front-end function that allows us to use printf-style formatting. The odprintf() function formats the string, insures that there is a proper CR/LF at the end (removing any previous line terminations), and sends the message to the debugger.

#include <stdio.h>
#include <stdarg.h>
#include <ctype.h>

void __cdecl odprintf(const char *format, ...)
char    buf[4096], *p = buf;
va_list args;
int     n;

        va_start(args, format);
        n = _vsnprintf(p, sizeof buf - 3, format, args); // buf-3 is room for CR/LF/NUL

        p += (n < 0) ? sizeof buf - 3 : n;

        while ( p > buf  &&  isspace(p[-1]) )
                *--p = '\0';

        *p++ = '\r';
        *p++ = '\n';
        *p   = '\0';


Then using it in code is easy:

        odprintf("Cannot open file %s [err=%ld]", fname, GetLastError());

We've been using this for years.

The protocol

Passing of data between the application and the debugger is done via a 4kbyte chunk of shared memory, with a Mutex and two Event objects protecting access to it. These are the four kernel objects involved:

object name object type
DBWinMutex Mutex
DBWIN_BUFFER Section (shared memory)

The mutex generally remains on the system all the time, but the other three are only present if a debugger is around to accept the messages. Indeed - if a debugger finds the last three objects already exist, it will refuse to run.

The DBWIN_BUFFER, when present, is organized like this structure. The process ID shows where the message came from, and string data fills out the remainder of the 4k. By convention, a NUL byte is always included at the end of the message.

struct dbwin_buffer {
        DWORD   dwProcessId;
        char    data[4096-sizeof(DWORD)];

When OutputDebugString() is called by an application, it takes these steps. Note that a failure at any point abandons the whole thing and treats the debugging request as a no-op (the string isn't sent anywhere).

  1. Open DBWinMutex and wait until we have exclusive access to it.
  2. Map the DBWIN_BUFFER segment into memory: if it's not found, there is no debugger running so the entire request is ignored.
  3. Open the DBWIN_BUFFER_READY and DBWIN_DATA_READY events. As with the shared memory segment, missing objects mean that no debugger is available.
  4. Wait for the DBWIN_BUFFER_READY event to be signaled: this says that the memory buffer is no longer in use. Most of the time, this event will be signaled immediately when it's examined, but it won't wait longer than 10 seconds for the buffer to become ready (a timeout abandons the request).
  5. Copy up to about 4kbytes of data to the memory buffer, and store the current process ID there as well. Always put a NUL byte at the end of the string.
  6. Tell the debugger that the buffer is ready by setting the DBWIN_DATA_READY event. The debugger takes it from there.
  7. Release the mutex
  8. Close the Event and Section objects, though we keep the handle to the mutex around for later.

On the debugger front, it's a bit simpler. The mutex is not used at all, and if the events and/or shared memory objects already exist, we presume that some other debugger is already running. Only one debugger can be in the system at a time.

  1. Create the shared memory segment and the two events. If we can't, exit.
  2. Set the DBWIN_BUFFER_READY event so the applications know that the buffer is available.
  3. Wait for the DBWIN_DATA_READY event to be signaled.
  4. Extract the process ID NUL-terminated string from the memory buffer.
  5. Go to step #2

This doesn't strike us as being a low-cost way of sending messages, and the application is at the mercy of the debugger for the speed at which it runs.

The Permissions Problem

We have seen problems for years with OutputDebugString() being unreliable at times, and we're not quite sure why Microsoft has such a hard time getting this right. Curiously, the problem has always revolved around the DBWinMutex object, and it requires that we visit the permissions system to find out why this is so troublesome.

The mutex object is alive and allocated until the last program using it closes its handle, so it can remain long after the original application which created it has exited. Since this object is so widely shared, it must be given explicit permissions that allow anybody to use it. Indeed, the "default" permissions are almost never suitable, and this mistake accounted for the first bug we observed in NT 3.51 and NT 4.0.

The fix - at the time - was to create this mutex with a wide-open DACL that allowed anybody to access it, but it seems that in Win2000 these permissions have been tightened up. Superficially they look correct, as we see in this table:

Administrators MUTEX_ALL_ACCESS

An application wishing to send debugging messages needs only the ability to wait for and acquire the mutex, and this is represented by the SYNCHRONIZE right. The permissions above are entirely correct to all all users to participate this way.

The surprise occurs when one looks at the behavior of CreateMutex() when the object already exists. In that case, Win32 behaves as if we were calling:


Even though we only really need SYNCHRONIZE access, it presumes the caller wishes to do everything (MUTEX_ALL_ACCESS). Because non-admins do not have these rights - only the few listed above - the mutex cannot be opened or acquired, so OutputDebugString() quietly returns without doing anything.

Even deciding to perform all software development as an administrator is not a complete fix: if there are other users (services, for instance) that run as non-admins, their debugging information will be lost if the permissions are not right.

Our feeling is that the real fix requires that Microsoft add a parameter to CreateMutex() - the access mask to use for the implied OpenMutex() if the object already exists. Perhaps someday we'll see a CreateMutexEx(), but in the medium term we have to take another approach. Instead, we'll just hard-change the permissions on the object as it lives in memory.

This revolves around the SetKernelObjectSecurity() call, and this fragment shows how a program can open the mutex and install a new DACL. This DACL remains even after this program exits, as long as any other programs maintain HANDLEs to it.

// open the mutex that we're going to adjust
HANDLE hMutex = OpenMutex(MUTEX_ALL_ACCESS, FALSE, "DBWinMutex");

// create SECURITY_DESCRIPTOR with an explicit, empty DACL
// that allows full access to everybody

InitializeSecurityDescriptor(&sd, SECURITY_DESCRIPTOR_REVISION);
        &sd,            // addr of SD
        TRUE,           // TRUE=DACL present
        NULL,           // ... but it's empty (wide open)
        FALSE);         // DACL explicitly set, not defaulted

// plug in the new DACL
SetKernelObjectSecurity(hMutex, DACL_SECURITY_INFORMATION, &sd);

This approach is clearly going down the right road, but we still must find a place to put this logic. It would be possible to put this in a small program that could be run on demand, but this seems like it would be interruptive. Our approach has been to write a Win32 service that takes care of this.

Our dbmutex tool performs just this job: it launches at system boot time, opens or creates the mutex, and then sets the object's security to allow wide access. It then sleeps until shutdown, holding the mutex open in the process. It consumes no CPU time.

Detailed implementation

We've spent a bunch of time with IDA Pro digging into the Windows 2000 KERNEL32.DLL implementation, and we think we have a good handle on how it's working on a more precise basis. Here we present pseudocode (e.g., we've not compiled it) for the OutputDebugString() function, plus the function that creates the mutex.

We are purposely skipping most of the error checking: if things go wrong, it frees up allocated resources and exits as if no debugger were available. The goal here is to show the general behavior, not a complete reverse engineering of the code.

The "setup" function - whose name we have manufactured - creates the mutex or opens it if not already there. They go to some pains to set the security on the mutex object so that anybody can use it, though in practice we'll see that they haven't quite gotten it right.


Random thoughts

It might strike some that this is a security matter, but it's really not. Non-admin users do have all the rights necessary to properly use OutputDebugString(), but due to the common mistake of "asking for more rights than required", a legitimate request is denied for a question posed in the wrong form.

But unlike most problems of this type, this is less intentional than most. Most mistakes are where the developer explicitly asks for too much (e.g., "MUTEX_ALL_ACCESS"), but this mask is implied by the behavior of CreateMutex(). This makes it harder to avoid without a change in the Win32 API.


While picking apart OutputDebugStringA() in KERNEL32.DLL, it became apparent how a non-admin could likely cripple a system. Once the mutex has been acquired, an application wishing to send a debug message waits up to ten seconds for the DBWIN_BUFFER_READY event to become ready, giving up if it times out. This seems like a prudent precaution to avoid starvation if the debugging system is busy.

But the earlier step, waiting for the mutex, has no such timeout. If any process on the system - including a non-privilged one - can open this mutex asking for SYNCHRONIZE rights, and just sit on it. Any other process attempting to acquire this mutex will be stopped dead in its tracks with no time limit.

Our investigation shows that all kinds of programs send random bits of debugging information (for instance, the MusicMatch Jukebox has a keyboard hook that's very chatty), and these threads are all halted by a few lines of code. It won't necessarily stop the whole program - there could be other threads - but in practice, developers don't plan on OutputDebugString() will be a denial-of-service avenue.


Oddly enough, we found that OutputDebugString() is not a native Unicode function. Most of the Win32 API has the "real" function to use Unicode (the "W" version), and they automatically convert from ASCII to UNICODE if the "A" version of the function is called.

But since OutputDebugString ultimately passes data to the debugger in the memory buffer strictly as ASCII, they have inverted the usual A/W pairing. This suggests that for sending a quick message to a debugger even in a Unicode program, it can be done by calling the "A" version directly:

OutputDebugStringA("Got here to place X");

Terminal Services considerations

We've discovered that in a Terminal Services environment, debug messages are generally relative to the current session and capturing global debugging messages is problematic. This appears to be because the mutex is now session relative rather than global.