Unfortunately, the Visual Studio® 6.0 IDE doesn´t have an easy way for you to specify DelayLoading for DLLs. In Visual Studio 6.0, you´ll have to add the /DELAYLOAD:XXX command-line fragment manually to the Project Settings | Link | Project Options edit field.
When to Use DelayLoad When you have a small project, it´s easy to come up with a list of DLLs that are good DelayLoad candidates. However, because projects may grow and can involve many developers, it´s just as easy to lose track of who uses which DLL. In the past, I´ve relied on gut instinct and Depends.EXE from the Platform SDK. A DLL from which only a few functions are imported is a good place to start. However, I wanted a way to automate and simplify the process. Thus was born the DelayLoadProfile program. DelayLoadProfile is a tool that runs your EXE and monitors the DLLs and functions that your EXE calls. After your program terminates, DelayLoadProfile spits out a summary of which DLLs were used and how many calls were made to each DLL. A DLL that´s imported, but which had no calls made to it, is a good candidate for DelayLoad importing. Let me emphasize one point before continuing: DelayLoadProfile works only against your EXE. While it could be extended to recurse into all of your imported DLLs and their dependencies, that would significantly complicate its code. As I´ll explain later, DelayLoadProfile just gives you hints about which DLLs you might consider using /DELAYLOAD on. You still have to use that neuron-based processing unit between your ears to make sure it makes sense to do so.
DelayLoadProfile: The Big Picture The concept behind DelayLoadProfile is simple. Redirecting the function pointers in the EXE´s IAT to point to a stub is all that´s needed. The stub simply notes that the imported function has been called, then jumps to the address that the Win32 loader originally stored in the IAT. However, the devil is in the details. First, you must decide where the code will run that locates and modifies the EXE´s IAT entries to point to the stubs. Doing the work out-of-process in some sort of control program is one option. This avoids the work involved in getting your code into the target EXE´s process. The downside is that it´s more work to traverse all the data structures necessary to locate and patch the IAT entries, as well as gather the results later. I´d be swimming in ReadProcessMemory calls. The other approach is to do the hard work in the same process space as the target EXE. This makes it almost trivial to march through the data structures, build stubs, redirect the IAT entries, and summarize the results at the end. However, doing the work in-process requires that some of the DelayLoadProfile code be loaded into the target EXE´s process as it runs. This is the path I took. Having committed to running my code in-process with the target, the next problem was figuring out how to get my code into the target process. One choice would have been to ask the user to link with the DelayLoadProfile code. Knowing it would require some effort by the target audience, I discarded this option. If a DelayLoadProfile user needed to modify their source, project, or makefile, many would pass. I needed to make DelayLoadProfile a complete no-brainer. At this point, I had boxed myself into some sort of loader program that would run the target EXE and inject my DelayLoadProfile DLL into it. One technique for DLL injection is to use CreateRemoteThread to start a thread in the target process that calls LoadLibrary on your DLL. I discarded this approach because CreateRemoteThread isn´t available on Windows 9x, which I wanted to support. Longtime MSJ readers may remember a program I wrote more than five years ago called APISPY32. It loads a process and injects a DLL into it for the purposes of logging API calls. That sounds similar to what I needed DelayLoadProfile to do. Alas, when I ran APISPY32 on Windows 2000, it failed to load the DLL. A little digging revealed the source of the problem, and I decided it was time to revamp this code for a whole new generation of programmers.
Into the Trenches To review quickly, DelayLoadProfile is a two-part system. A loader process runs your program. Early on in your program, the loader process injects a DLL into your program´s address space. This DLL scans through your EXE´s IAT and redirects the imported functions to point to stubs that the DLL creates. When your program shuts down, the injected DLL scans through the stubs it has created and summarizes how many calls were made to each imported DLL. If you´ve ever used the APIMON utility from the Platform SDK, you´ll recognize the similarities. The DLL that does all the work of monitoring a program´s use of imports is called DelayLoadProfileDLL (see Figure 1). DelayLoadProfileDLL uses the DLL_PROCESS_ATTACH and DLL_PROCESS_DETACH notifications sent to its DllMain procedure to initiate the two primary phases of the DLL´s work. When its DllMain gets the DLL_PROCESS_ATTACH notification, DelayLoadProfileDLL calls PrepareToProfile. Inside PrepareToProfile, the code locates the target EXE´s IAT. For each imported DLL it finds, the code determines if it´s a DLL that´s safe for IAT redirection. It does this by calling the IsModuleOKToHook function. Most of the time, it´s OK to redirect the IAT, so PrepareToProfile invokes the RedirectIAT function. RedirectIAT is where things get dirty, and it really helps if you understand the import-related data structures in WINNT.H. First, the function locates the IAT and the associated Import Names Table. The code then counts how many IAT entries there are by scanning through the IAT, looking for a NULL pointer. With this count, an array of DLPD_IAT_STUB stubs is created, with one stub for each IAT entry. Finally, it´s time for meatball surgery. The code makes yet another pass through the IAT. This time it grabs the address in each IAT entry, stuffs it into a JMP instruction in the stub, and redirects the IAT entry to point to the stub. As the code advances through each subsequent IAT entry, it also advances to the next DLPD_IAT_STUB stub in the allocated array. I´ll explain DLPD_IAT_STUB stubs a little later in this column. Two aspects of redirecting the IAT entries to the allocated stubs are worth mentioning. First, the IAT is often placed in a read-only section of the EXE. Ordinarily, an attempt to modify such an IAT pointer would result in an access violation. Luckily, the VirtualProtect API comes to the rescue and enables you to modify the attributes of a target address, in this case, the IAT. Read-write is the attribute you´re looking to modify. When it´s finished, the code restores the original memory protection attributes. The other tricky part of redirecting the IAT occurs when you encounter a data import. Although programmers don´t frequently do so, it´s relatively easy to import data in addition to code. The Visual C++ runtime library DLL (MSVCRT.DLL) has data exports. Redirecting an IAT entry that refers to data in an imported DLL is almost certainly a recipe for problems. So how do you determine whether an import is a normal code import or a data import? A commercial product could implement a sophisticated algorithm to determine the import type of an IAT entry. However, I took a shortcut and used IsBadWritePtr. If the IAT points to memory that´s writeable, it´s probably pointing to data. Likewise, if it points to read-only memory, odds are that it´s pointing to code. Is this a perfect test? No, but it´s good enough for DelayLoadProfile´s needs. Now let´s take a look at the stubs. The DLPD_IAT_STUB structure in DelayLoadProfileDLL.H contains the layout, which is a mixture of code and data. Simplifying this structure, a DLPD_IAT_STUB stub looks like this: |