Since crash dumps can only be troubleshot with the original binaries and debug symbols generated from the original source code, any suggestions that ask to change the source code, or to recompile, do not help with the present situation (of crash dumps that OP has already received). These suggestions could only improve the prognosis of troubleshooting the crash dumps of future releases of the software.
(Rant.) Should I mention that the typical cost to technical support to investigate a non-obvious crash dump involving some kind of data trashing and reconstruction of program and execution state is about $1000 USD? If you can spend one week of work to prevent one crash dump in the future, that seems worth it. I see outsourcing opportunity here (to lower-cost countries) but to let a third-party investigate a core dump means you give away the entirety of the company's tangible software secret.
In general, the ability to read and understand the assembly code shown in the Visual Studio step debugger is a requirement for one to work with crash dumps.
The choices are:
- Acquire the ability to handle this task yourself.
- Learn to read assembly code and techniques for reconstructing an understanding of the program state and the path of execution prior to the crash when given a crash dump.
- Let your supervisor know so that the task can be reassigned
- Delegate this task to someone else
Techniques that I use when faced with same situation.
The first rule before working on a crash dump is to try to reproduce the same crash yourself. If you can recreate the conditions that lead to a crash, it enables you to troubleshoot a live instance of the crashing program (i.e. you can step-execute it), as opposed to a frozen instance (i.e. a crash dump which does not allow you to step-execute).
If a function is known to have been inlined, then I try to locate all of the callers (technically "call sites"), and set breakpoints there. If even the callers are inlined, I will chase their parent callers until I find call sites that aren't inlined.
Note that, for the purpose of locating all possible call sites, you can do that on a rebuild of slightly modified source code. The knowledge of "all possible call sites" is a knowledge you can take away from one version of source code, and most of that knowledge is still applicable on the crash dump you are working on.
To track down a struct, it is necessary to know its address. A typical way in which the address of a struct is revealed in disassembly code is when that address was passed into a function call.
Are there problems with the "rule of zero"?
I agree that it is often misapplied (and would be better if this issue is fixed in the source code for future releases), and shares my own experience.
Personally I do follow the "rule of three" or "rule of five" for most C++ classes if at least one of the constructor / destructor is not trivial default.
I do this for two reasons.
The first reason is similar to yours. When a constructor is not trivial default, that means there is some "code" that the compiler must generate for you; these generated code may need to call some other code, such as the constructors for some members of the class. If those other code were written by you, you will want to be able to debug them. Providing a user-defined constructor makes this task easier.
The second reason is that, when a constructor is implicitly declared, in some situations it can cause issues when I try to store that type into an STL vector. The superficial reason was related to the problem of incomplete types, but since I'm not certain about the underlying reason and it's not related to this question so I will stop here.
However, providing a user-defined constructor does not always turn off inlining for these methods - the compiler may inline some of them e.g. the constructor anyway. To truly prevent inlining, you would use declspec(noinline)
with MSVC.
The ability to use disassembly debugging on optimized builds (builds made with "Release" configuration) depends on both inlining and the version of MSVC. Visual C++ 2017 has better handling for maintaining the map between disassembly address and line-of-code location for inlined functions.
Sometimes, the disassembly debugger skipped over an empty constructor, because it is truly empty. A C-compatible "POD" type is typically believed to be initialized with memset(p, 0, sizeof(*p))
or a sequence of XOR RAX, RAX; MOV [...], RAX
which are effectively zero-filling instructions. In some cases they aren't zero-filled at all, i.e. there aren't any instructions. The disassembly debugger cannot stop on non-existent disassembly addresses.
Just as a offhand remark, I find that:
The destructor of a type appears to generate more "stuff that are visible in disassembly". Most noticeable is if the destructor is called as part of a deletion on dynamic memory (delete
). In the disassembly, one will see an actual function call to an address associated with some_type::scalar deleting destructor
. Once a module (EXE or DLL) has been loaded into the process space, one can set a breakpoint at its disassembly address.
I tend to follow two-phase initialization in my project, due to reasons which aren't pertinent to this question. But because of this project guideline, I find that errors tend to be thrown not from the constructor, but instead from the second function which does the heavy-lifting.
I also use exceptions with string messages. If your project doesn't use exceptions, you may consider error logging as an alternative for capturing details about errors.
The key to debugging problems with a struct is to know (record) its address with certainty. There are many techniques. However, a discussion of these techniques may make the discussion too broad.