Debugging
This section assumes use of gdb, from the console. There are also specific notes on Windows Debugging or graphical tool on Mac OS X ( see : MacOSX_Debug_OpenOffice.org_using_XCode ) and hints on Build Problems Debugging
Building with debugging symbols
OO.o includes a way to add debugging code in per module, via
the build debug=true
command in each module.
This also adds lots of runtime assertions,
churning warnings etc. in addition to debug symbols - which
can be useful. To do just a plain build with debug symbols
though use build debug=true dbg_build_only=true
or in later versions use build debug=true dbglevel=2
for max
output and dbglevel=1 or 0 for less output.
You can also configure OO.o with --enable-symbols to build with symbolic generation.
gdb invocation
If you debug with gdb, you may find that execution stops due to signals at inappropiate locations, especially if running against libgcj and need to debug ignoring its garbage-collection. Best invocation is...
gdb ./soffice.bin (gdb) handle SIGPWR nostop noprint (gdb) handle SIGXCPU nostop noprint (gdb) handle SIG33 nostop noprint (gdb) run -norestore -writer
replace -writer with -draw/-impress/-calc/... as appropiate. The -norestore option prevents display of the crash reporter (as one frequently kills office during debugging).
Starting at the beginning
We start in 'main' with a sal wrapper, that calls vcl/source/app/svmain.cxx (SVMain). It invokes Main on pSVData->mpApp; but pSVData is an in-line local. To debug this use the pImplSVData global variable. eg:
p pImplSVData->maAppData
This 'Main' method is typically: desktop/source/app/app.cxx (Main).
Examining strings
We have already seen that OO.o has
it's own set of string classes, none of which gdb understands.
You need to use:
(gdb) print dbg_dump(sWhatEver)
to print the contents
of a UniString/ByteString/rtl::OUString/rtl::OString regardless
of the type when debugging C++ code. See Caolan's
write-up
for details.
Getting the build order right
The build dependencies of the modules are clearly crucial to
getting a clean build. When you type 'build' in a module, first
build examines prj/build.list, eg.neon/prj/build.lst
:
xh neon : soltools external expat NULL
this specifies that 'soltools', 'external' and 'expat' have to be satisfactorily built and delivered before neon can be built. Occasionally these rules get broken, and people don't notice for a while.
It crashes, but only in gdb
What fun — you symlinked desktop/unxlngi4.pro/bin/soffice to
soffice.bin in your install tree didn't you. That works fine
if you just run it, but it seems gdb unpacks the symlink and
passes a fully qualified path as argv[0], which defeats the
hunting for the binary in the path, so it assigns the program
base path as /opt/OpenOffice/OOO_STABLE_1/desktop/unxlngi4.pro/bin
and starts looking for (eg. applicat.rdb) in there. Of course
when it fails to find any setup information, it silently
crashes somewhere else yards away from the original problem.
It crashes, but doesn't crash
For various reasons signal handlers are trapped and life can get rather confusing; thus it's best for builders to apply something like this:
--- sal/osl/unx/signal.c +++ sal/osl/unx/signal.c @@ -188,6 +188,8 @@ static sal_Bool InitSignal() bSetILLHandler = sal_True; } + bSetSEGVHandler = bSetWINCHHandler = bSetILLHandler = bDoHardKill = sal_False; + SignalListMutex = osl_createMutex(); act.sa_handler = SignalHandlerFunction;
I can't find the code from the trace
Some methods, are described as having a special linkage, such that they can be used in callbacks; these typically have a prefix: 'LinkStub', so search for the latter part of the identifier in a freetext search. eg.
IMPL_LINK( Window, ImplHandlePaintHdl, void*, EMPTYARG )
builds the 'LinkStubImplHandlePaintHdl' method.
How can I re-build just the files I see in the trace
Often when you run gdb on a build without debugging symbols, you get an unhelpful gdb trace, but yet you can't afford the time/space to recompile all of OO.o with debugging symbols. Thus we have created a small perl helper, which will hunt for & touch files containing the symbols from your trace. This sub-set can then be re-built with debugging enabled for a better trace next time around:
gdb ./soffice.bin ... bt #0 0x40b4e0a1 in kill () from /lib/libc.so.6 #1 0x409acfe6 in raise () from /lib/libpthread.so.0 #2 0x447bcdbd in SfxMedium::DownLoad(Link const&) () from ./libsfx641li.so #3 0x447be151 in SfxMedium::SfxMedium(String const&, unsigned short, unsigned char, SfxFilter const*, SfxItemSet*) () from ./libsfx641li.so #4 0x448339d3 in getCppuType(com::sun::star::uno::Reference<com::sun::star::document::XImporter> const*) () from ./libsfx641li.so ... quit cd base/OOO_STABLE_1/sfx2 ootouch SfxMedium build debug=true
Thus, all files referencing / implementing anything with SfxMedium will be touched, and hence rebuilt with debugging symbols.
How can I re-build all the files in one source directory
If you want to recompile the code in just your current directory, you can use the killobj dmake target to remove the object files:
dmake killobj dmake
It always crashes in sal_XErrorHdl
You are a victim of asynchronous X error reporting;
export SAL_SYNCHRONIZE=1
will make all the X traffic
synchronous, and report the error by the method that caused it,
it'll also make OO.o far slower, and the timing different.
It silently fails to load my word file
Caolan suggests: put breakpoints in ww8par.cxx top and tail of SwWW8ImplReader::LoadDoc, and confirm that the document gets as far as the import filter.
A handy human place to put a breakpoint is in SwWW8ImplReader::ReadPlainChars, you can see chunks of text as they are read in. Alternatively SwWW8ImplReader::AppendTxtNode as each paragraph is inserted.
How do I use the debug console ?
So OO.o contains some hefty debugging infrastructure; pictured here
Enabling it is pretty easy - what you need is a so-called Non-Product Build.
By default, an OpenOffice.org build is a Product Build, i.e. ready for release after completion. If you specifiy the --enable-dbgutil
switch during configure
, then your environment will be prepared for a Non-Product Build - with lots of additional diagnostic tools.
Note that libraries from product and non-product builds are usually incompatible, so don't mix them in the same installation.
For available tools in non-product builds, have a look at the various DBG_foo
macros in tools/debug.hxx
, or, if you already are knowledgeable about this, let others participate by writing your knowledge down here.
To actually fire up the debug settings dialog, press <ctrl><alt><shift>-D.
Excel Interop debugging
This is fairly easy; edit sc/source/filter/inc/biffdump.hxx,
define EXC_INCL_DUMPER to 1, and re-build 'sc'. Also, copy
sc/source/filter/excel/biffrecdumper.ini to ~. Then run
soffice.bin foo.xls
and you should get a
foo.txt with the debug data in it.
The trace shows a crash in 'poll'
OO.o is a fairly threaded program, you're prolly just looking
at the wrong thread: there are not likely to be bugs in poll.
Use thread apply all backtrace
to get a backtrace
of all threads - this will most likely fail. When it does do:
thread 1
then bt
- most crashers
occur in the 'main' thread.
What does this trace mean ?
There are several typical stack-traces that come up again and again, one would be:
#15 0x4164a501 in raise () from /lib/tls/libc.so.6 #16 0x4164bcd9 in abort () from /lib/tls/libc.so.6 #17 0x415fb5a5 in std::set_unexpected () from /home/mnagashree/m72install/program/libstdc++.so.5 #18 0x415fb5e2 in std::terminate () from /home/mnagashree/m72install/program/libstdc++.so.5 #19 0x415fb69c in __cxa_rethrow ()
This section of trace means (essentially) that an exception was thrown - but there was no-one trying to catch it. Often this means there was a missing 'try {} catch()' clause in one of the calling frames.
A great way to debug exceptions is to add a breakpoint
in catch/throw, do this with catch throw
or
catch catch
in gdb.
STLport and checking iterators
The STL is a powerful tool but it also makes it easy - in the grand old C/C++ tradition - to shoot one selves in the foot, as we all know. STL containers and algorithms are now pervasive in OOo, so there is a need to validate the use of STL constructs in OO.o to find hidden problems.
Fortunately the STLport library - the default STL implementation for OO.o - has a powerful debug mode, and it's easy to use. Since SRC680 m128 it is possible to use the environment variable USE_STLP_DEBUG
to switch on the STLport debug mode, since SRC680 m150 it works for Windows, too
The most useful part of the STLport debug mode is iterator checking. Doing the OO.o smoke test and some little additional random testing we already found a number of questionable STL constructs.
Only code paths which are exercised will be tested by the STLport debug mode, though. If STLport finds a questionable STL usage it will throw an assertion and terminate. It is usually quite easy to extract a precise stack trace.
Some notes:
- STLport debug mode iterators are no pointers! We've cleaned up all occurrences of the lazy - and wrong - usages of iterators as pointers in SRC680 m128/m150, but maybe something new has already crept in. This clean up also helps with other STL implementations, like the one which comes with gcc-4.x
- A complete recompile is necessary, the debug modes renders all objects with STL constructs binary incompatible
- The STLport debug mode breaks the complexity assertions of the STL. Theoretically some operations should be much slower in debug mode than in product mode. In practice I didn't notice a real slowdown of OO.o.