Debugging

From Apache OpenOffice Wiki
Revision as of 02:56, 29 December 2008 by DavidRoss (Talk | contribs)

Jump to: navigation, search

This section assumes use of gdb, from the console. There are also specific notes on Windows Debugging or graphical tool on Mac OS X ( see : MacOSX_Debug_OpenOffice.org_using_XCode ) and hints on Build Problems Debugging

Building with debugging symbols

OO.o includes a way to add debugging code in per module, via the build debug=true command in each module. This also adds lots of runtime assertions, churning warnings etc. in addition to debug symbols - which can be useful. To do just a plain build with debug symbols though use build debug=true dbg_build_only=true or in later versions use build debug=true dbglevel=2 for max output and dbglevel=1 or 0 for less output.

You can also configure OO.o with --enable-symbols to build with symbolic generation.

gdb invocation

If you debug with gdb, you may find that execution stops due to signals at inappropriate locations, especially if running against libgcj and need to debug ignoring its garbage-collection. Best invocation is...

gdb ./soffice.bin
(gdb) handle SIGPWR nostop noprint
(gdb) handle SIGXCPU nostop noprint
(gdb) handle SIG33 nostop noprint
(gdb) run -norestore -writer

replace -writer with -draw/-impress/-calc/... as appropriate. The -norestore option prevents display of the crash reporter (as one frequently kills office during debugging).

The recommended .gdbinit file

You can add the handle commands from above to your ~/.gdbinit to save some typing. Also you can define there macros to print the content of strings even in cases where dbg_dump() does not work. The following is such recommended ~/.gdbinit file:

set history filename ~/.gdbhistory
set history save on

handle SIGPWR nostop noprint
handle SIGXCPU nostop noprint
handle SIG33 nostop noprint

tabset 4

# define "pu" command to display sal_Unicode *
def pu
  set $uni = $arg0
  set $len = $arg1
  set $i = 0
  printf "\""
  while (*$uni && $i++<$len && $i<255)
    if (*$uni < 0x80)
      printf "%c", *(char*)$uni++
    else
      printf "\\x%x", *(short*)$uni++
    end
  end
  printf "\"\n"
end

# define "pus" command to display rtl_uString
def pus
  set $ns = $arg0
  if ($ns.buffer)
    pu $ns.buffer $ns.length
  else
    print "Invalid/non-initialized rtl_uString."
  end
end

# define "pou" command to display rtl::OUString
def pou
  set $ns = $arg0
  if ($ns.pData)
    pus $ns.pData
  else
    print "Invalid/non-initialized OUString."
  end
end

# define "ptu" command to display tools (Uni)String
def ptu
  set $ns = $arg0
  if ($ns.mpData)
    pu $ns.mpData->maStr $ns.mpData->mnLen
  else
    print "Invalid/non-initialized tools String."
  end
end

With this, you can use pou the_OUString in gdb to print OUString named the_OUString, similarly pus the_rtl_uString for rtl_uString and pu the_p_sal_Unicode for sal_Unicode *. For tools's UniString (or just String for historical reasons) you can use ptu the_String

Starting at the beginning

We start in 'main' with a sal wrapper, that calls vcl/source/app/svmain.cxx (SVMain). It invokes Main on pSVData->mpApp; but pSVData is an in-line local. To debug this use the pImplSVData global variable. eg:

     p pImplSVData->maAppData

This 'Main' method is typically: desktop/source/app/app.cxx (Main).

Examining strings

We have already seen that OO.o has its own set of string classes, none of which gdb understands. You need to use: (gdb) print dbg_dump(sWhatEver) to print the contents of a UniString/ByteString/rtl::OUString/rtl::OString regardless of the type when debugging C++ code. See Caolan's write-up for details.

the functions dbg_dump() may not be available in gdb due to link. Just copy them in your current source file from '/sal/rtl/source/debug_print' and add the associated includes #include <rtl/strbuf.hxx> #include <rtl/ustring.hxx>. gdb should recognize them now.

Another way is to use the macros from the recommended .gdbinit file. Unfortunately, so far works only for OUString, sal_Unicode *, and rtl_uString.

Getting the build order right

The build dependencies of the modules are clearly crucial to getting a clean build. When you type 'build' in a module, first build examines prj/build.list, eg.neon/prj/build.lst:

       xh      neon  :  soltools external expat NULL

this specifies that 'soltools', 'external' and 'expat' have to be satisfactorily built and delivered before neon can be built. Occasionally these rules get broken, and people don't notice for a while.

It crashes, but only in gdb

What fun — you symlinked desktop/unxlngi4.pro/bin/soffice to soffice.bin in your install tree didn't you. That works fine if you just run it, but it seems gdb unpacks the symlink and passes a fully qualified path as argv[0], which defeats the hunting for the binary in the path, so it assigns the program base path as /opt/OpenOffice/OOO_STABLE_1/desktop/unxlngi4.pro/bin and starts looking for (eg. applicat.rdb) in there. Of course when it fails to find any setup information, it silently crashes somewhere else yards away from the original problem.

It crashes, but doesn't crash

For various reasons signal handlers are trapped and life can get rather confusing; thus it's best for builders to apply something like this:

--- sal/osl/unx/signal.c
+++ sal/osl/unx/signal.c
@@ -188,6 +188,8 @@ static sal_Bool InitSignal()
             bSetILLHandler = sal_True;
        }
 
+       bSetSEGVHandler = bSetWINCHHandler = bSetILLHandler = bDoHardKill = sal_False;
+
        SignalListMutex = osl_createMutex();
 
        act.sa_handler = SignalHandlerFunction;

I can't find the code from the trace

Some methods, are described as having a special linkage, such that they can be used in callbacks; these typically have a prefix: 'LinkStub', so search for the latter part of the identifier in a freetext search. eg.

      IMPL_LINK( Window, ImplHandlePaintHdl, void*, EMPTYARG )

builds the 'LinkStubImplHandlePaintHdl' method.

How can I re-build just the files I see in the trace

Often when you run gdb on a build without debugging symbols, you get an unhelpful gdb trace, but yet you can't afford the time/space to recompile all of OO.o with debugging symbols. Thus we have created a small perl helper, which will hunt for and touch files containing the symbols from your trace. This sub-set can then be re-built with debugging enabled for a better trace next time around:

    gdb ./soffice.bin
    ...
    bt
#0  0x40b4e0a1 in kill () from /lib/libc.so.6
#1  0x409acfe6 in raise () from /lib/libpthread.so.0
#2  0x447bcdbd in SfxMedium::DownLoad(Link const&) () from ./libsfx641li.so
#3  0x447be151 in SfxMedium::SfxMedium(String const&, unsigned short, unsigned char, SfxFilter const*, SfxItemSet*) ()
   from ./libsfx641li.so
#4  0x448339d3 in getCppuType(com::sun::star::uno::Reference<com::sun::star::document::XImporter> const*) () from ./libsfx641li.so
...
    quit
    cd base/OOO_STABLE_1/sfx2
    ootouch SfxMedium
    build debug=true
    

Thus, all files referencing or implementing anything with SfxMedium will be touched, and hence rebuilt with debugging symbols.

ootouch is not available upstream: it is available through ooo-build.

How can I re-build all the files in one source directory

If you want to recompile the code in just your current directory, you can use the killobj dmake target to remove the object files:

    dmake killobj
    dmake
    

It always crashes in sal_XErrorHdl

You are a victim of asynchronous X error reporting; export SAL_SYNCHRONIZE=1 will make all the X traffic synchronous, and report the error by the method that caused it, it'll also make OO.o far slower, and the timing different.

It silently fails to load my word file

Caolan suggests: put breakpoints in ww8par.cxx top and tail of SwWW8ImplReader::LoadDoc, and confirm that the document gets as far as the import filter.

A handy human place to put a breakpoint is in SwWW8ImplReader::ReadPlainChars, you can see chunks of text as they are read in. Alternatively SwWW8ImplReader::AppendTxtNode as each paragraph is inserted.

How do I use the debug console ?

So OO.o contains some hefty debugging infrastructure; pictured here (FIXME BROKEN LINK)

Enabling it is pretty easy - what you need is a so-called Non-Product Build.

By default, an OpenOffice.org build is a Product Build, i.e. ready for release after completion. If you specify the --enable-dbgutil switch during configure, then your environment will be prepared for a Non-Product Build - with lots of additional diagnostic tools.

Note that libraries from product and non-product builds are usually incompatible, so don't mix them in the same installation.

For available tools in non-product builds, have a look at the various DBG_foo macros in tools/debug.hxx, or, if you already are knowledgeable about this, let others participate by writing your knowledge down here.

To actually fire up the debug settings dialog, press <ctrl><alt><shift>-D.

Draw/Impress text edit debugging

When running a Non-Product Build, live edit mode in Draw/Impress text boxes has an extra debug hotkey: pressing <ctrl><alt>-F2 writes information about the currently edited text into a debug.log file (currently Windows-only).

Excel Interop debugging

This is fairly easy; define an environment variable XLSDUMPER pointing to the file sc/source/filter/excel/xldumper.dat (full path needed). Then run soffice.bin foo.xls and you should get a foo.xls.txt in the same directory with the debug data in it.

Note: this requires a debug build of the sc module. To easily get such a build, execute the following within your sc directory: build -- killobj ; build debug=true

The trace shows a crash in 'poll'

OO.o is a fairly threaded program, you're probably just looking at the wrong thread: there are not likely to be bugs in poll. Use thread apply all backtrace to get a backtrace of all threads - this will most likely fail. When it does do: thread 1 then bt - most crashers occur in the 'main' thread.

What does this trace mean ?

There are several typical stack-traces that come up again and again, one would be:

#15 0x4164a501 in raise () from /lib/tls/libc.so.6
#16 0x4164bcd9 in abort () from /lib/tls/libc.so.6
#17 0x415fb5a5 in std::set_unexpected ()
   from /home/mnagashree/m72install/program/libstdc++.so.5
#18 0x415fb5e2 in std::terminate ()
   from /home/mnagashree/m72install/program/libstdc++.so.5
#19 0x415fb69c in __cxa_rethrow ()
    

This section of trace means (essentially) that an exception was thrown - but there was no-one trying to catch it. Often this means there was a missing 'try {} catch()' clause in one of the calling frames.

A great way to debug exceptions is to add a breakpoint in catch/throw, do this with catch throw or catch catch in gdb.

Useful places to put breakpoints

If you have compiled with debugging enabled: build debug=true it is possible that you get some nice churning debug / assertion failure - and you want to get a pleasant & detailed stack-trace: to do that do break osl_assertFailedLine.

STLport and checking iterators

The STL is a powerful tool but it also makes it easy - in the grand old C/C++ tradition - to shoot one selves in the foot, as we all know. STL containers and algorithms are now pervasive in OOo, so there is a need to validate the use of STL constructs in OO.o to find hidden problems.

Fortunately the STLport library - the default STL implementation for OO.o - has a powerful debug mode, and it's easy to use. Since SRC680 m128 it is possible to use the environment variable USE_STLP_DEBUG to switch on the STLport debug mode, since SRC680 m150 it works for Windows, too

The most useful part of the STLport debug mode is iterator checking. Doing the OO.o smoke test and some little additional random testing we already found a number of questionable STL constructs.

Only code paths which are exercised will be tested by the STLport debug mode, though. If STLport finds a questionable STL usage it will throw an assertion and terminate. It is usually quite easy to extract a precise stack trace.

Some notes:

  • STLport debug mode iterators are no pointers! We've cleaned up all occurrences of the lazy - and wrong - usages of iterators as pointers in SRC680 m128/m150, but maybe something new has already crept in. This clean up also helps with other STL implementations, like the one which comes with gcc-4.x
  • A complete recompile is necessary, the debug modes renders all objects with STL constructs binary incompatible
  • The STLport debug mode breaks the complexity assertions of the STL. Theoretically some operations should be much slower in debug mode than in product mode. In practice I didn't notice a real slowdown of OO.o.
Personal tools