Performance/WriterLoadSave

From Apache OpenOffice Wiki
< Performance
Revision as of 14:52, 12 March 2009 by Os (Talk | contribs)

Jump to: navigation, search

Started investigation of load/save issues in Writer using VTune. The first test document is the ODF 1.1 specification document (http://docs.oasis-open.org/office/v1.1/OS/OpenDocument-v1.1.odt)</A>.

  • Already done:

  • Already identified issues:

    • To decide which properties have to be saved xmloff uses the interface methods css::beans::XPropertyState::getPropertyStates() and css::beans::XMultiPropertySet::getPropertyValues(). It could also use the interface css::beans::XTolerantMultiPropertySet::getDirectPropertyValuesTolerant() which is not implemented for Writer's UNO objects.

    • Saving Writer's text content is done by iterating over the paragraphs and iterating over so-called text portions within the paragraphs. Text portions are parts of the paragraph that have a single attribute set, text fields, redline portions, inline anchored frames etc. It might make sense to detect their properties at construction time and preset their css::uno::XTolerantMultiPropertySet interface.
      And the moment a text portion is created that adds a bookmark to remember it's position. The impact on real documents is not yet checked.

A test implementation of the XTolerantMultiPropertySet in Writer's text portion objects didn't result in increased save speed.

In the spreadsheet Media:odfsave.ods you can find a list of the top consumers from some selected libraries (sw, xo, svl, svt, sal3, sfx2) when saving the ODF specification document. I will concentrate on two issues here:

  • Conversion of hyperlinks takes a lot of time. To bring hyperlinks into the correct form and to make them relative to the target URL the methods from svt's URIHelper are used. This is done for _all_ URLs independent of the protocol. In the given document URLs are mostly http while the document is stored to file. One approach to solve this issue is to convert only if the protocols are the same. Another approach is to manage the hyperlinks within the document and to keep them in the correct form all the time. So they only have to be converted at the time the target URL changes (saveAs/storeToURL). On the given system this would save about 4 s from a total of 12 s save time. (stopwatch estimation)

  • Another rather big part of processing time is consumed to access the members of the implementations of css::beans::XPropertySet, XPropertyState, XPropertySetInfo. To find the requested element by it's name the methods from SfxItemPropertySet, SfxItemPropertySetInfo etc. iterate over an array of structs that define a property (SfxItemPropertyMap). This can be seen by the numbers from SfxItemPropertyMap::getByName, rtl::OUString::equalsAsciiL, SfxItemPropertySetInfo::hasPropertyByName in the svl library.

The replacement for the SfxItemPropertyMap that uses an std::hash_map is ready. After changing a lot of code in the applications as well as in svtools, sfx2, svx and others I started to compare the load/save times.

The result is not as expected. In [Media:Odfsave_withhash.ods] you can see that SfxItemPropertyMap::getByName() takes longer than before. The new function takes about 5.3 s totally. These are about 1.7 s more than it's predecessor SfxItemPropertyMap::GetByName() required. The time is consumed mostly in the _M_Find<::rtl::OUString> method of the hash_map implementation.

One of the probable reasons is the fact that the sorted access to properties eliminated a lot of string comparisons.

Personal tools