Calc/Performance/Specific Bottlenecks

From Apache OpenOffice Wiki
< Calc‎ | Performance
Revision as of 22:37, 11 February 2009 by ErAck (Talk | contribs)

Jump to: navigation, search

Specific bottlenecks to be worked on, identified using tools such as valgrind --tool=callgrind.


The Zaske case

Content relocated to Calc/Performance/Zaske, section preserved for external references linking here.

The Ou case

Loading a large plain data file takes very long.

References:

Findings:

  • source/filter/xml/xmlsubti.cxx
    • 38% of time spent in ScMyTables::NewColumn() because of replicated use of aTableVec[nTableCount - 1] (vector::operator[])
      Note: percentage may be off due to compilation without optimization to obtain exact line numbers that may result in STLport's vector methods being differently compiled.
      • proposed fix: should obtain the pointer once instead.
    • Similar for other places where aTableVec[xxx] is used.
  • TODO: Check all ScMyTables::.*() and ScMyTableData::.*()
    • Especially for 63342857 calls to AddColumn() and NewColumn() that result in 1168654944 calls to operator[] ...
    • 63081776 calls to AddColumn() originate from ScXMLTableRowCellContext::EndElement()
    • Those are highly suspicious and seem to indicate that too many temporary elements are created for empty columns/cells (needs verification).


Sorting values within functions

Content relocated to Calc/Performance/sorting_values_within_functions, section preserved for external references linking here.

Querying data within functions

An internal customer's document (sorry, can't publish) doing lookup queries that don't fit into the current caching strategy.

Findings:

  • 8% in 51613353 calls to com::sun::star::i18n::casefolding::getNextChar() via
    • 39696595 calls to utl::TransliterationWrapper::isEqual() via
      • ScTable::ValidQuery() via
        • 8888 calls to ScQueryCellIterator::GetThis() via
          • lcl_LookupQuery()
  • 5% in ScTableValidQuery() most in String() and ~String() of aCellStr
  • 200873636 calls to com::sun::star::i18n::casefolding::getNextChar() via
    • 33173401 calls to com::sun::star::i18n::Transliteration_caseignore::compare()
  • 5% in com::sun::star::i18n::oneToOneMappingWithFlag::find()
    • Replicated mpIndex[high] access, might be better using temporary pointer.
  • 5% in com::sun::star::i18n::casefolding::getValue()
  • 58% overall in ScTable::ValidQuery() and below
    • TODO: Cache results of ValidQuery()? Similar to ScLookupCache?
  • 11% overall in 27341713 calls to ScBroadcastAreaSlot::StartListeningArea() and below, of which 10% are in ::std::set::insert() and below.
    • TODO: refactor implementation of broadcast slots.


OOX import issue 96758

A document in xlsx format found somewhere on the internet (issue 96758).

Findings:

  • Takes more than 20 minutes to load in a debug session
  • about 95% of total load time in 1160 calls to ::oox::xls::WorksheetData::convertRowFormat() (more than 1 second per call)
    • ~100% in ::oox::xls::StylesBuffer::writeCellXfToPropertySet()
      • ~100% in ::oox::xls::Xf::writeToPropertySet()
        • root cause: multiple XPropertySet accesses while writing font properties

2008-12-10: First step. Reduce run time of ::oox::xls::Xf::writeToPropertySet().

  • Consolidated property set usage to one API call per XF (cell format object). Load time reduced from >20 minutes to 3:45 minutes. Woohoo.
    • Still spends 68% of total load time (2:33 minutes) in 1160 calls to ::oox::xls::WorksheetData::convertRowFormat() (132 ms per call)

2008-12-12: Second step. Optimize overall property set usage.

  • Changed interfaces of ::oox::PropertyMap and ::oox::PropertySet from property name (string) to property identifiers (integer). Identifiers will be generated on compile time from a text file with all used property names. A process singleton (created on demand) will contain a big vector of property name strings. Saves a few seconds of the total load time.

2008-12-18: Third step. Reduce call count of ::oox::xls::WorksheetData::convertRowFormat() by caching row formats and applying them at row ranges if formatting is equal across the rows.

  • Turns out that the 1160 formatted rows could be merged into 13 ranges. This means that the 1160 API calls have been reduced to 13! Load time reduced from 3:40 minutes to 1:07 minutes. Needed time to format the 1160 rows reduced from 2:33 minutes to 2 seconds!
Personal tools