Calc/Performance/Specific Bottlenecks

From Apache OpenOffice Wiki
< Calc‎ | Performance
Revision as of 18:54, 16 December 2008 by Dr (Talk | contribs)

Jump to: navigation, search

Specific bottlenecks to be worked on, identified using tools such as valgrind --tool=callgrind.


The Zaske case

Comparison with Excel 2003/2007 that need 1.2s where Calc needs 24s after changing a cell's value.

References:

Findings: lots of formulas directly or indirectly referring the input cell, with many listening to identical ranges.

Fix: Introduce the now existing bulk broadcaster that was already used for mass changes also for single cell changes to prevent repetitive broadcasts of identical ranges.

Fixed as i95967 in CWS calcperf03. Now Calc does it in 1.2s too..


The Ou case

Loading a large plain data file takes very long.

References:

Findings:

  • source/filter/xml/xmlsubti.cxx
    • 38% of time spent in ScMyTables::NewColumn() because of replicated use of aTableVec[nTableCount - 1] (vector::operator[])
      Note: percentage may be off due to compilation without optimization to obtain exact line numbers that may result in STLport's vector methods being differently compiled.
      • proposed fix: should obtain the pointer once instead.
    • Similar for other places where aTableVec[xxx] is used.
  • TODO: Check all ScMyTables::.*() and ScMyTableData::.*()
    • Especially for 63342857 calls to AddColumn() and NewColumn() that result in 1168654944 calls to operator[] ...
    • 63081776 calls to AddColumn() originate from ScXMLTableRowCellContext::EndElement()
    • Those are highly suspicious and seem to indicate that too many temporary elements are created for empty columns/cells (needs verification).


Sorting values within functions

i89976 has a document attached:

test-huge_calculations-Median-detailed.ods

NOTE: The assumptions made by the submitter as documented in the test case are plain wrong.

Findings when testing with filling C4:C3003

  • 52% overall in interpr3.cxx lcl_QuickSort() and below, of which
    • 32% in vector<double>::operator[] and below,
      • 25% originating from the loops
       while (ni <= nHi && rSortArray[ni]  < rSortArray[nLo]) ni++;
       while (nj >= nLo && rSortArray[nLo] < rSortArray[nj])  nj--;

where rSortArray[nLo] should be a temporary variable instead.

Or all that be realized using simple double[].

  • 21% overall in ScValueIterator::GetThis() and below.


Querying data within functions

An internal customer's document (sorry, can't publish) doing lookup queries that don't fit into the current caching strategy.

Findings:

  • 8% in 51613353 calls to com::sun::star::i18n::casefolding::getNextChar() via
    • 39696595 calls to utl::TransliterationWrapper::isEqual() via
      • ScTable::ValidQuery() via
        • 8888 calls to ScQueryCellIterator::GetThis() via
          • lcl_LookupQuery()
  • 5% in ScTableValidQuery() most in String() and ~String() of aCellStr
  • 200873636 calls to com::sun::star::i18n::casefolding::getNextChar() via
    • 33173401 calls to com::sun::star::i18n::Transliteration_caseignore::compare()
  • 5% in com::sun::star::i18n::oneToOneMappingWithFlag::find()
    • Replicated mpIndex[high] access, might be better using temporary pointer.
  • 5% in com::sun::star::i18n::casefolding::getValue()
  • 58% overall in ScTable::ValidQuery() and below
    • TODO: Cache results of ValidQuery()? Similar to ScLookupCache?
  • 11% overall in 27341713 calls to ScBroadcastAreaSlot::StartListeningArea() and below, of which 10% are in ::std::set::insert() and below.
    • TODO: refactor implementation of broadcast slots.

OOX import issue 96758

A document in xlsx format found somewhere on the internet (issue 96758).

Findings:

  • Takes more than 20 minutes to load in a debug session
  • about 95% of total load time in ::oox::xls::WorksheetData::convertRowFormat()
    • ~100% in ::oox::xls::StylesBuffer::writeCellXfToPropertySet()
      • ~100% in ::oox::xls::Xf::writeToPropertySet()
        • multiple XPropertySet accesses while writing font properties

2008-12-10: First step. Consolidate property set usage to one API call per XF (cell format object). Load time reduced from 20 minutes to 4 minutes. Woohoo.

2008-12-12: Second step. Change interface of ::oox::PropertyMap and ::oox::PropertySet from property name strings to integer property identifiers. Identifiers will be generated on compile time from a text file with all used property names. A process singleton (created on demand) will contain a big vector of property name strings. Saves a few seconds of the 4 minutes.

Personal tools