Calc/Performance/Specific Bottlenecks
Specific bottlenecks to be worked on, identified using tools such as
valgrind --tool=callgrind
.
Contents
The Zaske case
Content relocated to Calc/Performance/Zaske, section preserved for external references linking here.
The Ou case
Loading a large plain data file takes very long.
References:
Findings:
- source/filter/xml/xmlsubti.cxx
- 38% of time spent in ScMyTables::NewColumn() because of replicated use of aTableVec[nTableCount - 1] (vector::operator[])
Note: percentage may be off due to compilation without optimization to obtain exact line numbers that may result in STLport's vector methods being differently compiled.- proposed fix: should obtain the pointer once instead.
- Similar for other places where aTableVec[xxx] is used.
- 38% of time spent in ScMyTables::NewColumn() because of replicated use of aTableVec[nTableCount - 1] (vector::operator[])
- TODO: Check all ScMyTables::.*() and ScMyTableData::.*()
- Especially for 63342857 calls to AddColumn() and NewColumn() that result in 1168654944 calls to operator[] ...
- 63081776 calls to AddColumn() originate from ScXMLTableRowCellContext::EndElement()
- Those are highly suspicious and seem to indicate that too many temporary elements are created for empty columns/cells (needs verification).
Sorting values within functions
Content relocated to Calc/Performance/sorting_values_within_functions, section preserved for external references linking here.
Querying data within functions
An internal customer's document (sorry, can't publish) doing lookup queries that don't fit into the current caching strategy.
Findings:
- 8% in 51613353 calls to com::sun::star::i18n::casefolding::getNextChar() via
- 39696595 calls to utl::TransliterationWrapper::isEqual() via
- ScTable::ValidQuery() via
- 8888 calls to ScQueryCellIterator::GetThis() via
- lcl_LookupQuery()
- 8888 calls to ScQueryCellIterator::GetThis() via
- ScTable::ValidQuery() via
- 39696595 calls to utl::TransliterationWrapper::isEqual() via
- 5% in ScTableValidQuery() most in String() and ~String() of aCellStr
- 200873636 calls to com::sun::star::i18n::casefolding::getNextChar() via
- 33173401 calls to com::sun::star::i18n::Transliteration_caseignore::compare()
- 5% in com::sun::star::i18n::oneToOneMappingWithFlag::find()
- Replicated mpIndex[high] access, might be better using temporary pointer.
- 5% in com::sun::star::i18n::casefolding::getValue()
- 58% overall in ScTable::ValidQuery() and below
- TODO: Cache results of ValidQuery()? Similar to ScLookupCache?
- 11% overall in 27341713 calls to ScBroadcastAreaSlot::StartListeningArea() and below, of which 10% are in ::std::set::insert() and below.
- TODO: refactor implementation of broadcast slots.
OOX import issue 96758
A document in xlsx format found somewhere on the internet (issue 96758).
Findings:
- Takes more than 20 minutes to load in a debug session
- about 95% of total load time in 1160 calls to ::oox::xls::WorksheetData::convertRowFormat() (more than 1 second per call)
- ~100% in ::oox::xls::StylesBuffer::writeCellXfToPropertySet()
- ~100% in ::oox::xls::Xf::writeToPropertySet()
- root cause: multiple XPropertySet accesses while writing font properties
- ~100% in ::oox::xls::Xf::writeToPropertySet()
- ~100% in ::oox::xls::StylesBuffer::writeCellXfToPropertySet()
2008-12-10: First step. Reduce run time of ::oox::xls::Xf::writeToPropertySet().
- Consolidated property set usage to one API call per XF (cell format object). Load time reduced from >20 minutes to 3:45 minutes. Woohoo.
- Still spends 68% of total load time (2:33 minutes) in 1160 calls to ::oox::xls::WorksheetData::convertRowFormat() (132 ms per call)
2008-12-12: Second step. Optimize overall property set usage.
- Changed interfaces of ::oox::PropertyMap and ::oox::PropertySet from property name (string) to property identifiers (integer). Identifiers will be generated on compile time from a text file with all used property names. A process singleton (created on demand) will contain a big vector of property name strings. Saves a few seconds of the total load time.
2008-12-18: Third step. Reduce call count of ::oox::xls::WorksheetData::convertRowFormat() by caching row formats and applying them at row ranges if formatting is equal across the rows.
- Turns out that the 1160 formatted rows could be merged into 13 ranges. This means that the 1160 API calls have been reduced to 13! Load time reduced from 3:40 minutes to 1:07 minutes. Needed time to format the 1160 rows reduced from 2:33 minutes to 2 seconds!