Difference between revisions of "Performance/WriterLoadSave"
(→Conversion of Hyperlinks) |
B michaelsen (Talk | contribs) (→Callgrind Save XML-Generation (Contributions To SaveAsOwnFormat DEV300_m45)) |
||
Line 134: | Line 134: | ||
* [[media:Callgrind_DEV300_m45_Save3.tgz]] Odf Spec | * [[media:Callgrind_DEV300_m45_Save3.tgz]] Odf Spec | ||
To analyse these files open them with kcachegrind or callgrind_annotate. | To analyse these files open them with kcachegrind or callgrind_annotate. | ||
+ | === Callgrind Load XML-Parsing (Contributions To LoadOwnFormat DEV300_m45) === | ||
+ | The following profiling files have been generated with: | ||
+ | valgrind --tool=callgrind "--toggle-collect=*LoadOwnFormat*" --callgrind-out-file=OdfLoad1.cg soffice.bin | ||
+ | * [[media:Callgrind_DEV300_m45_Load1.tgz]] MailMerge, ScienceThesis | ||
+ | * [[media:Callgrind_DEV300_m45_Load2.tgz]] Manual, Odf Spec | ||
+ | To analyse these files open them with kcachegrind or callgrind_annotate. | ||
+ | |||
=== Benchmarking with vTune === | === Benchmarking with vTune === | ||
In the spreadsheet [[Media:odfsave.ods]] you can find a list of the top consumers from some selected libraries (sw, xo, svl, svt, sal3, sfx2) when saving the ODF specification document. | In the spreadsheet [[Media:odfsave.ods]] you can find a list of the top consumers from some selected libraries (sw, xo, svl, svt, sal3, sfx2) when saving the ODF specification document. |
Revision as of 16:06, 6 April 2009
|
---|
Quick Navigation Team Communication Activities |
About this template |
Contents
- 1 Investigation and Profiling of Writer Load/Save Performance
- 2 Optimizations
- 2.1 Implemented Optimizations
- 2.2 Identified Hotspots
- 2.2.1 Conversion of Hyperlinks
- 2.2.2 String Indexed Access of PropertySets
- 2.2.3 Using XMultiPropertySet where XTolerantMultiPropertySet might suffice and be more performant
- 2.2.4 Font Fallback
- 2.2.5 Compressed files do not need to be compressed again in Storage
- 2.2.6 Iteration over Frame Collections
- 3 Test Documents
Investigation and Profiling of Writer Load/Save Performance
We started a systematic profiling of the load/save performance on a current milestone (DEV300_m45). Im using Intel vTune on Windows and cachegrind (with cache-simulation) on Linux. On each platform each of the four documents (see Testdocuments below) is profiled for a load and a save procedure. Each measurement is done twice. In total, this will result in:
2 (platforms/profilers) * 4 (documents) * 2 (load/save) * 2 (measurements) = 32 measurements in total
The analysis of the data should:
- show, if the profiler output is stable and reproducable
- show, if there are any differences between platforms and profiler (and for cachegrind: does cache-simulation result in meaningful additional accuracy?)
- identify the hotspots of the current implementation
- be a basis to evalute the progress by optimizations
Birdsview Callgrind (Contributions To StoreToUrl DEV300_m45)
Document | Lib | Method | Instructions fetched | Cycle Cost Est. | Instructions fetched | Cycle Cost Est. |
ScienceThesis 1 | libsfx | SfxBaseModel::StoreAsUrl | 11592161864 | 12700226484 | 100.00% | 100.00% |
libsfx | SfxObjectShell::SaveAsOwnFormat | 4567386009 | 5193895809 | 39.40% | 40.90% | |
libsfx | SfxObjectShell::GenerateAndStoreThumbnail | 2336567459 | 2525183559 | 20.16% | 19.88% | |
libsfx | SfxMedium::Commit | 4535296593 | 4794222283 | 39.12% | 37.75% | |
other | 152911803 | 186924833 | 1.32% | 1.47% | ||
ScienceThesis 2 | libsfx | SfxBaseModel::StoreAsUrl | 11144535020 | 12250598830 | 100.00% | 100.00% |
libsfx | SfxObjectShell::SaveAsOwnFormat | 4575357778 | 5189916868 | 41.05% | 42.36% | |
libsfx | SfxObjectShell::GenerateAndStoreThumbnail | 1925216412 | 2116790032 | 17.27% | 17.28% | |
libsfx | SfxMedium::Commit | 4535327879 | 4792981289 | 40.70% | 39.12% | |
other | 108632951 | 150910641 | 0.97% | 1.23% | ||
Manual 1 | libsfx | SfxBaseModel::StoreAsUrl | 2019415631 | 2238108511 | 100.00% | 100.00% |
libsfx | SfxObjectShell::SaveAsOwnFormat | 1239343887 | 1399947937 | 61.37% | 62.55% | |
libsfx | SfxObjectShell::GenerateAndStoreThumbnail | 607055773 | 642691573 | 30.06% | 28.72% | |
libsfx | SfxMedium::Commit | 140565338 | 157789035 | 6.96% | 7.05% | |
other | 32450633 | 37679966 | 1.61% | 1.68% | ||
Manual 2 | libsfx | SfxBaseModel::StoreAsUrl | 2043487795 | 2268710495 | 100.00% | 100.00% |
libsfx | SfxObjectShell::SaveAsOwnFormat | 1262263271 | 1428106081 | 61.77% | 62.95% | |
libsfx | SfxObjectShell::GenerateAndStoreThumbnail | 608153207 | 644497207 | 29.76% | 28.41% | |
libsfx | SfxMedium::Commit | 141902391 | 154627511 | 6.94% | 6.82% | |
other | 31168926 | 41479696 | 1.53% | 1.83% | ||
Spec 1 | libsfx | SfxBaseModel::StoreAsUrl | 17995716426 | 20443306916 | 100.00% | 100.00% |
libsfx | SfxObjectShell::SaveAsOwnFormat | 16713230798 | 19057592468 | 92.87% | 93.22% | |
libsfx | SfxObjectShell::GenerateAndStoreThumbnail | 619569536 | 652497286 | 3.44% | 3.19% | |
libsfx | SfxMedium::Commit | 605712688 | 663294858 | 3.37% | 3.24% | |
other | 121121223 | 133131403 | 0.67% | 0.65% | ||
Spec 2 | libsfx | SfxBaseModel::StoreAsUrl | 18062701503 | 20537474953 | 100.00% | 100.00% |
libsfx | SfxObjectShell::SaveAsOwnFormat | 16681477861 | 19055676231 | 92.35% | 92.78% | |
libsfx | SfxObjectShell::GenerateAndStoreThumbnail | 654539677 | 685122457 | 3.62% | 3.34% | |
libsfx | SfxMedium::Commit | 605562742 | 663544862 | 3.35% | 3.23% | |
other | 121121223 | 133131403 | 0.67% | 0.65% | ||
MailMerge 1 | libsfx | SfxBaseModel::StoreAsUrl | 40777455765 | 53626856665 | 100.00% | 100.00% |
libsfx | SfxObjectShell::SaveAsOwnFormat | 23118516732 | 34395921752 | 56.69% | 64.14% | |
libsfx | SfxObjectShell::GenerateAndStoreThumbnail | 17475120469 | 19021058129 | 42.85% | 35.47% | |
libsfx | SfxMedium::Commit | 161846924 | 179849124 | 0.40% | 0.34% | |
other | 21971640 | 30027660 | 0.05% | 0.06% | ||
MailMerge 2 | libsfx | SfxBaseModel::StoreAsUrl | 34409955066 | 45317878876 | 100.00% | 100.00% |
libsfx | SfxObjectShell::SaveAsOwnFormat | 22931786501 | 33059699851 | 66.64% | 72.95% | |
libsfx | SfxObjectShell::GenerateAndStoreThumbnail | 11294919676 | 12054180246 | 35.03% | 26.60% | |
libsfx | SfxMedium::Commit | 161097067 | 173977107 | 0.47% | 0.38% | |
other | 22151822 | 30021672 | 0.06% | 0.07% | ||
std. deviation | libsfx | SfxBaseModel::StoreAsUrl | 9.31% | 9.21% | ||
libsfx | SfxObjectShell::SaveAsOwnFormat | 1.17% | 2.53% | |||
libsfx | SfxObjectShell::GenerateAndStoreThumbnail | 23.04% | 23.30% | |||
libsfx | SfxMedium::Commit | 0.61% | 2.21% | |||
other | 16.88% | 12.56% | ||||
Callgrind Save XML-Generation (Contributions To SaveAsOwnFormat DEV300_m45)
The following profiling files have been generated with:
valgrind --tool=callgrind "--toggle-collect=*SaveAsOwnFormat*" soffice.bin
Rerunning the save procedure shows them to have a high reproducability (~1% deviation for SaveAsOwnFormat as a whole).
- media:Callgrind_DEV300_m45_Save1.tgz MailMerge, Manual
- media:Callgrind_DEV300_m45_Save2.tgz ScienceThesis
- media:Callgrind_DEV300_m45_Save3.tgz Odf Spec
To analyse these files open them with kcachegrind or callgrind_annotate.
Callgrind Load XML-Parsing (Contributions To LoadOwnFormat DEV300_m45)
The following profiling files have been generated with:
valgrind --tool=callgrind "--toggle-collect=*LoadOwnFormat*" --callgrind-out-file=OdfLoad1.cg soffice.bin
- media:Callgrind_DEV300_m45_Load1.tgz MailMerge, ScienceThesis
- media:Callgrind_DEV300_m45_Load2.tgz Manual, Odf Spec
To analyse these files open them with kcachegrind or callgrind_annotate.
Benchmarking with vTune
In the spreadsheet Media:odfsave.ods you can find a list of the top consumers from some selected libraries (sw, xo, svl, svt, sal3, sfx2) when saving the ODF specification document.
Optimizations
Implemented Optimizations
Save time index entries
Issue 57008 related to save time index entries ( http://qa.openoffice.org/issues/show_bug.cgi?id=57008 ). It saves est. 10% of the save time.
Identified Hotspots
Conversion of Hyperlinks
- Conversion of hyperlinks takes a lot of time. To bring hyperlinks into the correct form and to make them relative to the target URL the methods from svt's URIHelper are used. This is done for _all_ URLs independent of the protocol. In the given document URLs are mostly http while the document is stored to file. One approach to solve this issue is to convert only if the protocols are the same. Another approach is to manage the hyperlinks within the document and to keep them in the correct form all the time. So they only have to be converted at the time the target URL changes (saveAs/storeToURL). On the given system this would save about 4 s from a total of 12 s save time. (stopwatch estimation)
Even worse: issue 50983
- Saving a file with a lot of fragment URLs "#bookmarkname" to a network share takes a lot of time. In comparision: DEV300 m41 takes about 1:55 min while os128 takes only 28s to save the document to a network share.
To make sure the file URLs are correctly normalized the dialog code to insert all kinds of links has to call the normalization. This applies to Insert/Hyperlink, Insert/Picture from File and others.
String Indexed Access of PropertySets
- Another rather big part of processing time is consumed to access the members of the implementations of css::beans::XPropertySet, XPropertyState, XPropertySetInfo. To find the requested element by it's name the methods from SfxItemPropertySet, SfxItemPropertySetInfo etc. iterate over an array of structs that define a property (SfxItemPropertyMap). This can be seen by the numbers from SfxItemPropertyMap::getByName, rtl::OUString::equalsAsciiL, SfxItemPropertySetInfo::hasPropertyByName in the svl library.
- The replacement for the SfxItemPropertyMap that uses an std::hash_map is ready. After changing a lot of code in the applications as well as in svtools, sfx2, svx and others I started to compare the load/save times.
- The result is not as expected. In Media:Odfsave_withhash.ods you can see that SfxItemPropertyMap::getByName() takes longer than before. The new function takes about 5.3 s totally. These are about 1.7 s more than it's predecessor SfxItemPropertyMap::GetByName() required. The time is consumed mostly in the _M_Find<::rtl::OUString> method of the hash_map implementation.
- One of the probable reasons is the fact that the sorted access to properties eliminated a lot of string comparisons.
Using XMultiPropertySet where XTolerantMultiPropertySet might suffice and be more performant
- To decide which properties have to be saved xmloff uses the interface methods css::beans::XPropertyState::getPropertyStates() and css::beans::XMultiPropertySet::getPropertyValues(). It could also use the interface css::beans::XTolerantMultiPropertySet::getDirectPropertyValuesTolerant() which is not implemented for Writer's UNO objects.
- Saving Writer's text content is done by iterating over the paragraphs and iterating over so-called text portions within the paragraphs. Text portions are parts of the paragraph that have a single attribute set, text fields, redline portions, inline anchored frames etc. It might make sense to detect their properties at construction time and preset their css::uno::XTolerantMultiPropertySet interface. And the moment a text portion is created that adds a bookmark to remember it's position. The impact on real documents is not yet checked.
- A test implementation of the XTolerantMultiPropertySet in Writer's text portion objects didn't result in increased save speed.
Font Fallback
The huge contribution of GenerateAndStoreThumbnail in the callgrind measurements for some of the documents is attributed to substitution matching for missing fonts. This might be an issue to investigate.
Compressed files do not need to be compressed again in Storage
- The large contribution of SfxMedium::Commit for documents "ScienceThesis" and "Manual" to StoreAsUrl in the callgrind analysis are attributed to the pictures in the document. We are investigating, if it might help to store image files that are already compressed (JPEG for examples) directly to the storage without trying to compress again in vain.
Iteration over Frame Collections
The methods SwDoc::GetFlyCount and SwDoc::GetFlyNum contribute more than 13 % of the instructions to SaveAsOwnFormat for the MailMerge document. The iteration over the frames array is O(n^2). Suggested solution:
- add a method SwXFrames::getAllFrames returning a Sequence<XFrame>
- get rid of the GetByIndex accesses in xmloff
Test Documents
- http://docs.oasis-open.org/office/v1.1/OS/OpenDocument-v1.1.odt (long specification document)
- http://www.fonaso.com/translations/FonasoVersatil-German-V1.8.doc (short document with images)
- https://eldorado.uni-dortmund.de/bitstream/2003/21520/2/Jackler.doc (~150 pages scientific paper with tables and images)
- MailMerge Document (~600 pages, to be uploaded)
Microsoft Word Documents where loaded and saved as odt before profiling.