Performance/Configuration
|
|
|---|
|
Quick Navigation Team Communication Activities |
| About this template |
Status Quo
The OOo configuration (aka registry; module configmgr) has various problems:
- Reading/stat'ing its many data files upon start up has negative impact on start up performance (see Performance/Startup#strace). A secondary performance problem might be the amount of
configmgrcode executed (see Performance/Startup#callgrind). See Issue 101955 “improve configmgr performance”. - There are various open issues relating to the configuration, for example:
- Issue 100548 “config data deployed via extension is still available after deinstallation and office restart ”
- Issue 88162 “Removing blank lines from a .xcu file to reduce its size”
- Issue 101422 “set nodes can't be reset correctly via configuration API”
- Issue 94456 “unnecessary Writer-javamail.xcu module”
- Issue 77102 “removeByName() for set nodes should remove them realy in XCU file”
- Issue 65160 “live deployment of configuration files does not work right”
- Issue 77200 “Assertion in config manager when installing extension”
- Issue 80296 “Reinstallation of an addon does not activate the new version completely.”
- Issue 60812 “refresh in configmgr leaves invalid items in cache”
- Issue 77174 “xsi namespace is missing in user registry files”
- Issue 53769 “Assertion: 45 times Error O:\SRC680\configmgr\source\treemgr\noderef.cxx Line 1371: (null)”
- Issue 46185 “API: cfgmgr2.OInnerTreeSetUpdateAccess::com::sun::star::container::XNameContainer”
- Issue 44715 “Set Update part of Config example needs to be rewritten”
- Issue 65845 “configuration should support IndexContainer/Access”
- Issue 52328 “Performance: configmanager”
- Issue 60022 “soffice with broken Common.xcu fails to start”
- Issue 60021 “soffice with broken Common.xcu does not terminate immediately upon user request”
- Issue 14471 “Configuration: cannot reinsert removed set elements without intervening commit”
- Issue 69360 “Support for localized values in extendable groups”
- Issue 56687 “User-Config corrupted if targetdir modified”
- I keep getting crash reports from
configmgrcode, some clearly from multi-threaded scenarios, some not. (See this mail.)
Improvement
All of the above, combined with the fact that the current configmgr code is hard to maintain—for me at least, made me consider a complete re-implementation of configmgr. The performance implications of reading/stat'ing the many data files during start up were clearly the most important factor here. This work is done on CWS sb111.
The intended steps are as follows:
- First, create a new implementation that has the same UNO API as the old one (it might leave out some obscure interfaces that are practically unused, anyway). It will continue to read the existing
.xcs/.xcufiles, but will (hopefully) be simpler and easier to maintain than the old implementation. I am currently working on this step. - Second, improve (start up) performance by reorganizing the configuration's data files. What the best organization will look like is still open to experimentation. See Configmgr_Refactoring for related ideas.
- A further step might be to redesign the
configmgrUNO API (and the relevant client code) to improve performance further, should measurements indicate that this is necessary.
Work in Progress
I do the improvements on CWS sb111. It is using Mercurial (instead of Subversion), and the Mercurial repository is available at http://hg.services.openoffice.org/hg/cws/sb111/ (which I sync with my local working repository more or less regularly).
Some notes, in no particular order:
- There are no overview documents yet. (The rationale is that design is still in big flux, so it is more convenient for me to work with notes on sheets of paper on my desk for now.)
- My simple approach for now is to read in all the
.xcs/.xcufiles during start up (many of them combined into larger.xcdfiles, which are just concatenations of multiple.xcs/.xcufiles, to decrease file I/O times), and write all modifications to a single file$UserInstallation/user/registrymodifications.xcu(using a custom XML format based on the.xcuformat)—and to optimize later. For now, a corruptedregistrymodificaitons.xcufile (due to errors in the new code, for example) can prevent OOo from starting up; just remove the file in such a situation. - For OOo extensions containing configuration data, the extension manager now communicates with the configuration manager via a simple (non–UNO-based) C++ interface, through which the extension manager can register configuration data files and remove them again. Which extension configuration data files are registerd (in the shared resp. per-user store) is now recorded in
registry/com.sun.star.comp.deployment.configuration.PackageRegistryBackend/configmgrrcini-files within the extension manager's cache directories (and those configuration data files are no longer merged intoregistry/com.sun.star.comp.deployment.configuration.PackageRegistryBackend/registrytrees, although for now those trees are still read out to cater for already installed extensions). At runtime, newly registered extension configuration data files are live-merged into the configuration manager's runtime data, while the effects of removing data files only become active on the next OOo start (rationale: if a configuration item is removed from a layer, the configuration manager at runtime has no information whether or not it hid an underlying item that now should become visible). - For now, I concentrate on availability of basic functionality, like executing the smoke test document (
somketestoo_native/data/smoketestdoc.sxw). Some code paths are not implemented yet (especially UNO API methods that appear to be rarely, if ever, used) and cause OOo to crash (to conveniently give me a stack trace when running OOo from within a debugger). - I mainly work on Solaris Intel, so the code may occasionally have problems with other compilers.
- As for a time line, the new code will definitely not go into OOo 3.2. More realistic targets are OOo 3.3 or, if file format incompatibilities require it, even OOo 4.
Some numbers, taken on the same Linux box where I did earlier measurements, show that the new implementation is currently about as fast as the old one. Currently, the new implementation uses libxml2's xmlreader API to read XML files (using libxml2's DOM-based API turned out to burn too many processing cycles, esp. in memory allocation, and libxml2's C-based SAX API is not very useful in a C++ scenario where the callbacks want to throw exceptions, and handling of .xcd <dependency> elements requires a mechanism to stop parsing prematurely). The callgrind numbers show that still too many processing cycles are spent reading the XML files in libxml2. With a bit of work, this could be replaced with a custom XML reader that would hopefully need fewer cycles and would hopefully speed up start up times. Comparing a plain unxlngi6.pro DEV300m54 OOo with (DEV300m54-based) sb111, /usr/bin/time -v opt/openoffice.org3/program/soffice -writer (and hitting Ctrl-Q when the text cursor starts to blink) gives the following numbers (one cold and two subsequent warm starts each):
| user | system | percentage | wall-clock | major page faults | |
|---|---|---|---|---|---|
| plain | 1.26s | 0.23s | 22% | 6.75s | 340 |
| 1.32s | 0.16s | 51% | 2.88s | 1 | |
| 1.30s | 0.16s | 53% | 2.76s | 1 | |
sb111
|
1.61s | 0.30s | 28% | 6.66s | 334 |
| 1.53s | 0.17s | 65% | 2.60s | 1 | |
| 1.42s | 0.16s | 63% | 2.48s | 1 |
callgrind numbers, sorted per ELF-object:
| plain | sb111
| ||||
|---|---|---|---|---|---|
| ∑ | 2,620,343,316 | 100.00% | ∑ | 2,705,655,238 | 100.00% |
sal
|
686,659,333 | 26.20% | sal
|
519,735,216 | 19.21% |
ld
|
330,253,794 | 12.60% | xml2
|
497,961,694 | 18.40% |
configmgr2
|
314,864,471 | 12.02% | ld
|
303,513,949 | 11.22% |
c
|
215,660,518 | 8.23% | configmgr
|
229,593,616 | 8.49% |
fontconfig
|
205,835,897 | 7.86% | c
|
218,615,140 | 8.08% |
pthread
|
117,283,877 | 4.48% | fontconfig
|
205,908,326 | 7.61% |
cppu
|
116,507,422 | 4.45% | pthread
|
96,248,977 | 3.56% |
vcl
|
92,381,398 | 3.53% | vcl
|
92,373,178 | 3.41% |
| … | … | … | … | … | … |
- After replacing
libxml2with a hand-crafted parser to read the XML files (thatmmaps the files into memory), start up times on the above Linux box unfortunately increased (to slightly over 7 sec. for cold start; warm start showed no significant difference). While the number of processing cycles indeed decreased, it appears thatmmaping files and then reading them causes too many page faults that require I/O, on Linux. - On Mac OS X, however, (on some recent 13" MacBook), start up times did decrease with the hand-crafted XML parser: from 13/3 sec. (cold/warm) for a plain
DEV300m54to 11/2 sec. with the new parser, to 9.5/2 sec. with an additionalWILLNEEDhint (see next item). - SUSv3
posix_madvise(..., POSIX_MADV_WILLNEED)(or equivalent on the various platforms) gives a hint to the operating system that the process is going to read themmaped data, so it should start paging data in. And at least on Mac OS X (10.5) this appears to have the intended effect (see previous item). However, on Linux it actually degraded overall performance, as themadivsecall apparently only returned once all the data had actually been paged in (so that the process could not start to parse the start of the XML file in parallel with the operating system paging in the rest). - Corresponding new
osl_File_MapFlag_WillNeed(changesetc9e71e0e1283) uses “@since UDK 3.2.12,” which will potentially have to be adjusted (and a changes mail will eventually have to be written). Also, support on other platforms (especially Windows) would be desirable. - That, on Linux, start up times decreased from (probably chunk-at-a-time
readbased)libxml2parsing to the new (mmapbased) parser indicates that it might be worthwhile to change the new parser to also use chunk-at-a-timeread, and see whether start up gets faster. However, that would be some work (and would apparently be wasted on other platforms like Mac OS X).
Platform Backends
The platform backends (like the locale backend supplying data for org.openoffice.System.L10N.Locale etc, all implemented in shell/source/backends and extensions/source/config/ldap) used to work as follows: They all support the com.sun.star.configuration.backend.PlatformBackend service and implement com.sun.star.configuration.backend.XSingleLayerStratum (some also implement com.sun.star.configuration.backend.XBackendChangesNotifier, but it is unclear whether that is really fully implemented and used). configmgr would iterate all PlatformBackend services and obtain the XLayer information. It appears that, from all the complexity available through XLayer, the only thing used here would be to set specific values for specific (non-localized) properties.
I could have re-implemented this in configmgr2 (not touching the platform backends), but it appeared too complex to me. Hence, I re-organized and stripped down as follows:
- In
oor:component-data(i.e.,.xcufiles), thevalueof a non-localizedpropcan have anoor:externalattribute (mutually exclusive withoor:nil="true"; and thevalueelement must not have content then). The value of the attribute (after full attribute value normalization) should consists of a UNO service name (which must not contain a space), a space, and an identifier. - When a platform backend used to potentially provide a specific property, a corresponding
oor:exernalattribute is now added to the correspondingofficecfg/registry/data.xcufile.- There is one problem with set member properties
/org.openoffice.Setup/Office/Factories/com.sun.star.presentation.PresentationDocument/ooSetupFactoryDefaultFilter,/org.openoffice.Setup/Office/Factories/com.sun.star.sheet.SpreadsheetDocument/ooSetupFactoryDefaultFilter, and/org.openoffice.Setup/Office/Factories/com.sun.star.text.TextDocument/ooSetupFactoryDefaultFilterfor the GConf backend in theENABLE_LOCKDOWNcase, as the.xcuentries would be conditional ongconflockdownas well as, respectively,impress,calc, orwriter.
- There is one problem with set member properties
- The platform backends are modified. They no longer support the
PlatformBackendservice (but, for simplicity, still support their specific old service names, even though their interfaces have changed). Instead ofXSingleLayerStratumthey now implementcom.sun.star.beans.XPropertySet:- Each backend assigns identifiers (unique within that backend) to the properties it potentially supports.
getPropertySetInfomay return null; it will never be called byconfigmgr.setPropertyValueis not called byconfigmgrfor now.getPropertyValueshall behave as follows:- If the given identifier is potentially supported and indeed a value can be supplied for it, that value shall be returned wrapped in a
com::sun::star::beans::Optional<any>value. - If the given identifier is potentially supported but no value can be supplied for it, an empty
com::sun::star::beans::Optional<any>shall be returned (to be able to distinguish a nil configuration value from an identifier for which no value can be supplied). - If the given identifier is not supported at all, a
com::sun::star::beans::UnknownPropertyExceptionshall be raised.
- If the given identifier is potentially supported and indeed a value can be supplied for it, that value shall be returned wrapped in a
- Property change listeners are not used by
configmgrfor now. They can be used in the future as a mechanism to notify of changes in the backend. - Vetoable change listeners will never be used by
configmgr.
XcuParserhandles theoor:externalattribute by passing it toPropertyNode, where it is evaluated on demand. If the given service is not installed this is treated as if the service potentially supported the given id but could not supply a value for it. A syntactically invalid attribute value, failure to instantiate the given service'sXPropertySet, or exceptions fromXPropertySet.getPropertyValue, however, result in failures ofPropertyNode::getValue.
LDAP Backend
With the old configmgr, the configuration of the LDAP backend was two-level:
basis/share/registry/data/org/openoffice/LDAP.xcu.sample(edited and with the “.sample” dropped) defines the access to an LDAP server and a mapping.- The corresponding mapping
basis/share/registry/ldap/mapping-attr.mapdefines what properties of the OOo configuration can be filled from which LDAP attributes of the accessed LDAP entry (it is assumed that the attribute's values are UTF-8 strings). For each OOo configuration property, multiple LDAP attributes can be listed, comma separated; the first attribute for which a value can be obtained, if any, is used. - There were predefined mapping files
oo-ldap-attr.map(suitable for a Sun Java System Directory Server) andoo-ad-ldap-attr.map(suitable for a Windows Active Directory Server).
This has been simplified:
basis/share/registry/mapping.xcd.sample(edited and with the “.sample” dropped) defines the access to an LDAP server (as before) and also contains all the affected OOo configuration properties, usingoor:externalattributes to specify the corresponding LDAP attributes.- The value of such an
oor:externalattribute must be the service namecom.sun.star.configuration.backend.LdapUserProfileBefollowed by a space and then followed by one or more LDAP attribute names (which may not contain commas) separated by commas (and no interspersed spaces). - There are two predefined examples,
oo-ldap.xcd.sampleandoo-ad-ldap.xcd.sample. - The old files
LDAP.xcu.sample,oo-ldap-attr.map,oo-ad-ldap-attr.mapand corresponding directories are removed. The OOo configuration property/org.openoffice/LDAP/UserDirectory/Mappingis no longer used and has been marked as obsolete.
This implies that maintainers of pre-OOo 3.3 installations that deployed the LDAP backend need to migrate old basis/share/registry/data/org/openoffice/LDAP.xcu (and basis/share/registry/ldap/mapping-attr.map) to new basis/share/registry/mapping.xcd upon upgrade. Otherwise, OOo would no longer obtain the relevant user data from LDAP.
Localized Data
In general, the non-localized .xcu (and resulting .xcd) files contain the en-US localized values (apparently so that fallback data for missing localization is available). Hence, the en-US localized .xcu/.xcd files are, in general, effectively empty, and consequently registry_en-US.xcd (containing most of the localized values) is not included in installation sets.
However, fcfg_langpack_en-US.zip (from module filter; and the resulting fcfg_langpack_en-US.xcd) does contain en-US localized values. The handling of .xcu files in filter is a mess (as of DEV300m61):
- “
xml:lang="x-default"” is used for localized values that shall not be localized through the OOo translation mechanism (i.e., filter UI names that are names of software products and thus the same in all languages). filter/source/config/fragments/filterscontains.xcufiles that contribute members to the/org.openoffice.TypeDetection.Filter/Filtersset, whose member templateFilterhas a localizedUINameproperty. There are.xcufiles that containx-defaultlocalized values and.xcufiles (all ending in_ui.xcu) that containen-USlocalized values. Thex-defaultlocalized values from the non-_ui.xcufiles end up in the non-localizedfcfg_*_filters.xcufiles delivered fromfilters. Theen-USlocalized values from the_ui.xcufiles end up in the localizedfcfg_localized_en-US.zipdelivered fromfilters, and the other localizations of those values, provided through the OOo translation mechanism, analogously end up in the correspondingfcfg_localized_*.zipfiles.filter/source/config/fragments/internalgraphicfilterscontains.xcufiles that contribute members to the/org.openoffice.TypeDetection.GraphicFilter/Filtersset, whose member templateFilterhas a localizedUINameproperty. There are.xcufiles that containen-USlocalized values. Thoseen-USlocalized values end up in the non-localizedfcfg_internalgraphics_filters.xcudelivered fromfilters. It is assumed that the OOo translation mechanism is not used at all to extract theen-USlocalized values from the source.xcufiles to allow for them to be translated to other languages. Whether this behavior is intended is unclear.filter/source/config/fragments/typescontains.xcufiles that contribute members to the/org.openoffice.TypeDetection.Types/Typesset, whose member templateTypehas a localizedUINameproperty. There are.xcufiles that containx-defaultand/oren-USlocalized values. Those (x-defaultoren-US) localized values end up in the non-localizedfcfg_*_types.xcufiles delivered fromfilters. The OOo translation mechanism appears to be used to at least extract theen-USlocalized values from the source.xcufiles to allow for them to be translated to other languages (seefilters/source/config/fragments/types/makefile.mk); however, it appears that any translated localized values would not flow back into the non-localizedfcfg_*_types.xcufiles. According to Andreas Schlüns, those localizedUINameproperties are not used by OOo at the moment, anyway.