Calc/Proposal DataPilot byIBM
Specification Status | |
Author | Wang Xu Ming |
Last Change | See wiki history |
Background
DataPilot is a critical function to Spreadsheet users. In 1.2 and 1.3 release, IBM Lotus Symphony Spreadsheet team developed several new features for DataPilot base on OpenOffice 1.1 code base and merged DataPilot related code in OpenOffice 2.4.
During the development, test team found that there is serious performance problem when user create or update a DataPilot table.
Then develop team enhanced the performance of the algorithm that created and output a DataPilot table, and will continue working on it in 2.0 version which use OpenOffice 3.1 code base.
Problem Description
Low performance when update a datapilot table
Test team tested several operations to a sample DataPilot table which have 5000 rows data source.
Test environment: Hardware: IBM T30 CPU: 2.4 GHz Memory:1.0 GB Operation System: Window XP SP2
Below table is the test result to OpenOffice 3.0.0:
Test Scenario | Open Office 3.0.0 |
Page: 1 Column: 2 Row: 1 Data: 1
Action: - Add Product to Row |
3.15s |
Page: 1 Column: 2 Row: 1 Data: 1
Action:- Product ID to Data |
3.06s |
Page: 1 Column: 3 Row: 3 Data: 1
Action:- Add Product to Row |
25.28s |
Page: 1 Column: 3 Row: 3 Data: 1
Action:- Add Product ID to Data |
44.21s |
Page: 1 Column: 2 Row: 2 Data: 3
Action:- Add SalesRep to Data |
6.28s |
Page: 1 Column: 2 Row: 1 Data: 1
Action:-Change the function of Revenue from Sum to Max |
6.03s |
Crash
Insert two field into row area ( Each field have about 1000 members ),causes freezing and crash.
Analyzing result
First, create a complex layout DataPilot table, and use rational quantify create a report for its performance.
From above table, we can see the top three "F time (% of Focus)" is three functions:
- new
- OutputDevice::DrawLine
- SfxItemSet::==
Then scan and debug the code, get three cause:
Allocate a lot of abundant data
For a simple datapilot table:
Member A1 in L1 field will create a array for all members {B1,B2,B3}. But only B1 is visible and valid.
Allocate too much memories
Every member's data is stored in a big structure.
Set too many times of border styles for output area
Some borders are set twice or more.
Solution
Data Source buffer
A document stored a source buffer array. Every table have a buffer id. The datapilot table can use the same id if they have same data source.
In the buffer, the members of a field can be identified by an id( the sorted index ).
Then in the output table's algorithm the ScDPItemData structure is replaced by an id. Only allocate visible member Enhance the algorithm of setting border style