Difference between revisions of "Calc/Proposal DataPilot byIBM"
Line 11: | Line 11: | ||
== Background== | == Background== | ||
DataPilot is a critical function to Spreadsheet users. | DataPilot is a critical function to Spreadsheet users. | ||
− | In | + | In 1.2 and 1.3 release, IBM Lotus Symphony Spreadsheet team developed several new features for DataPilot base on OpenOffice 1.1 code base and merged DataPilot related code in OpenOffice 2.4. |
During the development, test team found that there is serious performance problem when user create or update a DataPilot table. | During the development, test team found that there is serious performance problem when user create or update a DataPilot table. | ||
+ | |||
+ | Then develop team enhanced the performance of the algorithm that created and output a DataPilot table, and will continue working on it in 2.0 version which use OpenOffice 3.1 code base. | ||
+ | |||
== Problem Description == | == Problem Description == | ||
Line 19: | Line 22: | ||
Test team tested several operations to a sample DataPilot table which have 5000 rows data source. | Test team tested several operations to a sample DataPilot table which have 5000 rows data source. | ||
− | |||
− | |||
Test environment: Hardware: IBM T30 CPU: 2.4 GHz Memory:1.0 GB Operation System: Window XP SP2 | Test environment: Hardware: IBM T30 CPU: 2.4 GHz Memory:1.0 GB Operation System: Window XP SP2 | ||
+ | |||
+ | Below table is the test result to OpenOffice 3.0.0: | ||
+ | |||
{| border="2" cellpadding="4" cellspacing="0" style="margin: 1em 1em 1em 0; border: 1px #cccccc solid; border-collapse: collapse; width: 50%" | {| border="2" cellpadding="4" cellspacing="0" style="margin: 1em 1em 1em 0; border: 1px #cccccc solid; border-collapse: collapse; width: 50%" | ||
|- | |- | ||
− | | width="150" bgcolor="#dddddd" | ''' | + | | width="150" bgcolor="#dddddd" | '''Test Scenario'''|| width="150" bgcolor="#dddddd"|'''Open Office 3.0.0''' |
|- | |- | ||
− | | || | + | | Page: 1 Column: 2 Row: 1 Data: 1 |
+ | Action: - Add Product to Row | ||
+ | || 3.15s | ||
|- | |- | ||
− | | || | + | | Page: 1 Column: 2 Row: 1 Data: 1 |
+ | Action:- Product ID to Data | ||
+ | || 3.06s | ||
+ | |- | ||
+ | | Page: 1 Column: 3 Row: 3 Data: 1 | ||
+ | Action:- Add Product to Row | ||
+ | || 25.28s | ||
+ | |- | ||
+ | | Page: 1 Column: 3 Row: 3 Data: 1 | ||
+ | Action:- Add Product ID to Data | ||
+ | || 44.21s | ||
+ | |- | ||
+ | | Page: 1 Column: 2 Row: 2 Data: 3 | ||
+ | Action:- Add SalesRep to Data | ||
+ | || 6.28s | ||
+ | |- | ||
+ | | Page: 1 Column: 2 Row: 1 Data: 1 | ||
+ | Action:-Change the function of Revenue from Sum to Max | ||
+ | || 6.03s | ||
|- | |- | ||
|} | |} | ||
'''Crash''' | '''Crash''' | ||
− | Insert two field into row area ( Each field have about 1000 members ), | + | Insert two field into row area ( Each field have about 1000 members ),causes freezing and crash. |
== Analyzing result == | == Analyzing result == | ||
− | + | '''Allocate a lot of abundant data''' | |
− | + | ||
− | + | For a simple datapilot table: | |
− | + | ||
− | + | [[Image:simple dptable.jpg]] | |
− | + | ||
− | + | Member A1 in L1 field will create a array for all members {B1,B2,B3}. But only B1 is visible and valid. | |
− | + | ||
+ | '''Allocate too much memories''' | ||
+ | |||
+ | Every member's data is stored in a big structure. | ||
+ | |||
+ | '''Set too many times of border styles for output area''' | ||
+ | |||
+ | Some borders are set twice or more. | ||
== Solution == | == Solution == | ||
− | + | '''Data Source buffer''' | |
− | + | ||
− | + | A document stored a source buffer array. Every table have a buffer id. The datapilot table can use the same id if they have same data source. | |
− | + | ||
− | + | In the buffer, the members of a field can be identified by an id( the sorted index ). | |
− | + | ||
+ | Then in the output table's algorithm the ScDPItemData structure is replaced by an id. | ||
+ | '''Only allocate visible member''' | ||
+ | '''Enhance the algorithm of setting border style''' |
Revision as of 10:27, 23 June 2009
Specification Status | |
Author | Wang Xu Ming |
Last Change | See wiki history |
Background
DataPilot is a critical function to Spreadsheet users. In 1.2 and 1.3 release, IBM Lotus Symphony Spreadsheet team developed several new features for DataPilot base on OpenOffice 1.1 code base and merged DataPilot related code in OpenOffice 2.4.
During the development, test team found that there is serious performance problem when user create or update a DataPilot table.
Then develop team enhanced the performance of the algorithm that created and output a DataPilot table, and will continue working on it in 2.0 version which use OpenOffice 3.1 code base.
Problem Description
Low performance when update a datapilot table
Test team tested several operations to a sample DataPilot table which have 5000 rows data source.
Test environment: Hardware: IBM T30 CPU: 2.4 GHz Memory:1.0 GB Operation System: Window XP SP2
Below table is the test result to OpenOffice 3.0.0:
Test Scenario | Open Office 3.0.0 |
Page: 1 Column: 2 Row: 1 Data: 1
Action: - Add Product to Row |
3.15s |
Page: 1 Column: 2 Row: 1 Data: 1
Action:- Product ID to Data |
3.06s |
Page: 1 Column: 3 Row: 3 Data: 1
Action:- Add Product to Row |
25.28s |
Page: 1 Column: 3 Row: 3 Data: 1
Action:- Add Product ID to Data |
44.21s |
Page: 1 Column: 2 Row: 2 Data: 3
Action:- Add SalesRep to Data |
6.28s |
Page: 1 Column: 2 Row: 1 Data: 1
Action:-Change the function of Revenue from Sum to Max |
6.03s |
Crash
Insert two field into row area ( Each field have about 1000 members ),causes freezing and crash.
Analyzing result
Allocate a lot of abundant data
For a simple datapilot table:
Member A1 in L1 field will create a array for all members {B1,B2,B3}. But only B1 is visible and valid.
Allocate too much memories
Every member's data is stored in a big structure.
Set too many times of border styles for output area
Some borders are set twice or more.
Solution
Data Source buffer
A document stored a source buffer array. Every table have a buffer id. The datapilot table can use the same id if they have same data source.
In the buffer, the members of a field can be identified by an id( the sorted index ).
Then in the output table's algorithm the ScDPItemData structure is replaced by an id. Only allocate visible member Enhance the algorithm of setting border style