Difference between revisions of "Calc/Performance/Cell size"
(→ScFormulaCell: removed items that have been tackled in CWS odff) |
(→ScBaseCell: replace broken <code> with <source>) |
||
Line 8: | Line 8: | ||
sc/inc/cell.hxx; code in sc/source/core/data/cell.cxx & cell2.cxx | sc/inc/cell.hxx; code in sc/source/core/data/cell.cxx & cell2.cxx | ||
− | < | + | <source lang="cpp"> |
class ScBaseCell | class ScBaseCell | ||
{ | { | ||
Line 17: | Line 17: | ||
BYTE eCellType; // enum CellType - BYTE spart Speicher | BYTE eCellType; // enum CellType - BYTE spart Speicher | ||
BYTE nScriptType; | BYTE nScriptType; | ||
− | </ | + | </source> |
Every cell carries this overhead; note that a chunk of it is not necessary for many cells: | Every cell carries this overhead; note that a chunk of it is not necessary for many cells: | ||
Line 26: | Line 26: | ||
Solutions: a little re-factoring required, but stealing a bit-field from eCellType to denote a 'special' cell: | Solutions: a little re-factoring required, but stealing a bit-field from eCellType to denote a 'special' cell: | ||
− | < | + | <source lang="cpp"> |
class ScBaseCell | class ScBaseCell | ||
{ | { | ||
Line 34: | Line 34: | ||
bool bSpecial : 1; // other information to be looked up elsewhere | bool bSpecial : 1; // other information to be looked up elsewhere | ||
BYTE nScriptType; | BYTE nScriptType; | ||
− | </ | + | </source> |
The 'bSpecial' flag could be used to denote that there is a 'note' for this cell (in a separate hash), or that this cell has a single-cell dependant. So - we can save 2/3rds of the base size with fairly little effort. | The 'bSpecial' flag could be used to denote that there is a 'note' for this cell (in a separate hash), or that this cell has a single-cell dependant. So - we can save 2/3rds of the base size with fairly little effort. | ||
− | |||
== ScFormulaCell == | == ScFormulaCell == |
Revision as of 11:06, 10 March 2008
Basic problem: the most basic cell consumes about 50bytes all told, more complex cells consume far more memory, there are a number of simple & obvious things to re-factor here.
It's trivial to calculate the average cost - simply create a sheet with several thousand cells in it, and measure the heap allocation change on load - with memprof or some other tool, then divide by the number of cells. Similarly, wins are easy to measure this way.
ScBaseCell
sc/inc/cell.hxx; code in sc/source/core/data/cell.cxx & cell2.cxx
class ScBaseCell { protected: ScPostIt* pNote; SvtBroadcaster* pBroadcaster; USHORT nTextWidth; BYTE eCellType; // enum CellType - BYTE spart Speicher BYTE nScriptType;
Every cell carries this overhead; note that a chunk of it is not necessary for many cells:
- ScPostIt pointer - very, very infrequently used - we have almost no post-it note per cell.
- SvtBroadcaster - used by cells that are referenced (by a single cell (ie. non-range) reference) from another cell - again, a sub-set of all cells.
Solutions: a little re-factoring required, but stealing a bit-field from eCellType to denote a 'special' cell:
class ScBaseCell { protected: USHORT nTextWidth; BYTE eCellType : 7; // enum CellType - BYTE spart Speicher bool bSpecial : 1; // other information to be looked up elsewhere BYTE nScriptType;
The 'bSpecial' flag could be used to denote that there is a 'note' for this cell (in a separate hash), or that this cell has a single-cell dependant. So - we can save 2/3rds of the base size with fairly little effort.
ScFormulaCell
There are a number of problems here:
- ScFormulaCell inherits from svt/inc/listener.h - which has a virtual destructor, hence we have a vtable pointer per instance too (most likely unnecessary), as well as the listener list.
- Document pointer - as in the ScEditCell structure we lug around a document pointer we should 'know' as implicit context.
- Shared formulae - Excel will 'share' formulae - ie. very little state is duplicated if you fill a column 'D' with =((A1+B1)/C1)* SQRT(A1) or whatever. Calc by contrast will duplicate this formulae innumerable times. We need to extract immutable, position independant formula objects, reference count & share these; plus of course, elide duplicates on import. This would give a massive memory saving for large sheets - it's very common to share formulae.