12 April 2014: The OpenOffice Wiki is not, and never was, affected by the heartbleed bug. Users' passwords are safe and wiki users do not need take any actions.

Calc/Implementation/Formula cell and cells dependence

From Apache OpenOffice Wiki
Jump to: navigation, search


Calculation is the most important feature for Spreadsheet. There are many formulas to help user to complete some work. User can use formula to calculate literal data. For an example, "=SUM(1;2)". But in most cases, users use formula to calculate data stored in other cells. For an example, "=SUM(A1;D3)". When the data is changed in one cell, the formula cell which references this cell need be notified to recalculate to reflect the new value. How this happens? Broadcast and Listener.

1. Reference single cell

It is a typical broadcaster and listener relationship when a formula cell reference a single cell. For an example, B2 is a formula cell(=A1+B1), which reference cell A1 and B1 directly.

FormulaCell 1.png

ScBaseCell is the basic class for any cell. A cell can be a value cell, string cell, formula cell, etc. Only formula cell need listen to other cells. A ScFormulaCell is a listener. A ScBaseCell has a broadcaster. It takes ownership of the broadcaster.

FormulaCell 2.png

1.1 establish the broadcaster and listener relationship

A cell does not need have a broadcaster if it is not referenced by any other formula cell. After a formula cell is created, the formula cell tries to listen the cells which is referenced by it directly when it is inserting in the column. In ScColumn::Insert() method, it calls ScBaseCell::StartListeningTo() method, in which it analyze the reference address of a formula. For single reference address, ScFormulaCell does not try to get the referenced cell from address directly and listen to it. The reason is that the referenced cell may not existed, such as B1. So ScFormulaCell calls ScDocument::StartListeningCell(). At last, ScColumn::StartListening() is called, which can create a empty cell(ScNoteCell) if the referenced cell is not existed, and establish broadcaster and listener relationship.

FormulaCell 3.png

So there are two kinds of ScNoteCell. One is an empty cell with comments. The other is an empty cell which is referenced by other formula cell directly. This is also the reason why use ScNoteCell to replace a normal cell when deleting a cell from a column if this cell has a broadcaster. Please refer ScColumn::Delete(SCROW) for detail.

1.2 Break the broadcaster and listener relationship

When a formula cell is deleted or modified, ScBaseCell::EndListeningTo() is called to to end listening to its directly referenced cell. In ScBaseCell::EndListeningTo() method, it need analyze the reference address of a formula. For single reference address, it calls ScDocument::EndListeningCell(). At last, ScColumn::EndListening() is called, which removes the listener from broadcaster. In addition, it will delete the broadcaster from the source cell if the broadcaster does not have any listener, even delete the source cell if it is a empty cell with no listener.

2. Reference cell range

2.1 ScBroadcastArea and ScBroadcastAreaSlot

If a formula cell reference a cell range, any change in the referenced cell range need notify the formula cell. Spreadsheet use ScBroadcastArea to represent the broadcaster. ScBroadcastArea has a broadcaster and record related cell range.

FormulaCell 4.png

A ScBroadcastArea object is created if a formula reference a cell range. For a simple method, the area broadcaster can be put into an unique container. The container is a set. So different formula cells referencing same cell range will only create one broadcast area. When any cell is changed, Spreadsheet need search the container to find related area broadcaster. Then it use area broadcaster to broadcast the change.

FormulaCell 5.png

There may be lots of area broadcasters in a complex Spreadsheet document with many formulas. It will have performance problem to search in a very large container to response to any change in any cell. So Spreadsheet divide the unique container into several small containers by row and column. Each small container has a predefined range. It only stores the area broadcaster whose cell range is in its scope. If one referenced cell range is crossed the scope of several containers, every related container need store the area broadcaster. ScBroadcastAreaSlot is such container.

FormulaCell 6.png

2.2 Partition of the table

A table has 1,048,576 rows and 1,024 columns. How to divide the table into slots to make every slot does not contain a very large cell range and number of slots is not a large number either. Considering the upper sheet part usually is more populated and referneced, a logarithmic distribution method is used. From row 1 to row 32k, it divides the rows by 128. From row 32k to row 64k, it divides the rows by 256. From row 64k to row 128k, it divides the row by 512. And so on. Totally it divides rows into 896 slices. It divides 1024 columns by 16, that is 64 slices. So there are 57,344 slots in one table.

2.3 Establish and break the broadcaster and listener relationship

After a formula cell is created, the formula cell tries to listen the cell range which is referenced by formula when it is inserting in the column. In ScColumn::Insert() method, it calls ScBaseCell::StartListeningTo() method, in which it analyze the reference address of a formula. For double reference address, it call ScDocument::StartListeningArea() method. ScDocument has a ScBroadcastAreaSlotMachine which contains ScBroadcastAreaSlot by table. According to the cell range, the slot machine can find related slots. If the area is not existed in the slot, slot will create a ScBroadcastArea. Then make formula listen to the broadcast area. In ScBroadcastAreaSlotMachine::StarListeningArea() method, the area will be added to several slots if the area cross the scope of several slots. In ScBroadcastAreaSlot::StarListeningArea(), it guarantee that the broadcast area with same cell range will not be created twice.

FormulaCell 7.png

When a formula cell is deleted or modified, ScBaseCell::EndListeningTo() is called to to end listening to its directly referenced cell. In ScBaseCell::EndListeningTo() method, it need analyze the reference address of a formula. For double reference address, it calls ScDocument::EndListeningArea() which will call ScBroadcastAreaSlotMachine::EndListeningArea() . According to the cell range, the slot machine can find related slots. At last, ScBroadcastAreaSlot::EndListeningArea() is called, which removes the listener from broadcaster. In addition, it will remove the broadcaster area from the slot if the broadcaster area does not have any listener, even delete the broadcaster area if it does not has reference, which means it is not stored in other slot because of it cross several slot area.


3. Response to the change

FormulaCell 8.png

As before mentioned broadcast-listener mechanism, cell A1 listen to B1 and B2, cell B1 and B2 listen to cell C1. When C1 changed, B1 and B2 will be notified. How to spread the change to A1 in an efficient way. When a formula is notified that its dependency is changed, it set itself dirty and put itself into a FormulaTrackList in ScFormulaCell::Notify() method. FormulaTrackList is a doubly linked list in ScDocument. Every time, when some cell changes, besides broadcast the change information, ScDocument::TrackFormulas() is called. In ScDocument::TrackFormulas() method, for every formula cell in the list, it broadcasts the event that it is changed. The influenced formula cell which is not in the list will be added to the end of the list. When cell B1 broadcast, Cell A1 will be added to the FormulaTrackList in ScFormulaCell::Notify() method. Then A1 will broadcast it is changed. So all formula cells which has direct or indirect dependency for the original changed cell will be notified.


4 .Recalculate the formula

If one cell is modified, it will broadcast the event. The listening formula cells will be notified and set itself as dirty. When there is some request to get the cell value, such as Paint(get the value to paint), Saving(get the value to save in xml), the formula need be calculated. This is a efficient and high performance method that AOO does not need recalculate all related cells when a cell is modified. It only set the flag. For a cell is modified, ScDocShell::SetDocumentModified() is called, which cause ScTabViewShell be notified(ScTabViewShell::Notify()). Then ScTabView::UpdateFormulas() method is called to update all formula cells in visible area.

Personal tools