Writer/Core And Layout

From Apache OpenOffice Wiki
Jump to: navigation, search

Caution

This page is still under construction hence the content is still very tentative.

About Frames

The layout is the visual representation of a Writer document. Basically a frame is a rectangular area which is linked with other frames:

Writer layout.png

The base class of the frame hierachy is SwFrm, which is derived from SwClient for inheriting the ability of being notified by changes.

SwFrm : SwClient
pRegisteredIn : SwModify*
pUpper : SwFrm*
pNext : SwFrm*
pPrev : SwFrm*
aFrm : SwRect
aPrt : SwRect
bValidPos : bool
bValidSize : bool
bValidPrtArea : bool

A layout frame has an additional member pLower, i.e., a layout frame is a frame that contains other frames. Incarnations of a layout frame are pages, tables, ...

SwLayoutFrm : SwFrm
pLower : SwFrm*

Some frames are not only derived from SwFrm, but also from SwFlowFrm. These are the frames that are allowed at page breaks and shall continue on the next page, e.g., paragraphs, tables, ...

SwFlowFrm
rThis : SwFrm&
pFollow : SwFlowFrm*

The most important frame is the SwTxtFrm, which is the layout counterpart a SwTxtNode object. The nOfst member referes to the aText string member of the associated SwTxtNode object. A SwTxtFrm object is registered in a SwTxtNode object in order to be notified in case the SwTxtNode object is changed.

SwTxtFrm : SwFrm
nOfst : xub_StrLen

A page frame additionally has a couple of boolean members to indicate if any of the page contents is invalid. These flags are used during formatting of the document.

SwPageFrm
bInvalidLayout : bool
bInvalidContent : bool
bInvalidFly : bool

Some Notes about Nodes

A SwDoc object, which denotes the model of a Writer document, has as member of Type SwNodes, which stores the document content. The SwNodes object of an empty Writer document looks like this:

SwStartNode	(special start-end-section, not used)
SwEndNode
SwStartNode	(special start-end-section used for footnotes)
SwEndNode
SwStartNode	(special start-end-section used for frames, headers, footers)
SwEndNode
SwStartNode	(special start-end-section used for 'delete' redlines if they are not shown)
SwEndNode
SwStartNode (special start-end-section for 'regular' document content)
	SwTxtNode	(there always at least one empty paragraph in the document)
SwEndNode
SwTxtNode : SwCntntNode
aText : string

Relationship between nodes and frames

An important base class of the Writer model is SwNode, an important base class of the Writer view is SwFrm.

What's the relationship between the classes derived from these?

SwTxtNode <-> SwTxtFrm
Every SwTxtFrm belongs to a SwTxtNode, a SwTxtFrm is a SwClient and is registered in a SwModify (the SwTxtNode!).
Some SwTxtNodes do not have related SwTxtFrms, e.g. if they are in a hidden section or a header of an unused page style.
Some SwTxtNodes have more than one corresponding SwTxtFrm, e.g. if they are in a header of a used page style or if the content did not fit into one page.

SwTableNode <-> SwTabFrm
The relationship between SwTableNode and SwTabFrm is nearly like SwTxtNode/SwTxtFrm. One difference: the SwTabFrm is not registered in SwTableNode, it is registered in the table format (SwTableFmt).
A SwTableNode may have no SwTabFrm if it is hidden or part of an unused page style. A SwTableNode normally has one corresponding SwTabFrm, but could have more if it does not fit into one page or if it is part of a repeated page header/footer.

SwRowFrm
A SwRowFrm represents a row of a table. There is no corresponding SwNode object. A SwTabFrm contains at least one SwRowFrm (pLower). A SwRowFrm contains at least one SwCellFrm. SwTableBoxStartNode <-> SwCellFrm
The cells of a table are represented in the nodes array by SwStartNode (type SwTableBoxStartNode), the content of the cell is represented by the nodes between this SwStartNode and its SwEndnode. The corresponding view object is the SwCellFrm. The relationship is again like the SwTxtNode/SwTxtFrm relationship.

SwRootFrm
If a Writer document has a view (layout) there is one SwRootFrm (member of SwDoc), which represent the complete document, i.e. the complete SwNodes array. This SwRootFrm contains a double linked list of SwPageFrm, the pages of the document.

SwPageFrm
A SwPageFrm is a view object without a corresponding model object (SwNode). The pLower of the SwRootFrm is a SwPageFrm, if the document contains more than one page, these are double linked (pNext, pPrev). The pUpper of a SwPageFrm is allways the SwRootFrm. A SwPageFrm contains a double linked list of SwLayoutFrms, at least a SwBodyFrm for the floating text content and an optional SwHeaderFrm and SwFooterFrm. The SwHeaderFrm is the first frame in the list, the SwFooterFrm the last one (if they exist).

SwColumnFrm
A SwColumnFrm is a view object without a corresponding model object. It divides its upper layout frame into several columns. This upper frame can be a SwBodyFrm, a SwFlyFrm or a SwSectionFrm. The pLower of a SwColumnFrm is a SwBodyFrm.

SwSectionNode <-> SwSectFrm
A section in Writer is represented in the model by a pair SwSectionNode/SwEndNode. The content of the section (paragraphs and tables) are enclosed by this pair. If a SwTxtFrm corresponds to a SwTxtNode which is part of a section then normally the SwTxtFrm is lower to the corresponding SwSectionFrm (see example later on). If such a paragraph is broken by the page border into two pieces, the corresponding SwTxtFrms are lower to two different SwSectionFrm.
The relationship between SwSectionNode and SwSectFrm becomes complicated for nested sections. Even the SwSectionNodes are neseted, the corresponding SwSectionFrm are not! They are not lower/upper of each other, they are in the same list as pPrev/pNext.

Creation of frames

There are methods called MakeFrms(..) at some SwNode classes. These methods are able to create and insert new SwFrms into an already existent view (layout). If you e.g. press "ENTER" at the end of a paragraph, a new SwTxtNode will be inserted into the nodes array. Then this new SwTxtNode needs a view and so MakeFrms(..) will be called to create SwTxtFrm(s) and to insert it at the right layout position.
The MakeFrms() methods needs a SwNode with an existent view. The helper class SwNode2Layout is used to find all relevant SwFrms of this SwNode.

Time to have a look at some

Examples

1. The simplest document: one paragraph
Model:

 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
   <SwTxtNode A>
 <SwEndNode>

View:

 <SwRootFrm>
   <SwPageFrm>
     <SwBodyFrm>
       <SwTxtFrm A/>
     </SwBodyFrm>
   </SwPageFrm>
 </SwRootFrm>

2. A document with a page header (containing two paragraphs) and four paragraphs in floating content. The third paragraph has been splitted at the page margin
Model:

 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
   <SwStartNode>
     <SwTxtNode X>
     <SwTxtNode Y>
   <SwEndNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
   <SwTxtNode A>
   <SwTxtNode B>
   <SwTxtNode C>
   <SwTxtNode D>
 <SwEndNode>

View:

 <SwRootFrm>
   <SwPageFrm>
     <SwHeaderFrm>
       <SwTxtFrm X/>
       <SwTxtFrm Y/>
     </SwHeaderFrm>
     <SwBodyFrm>
       <SwTxtFrm A/>
       <SwTxtFrm B/>
       <SwTxtFrm C (part 1)/>
     </SwBodyFrm>
   </SwPageFrm>
   <SwPageFrm>
     <SwHeaderFrm>
       <SwTxtFrm X/>
       <SwTxtFrm Y/>
     </SwHeaderFrm>
     <SwBodyFrm>
       <SwTxtFrm C (part 2)/>
       <SwTxtFrm D/>
     </SwBodyFrm>
   </SwPageFrm>
 </SwRootFrm>

3. A document with one 2x2 table, different amount of paragraphs in table boxes and one paragraph behind the table.
Model:

 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
   <SwTableNode>
     <SwStartNode>
       <SwTxtNode X>
     <SwEndNode>
     <SwStartNode>
       <SwTxtNode Y>
       <SwTxtNode Z>
     <SwEndNode>
     <SwStartNode>
       <SwTxtNode A>
       <SwTxtNode B>
       <SwTxtNode C>
     <SwEndNode>
     <SwStartNode>
       <SwTxtNode D>
     <SwEndNode>
   <SwEndNode>
   <SwTxtNode Z>
 <SwEndNode>

View:

 <SwRootFrm>
   <SwPageFrm>
     <SwBodyFrm>
       <SwTableFrm>
         <SwRowFrm>
           <SwCellFrm>
             <SwTxtFrm X/>
           </SwCellFrm>
           <SwCellFrm>
             <SwTxtFrm Y/>
             <SwTxtFrm Z/>
           </SwCellFrm>
         </SwRowFrm>
         <SwRowFrm>
           <SwCellFrm>
             <SwTxtFrm A/>
             <SwTxtFrm B/>
             <SwTxtFrm C/>
           </SwCellFrm>
           <SwCellFrm>
             <SwTxtFrm D/>
           </SwCellFrm>
         </SwRowFrm>
       </SwTableFrm>
       <SwTxtFrm Z/>
     </SwBodyFrm>
   </SwPageFrm>
 </SwRootFrm>

4. A document with two paragraphs inside a section, one paragraph behind
Model:

 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
   <SwSectionNode>
     <SwTxtNode A>
     <SwTxtNode B>
   <SwEndNode>
   <SwTxtNode C>
 <SwEndNode>

View:

 <SwRootFrm>
   <SwPageFrm>
     <SwBodyFrm>
       <SwSectionFrm>
         <SwTxtFrm A/>
         <SwTxtFrm B/>
       </SwSectionFrm>
       <SwTxtFrm C/>
     </SwBodyFrm>
   </SwPageFrm>
 </SwRootFrm>

5. A document with three nested sections
Model:

 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
   <SwSectionNode S1>
     <SwTxtNode A>
     <SwSectionNode S2>
       <SwSectionNode S3>
         <SwTxtNode B>
         <SwTxtNode C>
       <SwEndNode>
     <SwEndNode>
     <SwTxtNode D>
   <SwEndNode>
 <SwEndNode>

View:

 <SwRootFrm>
   <SwPageFrm>
     <SwBodyFrm>
       <SwSectionFrm S1>
         <SwTxtFrm A/>
       </SwSectionFrm S1>
       <SwSectionFrm S3>
         <SwTxtFrm B/>
         <SwTxtFrm C/>
       </SwSectionFrm S3>
       <SwSectionFrm S1>
         <SwTxtFrm D/>
       </SwSectionFrm S1>
     </SwBodyFrm>
   </SwPageFrm>
 </SwRootFrm>

6. Nested sections and a 1x1 table
Model:

 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
 <SwEndNode>
 <SwStartNode>
   <SwSectionNode S1>
     <SwTxtNode A>
     <SwSectionNode S2>
       <SwTxtNode B>
     <SwEndNode>
     <SwTableNode>
       <SwStartNode>
         <SwTxtNode C>
       <SwEndNode>
     <SwEndNode>
   <SwEndNode>
   <SwTxtNode D>
 <SwEndNode>

View:

 <SwRootFrm>
   <SwPageFrm>
     <SwBodyFrm>
       <SwSectionFrm S1>
         <SwTxtFrm A/>
       </SwSectionFrm S1>
       <SwSectionFrm S2>
         <SwTxtFrm B/>
       </SwSectionFrm S2>
       <SwSectionFrm S1>
         <SwTableFrm>
           <SwRowFrm>
             <SwCellFrm>
               <SwTxtFrm C/>
             </SwCellFrm>
           </SwRowFrm>
         </SwTableFrm>
       </SwSectionFrm S1>
       <SwTxtFrm D/>
     </SwBodyFrm>
   </SwPageFrm>
 </SwRootFrm>

Some UML Diagrams

Nodes

This diagram is about the nodes array, the various node types, and document positions. A document position (SwPosition) is given by a node index (SwNodeIndex) which usually represents the paragraph the position is in and an index (SwIndex), which represents the position inside this paragraph. Note that a position does not necessarily have to be inside a paragraph, it may be the position of a graphic in the document. The positions are registered in an SwIndexReg object (which is the base class of the SwContentNode). This way the positions can be notified if e.g., characters are added to or deleted from a paragraph.

Writer nodes.png

Character Styles and Automatic Character Styles

This diagram shows the dependencies of character style attributes (SwFmtCharFmt) and automatic character style attributes (SwFmtAutoFmt).

Writer attributes.png

Cursors

This diagram shows the cursor hierarchy. A cursor basically consists of two SwPositions: Point and Mark. Such a pair is called a PaM. SwPaM is derived from SwRing. The Ring contains the single regions of a multi-selection.

Writer cursors.png

Styles

This diagram shows the classes representing styles. The base class for all styles is SwFmt. An SwFmt object contains an SwAttrSet, which is derived from SfxItemSet, a class designed to store attributes (SfxPoolItems). There are different styles - frame styles, paragraph styles, character styles etc. A paragraph style is represented by a SwTxtFmtColl object, a character style is represented by a SwCharFmt object. Styles are ofter also refered to as "formats".

Writer formats.png

Tables

Basically, there is the model representation (class SwTable) and the view representation (SwTabFrm) of Writer tables.

The Table Model

The table model is defined in an SwTable object. The SwTable object consists of an array of SwTableLine objects, which in turn consist of arrays of SwTableBox objects. All these objects are registered in classes derived from SwFrmFmt (SwTableFmt, SwTableLineFmt, SwTableBoxFmt). The attributes of the table, the table lines and table boxes are set at these SwFrmFmt objects. This way, changes at the attributes are propagated via the Modify() function to the table model objects. An SwTable pointer is a member of a SwTableNode, this way it is linked to from the nodes array.

The Table Frames

The table layout objects are SwTabFrm, SwRowFrm, and SwCellFrm. Each of them is associated with a corresponding table model object: SwTable <-> SwTabFrm, SwTableLine <-> SwRowFrm, SwTableBox <-> SwCellFrm. Like the table model objects, the table layout objects are also registered as clients of the respective SwFrmFmt objects.

Writer tables.png

Indexes and Positions

class SwNodes (array of SwNode objects): contains a list of SwNodeIndexes. Its member pRoot points to the first element.
If the array is manipulated by deletion of SwNode objects the SwNodeIndexes which points to a removed object will be adjusted to the next object which will not removed.

class SwNodeIndex: contains a pointer to a SwNode object. It is part of a list in the SwNodes array of the SwNode object. As long as the SwNode object is moved around in the SwNodes array, the SwNodeIndex needs not to be changed. If the SwNode object is removed from the SwNodes array, the SwNodeIndex is adjusted to the next SwNode in the SwNodes array.
The operators ++() and --() allows an SwNodeIndex to iterate through its SwNodes array.

class SwNodeRange: it's simply a pair (start, end) pair of SwNodeIndex.

SwNodes is derived from BigPrtArray which is an array of BigPtrEntry.
A BigPtrEntry knows its position in the array (GetPos()) and therefore has to be adjusted if elements are inserted or removed.

SwNode is derived from BigPtrEntry.

A paragraph in Writer is represented as SwTxtNode in Writer model. A SwTxtNode is derived from SwCntntNode which is derived from SwIndexReg.

class SwIndexReg: contains a sorted list of SwIndexes, pFirst points to the first element, pLast to the last element of this list.

class SwIndex: represents a position (xub_StrLen nIndex) in an array (SwIndexReg* pArray). Its registered at this array. If this array is manipulated, it updates all positions accordingly (method Update(..)).

class SwPosition: it's a pair of SwNodeIndex and SwIndex and represents a position in the document. If the SwNodeIndex points to a paragraph (SwTxtNode) the SwIndex is registered at this SwNode and its value assigns a character position inside the paragraph. If the SwNode points to another type (e.g. SwTableNode, SwSectionNode, SwStartNode) the SwIndex is registered at a dummy SwIndexReg.

Change Tracking

Often named as redlining in our code even change tracking seems to fit better.

SwDoc contains an array of recorded changes (if change tracking has been active during the editing of the document).
This array from type SwRedlineTbl contains pointer to SwRedline-objects. A SwRedline is mainly a SwPaM, i.e. a section of the document, represented by two positions. This section has been inserted, deleted or the formatting has been changed by a user. The number of this user, the time and date when the change happened and the type (insertion, deletion, formatting) is stored in the member pRedlineData of SwRedline. If the changed section is currently not visible (e.g. a deleted area, when changes Show is disabled) the SwPaM is empty but the pCntntSect member of the SwRedline points to a SwNodeIndex which is normally a position in the SwDoc's undo-array, where the non-visible redline content is stored.

Most relevant source code is located in sw/source/core/doc/docredln.cxx, the method SwDoc::AppendRedline(..) is the key function of all redline related stuff.
This method is responsible for the insertion of new redlines, it takes care for the sorting inside the SwRedlineTbl and for overlapping redlines.

Undo

The base class SwUndo is the most important class of the Writer undo concept. Objects derived from this class represent an user action (i.e. a change of the Writer model).
There are a lot of derived classes (for nearly every user action). Often these objects take a snapshot of a part of the Writer model in their constructor. The method SwUndo::Undo(..) can be called to reverse the user action and SwUndo::Redo(..) will perform the action again.
The Writer document class SwDoc contains an array of SwUndo. Every user action will create a new SwUndo object and put it into this array. If the user calls the undo functionality, the last SwUndo object will call its Undo(..) method.

... to be continued ...

Notifying/Broadcasting mechanism

The classes SwClient and SwModify are used to connect objects which needs to be notified if an object changes its status.

The SwModify object contains a list of SwClients. If a property of the SwModify object changes, its function Modify(..) could be called with the old and the new property as parameters. The SwModify broadcasted this information to all registered SwClients, i.e. it calls the Modify function of all SwClients in the list.

A SwClient offers a function Modify(..). It is able to register itself in a SwModify object. So it becomes part of a list. The SwClient contains a member pRegisteredIn, a pointer to its SwModify and two member pRight, pLeft which points to the previous and next SwClient in the list.

It's a 1:n relationship between SwModify and SwClient. A SwClient can be registered in only one SwModify, but of course a SwModify has a list of more than one SwClients.

An example:
A SwTxtFrm represents the visualization of a paragraph. If the paragraph has been edited, the core object of the paragraph (SwTxtNode) get changed. The SwTxtFrm is a SwClient and registered in the SwTxtNode which is a SwModify. If a paragraph is inside the header or footer of a page, there maybe several SwTxtFrms which needs to be notified by the SwTxtNode.

A SwClientIter is a little helperclass to allow the access to the SwClients of a SwModify.

Further Links

Documentation about the Writer text formatting engine:

http://sw.openoffice.org/drafts/text_formatting.html

Personal tools