Difference between revisions of "Architecture/To-Dos"

Latest revision as of 16:24, 20 October 2010

This page intends to collect various architectural deficiencies (aka the pet peeves of various people) of OpenOffice.org, and lists the areas where's work in progress to improve on the architecture.

Depending on the specific count algorithm, OOo consists of approximately 7E6 lines of code (the overwhelming lot being c++, all other being an order of magnitude less (Java, Perl, Basic, Python)). This sheer size in and of itself is a problem - the code base is notorious for crashing or slowing down to a crawl various software engineering tools, from debugger to dependency analysis to reverse design extraction.

The code itself varies greatly in quality, style, and age (the latter invariably leading to the former, if you recall the history and evolvement of c++), with parts being there virtually unmodified for 10+ years, and others just recently written from scratch.

Taken together, this leads to a lot of complexity and redundancy, which is very hard to remove.

Facing this amount of code, the big rules must be:

simplify
- remove internal redundancy
- remove external redundancy (use external projects, whereever possible)
- remove unused or dead code
- remove legacy functionality, which does no longer provide noticeable value (e.g. binfilter)

refactor for orthogonality
- make subsystems implement independent functionality
- enable combinations of those subsystems to be freely combinable
- carry that to the UI level (no artificial restrictions on what one can do with UI objects - e.g. shapes can be rotated, and clearly text frames should, too)

Architectural To-Dos

Infrastructure Improvements

Speeding up the build system, and maybe even make it consider global dependencies (currently, OOo has the notion of modules, which approximately map to toplevel directories in the build tree. Automatic build-time dependency calculation is currently only available on the intra-module level).

Making the actual design more accessible, improving upon existing solutions like LXR or Bonsai. Ultimately, this should result in refactorings of the source code being both much easier and much safer than today, by providing information where and how specific functionality is used. A prerequisite for that would be a parser that really knows about c++ - gccxml might be a starting point.

Runtime System Improvements

This is about making the implementation languages safer, and easier to use. What follows could also be subsumed under "transparency on the implementation level". When something can be used transparently, or appears transparent to a user, it is an implementation aspect she need not care about. Being able to program in an environment which is transparent with regard to lots of aspects, empowers the developer to focus on the problem at hand, not having to litter her code with mundane tasks such as memory management or locking.

Make threading transparent. Currently, fulfilling the contract of a UNO component regarding thread-safeness is

tedious work, because normally each involved object has to acquire and release a mutex on method entry and exit, respectively
almost impossible to get right, let alone verified to work correctly (no races, no deadlocks), because of the sheer mass of involved objects and mutices (the number of distinct states that would have to be checked for a proper verification is intractable for anything but the most trivial examples). The upcoming extended Binary Uno threading-model makes thread-safeness transparent, by automatically locking and unlocking when entering or exiting components on a much coarser level than single methods.

Make other mundane stuff transparent. Like memory management (via garbage collection, or refcounting via smart ptrs, UNO reference), or transactionality (the mode of making changes take place either completely, or not at all. Having a component behave in a non-transactional way in the face of an error makes recovery rather hard. There's more to transactionality than exception-safeness. Imagine two users collaborating on the same document).

General Refactoring Improvements

For many reasons the OpenOffice.org codebase is difficult to understand and navigate. On of the reasons is a lack of cleanup in the code. There is a never ending list of things that ought to be done-- add some of your own.

Actually remove deprecated things. Things like String and UniString need to go. svtools and tools have loads of stuff that is duplicated elsewhere or is deprecated. Getting rid of these sorts of things will make maintaining application code much easier.

Document things. Some of the code has comments that at one time were correct. Some code has German comments. While most of the OpenOffice.org programmers sind Deutschschprachig, there is an unofficial understanding that German comments mean "don't touch."

Code Improvements

Remove unused code

Binary Loading/Saving stuff in ItemSets, depend on EditEngine Loading/Saving (only used for Clipboard) - MT

has been removed now (along with the version mapping stuff in SfxItemPool) in CWS tl77

Remove duplicate code

Consolidate slightly copied and modified code.

BigPointerArray vs. SvPointerArray
RTL Strings with Tools Strings

Consolidate Text Engines

Text Engine
Writer Engine
Edit Engine

Replace code with 3rd party

Replace self made containers with STL containers.

Tools Container,
SvPointerArray (Issue 112395),
BigPointerArray,
"GetPos" is mostly used wrong -> remove it (algorithmic complexity to hight O(n*n)) - list of the GetPos usage

Improve modularity

?Clear "Mission Statements" for modules?

VCL

get rid of internal event queue.

Framework Improvements

The framework module already has a very modular architecture solely based on UNO components. Amongst others, it manages menus and toolbars, but it does not manage docked windows like e.g. navigator or impress task pane. These windows are still managed in sfx2 and their implementations are based on sfx2 classes. No non-sfx2 based module can use them.

Here's the roadmap:

add a LayoutManager for DockingWindows to the framework module that works in the same way as the LayoutManager for toolbars
implement docking handlers for the windows managed by this LayoutManager in the same way as for the toolbars
provide factories for the sfx2 based implementations of Docking Windows
simplify the frame classes in sfx2; without the sfx internal layout manager for DockingWindows only one "frame" class is necessary
put the current container windows inside of task windows managed by a Task service (later they might become tabs in a tab bar)
move the LayoutManager from the frame to the task (or add some "super manager") so that frames can share their tools

Amongst others this will allow to have more than one view to one ("split view") or several documents in one system task window.

Status Quo

The new LayoutManager and the docking handlers are already worked on and nearly finished. Next step will be the mentioned factories. There also is a prototype for a task window and a task service that can be adjusted to our needs. The biggest challenge currently is the "super manager".

Application-specific Improvements

One of the lingering problems on the application level is the fact that, in spite of modularized lower-level functionality, application functionality cannot be shared between OOo's applications (except via embedding of a whole application (OLE)). This is because for neither Calc nor Writer, there are reusable application engines, like a text engine providing text editing and layouting functionality, or a table engine providing formula and calculation support. Draw/Impress already uses a shared engine, dubbed 'Drawing Layer'. But there's still considerable functionality hidden in the application code, which is worth extracting. Especially the missing Writer engine manifests itself in duplicated text editing functionality in EditEngine and TextEngine (used by Impress and Calc for their corresponding text functionality).

Example 1

The Writer uses a different implementation when displaying pictures, opposed to Draw/Impress. Thus, if you insert a picture into a writer document, you have a different feature set available as in Draw/Impress or Calc. For example, the Writer graphic object is able to render a variety of border styles, but at the same time, is unable to rotate the picture. To have rotated pictures in Writer, one has to insert that picture in a Draw document, copy the resulting drawing layer object, and paste that into Writer.

Example 2

The Draw/Impress applications use a different implementation for displaying text, as opposed to the Writer. Thus, if you insert a text shape in Impress, you can't have two-column layout with it, nor does it provide change tracking ('redlining'). On the other hand, the Writer's fly frames, which superficially perform the same task as the Impress text shape, cannot be rotated.

Another area of improvement is rendering. Currently, all application's graphical output is based on the OutputDevice class, which provides only very basic rendering facilities (in fact, besides largely extended text output functionality (to handle OOo's i18n requirements), this interface has basically remained unchanged for a long time). Specifically, things like performant alpha compositing or anti-aliased geometry rendering are extremely hard to achieve with the current design. Therefore, starting with OOo 2.0, the XCanvas interface is slated to gradually replace OutputDevice in all applications.

Writer

break up the monolith
make the import filters more modular
port rendering to XCanvas

Calc

See Calc/To-Dos

Draw/Impress

break up the monolith
become more decoupled from sfx2
redesign API ( performance)

port Drawing Layer to XCanvas (see DrawingPrimitives for one of the preconditions)

Allow slides to inherit animations from the master slide

@@ Line 1: / Line 1: @@
+<noinclude>[[Category:Architecture]] [[Category:To-Do]]</noinclude>
 This page intends to collect various architectural deficiencies (aka the pet peeves of various people) of OpenOffice.org, and lists the areas where's work in progress to improve on the architecture.
@@ Line 5: / Line 7: @@
 The code itself varies greatly in quality, style, and age (the latter invariably leading to the former, if you recall the history and evolvement of c++), with parts being there virtually unmodified for 10+ years, and others just recently written from scratch.
-Taken together, this leads to a lot of complexity and [http://artax.karlin.mff.cuni.cz/~kendy/ooo/cut-n-paste/src680-m154.txt.gz redundancy], which is very hard to remove. What follows are some concrete instantiations of the aforementioned symptoms.
+Taken together, this leads to a lot of complexity and [http://artax.karlin.mff.cuni.cz/~kendy/ooo/cut-n-paste/src680-m154.txt.gz redundancy], which is very hard to remove.
+Facing this amount of code, the big rules must be:
+*simplify
+**remove internal redundancy
+**remove external redundancy (use external projects, whereever possible)
+**remove unused or dead code
+**remove legacy functionality, which does no longer provide noticeable value (e.g. [[Framework/Modules/binfilter|binfilter]])
+*refactor for orthogonality
+**make subsystems implement independent functionality
+**enable combinations of those subsystems to be freely combinable
+**carry that to the UI level (no artificial restrictions on what one can do with UI objects - e.g. shapes can be rotated, and clearly text frames should, too)
+==Architectural To-Dos==
+<DPL>category=To-Do
+category=Architecture</DPL>
 ==Infrastructure Improvements==
@@ Line 21: / Line 39: @@
 *Make threading transparent. Currently, fulfilling the contract of a UNO component regarding thread-safeness is
 #tedious work, because normally each involved object has to acquire and release a mutex on method entry and exit, respectively
-#almost impossible to get right, let alone verified to work correctly (no races, no deadlocks), because of the sheer mass of involved objects and mutices (the number of distinct states that would have to be checked for a proper verification is intractable for anything but the most trivial examples). The upcoming [[Uno/Effort/Creating_the_Uno_Threading_Framework | UNO Threading framework]] makes thread-safeness transparent, by automatically locking and unlocking when entering or exiting components on a much coarser level than single methods.
+#almost impossible to get right, let alone verified to work correctly (no races, no deadlocks), because of the sheer mass of involved objects and mutices (the number of distinct states that would have to be checked for a proper verification is intractable for anything but the most trivial examples). The upcoming [[Uno/Effort/Binary/Extend Threading-Model|extended Binary Uno threading-model]] makes thread-safeness transparent, by automatically locking and unlocking when entering or exiting components on a much coarser level than single methods.
 *Make other mundane stuff transparent. Like memory management (via garbage collection, or refcounting via [http://www.boost.org/libs/smart_ptr/index.html smart ptrs], [http://api.openoffice.org/docs/DevelopersGuide/ProfUNO/ProfUNO.htm#1+4+2+7+3+Mapping+of+Interface UNO reference]), or [http://en.wikipedia.org/wiki/Transaction_processing transactionality] (the mode of making changes take place either completely, or not at all. Having a component behave in a non-transactional way in the face of an error makes recovery rather hard. There's more to transactionality than exception-safeness. Imagine two users collaborating on the same document).
+==General Refactoring Improvements==
+For many reasons the OpenOffice.org codebase is difficult to understand and navigate.  On of the reasons is a lack of cleanup in the code.  There is a never ending list of things that ought to be done-- add some of your own.
+*Actually remove deprecated things.  Things like String and [http://go-oo.org/lxr/ident?i=UniString UniString] need to go.  svtools and tools have loads of stuff that is duplicated elsewhere or is deprecated.  Getting rid of these sorts of things will make maintaining application code much easier.
+*Document things.  Some of the code has comments that at one time were correct.  Some code has German comments.  While most of the OpenOffice.org programmers sind Deutschschprachig, there is an unofficial understanding that German comments mean "don't touch."
+==Code Improvements==
+===Remove unused code===
+Binary Loading/Saving stuff in ItemSets, depend on EditEngine Loading/Saving (only used for Clipboard) - MT
+* has been removed now (along with the version mapping stuff in SfxItemPool) in CWS tl77
+===Remove duplicate code===
+Consolidate slightly copied and modified code.
+* BigPointerArray vs. SvPointerArray
+* RTL Strings with Tools Strings
+Consolidate Text Engines
+* Text Engine
+* Writer Engine
+* Edit Engine
+===Replace code with 3rd party===
+Replace self made containers with STL containers.
+* Tools Container,
+* SvPointerArray ([http://www.openoffice.org/issues/show_bug.cgi?id=112395 Issue 112395]),
+* BigPointerArray,
+* "GetPos" is mostly used wrong -> remove it (algorithmic complexity to hight O(n*n)) - [http://svn.services.openoffice.org/opengrok/search?q=GetPos&project=/DEV300_m87 list of the GetPos usage]
+===Improve modularity===
+?Clear "Mission Statements" for modules?
+===[[VCL]]===
+get rid of internal event queue.
+==Framework Improvements==
+The framework module already has a very modular architecture solely based on UNO components. Amongst others, it manages menus and toolbars, but it does not manage docked windows like e.g. navigator or impress task pane. These windows are still managed in sfx2 and their implementations are based on sfx2 classes. No non-sfx2 based module can use them.
+Here's the roadmap:
+* add a LayoutManager for DockingWindows to the framework module that works in the same way as the LayoutManager for toolbars
+* implement docking handlers for the windows managed by this LayoutManager in the same way as for the toolbars
+* provide factories for the sfx2 based implementations of Docking Windows
+* simplify the frame classes in sfx2; without the sfx internal layout manager for DockingWindows only one "frame" class is necessary
+* put the current container windows inside of task windows managed by a Task service (later they might become tabs in a tab bar)
+* move the LayoutManager from the frame to the task (or add some "super manager") so that frames can share their tools
+Amongst others this will allow to have more than one view to one ("split view") or several documents in one system task window.
+===Status Quo===
+The new LayoutManager and the docking handlers are already worked on and nearly finished. Next step will be the mentioned factories. There also is a prototype for a task window and a task service that can be adjusted to our needs. The biggest challenge currently is the "super manager".
+==Application-specific Improvements==
+One of the lingering problems on the application level is the fact that, in spite of modularized lower-level functionality, application functionality cannot be shared ''between'' OOo's applications (except via embedding of a whole application (OLE)). This is because for neither Calc nor Writer, there are reusable application engines, like a text engine providing text editing and layouting functionality, or a table engine providing formula and calculation support. Draw/Impress already uses a shared engine, dubbed 'Drawing Layer'. But there's still considerable functionality hidden in the application code, which is worth extracting. Especially the missing Writer engine manifests itself in duplicated text editing functionality in EditEngine and TextEngine (used by Impress and Calc for their corresponding text functionality).
+<div name="Example" class="boilerplate metadata" id="example" style="background-color: #fee; margin: 0 1em; padding: 0 10px; border: 1px solid #aaa;">
+'''Example 1'''
+The Writer uses a different implementation when displaying pictures, opposed to Draw/Impress. Thus, if you insert a picture into a writer document, you have a different feature set available as in Draw/Impress or Calc. For example, the Writer graphic object is able to render a variety of border styles, but at the same time, is unable to rotate the picture. To have rotated pictures in Writer, one has to insert that picture in a Draw document, copy the resulting drawing layer object, and paste that into Writer.
+</div>
+<p>
+<div name="Example" class="boilerplate metadata" id="example" style="background-color: #fee; margin: 0 1em; padding: 0 10px; border: 1px solid #aaa;">
+'''Example 2'''
+The Draw/Impress applications use a different implementation for displaying text, as opposed to the Writer. Thus, if you insert a text shape in Impress, you can't have two-column layout with it, nor does it provide change tracking ('redlining'). On the other hand, the Writer's fly frames, which superficially perform the same task as the Impress text shape, cannot be rotated.
+</div>
+Another area of improvement is rendering. Currently, all application's graphical output is based on the OutputDevice class, which provides only very basic rendering facilities (in fact, besides largely extended text output functionality (to handle OOo's i18n requirements), this interface has basically remained unchanged for a long time). Specifically, things like performant alpha compositing or anti-aliased geometry rendering are extremely hard to achieve with the current design. Therefore, starting with OOo 2.0, the [http://api.openoffice.org/docs/common/ref/com/sun/star/rendering/XCanvas.html XCanvas] interface is slated to gradually replace OutputDevice in all applications.
+===Writer===
+*break up the monolith
+*make the import filters more modular
+*port rendering to XCanvas
+===Calc===
+See [[Calc/To-Dos]]
+===Draw/Impress===
+*break up the monolith
+*become more decoupled from sfx2
+*redesign API ([[Impress_Performance | performance]])
+*port Drawing Layer to XCanvas (see [[DrawingPrimitives]] for one of the preconditions)
+*Allow slides to inherit animations from the master slide