To-Dos

From Apache OpenOffice Wiki
Revision as of 14:16, 5 January 2007 by Ufi (Talk | contribs)

Jump to: navigation, search

OpenOffice.org has many things which one could work on ... just see below :-)

Architecture

This page intends to collect various architectural deficiencies (aka the pet peeves of various people) of OpenOffice.org, and lists the areas where's work in progress to improve on the architecture.

Depending on the specific count algorithm, OOo consists of approximately 7E6 lines of code (the overwhelming lot being c++, all other being an order of magnitude less (Java, Perl, Basic, Python)). This sheer size in and of itself is a problem - the code base is notorious for crashing or slowing down to a crawl various software engineering tools, from debugger to dependency analysis to reverse design extraction.

The code itself varies greatly in quality, style, and age (the latter invariably leading to the former, if you recall the history and evolvement of c++), with parts being there virtually unmodified for 10+ years, and others just recently written from scratch.

Taken together, this leads to a lot of complexity and redundancy, which is very hard to remove.

Facing this amount of code, the big rules must be:

  • simplify
    • remove internal redundancy
    • remove external redundancy (use external projects, whereever possible)
    • remove unused or dead code
    • remove legacy functionality, which does no longer provide noticeable value (e.g. binfilter)
  • refactor for orthogonality
    • make subsystems implement independent functionality
    • enable combinations of those subsystems to be freely combinable
    • carry that to the UI level (no artificial restrictions on what one can do with UI objects - e.g. shapes can be rotated, and clearly text frames should, too)


Architectural To-Dos

Infrastructure Improvements

  • Speeding up the build system, and maybe even make it consider global dependencies (currently, OOo has the notion of modules, which approximately map to toplevel directories in the build tree. Automatic build-time dependency calculation is currently only available on the intra-module level).
  • Making the actual design more accessible, improving upon existing solutions like LXR or Bonsai. Ultimately, this should result in refactorings of the source code being both much easier and much safer than today, by providing information where and how specific functionality is used. A prerequisite for that would be a parser that really knows about c++ - gccxml might be a starting point.


Runtime System Improvements

This is about making the implementation languages safer, and easier to use. What follows could also be subsumed under "transparency on the implementation level". When something can be used transparently, or appears transparent to a user, it is an implementation aspect she need not care about. Being able to program in an environment which is transparent with regard to lots of aspects, empowers the developer to focus on the problem at hand, not having to litter her code with mundane tasks such as memory management or locking.

  • Make threading transparent. Currently, fulfilling the contract of a UNO component regarding thread-safeness is
  1. tedious work, because normally each involved object has to acquire and release a mutex on method entry and exit, respectively
  2. almost impossible to get right, let alone verified to work correctly (no races, no deadlocks), because of the sheer mass of involved objects and mutices (the number of distinct states that would have to be checked for a proper verification is intractable for anything but the most trivial examples). The upcoming extended Binary Uno threading-model makes thread-safeness transparent, by automatically locking and unlocking when entering or exiting components on a much coarser level than single methods.
  • Make other mundane stuff transparent. Like memory management (via garbage collection, or refcounting via smart ptrs, UNO reference), or transactionality (the mode of making changes take place either completely, or not at all. Having a component behave in a non-transactional way in the face of an error makes recovery rather hard. There's more to transactionality than exception-safeness. Imagine two users collaborating on the same document).

General Refactoring Improvements

For many reasons the OpenOffice.org codebase is difficult to understand and navigate. On of the reasons is a lack of cleanup in the code. There is a never ending list of things that ought to be done-- add some of your own.

  • Actually remove deprecated things. Things like String and UniString need to go. svtools and tools have loads of stuff that is duplicated elsewhere or is deprecated. Getting rid of these sorts of things will make maintaining application code much easier.
  • Document things. Some of the code has comments that at one time were correct. Some code has German comments. While most of the OpenOffice.org programmers sind Deutschschprachig, there is an unofficial understanding that German comments mean "don't touch."

Code Improvements

Remove unused code

Binary Loading/Saving stuff in ItemSets, depend on EditEngine Loading/Saving (only used for Clipboard) - MT

  • has been removed now (along with the version mapping stuff in SfxItemPool) in CWS tl77

Remove duplicate code

Consolidate slightly copied and modified code.

  • BigPointerArray vs. SvPointerArray
  • RTL Strings with Tools Strings

Consolidate Text Engines

  • Text Engine
  • Writer Engine
  • Edit Engine

Replace code with 3rd party

Replace self made containers with STL containers.

Improve modularity

?Clear "Mission Statements" for modules?

VCL

get rid of internal event queue.

Framework Improvements

The framework module already has a very modular architecture solely based on UNO components. Amongst others, it manages menus and toolbars, but it does not manage docked windows like e.g. navigator or impress task pane. These windows are still managed in sfx2 and their implementations are based on sfx2 classes. No non-sfx2 based module can use them.

Here's the roadmap:

  • add a LayoutManager for DockingWindows to the framework module that works in the same way as the LayoutManager for toolbars
  • implement docking handlers for the windows managed by this LayoutManager in the same way as for the toolbars
  • provide factories for the sfx2 based implementations of Docking Windows
  • simplify the frame classes in sfx2; without the sfx internal layout manager for DockingWindows only one "frame" class is necessary
  • put the current container windows inside of task windows managed by a Task service (later they might become tabs in a tab bar)
  • move the LayoutManager from the frame to the task (or add some "super manager") so that frames can share their tools

Amongst others this will allow to have more than one view to one ("split view") or several documents in one system task window.

Status Quo

The new LayoutManager and the docking handlers are already worked on and nearly finished. Next step will be the mentioned factories. There also is a prototype for a task window and a task service that can be adjusted to our needs. The biggest challenge currently is the "super manager".

Application-specific Improvements

One of the lingering problems on the application level is the fact that, in spite of modularized lower-level functionality, application functionality cannot be shared between OOo's applications (except via embedding of a whole application (OLE)). This is because for neither Calc nor Writer, there are reusable application engines, like a text engine providing text editing and layouting functionality, or a table engine providing formula and calculation support. Draw/Impress already uses a shared engine, dubbed 'Drawing Layer'. But there's still considerable functionality hidden in the application code, which is worth extracting. Especially the missing Writer engine manifests itself in duplicated text editing functionality in EditEngine and TextEngine (used by Impress and Calc for their corresponding text functionality).

Another area of improvement is rendering. Currently, all application's graphical output is based on the OutputDevice class, which provides only very basic rendering facilities (in fact, besides largely extended text output functionality (to handle OOo's i18n requirements), this interface has basically remained unchanged for a long time). Specifically, things like performant alpha compositing or anti-aliased geometry rendering are extremely hard to achieve with the current design. Therefore, starting with OOo 2.0, the XCanvas interface is slated to gradually replace OutputDevice in all applications.

Writer

  • break up the monolith
  • make the import filters more modular
  • port rendering to XCanvas

Calc

See Calc/To-Dos

Draw/Impress

  • break up the monolith
  • become more decoupled from sfx2
  • redesign API ( performance)
  • Allow slides to inherit animations from the master slide

Uno

To-Dos and potential To-Dos.

General

Clear Separation of C and C++ Uno

There are various obstacles in the way to cleanly separate C Uno (AKA Binary Uno) from C++ Uno. Some of these are

  • the C Uno runtime is implemented in C++,
  • a C++ Uno runtime would be stacked on top of C Uno,
  • there is no living C language binding,
  • the C++ Uno runtime offers various functions for bootstrapping Uno, which are not yet available for Binary Uno.
  • Upper level modules headers may not be used, until they are delivered, even if they are self contained.

Some of the obvious tasks are:

Bugs

Naming / Clean up

  • Rename module udk/cppu to reflect that it is implementing the Binary Uno runtime.
  • Rename module udk/cppuhelper to reflect that it is implementing the C++ Uno runtime.
  • Rename the Binary Uno to JNI (Jave Native Interface) bridge: java_uno -> jni_uno. Because that is what the bridge is about.
  • Rename the Binary Uno to Remote Uno bridge: urp_uno -> remote_uno. Because that is what the bridge is about (actually, there is no URP object to program against, at least not in Binary Uno).
  • Remove the "lib" prefixes under UNIX from the Binary Uno bridges.
  • Rename udk/cpputools to something like "unotools" (unfortunately this name is already in use).
  • Mark SAL_IMPLEMENT_MAIN_WITH_ARGS as deprecated, the right way to deal with args are the RTL command line arg functions, see porting/sal/inc/rtl/process.h

Simplification and Performance

  • Remove the Binary Uno Object Binary Interface (OBI) (struct uno_Interface) and friends, replace it with one of the platform C++ OBIs.
  • Support direct access of Uno types in Uno IDL, without includes.
  • Let the *makers retrieve type information from the type providers and not from rdb files.
  • Harmonize initial object access for Remote Uno and components -> it is actually the same. E.g.
    "uno:library;[gcc3];<implementation name>"
    may be used to access an instance factory, or any other object of interest.
  • Leverage Purpose Bridges for global variables, e.g. the "ServiceManager" or the "ComponentContext". Use this for bootstrapping as well. E.g.
    Reference<XComponentContext> cppu::getComponentContext();
    always returns the current component context. It is usable in components, libraries or applications and may even bootstrap Uno, if no context is available yet.
  • Remove all exception specifications.
  • Consolidate the Binary Uno structs "uno_Environment" and "uno_ExtEnvironment".
  • Remove #ifndef EXCEPTIONS_OFF macros, actually C++ Uno is not usable without exceptions anyway.
  • Is SAL_CALL really necessary for "inline" stuff? If not, remove it.
  • Unify command line interface for all Uno tools.
  • Convert the ProxyFactory service into a library and deprecate it.

Features

Tests

Framework

You can find a list of tasks that we would like to implement but until now didn't find the time to do so. These tasks can be implemented by experienced C++ developers that want to help us. You would definitely get support or help of the regular framework developers. If you are interested to work on one of these tasks please contact us on our "dev" mailing list or via e-mail to the framework project lead.

Needed skills: C++, Windows API
Difficulty: Medium
Contact: cd at openoffice dot org

  • Toolbar and popup menu controllers which are more powerful and easier to use than the current ones.

Needed skills: C++, GUI experience
Difficulty: Medium
Contact: cd at openoffice dot org

  • Improve code handling configuration settings
    • Cleanup the code of configuration items (make them write-through instead of write-back with their own cache) to support immediate updates on configuration changes.
    • Update the 'Tools - Options' dialogs to support the read-only OpenOffice.org configuration item state.

Needed skills: C++, configuration background
Difficulty: Medium/Hard
Contact: cd at openoffice dot org

  • User interface to associate templates to existing documents

Needed skills: C++, GUI experience
Difficulty: Medium
Contact: cd at openoffice dot org

Needed skills: C++, GUI experience
Difficulty: Medium/Hard
Contact: cd at openoffice dot org

  • Implement start center for Mac without using a main window

Needed skills: C++, GUI and Mac OS X experience
Difficulty: Medium
Contact: cd at openoffice dot org

  • Implement a fancy user interface for the user interface migration feature. OpenOffice.org can currently migrate user changes between versions. It would be nice for experienced users to select changes with a fancy user interface.

Needed skills: C++, GUI experience, User interface design
Difficulty: Medium
Contact: cd at openoffice dot org

  • Tabbed-Window interface: Enhanced the current implementation to support a tabbed window user interface. There is a simple extension available which can be used as a starting point.

Needed skills: C++, GUI experience, experience with MVC concepts
Difficulty: Medium/Hard
Contact: cd at openoffice dot org

Writer

Intro: Development Opportunities

Here is a list of things that we would like to implement but until now didn't find the time to do so. In our opinion all of these tasks can be done by experienced C++ developers that are willing to enter the interesting world of OOo Writer. They are great opportunities for interested developers as the list mostly contains features that have been demanded for by the OpenOffice.org community. So providing them surely would be appreciated by the users. Of course interested developers are also invited to present their own ideas they would like to implement for the Writer project. They can count on the support or the help of the regular Writer developers. If you are interested in working on one of these tasks please get in touch with us on our "dev" mailing list or via private mail to the Writer project lead. We can talk about existing specifications, ideas, stuff to read or hack etc.


Additional ToDo Pages


Features with a high number of votes

Statusbar control for line and column number in Writer

Issue 18004

A statusbar control that shows the line and column number of the current cursor position. While this sounds easy for simple documents, it can become quite "interesting" for documents with tables, text frames etc.

Reveal Formatting Codes

Issue 3395

Especially former WordPerfect users very often ask for a special view or tools window where the formatting at a particular cursor position can be made visible by showing some "tokens" representing the applied formats.


Lotus WordPro filter

Issue 11215

If you know something about the format of Lotus WordPro - here's the perfect task for you! Anything can be useful as a start, even a simple filter that just excerpts the pure text. A first patch is available but it doesn't work good enough until now.


MathML Export

Exporting to HTML should turn Formulae into MathML (not GIFs). Issue 24256

Math already supports MathML export. The HTML export filter needs to be extended to use it instead of treating Math as an OLE object that is exported as a pixel graphic.


Issue Tracker queries

Here are some issue lists of things we would like to get implemented:

Some other ideas

More text import and export filters

Perhaps you have experience with other file formats? Here's something for you!

Shrinking text below a certain size

A component to shrink a document size by a defined page count, so the user can produce documents without a final halffull page. The component may then use different steps (for example: shrink all font sizes, or shrink the paragraph distance, or shrink ...) to come to a sensible result.

"Beautifier" for Writer

Such a tool can modify existing documents in a way that the result looks the same but is optimized for whatever criteria the developer wants to fulfill: replace hard formatting (autostyles) by styles, detect multiple white spaces, detect superfluous hard formatting, optimize fonts, embed and resize pictures etc.

Integration of Mac Grammar Checker

Since OOo 3.0.1 the Grammar Checking API of OOo is final. Some extensions based on this API already exist and it would be great to have an extension integrating the Mac Grammer Checker.


Integration of spell checking into the proof reading code

OOo Writer calls Grammar Checkers from a new, multi threaded implementation that iterates over paragraphs and sentences. OTOH spell checkers are called from the main thread and just iterates over the single words of the text. For several reasons it would be desirable to integrate both iterations to a common one, where all spell checkers still can be called with single words as nowadays:

  • spell checkers that can be called with larger blocks of text can be served better by providing a specialized component (Mac spell checker!)
  • some grammar checkers might also do spell checking, so calling them from different places in the code for both tasks is inefficient
  • spell checking can happen in a parallel thread of execution

Integration of Grammar Checking into Calc, Draw and Impress

Grammar Checkers can be used from Writer only, the other applications lack the necessary text iteration loops. Also an implementation of the text iteration API that is used to communicate with Grammar Checkers is not provided by the text engine that these applications use ("EditEngine"). For efficiency reasons this task should not be started before the task "integration of spell checking into the proof reading code" has been finished.

Integration of language guessing and selection into Calc, Draw and Impress

Issue 66798

The language status bar control together with its ability to propose a language for text and offer an easy selection of the correct language currently only works for flow text in Writer.

Visualization of RDF meta data

Issue 109598

OOo Writer supports RDF meta data according to the ODF 1.2 specification for a number of objects. We need a user interface component that shows them like e.g. our current comment (notes) sidebar. This would require to refactor the components of this sidebar so that they use an API that can support comments as well as meta data and change tracking comments.

Application Help

Application_Help_Development_To_Do

Personal tools