Difference between revisions of "Calc/Implementation/Spreadsheet Functions"

From Apache OpenOffice Wiki
Jump to: navigation, search
(sc/source/core/src/compiler.src)
m (Advanced)
(21 intermediate revisions by 3 users not shown)
Line 23: Line 23:
  
  
=== sc/inc/compiler.hrc (&le; DEV300_m38)<br />formula/inc/formula/compiler.hrc (&ge; DEV300_m39) ===
+
=== formula/inc/formula/compiler.hrc ===
 
+
'''''sc/inc/compiler.hrc''' up to and including version DEV300_m38'' 
 +
 
 
These are the defines used by the resources for function names and the Function
 
These are the defines used by the resources for function names and the Function
 
Wizard, and the numerical values of OpCode for the formula compiler and
 
Wizard, and the numerical values of OpCode for the formula compiler and
Line 52: Line 53:
 
</source>
 
</source>
  
=== sc/inc/opcode.hxx (&le; DEV300_m38)<br />formula/inc/formula/opcode.hxx (&ge; DEV300_m39) ===
+
=== formula/inc/formula/opcode.hxx ===
 +
'''''sc/inc/opcode.hxx''' up to and including version DEV300_m38''
  
 
Here the ''OpCodeEnum'' values are defined. Note that in a non-product build
 
Here the ''OpCodeEnum'' values are defined. Note that in a non-product build
Line 72: Line 74:
 
</source>
 
</source>
  
=== sc/source/core/src/compiler.src (&le; DEV300_m38)<br />formula/source/core/resource/core_resource.src (&ge; DEV300_m39) ===
+
=== formula/source/core/resource/core_resource.src ===
 +
'''''sc/source/core/src/compiler.src''' up to and including version DEV300_m38''
  
 
These are the resources for function names. There are 3 resource bundles:
 
These are the resources for function names. There are 3 resource bundles:
Line 122: Line 125:
 
=== The compiler knows the function ===
 
=== The compiler knows the function ===
  
After having added the necessary entries to <tt>sc/inc/compiler.hrc</tt>,
+
After having added the necessary entries to <tt>formula/inc/formula/compiler.hrc</tt>, <tt>formula/inc/formula/opcode.hxx</tt>, and <tt>formula/source/core/resource/core_resource.src</tt> (<tt>sc/inc/compiler.hrc</tt>,
<tt>sc/inc/opcode.hxx</tt> and <tt>sc/source/core/src/compiler.src</tt>, the
+
<tt>sc/inc/opcode.hxx</tt>, and <tt>sc/source/core/src/compiler.src</tt> respectively), the
 
formula compiler now knows the new function name and is able to compile an
 
formula compiler now knows the new function name and is able to compile an
 
expression where it is used, and it can be stored in and loaded from
 
expression where it is used, and it can be stored in and loaded from
Line 135: Line 138:
  
 
=== sc/inc/scfuncs.hrc ===
 
=== sc/inc/scfuncs.hrc ===
 +
Till DEV300m90
  
 
Function groups (categories) and HelpIDs for functions are defined here. Lookup
 
Function groups (categories) and HelpIDs for functions are defined here. Lookup
Line 145: Line 149:
 
</source>
 
</source>
  
 +
=== sc/inc/helpids.h ===
 +
Since DEV300m91.
 +
 +
With {{bug|111874}} (not issue 111784 as in comment in CWS changehid) the helpid system has been changed from numerical to byte string. The help ids are no longer defined in file <tt>scfuncs.hrc</tt> but in file <tt>helpids.h</tt>
 +
 +
For our function that could be
 +
 +
<source lang="cpp">
 +
#define HID_FUNC_MYFUNC            "SC_HID_FUNC_MYFUNC"
 +
</source>
 +
 +
Notice the prefix <tt>SC_</tt>. Use a position that is similar to the position of the new function in the other files like <tt>compiler.hrc</tt> or <tt>core_resource.src</tt>.
  
 
=== sc/source/ui/src/scfuncs.src ===
 
=== sc/source/ui/src/scfuncs.src ===
Line 225: Line 241:
 
Now the Function Wizard can display the function and its parameters, and
 
Now the Function Wizard can display the function and its parameters, and
 
online-help may be authored.
 
online-help may be authored.
 
  
 
== Let the interpreter handle the function ==
 
== Let the interpreter handle the function ==
Line 302: Line 317:
 
=== The not so easy case of non-scalar arguments ===
 
=== The not so easy case of non-scalar arguments ===
  
'''TODO:''' section needs elaboration.
+
 
 +
==== <tt>StackVar</tt> argument types ====
  
 
If the new function would handle parameters that are not scalar values, for
 
If the new function would handle parameters that are not scalar values, for
Line 311: Line 327:
 
doubt.
 
doubt.
  
In this case the parameter types have to be added to file
+
The most common <tt>StackVar</tt> types obtainable with <code>GetStackType()</code>are:
 +
 
 +
; svDouble : A double value.
 +
; svString : A literal string text.
 +
; svSingleRef : A single cell reference.
 +
; svDoubleRef : A cell range reference.
 +
; svRefList : A list of cell range references.
 +
; svMatrix : A matrix/array.
 +
 
 +
There are a few other types available with <code>GetRawStackType()</code> for
 +
specific situations, these are converted to ''svDouble'' in
 +
<code>GetStackType()</code> and returned as 0.0 by <code>PopDouble()</code> and
 +
<code>GetDouble()</code>:
 +
 
 +
; svMissing : An empty parameter without value, if that should not be converted to 0.0 use <code>GetDoubleWithDefault(double)</code>.
 +
; svEmptyCell : A previous expression returned an empty cell, it may depend on context whether that is to be interpreted as double or string.
 +
 
 +
The general template to act on each parameter in sequence from right to left is:
 +
 
 +
<source lang="cpp">
 +
    // Note: use short instead of BYTE when doing the (nParamCount-- > 0) loop below.
 +
    short nParamCount = GetByte();
 +
    if (MustHaveParamCountMin( nParamCount, 1))    // In case your function needs at least one parameter.
 +
    {
 +
        size_t nRefInList = 0;
 +
        while(nParamCount-- > 0)
 +
        {
 +
            if (nGlobalError)
 +
                Pop();
 +
            else
 +
            {
 +
                switch (GetStackType())
 +
                {
 +
                    case svDouble :
 +
                        {
 +
                            double fVal = PopDouble();
 +
                            if (!nGlobalError)
 +
                            {
 +
                                ...
 +
                            }
 +
                        }
 +
                    break;
 +
                    case svString :
 +
                        {
 +
                            String aStr( PopString());
 +
                            if (!nGlobalError)
 +
                            {
 +
                                ...
 +
                            }
 +
                        }
 +
                    break;
 +
                    case svSingleRef :
 +
                        {
 +
                            ScAddress aAdr;
 +
                            PopSingleRef( aAdr);
 +
                            if (!nGlobalError)
 +
                            {
 +
                                ScBaseCell* pCell = GetCell( aAdr);
 +
                                if (HasCellValueData( pCell))
 +
                                {
 +
                                    double fVal = GetCellValue( aAdr, pCell);
 +
                                    if (!nGlobalError)
 +
                                    {
 +
                                        ...
 +
                                    }
 +
                                }
 +
                                else
 +
                                {
 +
                                    String aStr;
 +
                                    GetCellString( aStr, pCell);
 +
                                    if (!nGlobalError)
 +
                                    {
 +
                                        ...
 +
                                    }
 +
                                }
 +
                            }
 +
                        }
 +
                    break;
 +
                    case svDoubleRef:
 +
                    case svRefList:
 +
                        {
 +
                            ScRange aRange;
 +
                            PopDoubleRef( aRange, nParamCount, nRefInList);
 +
                            if (!nGlobalError)
 +
                            {
 +
                                double fVal;
 +
                                USHORT nErr = 0;
 +
                                ScValueIterator aValIter( pDok, aRange );
 +
                                if (aValIter.GetFirst( fVal, nErr))
 +
                                {
 +
                                    do
 +
                                    {
 +
                                        if (nErr)
 +
                                            break;
 +
                                        ...
 +
                                    } while (aValIter.GetNext( fVal, nErr));
 +
                                }
 +
                                SetError( nErr);
 +
                            }
 +
                        }
 +
                    break;
 +
                    case svMatrix:
 +
                        {
 +
                            ScMatrixRef pMat = GetMatrix();
 +
                            if (pMat)
 +
                            {
 +
                                ...
 +
                            }
 +
                            // else: GetMatrix did set errIllegalParameter
 +
                        }
 +
                    break;
 +
                    default:
 +
                        PopError();
 +
                        SetError( errIllegalParameter);
 +
                }
 +
            }
 +
        }
 +
        Push...(...);  // Don't forget to push a result ;-)
 +
    }
 +
</source>
 +
 
 +
Of course your function may require specific treatment of parameters depending
 +
on parameter types of other parameters, or bail out if a parameter type does
 +
not match an expected type, this template is meant as guidance only.
 +
 
  
 
==== sc/source/core/tool/parclass.cxx ====
 
==== sc/source/core/tool/parclass.cxx ====
  
that describes how parameters are to be treated in detail. If the function does
+
Describes how parameters are to be treated if the function is used in an array
not accept other than scalar parameters, nothing has to be done here. Please
+
context expression, AKA array formula, where cell range references may be
read the comments there and in <tt>sc/source/core/inc/parclass.hxx</tt>. Ask on
+
treated differently from non-array context. If the function does not accept
the dev@sc mailing list if in doubt.
+
other than scalar arguments, nothing has to be done here and the function does
 +
not need an entry. Else the type of each parameter has to be set to one of:
  
 +
; Value : Function expects a single scalar value.
 +
; Reference : A cell range reference is passed as reference and not converted to array.
 +
; Array : In array context a cell range reference is converted to array.
 +
; ForceArray : A cell range reference is always converted to an array and the ForceArray context is propagated to all functions and operators in this parameter's expression.
 +
; ReferenceOrForceArray : A cell range reference is not converted to an array, but a ForceArray context is propagated to all functions and operators in this parameter's expression. Only LOOKUP() uses this, other functions should not need it.
  
== Excel import/export ==
+
For a more detailed description of these types please read the comments for
 +
<tt>ScParameterClassification::Type</tt> in
 +
<tt>sc/source/core/inc/parclass.hxx</tt>. Ask on the dev@sc mailing list if in
 +
doubt.
  
 +
Set the <tt>ScParameterClassification::CommonData.bRepeatLast</tt> member
 +
variable to <code>true</code> if the function accepts multiple parameters that
 +
are of the same type as the last one specified, e.g. for SUM() that accepts
 +
multiple ''Reference'' parameters. If the function does not accept multiple
 +
parameters of the same type, set the value to <code>false</code>.
  
=== sc/source/filter/excel/xlformula.cxx ===
 
  
If the new function is also supported by Microsoft Excel&reg;, for
+
== Microsoft Excel<sup>&reg;</sup> import and export ==
import/export it has to be added to the binary filter as well. Depending on
+
from which version Excel knows the function, it has to be added to a
+
corresponding <code>XclFunctionInfo</code> table in <tt>xlformula.cxx</tt>. Ask
+
our Excel expert Daniel on the dev@sc mailing list for details. For details of
+
the tables' content see the declaration of <code>struct XclFunctionInfo</code>
+
in <tt>sc/source/filter/inc/xlformula.hxx</tt>.
+
  
 +
If the new function is also supported by Microsoft Excel<sup>&reg;</sup>, for import/export it has to be added to the filter code as well. OpenOffice.org supports various versions of the file format used by Excel.
  
=== oox/source/xls/formulabase.cxx ===
+
* Currently, import and export of the old binary formats (file extension ".xls", internal names "BIFF2" for Excel 2.x, "BIFF3" for Excel 3.0, "BIFF4" for Excel 4.0, "BIFF5" for Excel 5.0 and Excel 95, and "BIFF8" for Excel 97-2003) are implemented in the <tt>sc</tt> code module in the directory <tt>sc/source/filter/excel</tt>. While the import filters support all BIFF versions, there are export filters for BIFF5 and BIFF8 only.
  
As if that wasn't enough, nearly the same has to be added to the new binary
+
* Import of the new Office Open XML formats (file extensions ".xlsx" and ".xlsm" used by Excel 2007 and later, internal name "OOXML") and the new binary format (file extension ".xlsb" used by Excel 2007 and later, internal name "BIFF12") are implemented in the <tt>oox</tt> code module in the directory <tt>oox/source/xls</tt>.
filter for Microsoft Excel&reg; that shares some mechanisms with the new
+
 
[[WikiPedia:Office_Open_XML | Microsoft Office Open XML (MOOXML)]] filter. The
+
In the future, it is planned to move the old BIFF2-BIFF8 filters to the <tt>oox</tt> code module too, to be able to share more source code for all filters. The import of functions and formulas will take benefit from that too.
implementation in <tt>sc/source/filter/{excel,xcl97}/</tt> will be deprecated
+
 
later. For details of the new tables see declaration of <code>struct FunctionInfo</code>
+
=== Old binary file formats (BIFF2-BIFF8) ===
in <tt>oox/inc/oox/xls/formulabase.hxx</tt>.
+
 
 +
==== Overview ====
 +
 
 +
Import and export of BIFF formulas is spread over a number of files in the <tt>sc</tt> code module.
 +
 
 +
* Source code used by the import and export filters is located in the files <tt>sc/source/filter/inc/xlformula.hxx</tt> and <tt>sc/source/filter/excel/xlformula.cxx</tt>. The cxx file contains the most important detail for functions: the ''function tables'', which will be described in detail below.
 +
 
 +
* The source code to read formulas from BIFF is located in <tt>sc/source/filter/inc/excform.hxx</tt>, <tt>sc/source/filter/excel/excform.cxx</tt> (for BIFF2-BIFF5), and <tt>sc/source/filter/excel/excform8.cxx</tt> (additional code for BIFF8). Note that these source files have not been touched seriously for more than a decade, so the code quality might not be very high.
 +
 
 +
* The source code to write formulas to BIFF is located in <tt>sc/source/filter/inc/xeformula.hxx</tt> and <tt>sc/source/filter/excel/xeformula.cxx</tt>.
 +
 
 +
It is important to know the BIFF version the function has been added to. Some functions are not supported by very old versions of Excel, and some functions even change the number of supported parameters between BIFF versions.
 +
 
 +
==== Basics - function tables ====
 +
 
 +
The file <tt>sc/source/filter/excel/xlformula.cxx</tt> contains a function table for each BIFF version, and an additional table for functions unknown to Excel. Every line in these tables is basically a structure of type <tt>XclFunctionInfo</tt> as defined in <tt>sc/source/filter/inc/xlformula.hxx</tt> and describes a single spreadsheet function. If a function description is added in a table for a specific BIFF version, it will be used for all newer BIFF versions too, there is no need to repeat the function description in the following tables (unless the function changes somehow in a newer BIFF version, this will be discussed in the next chapter).
 +
 
 +
We will continue to take a look at our example function '''MYFUNC'''. We will assume that this function has been introduced in BIFF3. Furthermore, we will assume that the function is built-in in Excel and has the '''BIFF function index''' 200. The BIFF function index is the counterpart of Calc's function op-code used to identify the function in the binary file format. Later on this page, handling of add-in functions (which do not have a specific function index) will be described too.
 +
 
 +
First, we have to add a new line in the function table for BIFF3, called <tt>saFuncTable_3</tt>. By convention, the tables are sorted by BIFF function index (second column in the tables), here we should use the appropriate position for our index 200.
 +
 
 +
; Op-code
 +
The first entry in the function description is Calc's op-code of the function, which is <tt>ocMyFunc</tt> here:
 +
<source lang="cpp">
 +
    { ocMyFunc,
 +
</source>
 +
 
 +
; BIFF function index
 +
The second entry in the function description is the BIFF function index:
 +
<source lang="cpp">
 +
    { ocMyFunc, 200,
 +
</source>
 +
 
 +
; Minimum parameter count
 +
The third entry in the function description is the minimum parameter count of the function. '''MYFUNC''' expects at least one parameter:
 +
<source lang="cpp">
 +
    { ocMyFunc, 200, 1,
 +
</source>
 +
 
 +
; Maximum parameter count
 +
The fourth entry in the function description is the maximum parameter count of the function. '''MYFUNC''' expects at most two parameters:
 +
<source lang="cpp">
 +
    { ocMyFunc, 200, 1, 2,
 +
</source>
 +
 
 +
; Return type
 +
The fifth entry in the function description is the type of the value the function returns. Usually, a function returns a single value (scalar), which might be a number or a string. But a few functions return an array of values (e.g. the function '''MTRANS''' that transposes a matrix), or a cell range address (e.g. the function '''INDIRECT''' that converts a string to a range address). Allowed values for this entry are <tt>V</tt> for scalar values, <tt>A</tt> for arrays, and <tt>R</tt> for range addresses (references). '''MYFUNC''' returns a scalar value, thus we have to add the <tt>V</tt> type:
 +
<source lang="cpp">
 +
    { ocMyFunc, 200, 1, 2, V,
 +
</source>
 +
 
 +
; Parameter types
 +
The sixth entry in the function description is a ''C array'' describing the type of all function parameters. This is the most complicated part of the function description, as sometimes the parameter type is not obvious. Simple functions taking scalar values usually will have parameters of type <tt>VR</tt> which is short for "value/repeated". See the inline documentation of the structure <tt>XclFuncParamInfo</tt> in the file <tt>sc/source/filter/inc/xlformula.hxx</tt> for more details. Our function takes two scalar value parameter:
 +
<source lang="cpp">
 +
    { ocMyFunc, 200, 1, 2, V, { VR, VR },
 +
</source>
 +
 
 +
; Additional flags
 +
The seventh entry in the function description contains additional flags that control the behaviour of the formula filters. Later on, these flags will be described. Our function does not need any special behaviour:
 +
<source lang="cpp">
 +
    { ocMyFunc, 200, 1, 2, V, { VR, VR }, 0,
 +
</source>
 +
 
 +
; Alternative name
 +
The eighth entry in the function description is a ''C string'' specifying an alternative function name to be used in BIFF for add-in functions. If not used, it can be set to null:
 +
<source lang="cpp">
 +
    { ocMyFunc, 200, 1, 2, V, { VR, VR }, 0, 0 },
 +
</source>
 +
 
 +
That's it. The function will now be imported correctly from BIFF3-BIFF8 files and exported to BIFF5 and BIFF8 files. As an optimization or simplification, it is allowed to leave out the parameter type of trailing parameters, if they are equal to the type of their predecessor. In our case, we can write the parameter list as <tt>{ VR }</tt>:
 +
<source lang="cpp">
 +
    { ocMyFunc, 200, 1, 2, V, { VR }, 0, 0 },
 +
</source>
 +
 
 +
This is especially useful for functions that take more arguments, e.g. the function '''REPLACE''' that takes 4 parameters of type <tt>VR</tt>:
 +
<source lang="cpp">
 +
    { ocReplace, 119, 4, 4, V, { VR }, 0, 0 },
 +
</source>
 +
 
 +
==== Advanced - function tables and filter source code extensions ====
 +
 
 +
There are a number of special cases to be aware of. The following section describes them in detail and gives examples of related existing functions. If they do not apply to your new function, just go ahead to the next chapter ''New file formats (OOXML and BIFF12)''.
 +
 
 +
; Maximum parameter count
 +
A few functions allow to pass the maximum number of supported parameters, which is 30 in the BIFF2-BIFF8 file formats. There is a placeholder <tt>MX</tt> that can be used for the ''maximum parameter count'' entry, e.g. for the '''SUM''' or '''CHOOSE''' function. Note the short parameter type lists. The entry <tt>{ RX }</tt> means that all 30 parameters are of type <tt>RX</tt>, the entry <tt>{ VO, RO }</tt> means that the first parameter is of type <tt>VO</tt>, and that ''all'' following parameters starting from second are of type <tt>RO</tt>. The <tt>MX</tt> placeholder becomes more important in the OOXML function tables (described below) where the maximum parameter count is not fixed to 30 anymore, but changes to 255 in OOXML and BIFF12 files.
 +
<source lang="cpp">
 +
    { ocSum,    4, 0, MX, V, { RX },    0, 0 },
 +
    { ocChose, 100, 2, MX, R, { VO, RO }, 0, 0 },
 +
</source>
 +
 
 +
; Functions without parameters
 +
A few functions do not expect any parameters, e.g. the functions '''TRUE''', '''FALSE''', and '''PI'''. In this case, the parameter type list can be left empty.
 +
<source lang="cpp">
 +
    { ocPi, 19, 0, 0, V, {}, 0, 0 },
 +
</source>
 +
 
 +
; Volatile functions
 +
If the result of a function may change on every evaluation of the function, the function has to be marked as ''volatile''. For example, the functions '''NOW''', '''TODAY''', and '''RANDOM''' are volatile. To mark a volatile function, the flag <tt>EXC_FUNCFLAG_VOLATILE</tt> has to be passed in the seventh entry.
 +
<source lang="cpp">
 +
    { ocRandom, 63, 0, 0, V, {}, EXC_FUNCFLAG_VOLATILE, 0 },
 +
</source>
 +
 
 +
; Restrict to import
 +
Sometimes it is needed to restrict a function description to the import filter (hide it from the export filter). For that, the flag <tt>EXC_FUNCFLAG_IMPORTONLY</tt> can be used. For example, in Excel there exist the functions '''DOLLAR''' (BIFF index 13) and '''USDOLLAR''' (BIFF index 204). These functions behave equally (as far as we know) and are both mapped to the ODF function '''DOLLAR''' with the op-code <tt>ocCurrency</tt>. To prevent confusion in the export filter, the entry for the function '''DOLLAR''' is not marked (visible in import and export filters), but the entry for the function '''USDOLLAR''' is marked as import-only. This way, the export filter will always write the op-code <tt>ocCurrency</tt> as BIFF function '''DOLLAR''' with function index 13.
 +
<source lang="cpp">
 +
    { ocCurrency,  13, 1, 2, V, { VR }, 0, 0 },
 +
    { ocCurrency, 204, 1, 2, V, { VR }, EXC_FUNCFLAG_IMPORTONLY, 0 },
 +
</source>
 +
 
 +
; Restrict to export
 +
It is also possible to restrict a function description to the export filter (hide it from the import filter) by using the flag <tt>EXC_FUNCFLAG_EXPORTONLY</tt>. Examples will be discussed below.
 +
<source lang="cpp">
 +
    { ocExternal, 255, 1, MX, R, { RO_E, RO }, EXC_FUNCFLAG_EXPORTONLY, 0 },
 +
</source>
 +
 
 +
; Calc-only function parameters
 +
A few functions accept more parameters in Calc than in Excel. For example, the functions '''FLOOR''' and '''CEILING''' (introduced in BIFF4) are restricted to two parameters in Excel, but Calc supports a third parameters. These parameters have to be marked with the parameter type <tt>C</tt>. The import filter may react on this parameter type and add a default value for the parameter (e.g. for '''FLOOR''' and '''CEILING''' it is needed to add the value 1 to get the same behaviour as in Excel, see <tt>ExcelToSc::DoMulArgs()</tt> in <tt>sc/source/filter/excel/excform.cxx</tt>). The export filter will skip the parameter instead of producing an error for the entire function (e.g. the formula <tt>=FLOOR(1;1;1)</tt> will be exported as <tt>=FLOOR(1;1)</tt>).
 +
<source lang="cpp">
 +
    { ocFloor, 285, 2, 2, V, { VR, VR, C }, 0, 0 },
 +
    { ocCeil,  288, 2, 2, V, { VR, VR, C }, 0, 0 },
 +
</source>
 +
 
 +
; Changed parameter count in later BIFF version
 +
If a BIFF version extends a function of a former BIFF version with new parameters, the respective function table needs to repeat the function description with the changed settings. For example, the function '''WEEKDAY''' (op-code <tt>ocGetDayOfWeek</tt>) has been introduced in BIFF2 with only one parameter. In BIFF5, a second optional parameter has been added, which is supported by Calc too. In consequence, the BIFF2 function table contains a function description for this function with a maximum parameter count of 1, which will be used in the following BIFF versions too. The second parameter is marked as ''Calc-only''. The BIFF5 function table will contain a new function description with the new maximum parameter count of 2.
 +
<source lang="cpp">
 +
    // BIFF2 function table:
 +
    { ocGetDayOfWeek, 70, 1, 1, V, { VR, C }, 0, 0 },
 +
 
 +
    // BIFF5 function table, second optional parameter with type 'VR' added
 +
    { ocGetDayOfWeek, 70, 1, 2, V, { VR }, 0, 0 },
 +
</source>
 +
 
 +
; Excel-only function parameters
 +
A few functions accept more parameters in Excel than in Calc. For example, the function '''PERCENTRANK''' accepts an optional third parameter in Excel, and the function '''EXTERN.CALL''' used internally by Excel for add-in function calls needs a hidden leading parameter containing the actual function name (more details below). These parameters have to be marked by appending <tt>_E</tt> to the actual parameter type, e.g. <tt>VR_E</tt>. On import, these parameters will be ignored, or (in case of the function '''EXTERN.CALL''') evaluated internally. On export, the filter has to insert an appropriate value for the parameter if required, see <tt>XclExpFmlaCompImpl::AppendDefaultParam()</tt> in <tt>sc/source/filter/excel/xeformula.cxx</tt>.
 +
<source lang="cpp">
 +
    { ocPercentrank, 329, 2,  3, V, { RX, VR, VR_E }, 0, 0 },
 +
    { ocExternal,    255, 1, MX, R, { RO_E, RO }, EXC_FUNCFLAG_EXPORTONLY, 0 },
 +
</source>
 +
 
 +
; Simulate Calc-only functions
 +
A few functions are not supported by Excel, but the export filter may simulate them easily by using a similar functions. For example, the cotangent functions '''COT''' (cotangent), '''ACOT''' (arcos cotangent), '''COTH''' (cotangent hyperbolicus), and '''ACOTH''' (arcus cotangent hyperbolicus) do not exist in Excel, but <tt>COT(x)</tt> can be written as <tt>1/TAN(x)</tt>, <tt>ACOT(x)</tt> can be written as <tt>PI/2-ATAN(x)</tt>, <tt>COTH(x)</tt> can be written as <tt>1/TANH(x)</tt>, and <tt>ACOTH(x)</tt> can be written as <tt>ATANH(1/x)</tt>. To do that, some more work has to be done in the export filter in addition to the function description. In the following, handling of the functions '''COT''' and '''ACOTH''' will be discussed. First, we add function descriptions to the function table, containing the BIFF funtion index of the function used to simulate our new function ('''TAN''' with BIFF index 17 for '''COT''', and '''ATANH''' with BIFF index 234 for '''ACOTH'''). The function will be marked as ''export-only'', so that the import filter will ignore these function descriptions:
 +
<source lang="cpp">
 +
    { ocTan,        17, 1, 1, V, { VR }, 0, 0 },
 +
    { ocCot,        17, 1, 1, V, { VR }, EXC_FUNCFLAG_EXPORTONLY, 0 },
 +
    { ocArcTanHyp, 234, 1, 1,  V, { VR }, 0, 0 },
 +
    { ocArcCotHyp, 234, 1, 1,  V, { VR }, EXC_FUNCFLAG_EXPORTONLY, 0 },
 +
</source>
 +
Next, the export filter has to be extended to add the new tokens needed to simulate our function. To be sure to not change the meaning of the entire formula, parentheses have to be inserted appropriately.
 +
* The term <tt>COT(x)</tt> has to be replaced with <tt>(1/TAN(x))</tt>, otherwise e.g. the formula <tt>=2^COT(x)</tt> will not be evaluated correctly as <tt>=2^(1/TAN(x))</tt>, but as <tt>=(2^1)/TAN(x)</tt>.
 +
* The term <tt>ACOTH(x)</tt> has to be replaced with <tt>ATANH(1/(x))</tt>, otherwise e.g. the formula <tt>=ACOTH(1+1)</tt> will not be evaluated correctly as <tt>=ATANH(1/(1+1))</tt>, but as <tt>=ATANH((1/1)+1)</tt>.
 +
 
 +
Formulas are stored in ''reverse polish notation'' (RPN). The term <tt>(1/TAN(x))</tt> will be stored as token sequence "<tt>1,x,TAN,DIV,()</tt>" where <tt>x</tt> can be an arbitrarily complex expression. The term <tt>ATANH(1/(x))</tt> will be stored as "<tt>1,x,(),DIV,ATANH</tt>" respectively. Currently, the export filter will already find the new function descriptions but would just write the specified BIFF function index (e.g. <tt>TAN(x)</tt> for <tt>COT(x)</tt>). The new tokens have to be added manually in the export filter code. All this is done in the file <tt>sc/source/filter/excel/xeformula.cxx</tt>.
 +
* '''COT''': In the RPN array "<tt>1,x,TAN,DIV,()</tt>" (representing the function <tt>COT(x)</tt>), the token <tt>1</tt> has to be inserted in front of the entire function (consisting of the tokens "<tt>x,TAN</tt>"). To do this, a <tt>case</tt> will be inserted in the method <tt>XclExpFmlaCompImpl::PrepareFunction()</tt>:
 +
<source lang="cpp">
 +
void XclExpFmlaCompImpl::PrepareFunction( XclExpFuncData& rFuncData )
 +
{
 +
    switch( rFuncData.GetOpCode() )
 +
    {
 +
// ...
 +
        case ocCot:
 +
            AppendIntToken( 1 );
 +
        break;
 +
// ...
 +
    }
 +
}
 +
</source>
 +
* The division operator and the parentheses have to be appended to the entire function. This is done in the method <tt>XclExpFmlaCompImpl::FinishFunction()</tt>. The boolean value <tt>true</tt> in the call of <tt>AppendBinaryOperatorToken()</tt> specifies that the operator works on scalar values (in difference to e.g. the range operator ":").
 +
<source lang="cpp">
 +
void XclExpFmlaCompImpl::FinishFunction( XclExpFuncData& rFuncData, sal_uInt8 nCloseSpaces )
 +
{
 +
// ...
 +
        switch( rFuncData.GetOpCode() )
 +
        {
 +
// ...
 +
            case ocCot:
 +
                AppendBinaryOperatorToken( EXC_TOKID_DIV, true );
 +
                AppendParenToken();
 +
            break;
 +
// ...
 +
        }
 +
// ...
 +
}
 +
</source>
 +
* '''ACOTH''': In the token array "<tt>1,x,(),DIV,ATANH</tt>" (representing the function <tt>ACOTH(x)</tt>), the token <tt>1</tt> has to be inserted in front of the function parameter <tt>x</tt> (but not in front of the entire function). This is done in the method <tt>XclExpFmlaCompImpl::PrepareParam()</tt>. The variable <tt>nParamIdx</tt> contains the zero-based index of the current parameter:
 +
<source lang="cpp">
 +
void XclExpFmlaCompImpl::PrepareParam( XclExpFuncData& rFuncData )
 +
{
 +
    sal_uInt8 nParamIdx = rFuncData.GetParamCount();
 +
    switch( rFuncData.GetOpCode() )
 +
    {
 +
// ...
 +
        case ocArcCotHyp:
 +
            if( nParamIdx == 0 )
 +
                AppendIntToken( 1 );
 +
        break;
 +
// ...
 +
    }
 +
}
 +
</source>
 +
* The parentheses and the division operator have to be appended to the function parameter <tt>x</tt>. This is done in the method <tt>XclExpFmlaCompImpl::FinishParam()</tt>. Again, the variable <tt>nParamIdx</tt> contains the zero-based index of the current parameter:
 +
<source lang="cpp">
 +
void XclExpFmlaCompImpl::FinishParam( XclExpFuncData& rFuncData )
 +
{
 +
//...
 +
    sal_uInt8 nParamIdx = rFuncData.GetParamCount() - 1;
 +
    switch( rFuncData.GetOpCode() )
 +
    {
 +
// ...
 +
        case ocArcCotHyp:
 +
            if( nParamIdx == 0 )
 +
            {
 +
                AppendParenToken();
 +
                AppendBinaryOperatorToken( EXC_TOKID_DIV, true );
 +
            }
 +
        break;
 +
// ...
 +
    }
 +
}
 +
</source>
 +
 
 +
; Functions stored as call to an internal defined name
 +
Functions appearing as built-in function in Excel might in fact be stored as a call to an ''internal defined name''. For example, the function '''BAHTTEXT''' is not stored with a BIFF function index, but internally as <tt>EXTERN.CALL("_xlfn.BAHTTEXT",parameter)</tt> where <tt>"_xlfn.BAHTTEXT"</tt> is a hidden parameter and contains a reference to the internal ''defined name'' (also known as ''named range'') named "_xlfn.BAHTTEXT". This defined name is flagged to be a function call instead of a regular named range. If a formula calls a VBA macro, the function call is stored the same way using a dedicated defined name containing the name of the VBA macro. To add support for the '''BAHTTEXT''' function, the following function description can be used. It has to refer to the function '''EXTERN.CALL''' with the BIFF index 255, and has to specify the parameter type for the first hidden parameter explicitly (type <tt>RO_E</tt>). Note that ''all'' parameters of the function '''EXTERN.CALL''' have to be of type <tt>RO</tt>. The preprocessor macro <tt>EXC_FUNCNAME</tt> adds the "_xlfn." prefix to the function name:
 +
<source lang="cpp">
 +
    { ocBahtText, 255, 2, 2, V, { RO_E, RO }, 0, EXC_FUNCNAME( "BAHTTEXT" ) },
 +
</source>
 +
Actually, the function '''BAHTTEXT''' can also be stored as built-in function with the BIFF function index 368. To be able to handle this case too, the function descriptions can be changed to:
 +
<source lang="cpp">
 +
    { ocBahtText, 368, 1, 1, V, { VR },      EXC_FUNCFLAG_IMPORTONLY, EXC_FUNCNAME( "BAHTTEXT" ) },
 +
    { ocBahtText, 255, 2, 2, V, { RO_E, RO }, EXC_FUNCFLAG_EXPORTONLY, EXC_FUNCNAME( "BAHTTEXT" ) },
 +
</source>
 +
The description for the export filter still has to refer to the function '''EXTERN.CALL'''. When loading a file with the '''BAHTTEXT''' function, internally the import filter will encounter the function '''EXTERN.CALL''' and will find the (already existing) function description
 +
<source lang="cpp">
 +
    { ocExternal, 255, 1, MX, R, { RO_E, RO }, EXC_FUNCFLAG_IMPORTONLY, 0 }
 +
</source>
 +
After reading the first hidden parameter of the function, the filter will find the function name "_xlfn.BAHTTEXT", will search the function table for a function description containing this name, will find our new description mentioned above, and will finally use the op-code <tt>ocBahttext</tt>.
 +
 
 +
; Functions from external add-ins
 +
In Excel, it is possible to call functions that are part of an external add-in. These add-ins are located in a special directory of the Excel installation, called the ''library directory''. If such a function is built-in in Calc, some conversion has to be done. Currently, the only existing example is the function '''EUROCONVERT''' added to Excel with the add-in '''EUROTOOL''' (add-in library file <tt>eurotool.xla</tt>). All add-in functions (also the functions from the ''Analysis'' add-in) are stored using the internal function '''EXTERN.CALL''' (see previous section) which will contain a description of the add-in name and function name in its first hidden parameter. Thus, the formula <tt>=EUROCONVERT(1;"DEM";"EUR")</tt> will appear as <tt>=EXTERN.CALL("EUROTOOL.XLA!EUROCONVERT";1;"DEM";"EUR")</tt> in the Excel file. First, we need a function description in the BIFF8 function table, using the function '''EXTERN.CALL''' again. Again, all parameters of this function have to be of type <tt>RO</tt>:
 +
<source lang="cpp">
 +
    { ocEuroConvert, 255, 4, 6, V, { RO_E, RO }, EXC_FUNCFLAG_EXPORTONLY, "EUROCONVERT" }
 +
</source>
 +
The first hidden parameter is a reference to an ''external name'' being part of the description of the external workbook <tt>EUROTOOL.XLA</tt>. A detailed description would go beyond the scope here. To get an idea, grep the files <tt>sc/source/filter/excel/xilink.cxx</tt>, <tt>sc/source/filter/excel/excform8.cxx</tt>, and <tt>sc/source/filter/excel/xelink.cxx</tt> for <tt>xlExtEuroConvert</tt> and <tt>EXC_SBTYPE_EUROTOOL</tt>.
 +
 
 +
; Functions not supported by Excel
 +
The export filter can write functions to the Excel file format even if Excel does not know them. The import filter will restore these functions when loading the file. The function table <tt>saFuncTable_Odf</tt> contains all functions that will be handled this way. For example, the function '''BASE''' will be inserted as following:
 +
<source lang="cpp">
 +
    EXC_FUNCENTRY_ODF( ocBase, 2, 3, 0, "BASE" ),
 +
</source>
 +
The preprocessor macro <tt>EXC_FUNCENTRY_ODF</tt> converts this entry to the appropriate function descriptions (see the definition of the macro above the table). The first entry is the Calc op-code, the second entry is the minimum parameter count, the third entry is the maximum parameter count, the fourth entry will be used for additional flags, and the last entry is the ODF function name.
 +
 
 +
=== New file formats (OOXML and BIFF12) ===
 +
 
 +
==== Overview ====
 +
 
 +
As if that wasn't enough, nearly the same changes have to be done in the <tt>oox</tt> code module implementing filters for the [[WikiPedia:Office_Open_XML|Microsoft Office Open XML (OOXML)]] file format and the binary BIFF12 format. As already mentioned, in the future this filter will handle import and export of all Excel file formats including the old BIFF2-BIFF8 formats. Therefore, the function tables have been prepared to contain all information needed to import and export BIFF2-BIFF12 and OOXML files. The following files contain the source code for formula handling:
 +
* Source code used by the import and export filters is located in the files <tt>oox/inc/oox/xls/formulabase.hxx</tt> and <tt>oox/source/xls/formulabase.cxx</tt>. The cxx file contains all function tables.
 +
* The source code for formula import is located in the files <tt>oox/inc/oox/xls/formulaparser.hxx</tt> and <tt>oox/source/xls/formulaparser.cxx</tt>.
 +
* Currently, there is no source code for formula export.
 +
 
 +
==== Function tables ====
 +
 
 +
The structure of the function tables is similar to the structure of the old function tables in module <tt>sc</tt> described above. The main difference is, that the filter implementation is based completely on the OpenOffice.org API and therefore does not have access to Calc internals such as the function op-codes. Following a short overview of the entries in a function description. The function descriptions are structures of type <tt>::oox::xls::FunctionData</tt> defined locally above the tables. But first, an example description for the function '''SUM''':
 +
<source lang="cpp">
 +
    { "SUM", "SUM", 4, 4, 0, MX, V, { RX }, 0 },
 +
</source>
 +
 
 +
; ODF function name
 +
The first entry is the function name as defined in OpenFormula. May be null, if not available (functions available in Excel only).
 +
 
 +
; OOXML function name
 +
The second entry is the function name as it appears in the OOXML file format. May be null, if not available (functions available in Calc only).
 +
 
 +
; BIFF12 function index
 +
The third entry is the function index used in BIFF12 files (file extension ".xlsb").
 +
 
 +
; BIFF2-BIFF8 function index
 +
The fourth entry is the function index used in BIFF2-BIFF8 files (file extension ".xls").
 +
 
 +
; Minimum parameter count
 +
The fifth entry is the minimum number of parameters required by Excel.
 +
 
 +
; Maximum parameter count
 +
The sixth entry is the maximum number of parameters allowed in Excel. The constant <tt>MX</tt> can be used to specify the maximum number allowed by the current file format, which is 30 in BIFF2-BIFF8 and 255 in BIFF12 and OOXML. The filters will find and use the correct maximum value internally.
 +
 
 +
; Return type
 +
The seventh entry is the type of the return value. As before, can be <tt>V</tt> for scalar values, <tt>A</tt> for arrays of values, or <tt>R</tt> for cell range addresses (references).
 +
 
 +
; Parameter types
 +
The eighth entry is a ''C array'' describing the type of all function parameters. This entry is equal to the arrays from the old BIFF filters described above.
 +
 
 +
; Additional flags
 +
The ninth and last entry contains additional flags controlling the behaviour of the filters. The three flags known from the old BIFF filters are available as well as some new flags:
 +
* <tt>FUNCFLAG_VOLATILE</tt>: The function result is volatile.
 +
* <tt>FUNCFLAG_IMPORTONLY</tt>: The function description is available for the import filters only.
 +
* <tt>FUNCFLAG_EXPORTONLY</tt>: The function description is available for the export filters only.
 +
* <tt>FUNCFLAG_MACROCALL</tt>: The BIFF filters will handle the function as call to a ''defined name'' (see previous chapter for details; the OOXML name will be prefixed with "_xlfn."). Has no effect on the OOXML filter.
 +
* <tt>FUNCFLAG_MACROCALLODF</tt>: Used to mark functions not available in Excel but in Calc only, to be able to preserve them in a roundtrip scenario.
 +
* <tt>FUNCFLAG_EXTERNAL</tt>: Used to mark functions that are stored externally in Calc (with op-code <tt>ocExternal</tt>). Currently, all functions from the Analysis add-in are marked with this flag.
 +
* <tt>FUNCFLAG_MACROFUNC</tt>: The function is a macrosheet function. This kind of functions is not supported in Calc, and is available in Excel macrosheets only (not in regular worksheets).
 +
* <tt>FUNCFLAG_MACROCMD</tt>: The function is a macrosheet command. This kind of functions is not supported in Calc, and is available in Excel macrosheets only (not in regular worksheets).
 +
* <tt>FUNCFLAG_ALWAYSVAR</tt>: The function is always handled as function with variable number of parameters, even if minimum and maximum number of parameters are equal.
 +
* <tt>FUNCFLAG_PARAMPAIRS</tt>: The function repeats the last two parameter types instead of only the last type for additional parameters. This is used e.g. for the function '''COUNTIFS''' (added in OOXML/BIFF12) that supports up to 255 parameters and always expects pairs of parameter with types <tt>RO, VR</tt>.
 +
 
 +
==== Advanced ====
 +
 
 +
; Calc-only function parameters
 +
Parameters marked with the <tt>C</tt> parameter type are only available in Calc (e.g. the third parameter of the functions '''FLOOR''' and '''CEILING'''). Everytime the import filter encounters such a parameter, it calls the method <tt>FormulaFinalizer::appendCalcOnlyParameter()</tt> in <tt>oox/source/xls/formulaparser.cxx</tt> allowing to set a default value for the parameter.
 +
 
 +
; Add values for optional parameters
 +
After a function has been imported completely, the method <tt>FormulaFinalizer::appendRequiredParameters()</tt> will be called allowing to extend the parameter list. This is useful for functions with parameters optional in Excel but required in Calc, e.g. the second parameter of the function '''WEEKNUM'''.
 +
 
 +
; Add values for empty parameters
 +
If the import filter encounters an empty parameter, it calls the method <tt>FormulaFinalizer::appendEmptyParameter()</tt>. The implementation may add a value in case Calc does not support an empty parameter here. Currently, this is used for the function '''IF'''. For example, the import filter replaces <tt>=IF(cond;)</tt> with <tt>=IF(cond;0)</tt> and <tt>=IF(cond;;)</tt> with <tt>=IF(cond;0;0)</tt>.
 +
 
 +
; Functions stored as call to an internal defined name
 +
The function flag <tt>FUNCFLAG_MACROCALL</tt> is used to mark functions that are stored as call to an internal ''defined name'' (see previous chapter for details). It is still possible to specify an alternative BIFF function index.
 +
<source lang="cpp">
 +
    { "COM.MICROSOFT.BAHTTEXT", "BAHTTEXT", 368, 368, 1, 1, V, { VR }, FUNCFLAG_MACROCALL },
 +
</source>
 +
 
 +
; Functions from the Analysis add-in
 +
In the old BIFF2-BIFF8 filters, these functions are handled internally and do not need descriptions in the function tables at all. In BIFF12 and OOXML, these functions are stored as internal functions (there is no Analysis add-in anymore in Excel 2007 and later), and therefore get their own BIFF12 function index. Following an example for the description of the Analysis add-in function '''COMPLEX'''. Note the constant <tt>NOID</tt> used for the BIFF2-BIFF8 function index to specify that these formats do not support a BIFF index here. The import filter will find the syntax using the function '''EXTERN.CALL''', and will resolve the function by the specified OOXML function name.
 +
<source lang="cpp">
 +
    { "COMPLEX", "COMPLEX", 411, NOID, 2, 3, V, { RR }, FUNCFLAG_EXTERNAL },
 +
</source>
 +
 
 +
; Functions from external add-ins
 +
It is possible to call functions that are part of an external add-in. These add-ins are located in a special directory of the Excel installation, called the ''library directory''. The following function description is for the function '''EUROCONVERT''' from the external add-in '''EUROTOOL''' which is located either in the file <tt>eurotool.xla</tt> (Excel 2003 and earlier) or in the file <tt>eurotool.xlam</tt> (Excel 2007 and later). Every add-in has a corresponding value in the enumeration <tt>FunctionLibraryType</tt> defined in <tt>oox/inc/oox/xls/formulabase.hxx</tt>. The macro <tt>FUNCLIB_TO_FUNCFLAGS</tt> converts this value to the appropriate function flags:
 +
<source lang="cpp">
 +
    { "EUROCONVERT", "EUROCONVERT", NOID, NOID, 3, 5, V, { VR }, FUNCLIB_TO_FUNCFLAGS( FUNCLIB_EUROTOOL ) },
 +
</source>
 +
The function library flag triggers internal code that resolves a reference to the add-in file name. Adding a new add-in is quite easy compared to the old BIFF filters. Assuming the new add-in is called '''MYADDIN''' (file name <tt>myaddin.xla</tt> or <tt>myaddin.xlam</tt>) and contains the function '''MYADDINFUNC'''.
 +
* In <tt>oox/inc/oox/xls/formulabase.hxx</tt>, add a new value to the enumeration <tt>FunctionLibraryType</tt>:
 +
<source lang="cpp">
 +
enum FunctionLibraryType
 +
{
 +
    FUNCLIB_UNKNOWN,
 +
    FUNCLIB_EUROTOOL,
 +
    FUNCLIB_MYADDIN          // <== new enum value
 +
};
 +
</source>
 +
* In <tt>oox/source/xls/formulabase.cxx</tt>, add an entry in the method <tt>FunctionProvider::getFuncLibTypeFromLibraryName()</tt> that converts the passed library file name to a value of the enumeration <tt>FunctionLibraryType</tt>:
 +
<source lang="cpp">
 +
FunctionLibraryType FunctionProvider::getFuncLibTypeFromLibraryName( const OUString& rLibraryName ) const
 +
{
 +
// ...
 +
    if( OOX_XLS_IS_LIBNAME( rLibraryName, "EUROTOOL" ) )
 +
        return FUNCLIB_EUROTOOL;
 +
    if( OOX_XLS_IS_LIBNAME( rLibraryName, "MYADDIN" ) )    // <== new entry
 +
        return FUNCLIB_MYADDIN;                            // <==
 +
// ...
 +
}
 +
</source>
 +
* Add a new function description for each function in the add-in. We have only one function:
 +
<source lang="cpp">
 +
    { "MYADDINFUNC", "MYADDINFUNC", NOID, NOID, 2, 4, V, { VR }, FUNCLIB_TO_FUNCFLAGS( FUNCLIB_MYADDIN ) },
 +
</source>
 +
That's all.
 +
 
 +
; Functions not supported by Excel
 +
The function table <tt>saFuncTableOdf</tt> contains all functions that are not supported by Excel. For example, the function '''BASE''' will be inserted as following:
 +
<source lang=cpp>
 +
    { "BASE", 0, NOID, NOID, 2, 3, V, { VR }, FUNCFLAG_MACROCALLODF },
 +
</source>

Revision as of 12:34, 28 December 2010


Procedure to add a new Calc spreadsheet function.


Introduction

Let's assume you want to implement a new spreadsheet function and the function was defined by the OASIS OpenDocument Format Formula subcommittee, see latest revision of the specification (draft). Let's further assume the function's name is MYFUNC and will take 2 parameters, of which the second parameter is optional and defaulted to 0, and returns a number, as following:

Syntax: MYFUNC( Number Param1 [ ; Number Param2 = 0 ] )
Returns: Number


Make the formula compiler know the function

formula/inc/formula/compiler.hrc

sc/inc/compiler.hrc up to and including version DEV300_m38

These are the defines used by the resources for function names and the Function Wizard, and the numerical values of OpCode for the formula compiler and interpreter. Note that once defined the names must not be changed because they are used by the localization tools as identifiers.

Add a new define, in this case for 2 parameters append it to the section for functions with more than 1 parameter near the end of the file. Name the define SC_OPCODE_MYFUNC and insert it right before the define of SC_OPCODE_STOP_2_PAR, give it the value SC_OPCODE_STOP_2_PAR had, and increment the values of SC_OPCODE_STOP_2_PAR and SC_OPCODE_LAST_OPCODE_ID. If before the section looked like

#define SC_OPCODE_STOP_2_PAR        393
 
#define SC_OPCODE_LAST_OPCODE_ID    392      /* last OpCode */

it should then be

#define SC_OPCODE_MYFUNC            393
#define SC_OPCODE_STOP_2_PAR        394
 
#define SC_OPCODE_LAST_OPCODE_ID    393      /* last OpCode */

formula/inc/formula/opcode.hxx

sc/inc/opcode.hxx up to and including version DEV300_m38

Here the OpCodeEnum values are defined. Note that in a non-product build (--enable-dbgutil during configure) there is a typedef OpCodeEnum OpCode; to show enum names in the debugger, while in a product build it is typedef OpCodeEnum USHORT; to save some memory, since compilers tend to produce an int for an enum.

Find a "right" place for the new enum. Although the way the definitions are setup the placement doesn't matter, there are sections with different topics, such as String functions and Statistical functions. Maybe the correct place for MYFUNC would be under miscellaneous. Best practice is to add a new OpCode to the end of such section. Name the OpCode ocMyFunc and add the line

        ocMyFunc            = SC_OPCODE_MYFUNC,

formula/source/core/resource/core_resource.src

sc/source/core/src/compiler.src up to and including version DEV300_m38

These are the resources for function names. There are 3 resource bundles:

RID_SC_FUNCTION_NAMES 
English UI display names. These get localized for the UI of other languages.
RID_SC_FUNCTION_NAMES_ENGLISH 
These English names are used internally to store/load ODF v1.0/v1.1 and for API XFunctionAccess. Usually the name is identical to that in RID_SC_FUNCTION_NAMES. Once defined and "in the wild", the name must not be changed.
RID_SC_FUNCTION_NAMES_ENGLISH_ODFF 
These English names are used internally to store/load ODFF aka OpenFormula as of ODF v1.2. Once defined, the name must not be changed.

The new function name must be defined for all 3 resource bundles.

To the end of
Resource RID_SC_FUNCTION_NAMES
add

    String SC_OPCODE_MYFUNC
    {
        Text [ en-US ] = "MYFUNC" ;
    };

The [ en-US ] field tells the localization tools that the name may be localized.


To the end of
Resource RID_SC_FUNCTION_NAMES_ENGLISH
add

    String SC_OPCODE_MYFUNC { Text = "MYFUNC" ; };

The absence of the [ en-US ] field tells the localization tools that the name must not be localized.


To the end of
Resource RID_SC_FUNCTION_NAMES_ENGLISH_ODFF
add

    String SC_OPCODE_MYFUNC { Text = "MYFUNC" ; };

Again, the absence of the [ en-US ] field tells the localization tools that the name must not be localized.

The compiler knows the function

After having added the necessary entries to formula/inc/formula/compiler.hrc, formula/inc/formula/opcode.hxx, and formula/source/core/resource/core_resource.src (sc/inc/compiler.hrc, sc/inc/opcode.hxx, and sc/source/core/src/compiler.src respectively), the formula compiler now knows the new function name and is able to compile an expression where it is used, and it can be stored in and loaded from a document. Of course nothing else works, the interpreter doesn't know how to handle it and will generate an error if encountered. The function and its parameters will not appear in the Function Wizard.


Publish the function to the Function Wizard

sc/inc/scfuncs.hrc

Till DEV300m90

Function groups (categories) and HelpIDs for functions are defined here. Lookup the group where functions of ID_FUNCTION_GRP_... matching the new function's category are defined and append an entry, incrementing the offset of the last entry by one. For our function that could be

#define HID_FUNC_MYFUNC             (HID_SC_FUNC_DUMMY+(ID_FUNCTION_GRP_MATH*ID_FUNCTION_OFFSET)+56)

sc/inc/helpids.h

Since DEV300m91.

With Issue 111874 (not issue 111784 as in comment in CWS changehid) the helpid system has been changed from numerical to byte string. The help ids are no longer defined in file scfuncs.hrc but in file helpids.h

For our function that could be

#define HID_FUNC_MYFUNC             "SC_HID_FUNC_MYFUNC"

Notice the prefix SC_. Use a position that is similar to the position of the new function in the other files like compiler.hrc or core_resource.src.

sc/source/ui/src/scfuncs.src

This large resource file contains all elements necessary to display functions in the Function Wizard. It defines the function's short description, the number of parameters, whether they are optional, and the description of each parameter. For a detailed description of fields see the comment on top of the file.

Add the new function to the end of one of the two resource blocks RID_SC_FUNCTION_DESCRIPTIONS1 or RID_SC_FUNCTION_DESCRIPTIONS2. Which one doesn't really matter, but functions should be more or less equally distributed over the two blocks. There are two blocks because one resource block couldn't have more than 64k data, just another legacy from Win16 times.. looked up again right now (2008-08-23) in the meantime this restriction seems to have been obsoleted, which would have to be verified by a test build using one resource block only though.

    Resource SC_OPCODE_MYFUNC
    {
        String 1 // Description
        {
            Text [ en-US ] = "Calculates foo, optionally using bar." ;
        };
        ExtraData =
        {
            0;
            ID_FUNCTION_GRP_MATH;
            U2S( HID_FUNC_MYFUNC );
            2;  0;  1;
            0;
        };
        String 2 // Name of Parameter 1
        {
            Text [ en-US ] = "Number" ;
        };
        String 3 // Description of Parameter 1
        {
            Text [ en-US ] = "A value for which foo is to be calculated." ;
        };
        String 4 // Name of Parameter 2
        {
            Text [ en-US ] = "Number" ;
        };
        String 5 // Description of Parameter 2
        {
            Text [ en-US ] = "The bar value." ;
        };
    };

Some comments on the ExtraData block, for more details see scfuncs.src:

ExtraData =
{
    0;                      // The function is not suppressed and available in UI.
    ID_FUNCTION_GRP_MATH;   // The category in which the function is displayed.
    U2S( HID_FUNC_MYFUNC ); // The HelpID of this function.
    2;  0;  1;              // 2 parameters, of which the 2nd is optional.
    0;                      // None of the parameters are suppressed in the UI.
};


sc/util/hidother.src

Here the HelpIDs to be used within the Function Wizard are propagated to the help system. Go to the end of the section containing HID_FUNC_... and append an entry,

hidspecial HID_FUNC_MYFUNC          { HelpID = HID_FUNC_MYFUNC; };


The Function Wizard knows the function

Now the Function Wizard can display the function and its parameters, and online-help may be authored.

Let the interpreter handle the function

sc/source/core/inc/interpre.hxx

Add a member method to class ScInterpreter that will handle the new function. Here this would be

void ScMyFunc();

Take care that the new member function doesn't resemble the name of some already existing class, for example the method for the ADDRESS() function is named ScAddressFunc() because ScAddress(), if called without this-> prefix, would be the ctor of class ScAddress instead. If in doubt, add ...Func() to the name.


sc/source/core/tool/interpr4.cxx

In method ScInterpreter::Interpret() add to switch ( eOp ) the call to the member function for the OpCode:

                case ocMyFunc           : ScMyFunc();                   break;


sc/source/core/tool/interpr?.cxx

Pick one of the interpr?.cxx source files where the new method may fit. There is no general advice which file exactly that might be, be sensible. As a guide line

interpr1.cxx 
Basic math functions and informational functions.
interpr2.cxx 
Date functions and financial functions.
interpr3.cxx 
Statistical functions.
interpr4.cxx 
Interpreter managing and stack related functions, you shouldn't need to add anything there.
interpr5.cxx 
Matrix functions.
interpr6.cxx 
Some number crunching that in the past had to be compiled without optimizations, though in the mean time code changed and this probably is not necessary anymore.

For our function, since that expects two, one optional, numerical scalar arguments, this would be:

void ScInterpreter::ScMyFunc()
{
    BYTE nParamCount = GetByte();
    // The MustHaveParamCount...() functions check the number of parameters and
    // if they do not fit push an error on the stack, if the method fails
    // (returns false) we return immediately.
    if ( !MustHaveParamCount( nParamCount, 1, 2 ) )
        return;
 
    // Arguments are popped from the stack from right to left.
    double fParam2;
    if (nParamCount == 2)
        fParam2 = GetDouble();
    else
        fParam2 = 0.0;
 
    double fVal = GetDouble();
    if ( /* does fVal meet all constraints */ )
    {
        double fResult = /* calculate foo */ ;
        PushDouble( fResult );
    }
    else
        PushIllegalArgument();
}


The not so easy case of non-scalar arguments

StackVar argument types

If the new function would handle parameters that are not scalar values, for example a NumberSequence or matrix/array, they would have to be treated explicitly, checking and reacting on the type of each argument. Lookout for functions that use the GetType() call and handle StackVar svDoubleRef or similar. Ask on the dev@sc mailing list if in doubt.

The most common StackVar types obtainable with GetStackType()are:

svDouble 
A double value.
svString 
A literal string text.
svSingleRef 
A single cell reference.
svDoubleRef 
A cell range reference.
svRefList 
A list of cell range references.
svMatrix 
A matrix/array.

There are a few other types available with GetRawStackType() for specific situations, these are converted to svDouble in GetStackType() and returned as 0.0 by PopDouble() and GetDouble():

svMissing 
An empty parameter without value, if that should not be converted to 0.0 use GetDoubleWithDefault(double).
svEmptyCell 
A previous expression returned an empty cell, it may depend on context whether that is to be interpreted as double or string.

The general template to act on each parameter in sequence from right to left is:

    // Note: use short instead of BYTE when doing the (nParamCount-- > 0) loop below.
    short nParamCount = GetByte();
    if (MustHaveParamCountMin( nParamCount, 1))     // In case your function needs at least one parameter.
    {
        size_t nRefInList = 0;
        while(nParamCount-- > 0)
        {
            if (nGlobalError)
                Pop();
            else
            {
                switch (GetStackType())
                {
                    case svDouble :
                        {
                            double fVal = PopDouble();
                            if (!nGlobalError)
                            {
                                ...
                            }
                        }
                    break;
                    case svString :
                        {
                            String aStr( PopString());
                            if (!nGlobalError)
                            {
                                ...
                            }
                        }
                    break;
                    case svSingleRef :
                        {
                            ScAddress aAdr;
                            PopSingleRef( aAdr);
                            if (!nGlobalError)
                            {
                                ScBaseCell* pCell = GetCell( aAdr);
                                if (HasCellValueData( pCell))
                                {
                                    double fVal = GetCellValue( aAdr, pCell);
                                    if (!nGlobalError)
                                    {
                                        ...
                                    }
                                }
                                else
                                {
                                    String aStr;
                                    GetCellString( aStr, pCell);
                                    if (!nGlobalError)
                                    {
                                        ...
                                    }
                                }
                            }
                        }
                    break;
                    case svDoubleRef:
                    case svRefList:
                        {
                            ScRange aRange;
                            PopDoubleRef( aRange, nParamCount, nRefInList);
                            if (!nGlobalError)
                            {
                                double fVal;
                                USHORT nErr = 0;
                                ScValueIterator aValIter( pDok, aRange );
                                if (aValIter.GetFirst( fVal, nErr))
                                {
                                    do
                                    {
                                        if (nErr)
                                            break;
                                        ...
                                    } while (aValIter.GetNext( fVal, nErr));
                                }
                                SetError( nErr);
                            }
                        }
                    break;
                    case svMatrix:
                        {
                            ScMatrixRef pMat = GetMatrix();
                            if (pMat)
                            {
                                ...
                            }
                            // else: GetMatrix did set errIllegalParameter
                        }
                    break;
                    default:
                        PopError();
                        SetError( errIllegalParameter);
                }
            }
        }
        Push...(...);  // Don't forget to push a result ;-)
    }

Of course your function may require specific treatment of parameters depending on parameter types of other parameters, or bail out if a parameter type does not match an expected type, this template is meant as guidance only.


sc/source/core/tool/parclass.cxx

Describes how parameters are to be treated if the function is used in an array context expression, AKA array formula, where cell range references may be treated differently from non-array context. If the function does not accept other than scalar arguments, nothing has to be done here and the function does not need an entry. Else the type of each parameter has to be set to one of:

Value 
Function expects a single scalar value.
Reference 
A cell range reference is passed as reference and not converted to array.
Array 
In array context a cell range reference is converted to array.
ForceArray 
A cell range reference is always converted to an array and the ForceArray context is propagated to all functions and operators in this parameter's expression.
ReferenceOrForceArray 
A cell range reference is not converted to an array, but a ForceArray context is propagated to all functions and operators in this parameter's expression. Only LOOKUP() uses this, other functions should not need it.

For a more detailed description of these types please read the comments for ScParameterClassification::Type in sc/source/core/inc/parclass.hxx. Ask on the dev@sc mailing list if in doubt.

Set the ScParameterClassification::CommonData.bRepeatLast member variable to true if the function accepts multiple parameters that are of the same type as the last one specified, e.g. for SUM() that accepts multiple Reference parameters. If the function does not accept multiple parameters of the same type, set the value to false.


Microsoft Excel® import and export

If the new function is also supported by Microsoft Excel®, for import/export it has to be added to the filter code as well. OpenOffice.org supports various versions of the file format used by Excel.

  • Currently, import and export of the old binary formats (file extension ".xls", internal names "BIFF2" for Excel 2.x, "BIFF3" for Excel 3.0, "BIFF4" for Excel 4.0, "BIFF5" for Excel 5.0 and Excel 95, and "BIFF8" for Excel 97-2003) are implemented in the sc code module in the directory sc/source/filter/excel. While the import filters support all BIFF versions, there are export filters for BIFF5 and BIFF8 only.
  • Import of the new Office Open XML formats (file extensions ".xlsx" and ".xlsm" used by Excel 2007 and later, internal name "OOXML") and the new binary format (file extension ".xlsb" used by Excel 2007 and later, internal name "BIFF12") are implemented in the oox code module in the directory oox/source/xls.

In the future, it is planned to move the old BIFF2-BIFF8 filters to the oox code module too, to be able to share more source code for all filters. The import of functions and formulas will take benefit from that too.

Old binary file formats (BIFF2-BIFF8)

Overview

Import and export of BIFF formulas is spread over a number of files in the sc code module.

  • Source code used by the import and export filters is located in the files sc/source/filter/inc/xlformula.hxx and sc/source/filter/excel/xlformula.cxx. The cxx file contains the most important detail for functions: the function tables, which will be described in detail below.
  • The source code to read formulas from BIFF is located in sc/source/filter/inc/excform.hxx, sc/source/filter/excel/excform.cxx (for BIFF2-BIFF5), and sc/source/filter/excel/excform8.cxx (additional code for BIFF8). Note that these source files have not been touched seriously for more than a decade, so the code quality might not be very high.
  • The source code to write formulas to BIFF is located in sc/source/filter/inc/xeformula.hxx and sc/source/filter/excel/xeformula.cxx.

It is important to know the BIFF version the function has been added to. Some functions are not supported by very old versions of Excel, and some functions even change the number of supported parameters between BIFF versions.

Basics - function tables

The file sc/source/filter/excel/xlformula.cxx contains a function table for each BIFF version, and an additional table for functions unknown to Excel. Every line in these tables is basically a structure of type XclFunctionInfo as defined in sc/source/filter/inc/xlformula.hxx and describes a single spreadsheet function. If a function description is added in a table for a specific BIFF version, it will be used for all newer BIFF versions too, there is no need to repeat the function description in the following tables (unless the function changes somehow in a newer BIFF version, this will be discussed in the next chapter).

We will continue to take a look at our example function MYFUNC. We will assume that this function has been introduced in BIFF3. Furthermore, we will assume that the function is built-in in Excel and has the BIFF function index 200. The BIFF function index is the counterpart of Calc's function op-code used to identify the function in the binary file format. Later on this page, handling of add-in functions (which do not have a specific function index) will be described too.

First, we have to add a new line in the function table for BIFF3, called saFuncTable_3. By convention, the tables are sorted by BIFF function index (second column in the tables), here we should use the appropriate position for our index 200.

Op-code

The first entry in the function description is Calc's op-code of the function, which is ocMyFunc here:

    { ocMyFunc,
BIFF function index

The second entry in the function description is the BIFF function index:

    { ocMyFunc, 200,
Minimum parameter count

The third entry in the function description is the minimum parameter count of the function. MYFUNC expects at least one parameter:

    { ocMyFunc, 200, 1,
Maximum parameter count

The fourth entry in the function description is the maximum parameter count of the function. MYFUNC expects at most two parameters:

    { ocMyFunc, 200, 1, 2,
Return type

The fifth entry in the function description is the type of the value the function returns. Usually, a function returns a single value (scalar), which might be a number or a string. But a few functions return an array of values (e.g. the function MTRANS that transposes a matrix), or a cell range address (e.g. the function INDIRECT that converts a string to a range address). Allowed values for this entry are V for scalar values, A for arrays, and R for range addresses (references). MYFUNC returns a scalar value, thus we have to add the V type:

    { ocMyFunc, 200, 1, 2, V,
Parameter types

The sixth entry in the function description is a C array describing the type of all function parameters. This is the most complicated part of the function description, as sometimes the parameter type is not obvious. Simple functions taking scalar values usually will have parameters of type VR which is short for "value/repeated". See the inline documentation of the structure XclFuncParamInfo in the file sc/source/filter/inc/xlformula.hxx for more details. Our function takes two scalar value parameter:

    { ocMyFunc, 200, 1, 2, V, { VR, VR },
Additional flags

The seventh entry in the function description contains additional flags that control the behaviour of the formula filters. Later on, these flags will be described. Our function does not need any special behaviour:

    { ocMyFunc, 200, 1, 2, V, { VR, VR }, 0,
Alternative name

The eighth entry in the function description is a C string specifying an alternative function name to be used in BIFF for add-in functions. If not used, it can be set to null:

    { ocMyFunc, 200, 1, 2, V, { VR, VR }, 0, 0 },

That's it. The function will now be imported correctly from BIFF3-BIFF8 files and exported to BIFF5 and BIFF8 files. As an optimization or simplification, it is allowed to leave out the parameter type of trailing parameters, if they are equal to the type of their predecessor. In our case, we can write the parameter list as { VR }:

    { ocMyFunc, 200, 1, 2, V, { VR }, 0, 0 },

This is especially useful for functions that take more arguments, e.g. the function REPLACE that takes 4 parameters of type VR:

    { ocReplace, 119, 4, 4, V, { VR }, 0, 0 },

Advanced - function tables and filter source code extensions

There are a number of special cases to be aware of. The following section describes them in detail and gives examples of related existing functions. If they do not apply to your new function, just go ahead to the next chapter New file formats (OOXML and BIFF12).

Maximum parameter count

A few functions allow to pass the maximum number of supported parameters, which is 30 in the BIFF2-BIFF8 file formats. There is a placeholder MX that can be used for the maximum parameter count entry, e.g. for the SUM or CHOOSE function. Note the short parameter type lists. The entry { RX } means that all 30 parameters are of type RX, the entry { VO, RO } means that the first parameter is of type VO, and that all following parameters starting from second are of type RO. The MX placeholder becomes more important in the OOXML function tables (described below) where the maximum parameter count is not fixed to 30 anymore, but changes to 255 in OOXML and BIFF12 files.

    { ocSum,     4, 0, MX, V, { RX },     0, 0 },
    { ocChose, 100, 2, MX, R, { VO, RO }, 0, 0 },
Functions without parameters

A few functions do not expect any parameters, e.g. the functions TRUE, FALSE, and PI. In this case, the parameter type list can be left empty.

    { ocPi, 19, 0, 0, V, {}, 0, 0 },
Volatile functions

If the result of a function may change on every evaluation of the function, the function has to be marked as volatile. For example, the functions NOW, TODAY, and RANDOM are volatile. To mark a volatile function, the flag EXC_FUNCFLAG_VOLATILE has to be passed in the seventh entry.

    { ocRandom, 63, 0, 0, V, {}, EXC_FUNCFLAG_VOLATILE, 0 },
Restrict to import

Sometimes it is needed to restrict a function description to the import filter (hide it from the export filter). For that, the flag EXC_FUNCFLAG_IMPORTONLY can be used. For example, in Excel there exist the functions DOLLAR (BIFF index 13) and USDOLLAR (BIFF index 204). These functions behave equally (as far as we know) and are both mapped to the ODF function DOLLAR with the op-code ocCurrency. To prevent confusion in the export filter, the entry for the function DOLLAR is not marked (visible in import and export filters), but the entry for the function USDOLLAR is marked as import-only. This way, the export filter will always write the op-code ocCurrency as BIFF function DOLLAR with function index 13.

    { ocCurrency,  13, 1, 2, V, { VR }, 0, 0 },
    { ocCurrency, 204, 1, 2, V, { VR }, EXC_FUNCFLAG_IMPORTONLY, 0 },
Restrict to export

It is also possible to restrict a function description to the export filter (hide it from the import filter) by using the flag EXC_FUNCFLAG_EXPORTONLY. Examples will be discussed below.

    { ocExternal, 255, 1, MX, R, { RO_E, RO }, EXC_FUNCFLAG_EXPORTONLY, 0 },
Calc-only function parameters

A few functions accept more parameters in Calc than in Excel. For example, the functions FLOOR and CEILING (introduced in BIFF4) are restricted to two parameters in Excel, but Calc supports a third parameters. These parameters have to be marked with the parameter type C. The import filter may react on this parameter type and add a default value for the parameter (e.g. for FLOOR and CEILING it is needed to add the value 1 to get the same behaviour as in Excel, see ExcelToSc::DoMulArgs() in sc/source/filter/excel/excform.cxx). The export filter will skip the parameter instead of producing an error for the entire function (e.g. the formula =FLOOR(1;1;1) will be exported as =FLOOR(1;1)).

    { ocFloor, 285, 2, 2, V, { VR, VR, C }, 0, 0 },
    { ocCeil,  288, 2, 2, V, { VR, VR, C }, 0, 0 },
Changed parameter count in later BIFF version

If a BIFF version extends a function of a former BIFF version with new parameters, the respective function table needs to repeat the function description with the changed settings. For example, the function WEEKDAY (op-code ocGetDayOfWeek) has been introduced in BIFF2 with only one parameter. In BIFF5, a second optional parameter has been added, which is supported by Calc too. In consequence, the BIFF2 function table contains a function description for this function with a maximum parameter count of 1, which will be used in the following BIFF versions too. The second parameter is marked as Calc-only. The BIFF5 function table will contain a new function description with the new maximum parameter count of 2.

    // BIFF2 function table:
    { ocGetDayOfWeek, 70, 1, 1, V, { VR, C }, 0, 0 },
 
    // BIFF5 function table, second optional parameter with type 'VR' added
    { ocGetDayOfWeek, 70, 1, 2, V, { VR }, 0, 0 },
Excel-only function parameters

A few functions accept more parameters in Excel than in Calc. For example, the function PERCENTRANK accepts an optional third parameter in Excel, and the function EXTERN.CALL used internally by Excel for add-in function calls needs a hidden leading parameter containing the actual function name (more details below). These parameters have to be marked by appending _E to the actual parameter type, e.g. VR_E. On import, these parameters will be ignored, or (in case of the function EXTERN.CALL) evaluated internally. On export, the filter has to insert an appropriate value for the parameter if required, see XclExpFmlaCompImpl::AppendDefaultParam() in sc/source/filter/excel/xeformula.cxx.

    { ocPercentrank, 329, 2,  3, V, { RX, VR, VR_E }, 0, 0 },
    { ocExternal,    255, 1, MX, R, { RO_E, RO }, EXC_FUNCFLAG_EXPORTONLY, 0 },
Simulate Calc-only functions

A few functions are not supported by Excel, but the export filter may simulate them easily by using a similar functions. For example, the cotangent functions COT (cotangent), ACOT (arcos cotangent), COTH (cotangent hyperbolicus), and ACOTH (arcus cotangent hyperbolicus) do not exist in Excel, but COT(x) can be written as 1/TAN(x), ACOT(x) can be written as PI/2-ATAN(x), COTH(x) can be written as 1/TANH(x), and ACOTH(x) can be written as ATANH(1/x). To do that, some more work has to be done in the export filter in addition to the function description. In the following, handling of the functions COT and ACOTH will be discussed. First, we add function descriptions to the function table, containing the BIFF funtion index of the function used to simulate our new function (TAN with BIFF index 17 for COT, and ATANH with BIFF index 234 for ACOTH). The function will be marked as export-only, so that the import filter will ignore these function descriptions:

    { ocTan,        17, 1, 1, V, { VR }, 0, 0 },
    { ocCot,        17, 1, 1, V, { VR }, EXC_FUNCFLAG_EXPORTONLY, 0 },
    { ocArcTanHyp, 234, 1, 1,  V, { VR }, 0, 0 },
    { ocArcCotHyp, 234, 1, 1,  V, { VR }, EXC_FUNCFLAG_EXPORTONLY, 0 },

Next, the export filter has to be extended to add the new tokens needed to simulate our function. To be sure to not change the meaning of the entire formula, parentheses have to be inserted appropriately.

  • The term COT(x) has to be replaced with (1/TAN(x)), otherwise e.g. the formula =2^COT(x) will not be evaluated correctly as =2^(1/TAN(x)), but as =(2^1)/TAN(x).
  • The term ACOTH(x) has to be replaced with ATANH(1/(x)), otherwise e.g. the formula =ACOTH(1+1) will not be evaluated correctly as =ATANH(1/(1+1)), but as =ATANH((1/1)+1).

Formulas are stored in reverse polish notation (RPN). The term (1/TAN(x)) will be stored as token sequence "1,x,TAN,DIV,()" where x can be an arbitrarily complex expression. The term ATANH(1/(x)) will be stored as "1,x,(),DIV,ATANH" respectively. Currently, the export filter will already find the new function descriptions but would just write the specified BIFF function index (e.g. TAN(x) for COT(x)). The new tokens have to be added manually in the export filter code. All this is done in the file sc/source/filter/excel/xeformula.cxx.

  • COT: In the RPN array "1,x,TAN,DIV,()" (representing the function COT(x)), the token 1 has to be inserted in front of the entire function (consisting of the tokens "x,TAN"). To do this, a case will be inserted in the method XclExpFmlaCompImpl::PrepareFunction():
void XclExpFmlaCompImpl::PrepareFunction( XclExpFuncData& rFuncData )
{
    switch( rFuncData.GetOpCode() )
    {
// ...
        case ocCot:
            AppendIntToken( 1 );
        break;
// ...
    }
}
  • The division operator and the parentheses have to be appended to the entire function. This is done in the method XclExpFmlaCompImpl::FinishFunction(). The boolean value true in the call of AppendBinaryOperatorToken() specifies that the operator works on scalar values (in difference to e.g. the range operator ":").
void XclExpFmlaCompImpl::FinishFunction( XclExpFuncData& rFuncData, sal_uInt8 nCloseSpaces )
{
// ...
        switch( rFuncData.GetOpCode() )
        {
// ...
            case ocCot:
                AppendBinaryOperatorToken( EXC_TOKID_DIV, true );
                AppendParenToken();
            break;
// ...
        }
// ...
}
  • ACOTH: In the token array "1,x,(),DIV,ATANH" (representing the function ACOTH(x)), the token 1 has to be inserted in front of the function parameter x (but not in front of the entire function). This is done in the method XclExpFmlaCompImpl::PrepareParam(). The variable nParamIdx contains the zero-based index of the current parameter:
void XclExpFmlaCompImpl::PrepareParam( XclExpFuncData& rFuncData )
{
    sal_uInt8 nParamIdx = rFuncData.GetParamCount();
    switch( rFuncData.GetOpCode() )
    {
// ...
        case ocArcCotHyp:
            if( nParamIdx == 0 )
                AppendIntToken( 1 );
        break;
// ...
    }
}
  • The parentheses and the division operator have to be appended to the function parameter x. This is done in the method XclExpFmlaCompImpl::FinishParam(). Again, the variable nParamIdx contains the zero-based index of the current parameter:
void XclExpFmlaCompImpl::FinishParam( XclExpFuncData& rFuncData )
{
//...
    sal_uInt8 nParamIdx = rFuncData.GetParamCount() - 1;
    switch( rFuncData.GetOpCode() )
    {
// ...
        case ocArcCotHyp:
            if( nParamIdx == 0 )
            {
                AppendParenToken();
                AppendBinaryOperatorToken( EXC_TOKID_DIV, true );
            }
        break;
// ...
    }
}
Functions stored as call to an internal defined name

Functions appearing as built-in function in Excel might in fact be stored as a call to an internal defined name. For example, the function BAHTTEXT is not stored with a BIFF function index, but internally as EXTERN.CALL("_xlfn.BAHTTEXT",parameter) where "_xlfn.BAHTTEXT" is a hidden parameter and contains a reference to the internal defined name (also known as named range) named "_xlfn.BAHTTEXT". This defined name is flagged to be a function call instead of a regular named range. If a formula calls a VBA macro, the function call is stored the same way using a dedicated defined name containing the name of the VBA macro. To add support for the BAHTTEXT function, the following function description can be used. It has to refer to the function EXTERN.CALL with the BIFF index 255, and has to specify the parameter type for the first hidden parameter explicitly (type RO_E). Note that all parameters of the function EXTERN.CALL have to be of type RO. The preprocessor macro EXC_FUNCNAME adds the "_xlfn." prefix to the function name:

    { ocBahtText, 255, 2, 2, V, { RO_E, RO }, 0, EXC_FUNCNAME( "BAHTTEXT" ) },

Actually, the function BAHTTEXT can also be stored as built-in function with the BIFF function index 368. To be able to handle this case too, the function descriptions can be changed to:

    { ocBahtText, 368, 1, 1, V, { VR },       EXC_FUNCFLAG_IMPORTONLY, EXC_FUNCNAME( "BAHTTEXT" ) },
    { ocBahtText, 255, 2, 2, V, { RO_E, RO }, EXC_FUNCFLAG_EXPORTONLY, EXC_FUNCNAME( "BAHTTEXT" ) },

The description for the export filter still has to refer to the function EXTERN.CALL. When loading a file with the BAHTTEXT function, internally the import filter will encounter the function EXTERN.CALL and will find the (already existing) function description

    { ocExternal, 255, 1, MX, R, { RO_E, RO }, EXC_FUNCFLAG_IMPORTONLY, 0 }

After reading the first hidden parameter of the function, the filter will find the function name "_xlfn.BAHTTEXT", will search the function table for a function description containing this name, will find our new description mentioned above, and will finally use the op-code ocBahttext.

Functions from external add-ins

In Excel, it is possible to call functions that are part of an external add-in. These add-ins are located in a special directory of the Excel installation, called the library directory. If such a function is built-in in Calc, some conversion has to be done. Currently, the only existing example is the function EUROCONVERT added to Excel with the add-in EUROTOOL (add-in library file eurotool.xla). All add-in functions (also the functions from the Analysis add-in) are stored using the internal function EXTERN.CALL (see previous section) which will contain a description of the add-in name and function name in its first hidden parameter. Thus, the formula =EUROCONVERT(1;"DEM";"EUR") will appear as =EXTERN.CALL("EUROTOOL.XLA!EUROCONVERT";1;"DEM";"EUR") in the Excel file. First, we need a function description in the BIFF8 function table, using the function EXTERN.CALL again. Again, all parameters of this function have to be of type RO:

    { ocEuroConvert, 255, 4, 6, V, { RO_E, RO }, EXC_FUNCFLAG_EXPORTONLY, "EUROCONVERT" }

The first hidden parameter is a reference to an external name being part of the description of the external workbook EUROTOOL.XLA. A detailed description would go beyond the scope here. To get an idea, grep the files sc/source/filter/excel/xilink.cxx, sc/source/filter/excel/excform8.cxx, and sc/source/filter/excel/xelink.cxx for xlExtEuroConvert and EXC_SBTYPE_EUROTOOL.

Functions not supported by Excel

The export filter can write functions to the Excel file format even if Excel does not know them. The import filter will restore these functions when loading the file. The function table saFuncTable_Odf contains all functions that will be handled this way. For example, the function BASE will be inserted as following:

    EXC_FUNCENTRY_ODF( ocBase, 2, 3, 0, "BASE" ),

The preprocessor macro EXC_FUNCENTRY_ODF converts this entry to the appropriate function descriptions (see the definition of the macro above the table). The first entry is the Calc op-code, the second entry is the minimum parameter count, the third entry is the maximum parameter count, the fourth entry will be used for additional flags, and the last entry is the ODF function name.

New file formats (OOXML and BIFF12)

Overview

As if that wasn't enough, nearly the same changes have to be done in the oox code module implementing filters for the Microsoft Office Open XML (OOXML) file format and the binary BIFF12 format. As already mentioned, in the future this filter will handle import and export of all Excel file formats including the old BIFF2-BIFF8 formats. Therefore, the function tables have been prepared to contain all information needed to import and export BIFF2-BIFF12 and OOXML files. The following files contain the source code for formula handling:

  • Source code used by the import and export filters is located in the files oox/inc/oox/xls/formulabase.hxx and oox/source/xls/formulabase.cxx. The cxx file contains all function tables.
  • The source code for formula import is located in the files oox/inc/oox/xls/formulaparser.hxx and oox/source/xls/formulaparser.cxx.
  • Currently, there is no source code for formula export.

Function tables

The structure of the function tables is similar to the structure of the old function tables in module sc described above. The main difference is, that the filter implementation is based completely on the OpenOffice.org API and therefore does not have access to Calc internals such as the function op-codes. Following a short overview of the entries in a function description. The function descriptions are structures of type ::oox::xls::FunctionData defined locally above the tables. But first, an example description for the function SUM:

    { "SUM", "SUM", 4, 4, 0, MX, V, { RX }, 0 },
ODF function name

The first entry is the function name as defined in OpenFormula. May be null, if not available (functions available in Excel only).

OOXML function name

The second entry is the function name as it appears in the OOXML file format. May be null, if not available (functions available in Calc only).

BIFF12 function index

The third entry is the function index used in BIFF12 files (file extension ".xlsb").

BIFF2-BIFF8 function index

The fourth entry is the function index used in BIFF2-BIFF8 files (file extension ".xls").

Minimum parameter count

The fifth entry is the minimum number of parameters required by Excel.

Maximum parameter count

The sixth entry is the maximum number of parameters allowed in Excel. The constant MX can be used to specify the maximum number allowed by the current file format, which is 30 in BIFF2-BIFF8 and 255 in BIFF12 and OOXML. The filters will find and use the correct maximum value internally.

Return type

The seventh entry is the type of the return value. As before, can be V for scalar values, A for arrays of values, or R for cell range addresses (references).

Parameter types

The eighth entry is a C array describing the type of all function parameters. This entry is equal to the arrays from the old BIFF filters described above.

Additional flags

The ninth and last entry contains additional flags controlling the behaviour of the filters. The three flags known from the old BIFF filters are available as well as some new flags:

  • FUNCFLAG_VOLATILE: The function result is volatile.
  • FUNCFLAG_IMPORTONLY: The function description is available for the import filters only.
  • FUNCFLAG_EXPORTONLY: The function description is available for the export filters only.
  • FUNCFLAG_MACROCALL: The BIFF filters will handle the function as call to a defined name (see previous chapter for details; the OOXML name will be prefixed with "_xlfn."). Has no effect on the OOXML filter.
  • FUNCFLAG_MACROCALLODF: Used to mark functions not available in Excel but in Calc only, to be able to preserve them in a roundtrip scenario.
  • FUNCFLAG_EXTERNAL: Used to mark functions that are stored externally in Calc (with op-code ocExternal). Currently, all functions from the Analysis add-in are marked with this flag.
  • FUNCFLAG_MACROFUNC: The function is a macrosheet function. This kind of functions is not supported in Calc, and is available in Excel macrosheets only (not in regular worksheets).
  • FUNCFLAG_MACROCMD: The function is a macrosheet command. This kind of functions is not supported in Calc, and is available in Excel macrosheets only (not in regular worksheets).
  • FUNCFLAG_ALWAYSVAR: The function is always handled as function with variable number of parameters, even if minimum and maximum number of parameters are equal.
  • FUNCFLAG_PARAMPAIRS: The function repeats the last two parameter types instead of only the last type for additional parameters. This is used e.g. for the function COUNTIFS (added in OOXML/BIFF12) that supports up to 255 parameters and always expects pairs of parameter with types RO, VR.

Advanced

Calc-only function parameters

Parameters marked with the C parameter type are only available in Calc (e.g. the third parameter of the functions FLOOR and CEILING). Everytime the import filter encounters such a parameter, it calls the method FormulaFinalizer::appendCalcOnlyParameter() in oox/source/xls/formulaparser.cxx allowing to set a default value for the parameter.

Add values for optional parameters

After a function has been imported completely, the method FormulaFinalizer::appendRequiredParameters() will be called allowing to extend the parameter list. This is useful for functions with parameters optional in Excel but required in Calc, e.g. the second parameter of the function WEEKNUM.

Add values for empty parameters

If the import filter encounters an empty parameter, it calls the method FormulaFinalizer::appendEmptyParameter(). The implementation may add a value in case Calc does not support an empty parameter here. Currently, this is used for the function IF. For example, the import filter replaces =IF(cond;) with =IF(cond;0) and =IF(cond;;) with =IF(cond;0;0).

Functions stored as call to an internal defined name

The function flag FUNCFLAG_MACROCALL is used to mark functions that are stored as call to an internal defined name (see previous chapter for details). It is still possible to specify an alternative BIFF function index.

    { "COM.MICROSOFT.BAHTTEXT", "BAHTTEXT", 368, 368, 1, 1, V, { VR }, FUNCFLAG_MACROCALL },
Functions from the Analysis add-in

In the old BIFF2-BIFF8 filters, these functions are handled internally and do not need descriptions in the function tables at all. In BIFF12 and OOXML, these functions are stored as internal functions (there is no Analysis add-in anymore in Excel 2007 and later), and therefore get their own BIFF12 function index. Following an example for the description of the Analysis add-in function COMPLEX. Note the constant NOID used for the BIFF2-BIFF8 function index to specify that these formats do not support a BIFF index here. The import filter will find the syntax using the function EXTERN.CALL, and will resolve the function by the specified OOXML function name.

    { "COMPLEX", "COMPLEX", 411, NOID, 2, 3, V, { RR }, FUNCFLAG_EXTERNAL },
Functions from external add-ins

It is possible to call functions that are part of an external add-in. These add-ins are located in a special directory of the Excel installation, called the library directory. The following function description is for the function EUROCONVERT from the external add-in EUROTOOL which is located either in the file eurotool.xla (Excel 2003 and earlier) or in the file eurotool.xlam (Excel 2007 and later). Every add-in has a corresponding value in the enumeration FunctionLibraryType defined in oox/inc/oox/xls/formulabase.hxx. The macro FUNCLIB_TO_FUNCFLAGS converts this value to the appropriate function flags:

    { "EUROCONVERT", "EUROCONVERT", NOID, NOID, 3, 5, V, { VR }, FUNCLIB_TO_FUNCFLAGS( FUNCLIB_EUROTOOL ) },

The function library flag triggers internal code that resolves a reference to the add-in file name. Adding a new add-in is quite easy compared to the old BIFF filters. Assuming the new add-in is called MYADDIN (file name myaddin.xla or myaddin.xlam) and contains the function MYADDINFUNC.

  • In oox/inc/oox/xls/formulabase.hxx, add a new value to the enumeration FunctionLibraryType:
enum FunctionLibraryType
{
    FUNCLIB_UNKNOWN,
    FUNCLIB_EUROTOOL,
    FUNCLIB_MYADDIN          // <== new enum value
};
  • In oox/source/xls/formulabase.cxx, add an entry in the method FunctionProvider::getFuncLibTypeFromLibraryName() that converts the passed library file name to a value of the enumeration FunctionLibraryType:
FunctionLibraryType FunctionProvider::getFuncLibTypeFromLibraryName( const OUString& rLibraryName ) const
{
// ...
    if( OOX_XLS_IS_LIBNAME( rLibraryName, "EUROTOOL" ) )
        return FUNCLIB_EUROTOOL;
    if( OOX_XLS_IS_LIBNAME( rLibraryName, "MYADDIN" ) )    // <== new entry
        return FUNCLIB_MYADDIN;                            // <==
// ...
}
  • Add a new function description for each function in the add-in. We have only one function:
    { "MYADDINFUNC", "MYADDINFUNC", NOID, NOID, 2, 4, V, { VR }, FUNCLIB_TO_FUNCFLAGS( FUNCLIB_MYADDIN ) },

That's all.

Functions not supported by Excel

The function table saFuncTableOdf contains all functions that are not supported by Excel. For example, the function BASE will be inserted as following:

    { "BASE", 0, NOID, NOID, 2, 3, V, { VR }, FUNCFLAG_MACROCALLODF },
Personal tools