Difference between revisions of "Documentation/How Tos/Using Arrays"

From Apache OpenOffice Wiki
Jump to: navigation, search
(Array Functions)
(Array expression calculations: -> array *formula*)
Line 85: Line 85:
 
You can enter an array formula with the Function Wizard, by ticking the ''Array'' checkbox. If you inadvertently enter a formula normally, you can click the Function Wizard again, then tick the ''Array'' checkbox to change it to an array formula.
 
You can enter an array formula with the Function Wizard, by ticking the ''Array'' checkbox. If you inadvertently enter a formula normally, you can click the Function Wizard again, then tick the ''Array'' checkbox to change it to an array formula.
  
== Array expression calculations ==
+
== Array formula calculations ==
In an array expression, when Calc evaluates the formula, it treats ‘unexpected arrays’ as arrays (rather than using a single value), calculating a value for each of the array elements, and returning an array of results.
+
When Calc evaluates an array formula, it treats ‘unexpected arrays’ as a series of values (rather than using a single value), calculating a result for each of the array elements, and returning an array of results.
  
 
''Example:''
 
''Example:''
Line 102: Line 102:
 
''Example:''
 
''Example:''
  
With the array expression <tt>'''=SQRT( {16; 4; 25} )'''</tt>:
+
With the array formula <tt>'''=SQRT( {16; 4; 25} )'''</tt>:
  
 
#There is only one array, with 1 row and 3 columns.
 
#There is only one array, with 1 row and 3 columns.
Line 111: Line 111:
 
''Example:''
 
''Example:''
  
With the array expression <tt>'''=SQRT( {8 | 18} * 2 )'''</tt> in cell A5:
+
With the array formula <tt>'''=SQRT( {8 | 18} * 2 )'''</tt> in cell A5:
  
 
#There is only one array, with 2 rows and 1 column.
 
#There is only one array, with 2 rows and 1 column.
Line 122: Line 122:
 
''Example:''
 
''Example:''
  
:<tt>'''=SUM(SQRT( {16; 4; 25} ))'''</tt> as an array expression. The calculation of <tt>'''SQRT( {16; 4; 25} )'''</tt>, as before, yields a result of <tt>'''{4; 2; 5}'''</tt>, thus giving <tt>'''SUM( {4; 2; 5} )'''</tt>, returning a final result in the cell of 4+2+5 = <tt>'''11'''</tt>.
+
:<tt>'''=SUM(SQRT( {16; 4; 25} ))'''</tt> as an array formula. The calculation of <tt>'''SQRT( {16; 4; 25} )'''</tt>, as before, yields a result of <tt>'''{4; 2; 5}'''</tt>, thus giving <tt>'''SUM( {4; 2; 5} )'''</tt>, returning a final result in the cell of 4+2+5 = <tt>'''11'''</tt>.
  
 
''Example:''
 
''Example:''
  
With the array expression <tt>'''=SUM(IF(A1:A4>0; B1:B4; 0))'''</tt>
+
With the array formula <tt>'''=SUM(IF(A1:A4>0; B1:B4; 0))'''</tt>
  
 
#The two arrays are the same size.
 
#The two arrays are the same size.

Revision as of 09:09, 15 March 2008

This is a work in progress - unfinished

Introduction

Firstly, be aware that you can almost certainly manage without arrays. They do make life more complicated - in some ways, they extend what a spreadsheet can do beyond what spreadsheets were intended for. Arrays can however reduce the number of cells used.

An array is simply a rectangular block of information that Calc can manipulate in a formula - that is, it is information organised in rows and columns.

There are 2 ways to specify an array in a formula:

  • as a range - for example A2:C3.
  • as an "inline array", for example {1; 5; 3 | 6; 2; 4} (this needs OpenOffice 2.4 - see Array Issues)
Calc array4.png
You type curly braces { } around an inline array. Entries on a row are separated by a semicolon ‘;’, and rows are separated by the pipe character ‘|’. Each row must have the same number of elements (it is wrong to write {1; 2; 3 | 4; 5} because there are 3 elements in the top row and only 2 in the next row).
Inline arrays may have mixed contents, for example {4; 2; "dog" | -22; "cat"; 0} is valid. However an inline array may not contain references (eg A4) or formulae (eg INT(7/3) ).
Calc array5.png

You can give a name to a range of cells: select the range and choose Insert - Names - Define.

You can give a name to an inline array: choose Insert - Names - Define; type the array (eg {1; 3; 2} including the curly braces) in the Assigned to box.


Functions which understand array parameters

Some functions, such as SUM(), AVERAGE(), MATCH(), LOOKUP(), accept one or more of their parameters as arrays.

For example:

SUM( A2:C3 ) returns the sum of the numbers in the range A2:C3.
SUM( {3; 2; 4} ) returns 9, the sum of the numbers in the inline array {3; 2; 4}.
SUM( B5; 7 ) returns the sum of B5 and 7. SUM() expects/understands single (‘scalar’) values too.

Functions not expecting array parameters

Some functions, such as ABS(), SQRT(), COS(), LEN() expect their parameters to be ‘scalar’ - that is, a single value such as 354 or "dog" or the contents of a cell eg B5.

Example:

SQRT(4) returns 2.
LEN("dog") returns 3.

However, you may still use an array where a single value is expected - for example SQRT( {9; 4} ). If you enter the formula ‘normally’ by pressing Enter, Calc will then evaluate the formula using a single value from the array as follows:

If it is an inline array:

Calc will use the first value (the ‘top left’).
Example:
=SQRT( {9; 4 | 25; 16} ) then press Enter returns 3, the square root of the first element in the array (9).

If it is a range:

1. Calc will return an error unless the array is a single row or a single column.
2. For a single row or a single column range, Calc will use the value where the formula cell’s column/row intersects with the array (or return an error if there is no intersection).
Examples:
Calc array1.png
The formula =ABS(B2:B5) is entered 'normally' in cell D3, that is on row 3. Row 3 intersects B2:B5 at cell B3, thus the formula evaluated is =ABS(B3).
Calc array2.png
The formula =LEN(B5:D5) is entered 'normally' in cell B1, that is in column B. Column B intersects B5:D5 at cell B5, thus the formula evaluated is =LEN(B5).

Array expressions

The real power of arrays comes when you enter a formula in a special way, as an ‘array expression’. You do this by pressing Cntrl-Shift-Enter instead of the Enter button.

If in cell B1 you enter ={3; 4} ‘normally’ by pressing Enter, the first value 3 is displayed in the cell.

If in cell B2 you type ={3; 4} but press Cntrl-Shift-Enter instead of Enter, the cell becomes an “array expression”. The formula now returns the entire array {3; 4}. Cell B2 displays 3 and cell C2 displays 4.

Note that if you entered the formula using the Enter key, simply selecting the cell and pressing Cntrl-Shift-Enter will not convert the cell to an array expression - you must make an actual edit (such as adding then deleting a character).

If you now try to edit cell B2, you are told that "you cannot change only part of an array". To edit an array you must select the entire array, either with the mouse or by typing Cntrl-/ (hold the Cntrl key and press the slash key ‘/’).

Calc array6.png

The formula bar indicates that this is an array expression by putting curly braces {} around the formula. You do not need to type these - they will disappear while you edit the formula, and Calc will show them again when you have finished editing.

You can enter an array formula with the Function Wizard, by ticking the Array checkbox. If you inadvertently enter a formula normally, you can click the Function Wizard again, then tick the Array checkbox to change it to an array formula.

Array formula calculations

When Calc evaluates an array formula, it treats ‘unexpected arrays’ as a series of values (rather than using a single value), calculating a result for each of the array elements, and returning an array of results.

Example:

Calc array3.png
=SQRT( {16; 4; 25} ) when entered by pressing Cntrl-Shift-Enter instead of Enter returns an array of results, with 1 row and 3 columns - {4; 2; 5}. If the formula is in cell B2, Calc places the results in cells B2:D2. 4 is placed in B2, 2 in C2 and 5 in D2.


The process in effect works thus:

  1. All ‘unexpected arrays’ in the same array calculation should be the same size (see this issue: Issue 46681).
  2. The result will be returned in an array of that size.
  3. Any single values are in effect expanded into an array of that size, where each element has that value.
  4. The calculation is done for each element in turn, with the result returned in the corresponding element of the output array.

Example:

With the array formula =SQRT( {16; 4; 25} ):

  1. There is only one array, with 1 row and 3 columns.
  2. The result will be returned in an array with 1 row and 3 columns.
  3. There are no single values.
  4. The calculation is done for 16 first, then for 4, then for 25, giving the array result {4; 2; 5}.

Example:

With the array formula =SQRT( {8 | 18} * 2 ) in cell A5:

  1. There is only one array, with 2 rows and 1 column.
  2. The result will be returned in an array with 2 rows and 1 column.
  3. The single value 2 is expanded, so we have SQRT( {8 | 18} * {2 | 2} ).
  4. The calculation is: first element SQRT(8*2) = 4; second element SQRT(18*2) = 6; the array result is thus {4 | 6} - that is, 4 in cell A5 and 6 in cell A6.

The result of an array expression is an array, which can be used within the formula.

Example:

=SUM(SQRT( {16; 4; 25} )) as an array formula. The calculation of SQRT( {16; 4; 25} ), as before, yields a result of {4; 2; 5}, thus giving SUM( {4; 2; 5} ), returning a final result in the cell of 4+2+5 = 11.

Example:

With the array formula =SUM(IF(A1:A4>0; B1:B4; 0))

  1. The two arrays are the same size.
  2. The result of the IF() array calculation will be an array of that size, which SUM() will add up.
  3. Expanding the single values gives =SUM(IF(A1:A4>{0;0;0;0}; B1:B4; {0;0;0;0}))
  4. If A1>0 the first element is B1; else 0. If A2>0 the second element is B2; else 0 .... The array presented to SUM() has the values in B1:B4 where the adjacent value in A1:A4 is >0. The final output is the sum of the values in B1:B4 for which the adjacent value in A1:A4 is >0.

Array Functions

Some functions are intended to manipulate or return arrays. They are listed in the Array function category, and are:

FREQUENCY, GROWTH, LINEST, LOGEST, MDETERM, MINVERSE, MMULT, MUNIT, SUMPRODUCT, SUMX2MY2, SUMX2PY2, SUMXMY2, TRANSPOSE, TREND

Some should be entered with Enter ('normal' mode) and some with Cntrl-Shift-Enter (as an array formula).

For example, SUMPRODUCT (in normal mode) multiplies corresponding elements of the arrays given as parameters, then returns the sum.

For example, MUNIT(2) (as an array formula) returns the 2 x 2 unit (identity) matrix as the array {1; 0 | 0; 1}.

Tips and Tricks

All these examples are to be entered as an array expression, by pressing Cntrl-Shift-Enter.

Sum of entries matching multiple conditions

SUM( (A1:A6="red")*(B1:B6="big")*C1:C6 ) returns the sum of entries in C1:C6 whose A column entries are "red" AND whose B column entries are "big". A1:A6 and B1:B6 each produce a 6 element array of TRUE or FALSE - which in number calculations are 1 or 0. Thus if A2 contains "red" and B2 contains "big" the second element of the array is 1 * 1 * C2 = C2. If A2 contains "blue" instead, the second element of the array is 0 * 1 * C2 = 0.

SUM( ((A1:A6="red")+(B1:B6="big")>0)*C1:C6 ) returns the sum of entries in C1:C6 whose A column entries are "red" OR whose B column entries are "big".

SUM( MOD((A1:A6="red")+(B1:B6="big");2)*C1:C6 ) returns the sum of entries in C1:C6 either whose A column entries are "red" OR whose B column entries are "big" but not both (exclusive OR)

SUM( NOT((A1:A6="red")+(B1:B6="big"))*C1:C6 ) returns the sum of entries in C1:C6 where neither the A column entry is "red" NOR the B column entry is "big".

Count of entries matching multiple conditions

SUM( (A1:A6="red")*(B1:B6="big") ) returns the number of rows whose A column entries are "red" AND whose B column entries are "big".

Maximum in a particular month

MAX(IF(MONTH(B1:B9)=5; C1:C9; 0)) returns the maximum value in C1:C9 where the corresponding B1:B9 date is in May (month 5). The MIN() function would provide the minimum in the month.

Average of entries meeting a condition

AVERAGE(IF(A1:A9="red"; B1:B9; "")) returns the average of entries in B1:B9 whose A column entries are “red”. The AVERAGE function ignores any blank entries.

The MEDIAN function can be used similarly.

Dynamic sorting of a column

=LARGE(B3:B9;ROW(B3:B9)+1-ROW(B3)) entered in cell C3 returns an array in C3:C9 which is B3:B9 in descending order. +1-ROW(3) is a constant - you could write =LARGE(B3:B9;ROW(B3:B9)-2) instead.

=SMALL(B3:B9;ROW(B3:B9)-2) returns B3:B9 in ascending order.

Sum ignoring errors

SUM( IF(ISERROR(A1:A9); 0; A1:A9) ). Normally SUM will propagate any error found.

Average, ignoring zero entries

AVERAGE(IF(B1:B9<>0; B1:B9; ""))

Test a cell for one of a set of values

OR(B2={2; 5; 6}) as an array expression, or OR(B2=2; B2=5; B2=6) as a ‘normal’ formula.

Sum of the smallest 4 numbers

SUM(SMALL(B3:B9; {1;2;3;4})) or SUM(SMALL(B3:B9;ROW(A1:A4))). ROW(A1:A4) produces {1;2;3;4}.

Last used cell in a column

MAX(ROW(B1:B9)*(B1:B9<>"")) returns the row number of the last cell used.

Issues

  • Inline arrays work in OOo2.3, but fail if there are any spaces, or negative numbers. This is fixed for OOo2.4 (issue 82644).
  • It will be possible to include different size arrays in a formula, as there is a defined calculation process in the forthcoming international standard ODFF. Calc should comply in OOo3.0. See Issue 46681.
  • Various functions cannot be used in array expressions: COUNTIF, SUMIF (issue 65866); MATCH (issue 8947)
  • Array expressions have to do a lot of calculating - which may slow down your computer with large arrays.
Personal tools