# Difference between revisions of "Documentation/How Tos/Using Arrays"

(→Issues: different dimension arrays - example; re-ordered) |
Jim DeLaHunt (Talk | contribs) (→Array formula calculations: details about size constraints on unexpected arrays, and add example.) |
||

Line 90: | Line 90: | ||

The process in effect works thus: | The process in effect works thus: | ||

− | #All ‘unexpected arrays’ in the same array calculation | + | #All ‘unexpected arrays’ in the same array calculation must either |

+ | #*a) have the same number of rows and columns, or | ||

+ | #*b) have the same number of rows and some arrays may have only one column, or | ||

+ | #*c) have the same number of rows and some arrays may have only one column. | ||

+ | #Where some of the ‘unexpected arrays’ have just one row or column, that row or column is replicated enough times to give that array the same number of rows and columns as the other arrays. (This was tracked in [https://issues.apache.org/ooo/show_bug.cgi?id=46681 Issue 46681] and fixed in OpenOffice.org v3.1). | ||

#The result will be returned in an array of that size. | #The result will be returned in an array of that size. | ||

#The calculation is done for each element in turn, with the result returned in the corresponding element of the output array. | #The calculation is done for each element in turn, with the result returned in the corresponding element of the output array. | ||

Line 123: | Line 127: | ||

#The result of the <tt>'''IF()'''</tt> array calculation will be an array of that size, which <tt>'''SUM()'''</tt> will add up. | #The result of the <tt>'''IF()'''</tt> array calculation will be an array of that size, which <tt>'''SUM()'''</tt> will add up. | ||

#If A1>0 the first element is <tt>'''B1'''</tt>; else <tt>'''0'''</tt>. If A2>0 the second element is <tt>'''B2'''</tt>; else <tt>'''0'''</tt> .... The array presented to <tt>'''SUM()'''</tt> has the values in B1:B4 where the adjacent value in A1:A4 is >0. The final output is the sum of the values in B1:B4 for which the adjacent value in A1:A4 is >0. | #If A1>0 the first element is <tt>'''B1'''</tt>; else <tt>'''0'''</tt>. If A2>0 the second element is <tt>'''B2'''</tt>; else <tt>'''0'''</tt> .... The array presented to <tt>'''SUM()'''</tt> has the values in B1:B4 where the adjacent value in A1:A4 is >0. The final output is the sum of the values in B1:B4 for which the adjacent value in A1:A4 is >0. | ||

+ | |||

+ | ''Example:'' | ||

+ | |||

+ | With the array formula <tt>'''=((({6|8})>({1;3;5|7;9;10}))*({1;3;5|7;9;10}))'''</tt> | ||

+ | |||

+ | # The first array has two rows and one column. The other two arrays have two rows and three columns. | ||

+ | # The single column of the first array is replicated three times, to make an array with two rows and three columns. | ||

+ | # The result is <tt>'''{1;3;5|7;0;0}'''</tt>. | ||

+ | # If the first array had more than two rows, or a different number of columns than one or three, it would not be sized correctly for this formula. | ||

== Array functions == | == Array functions == |

## Revision as of 01:27, 12 October 2014

## Contents

- 1 Introduction
- 2 Functions which understand array parameters
- 3 Functions not expecting array parameters
- 4 Array formulas
- 5 Array formula calculations
- 6 Array functions
- 7 Tips and Tricks
- 7.1 Sum of entries matching multiple conditions
- 7.2 Count of entries matching multiple conditions
- 7.3 Maximum in a particular month
- 7.4 Average of entries meeting a condition
- 7.5 Dynamic sorting of a column
- 7.6 Sum ignoring errors
- 7.7 Average, ignoring zero entries
- 7.8 Test a cell for one of a set of values
- 7.9 Sum of the smallest 4 numbers
- 7.10 Last used cell in a column

- 8 Issues

## Introduction

Arrays may be helpful in reducing the number of cells you use, but they can be a little complex to understand. In some ways, they extend what a spreadsheet can do beyond what spreadsheets were intended for. Almost all tasks are possible without them.

An array is simply a rectangular block of information that Calc can manipulate in a formula - that is, it is information organised in rows and columns. An array may be cells on a spreadsheet, or may be held internally by Calc.

There are 2 ways to specify an array in a formula:

- as a range - for example
.**A2:C3**

- as an "inline array", for example
(these are fully functional from OOo2.4, but do exist in earlier versions - see Array Issues). You type curly braces**{1; 5; 3 | 6; 2; 4}**around an inline array. Entries on a row are separated by a semicolon ‘**{ }**’, and rows are separated by the pipe character ‘**;**’. Each row must have the same number of elements (it is wrong to write**|**because there are 3 elements in the top row and only 2 in the next row). Inline arrays may have mixed contents, for example**{1; 2; 3 | 4; 5}**. However an inline array may**{4; 2; "dog" | -22; "cat"; 0}**__not__contain references (eg), or formulae (eg**A4**,**PI()**), or percentages (eg**2*3**).**5%**

You can give a name to a range of cells: select the range and choose **Insert - Names - Define**.

You can give a name to an inline array: choose **Insert - Names - Define**; type the array (eg ` {1; 3; 2}` including the curly braces) in the

**Assigned to**box.

## Functions which understand array parameters

Some functions, such as ` SUM()`,

`,`

**AVERAGE()**`,`

**MATCH()**`, accept one or more of their parameters as arrays.`

**LOOKUP()***For example:*

returns the sum of the numbers in the range**SUM( A2:C3 )**.**A2:C3**

returns**SUM( {3; 2; 4} )**, the sum of the numbers in the inline array**9**.**{3; 2; 4}**

expects/understands single (‘scalar’) values too -**SUM**returns the sum of**SUM( B5; 7 )**and**B5**.**7**

## Functions not expecting array parameters

Some functions, such as ` ABS()`,

`,`

**SQRT()**`,`

**COS()**`expect their parameters to be ‘scalar’ - that is, a single value such as`

**LEN()**`or`

**354**`or the contents of a cell eg`

**"dog"**`.`

**B5***Example:*

returns**SQRT(4)**.**2**

returns**LEN("dog")**.**3**

However, you __may still use an array__ where a single value is expected - *for example* ` SQRT( {9; 4} )`. If you enter the formula ‘normally’ by pressing

**Enter**, Calc will then evaluate the formula using a single value from the array as follows:

If it is an **inline array**:

- Calc will use the first value (the ‘top left’).

*Example:*

then press**=SQRT( {9; 4 | 25; 16} )****Enter**returns, the square root of the first element in the array (**3**).**9**

If it is a **range**:

- 1. Calc will return an error unless the array is a single row or a single column.

- 2. For a single row or a single column range, Calc will use the value where the formula cell’s column/row intersects with the array (or return an error if there is no intersection).

*Examples:*

- The formula
is entered 'normally' in cell D3, that is on row 3. Row 3 intersects**=ABS(B2:B5)**at cell B3, thus the formula evaluated is**B2:B5**.**=ABS(B3)**

- The formula

- The formula
is entered 'normally' in cell B1, that is in column B. Column B intersects**=LEN(B5:D5)**at cell B5, thus the formula evaluated is**B5:D5**.**=LEN(B5)**

- The formula

## Array formulas

The real power of arrays comes when you enter a formula in a special way, as an ‘array formula’. You do this by pressing **Cntrl-Shift-Enter** instead of the **Enter** button (or by ticking the *Array* checkbox if you are using the Formula Wizard).

If in cell B1 you enter ` ={3; 4}` ‘normally’ by pressing

**Enter**, the first value

`is displayed in the cell.`

**3**If in cell B2 you type ` ={3; 4}` but press

**Cntrl-Shift-Enter**instead of

**Enter**, the cell becomes an 'array formula'. The formula now returns the entire array

`. Cell B2 displays`

**{3; 4}**`and cell C2 displays`

**3**`.`

**4***Note that if you entered the formula using the*Array**Enter**key, simply selecting the cell and pressing**Cntrl-Shift-Enter**will not convert the cell to an array expression - you must make an actual edit (such as adding then deleting a character), or click the Function Wizard and tick the*checkbox.*

If you now try to edit cell B2, you are told that *"you cannot change only part of an array"*. To edit an array you must select the entire array, either with the mouse or by typing Cntrl-/ (hold the Cntrl key and press the slash key ‘/’).

The formula bar indicates that this is an array formula by enclosing it in curly braces {}. You do not need to type these - they will disappear while you edit the formula, and Calc will show them again when you have finished editing.

## Array formula calculations

When Calc evaluates an array formula, it treats ‘unexpected arrays’ as a series of values (rather than using a single value), calculating a result for each of the array elements, and returning an __array of results__.

*Example:*

when entered by pressing**=SQRT( {16; 4; 25} )****Cntrl-Shift-Enter**instead of**Enter**returns an array of results, with 1 row and 3 columns -. If the formula is in cell B2, Calc places the results in cells B2:D2.**{4; 2; 5}**is placed in B2,**4**in C2 and**2**in D2.**5**

The process in effect works thus:

- All ‘unexpected arrays’ in the same array calculation must either
- a) have the same number of rows and columns, or
- b) have the same number of rows and some arrays may have only one column, or
- c) have the same number of rows and some arrays may have only one column.

- Where some of the ‘unexpected arrays’ have just one row or column, that row or column is replicated enough times to give that array the same number of rows and columns as the other arrays. (This was tracked in Issue 46681 and fixed in OpenOffice.org v3.1).
- The result will be returned in an array of that size.
- The calculation is done for each element in turn, with the result returned in the corresponding element of the output array.

*Example:*

With the array formula ` =SQRT( {16; 4; 25} )`:

- There is only one array, with 1 row and 3 columns.
- The result will be returned in an array with 1 row and 3 columns.
- The calculation is done for
first, then for**16**, then for**4**, giving the array result**25**.**{4; 2; 5}**

*Example:*

With the array formula ` =SQRT( {8 | 18} * 2 )` in cell A5:

- There is only one array, with 2 rows and 1 column.
- The result will be returned in an array with 2 rows and 1 column.
- The calculation is: first element
; second element**SQRT(8*2) = 4**; the array result is thus**SQRT(18*2) = 6**- that is,**{4 | 6}**in cell A5 and**4**in cell A6.**6**

The result of an array expression is an array, which can be used within the formula.

*Example:*

as an array formula. The calculation of**=SUM(SQRT( {16; 4; 25} ))**, as before, yields a result of**SQRT( {16; 4; 25} )**, thus giving**{4; 2; 5}**, returning a final result in the cell of 4+2+5 =**SUM( {4; 2; 5} )**.**11**

*Example:*

With the array formula **=SUM(IF(A1:A4>0; B1:B4; 0))**

- The two arrays
and**A1:A4**both have 4 rows and 1 column.**B1:B4** - The result of the
array calculation will be an array of that size, which**IF()**will add up.**SUM()** - If A1>0 the first element is
; else**B1**. If A2>0 the second element is**0**; else**B2**.... The array presented to**0**has the values in B1:B4 where the adjacent value in A1:A4 is >0. The final output is the sum of the values in B1:B4 for which the adjacent value in A1:A4 is >0.**SUM()**

*Example:*

With the array formula **=((({6|8})>({1;3;5|7;9;10}))*({1;3;5|7;9;10}))**

- The first array has two rows and one column. The other two arrays have two rows and three columns.
- The single column of the first array is replicated three times, to make an array with two rows and three columns.
- The result is
.**{1;3;5|7;0;0}** - If the first array had more than two rows, or a different number of columns than one or three, it would not be sized correctly for this formula.

## Array functions

Some functions calculate their result as an array. If they are to __return__ that array, they must be entered as an array formula, by pressing **Cntrl-Shift-Enter** (or by ticking the *Array* checkbox if you are using the Formula Wizard). If, instead, they are entered 'normally' with the **Enter** button, just the (single) top left element of the calculated array will be returned. These funtions are listed in the **Array function category**, and are:

*For example*: =MUNIT(2) entered as an array formula in cell B1 returns the 2 x 2 unit (identity) matrix as the array {1; 0 | 0; 1} - that is, cells B1, B2, C1, C2 show 1, 0, 0, 1. If this formula is entered 'normally', just the 'top left' value (1) is returned in cell B1.

Some functions (including some listed above) take parameters that are forced to evaluate as an array formula, even when the formula is entered 'normally':

- MDETERM, MINVERSE, MMULT, SUMPRODUCT, SUMX2MY2, SUMX2PY2, SUMXMY2, CORREL, COVAR, FORECAST, FTEST, INTERCEPT, MODE, PEARSON, PROB, RSQ, SLOPE, STEYX, TTEST

*For example*: MODE(ABS(A1:A3)) (in normal mode) forces ABS(A1:A3) to be evaluated as an array formula, returning an array of the absolute values of A1:A3 from which MODE selects the most common value to be returned.

*For example*: PROB(A1:A5; B1:B5/100; 3) (in normal mode) forces B1:B5/100 to be evaluated as an array formula, returning an array where each value is one hundredth of its value in B1:B5 (perhaps because the probabilities were expressed in percent).

A few of these functions have some (array) parameters that are forced to evaluate as an array formula, and other (single value) parameters that evaluate normally.

*For example*: PROB(A1:A5; B1:B5; {3; 4}) (in normal mode) does not evaulate {3; 4} as an array, because this would mean PROB returning an array. It evaluates PROB(A1:A5; B1:B5; 3) - that is, it interprets {3; 4} in scalar mode, taking the 'top left' value 3. However, if entered as an array formula it will of course return an array.

## Tips and Tricks

__All these examples are to be entered as an array expression, by pressing Cntrl-Shift-Enter.__

#### Sum of entries matching multiple conditions

` SUM( (A1:A6="red")*(B1:B6="big")*C1:C6 )` returns the sum of entries in C1:C6 whose A column entries are

`AND whose B column entries are`

**"red"**`. A1:A6 and B1:B6 each produce a 6 element array of`

**"big"**`or`

**TRUE**`- which in number calculations are`

**FALSE**`or`

**1**`. Thus if A2 contains`

**0**`and B2 contains`

**"red"**`the second element of the array is`

**"big"**`. If A2 contains`

**1 * 1 * C2 = C2**`instead, the second element of the array is`

**"blue"**`.`

**0 * 1 * C2 = 0**` SUM( ((A1:A6="red")+(B1:B6="big")>0)*C1:C6 )` returns the sum of entries in C1:C6 whose A column entries are

`OR whose B column entries are`

**"red"**`.`

**"big"**` SUM( MOD((A1:A6="red")+(B1:B6="big");2)*C1:C6 )` returns the sum of entries in C1:C6 either whose A column entries are

`OR whose B column entries are`

**"red"**`but not both (exclusive OR)`

**"big"**` SUM( NOT((A1:A6="red")+(B1:B6="big"))*C1:C6 )` returns the sum of entries in C1:C6 where neither the A column entry is

`NOR the B column entry is`

**"red"**`.`

**"big"**#### Count of entries matching multiple conditions

` SUM( (A1:A6="red")*(B1:B6="big") )` returns the number of rows whose A column entries are

`AND whose B column entries are`

**"red"**`.`

**"big"**#### Maximum in a particular month

` MAX(IF(MONTH(B1:B9)=5; C1:C9; 0))` returns the maximum value in C1:C9 where the corresponding B1:B9 date is in May (month

`). The`

**5**`function would provide the minimum in the month.`

**MIN()**#### Average of entries meeting a condition

` AVERAGE(IF(A1:A9="red"; B1:B9; ""))` returns the average of entries in B1:B9 whose A column entries are

`. The`

**“red”**`function ignores any blank entries.`

**AVERAGE**The ` MEDIAN` function can be used similarly.

#### Dynamic sorting of a column

` =LARGE(B3:B9;ROW(B3:B9)+1-ROW(B3))` entered in cell C3 returns an array in C3:C9 which is B3:B9 in descending order.

`is a constant - you could write`

**+1-ROW(3)**`instead.`

**=LARGE(B3:B9;ROW(B3:B9)-2)**` =SMALL(B3:B9;ROW(B3:B9)-2)` returns B3:B9 in ascending order.

#### Sum ignoring errors

` SUM( IF(ISERROR(A1:A9); 0; A1:A9) )`. Normally

`will propagate any error found.`

**SUM**#### Average, ignoring zero entries

**AVERAGE(IF(B1:B9<>0; B1:B9; ""))**

#### Test a cell for one of a set of values

` OR(B2={2; 5; 6})` as an array expression, or

`as a ‘normal’ formula.`

**OR(B2=2; B2=5; B2=6)**#### Sum of the smallest 4 numbers

` SUM(SMALL(B3:B9; {1;2;3;4}))` or

`.`

**SUM(SMALL(B3:B9;ROW(A1:A4)))**`produces`

**ROW(A1:A4)**`.`

**{1;2;3;4}**#### Last used cell in a column

` MAX(ROW(B1:B9)*(B1:B9<>""))` returns the row number of the last cell used.

## Issues

- Some functions cannot yet be used in array formulas:
,**COUNTIF**(issue 65866), and**SUMIF**(issue 87219).**ISLOGICAL**(issue 8947) works from OOo3.0.**MATCH** -
has limited behaviour within an array formula.**OFFSET** - Names defined by
**Insert - Names - Define**can be used within an array formula, but labels (either**Insert - Names - Labels**or automatically recognised at the head of a column) should not be used (issue 87268). - In Excel, not all functions can be used in array formulas. For example
works in Calc but not in Excel. Be aware of Excel's limitations if you plan to use a Calc spreadsheet in Excel.**CONCATENATE** - Volatile functions such as
,**RAND**,**RANDBETWEEN**may not be recalculated in array formulas; for example in**NOW**just one random number is generated (issue 67135).**{=A1:A3+RAND()}** - Before OOo2.4 spaces and negative numbers in inline arrays failed (issue 82644).
- Before OOo3.1 it was not possible to include different dimension arrays in a formula (issue 46681). Now
internally expands**{ {1|2} + {10;20|30;40} }**to**{1|2}**, to correctly return the result**{1;1|2;2}**.**{11;21|32;42}** - Array formulas may slow down your computer if you have large arrays.

Content on this page is licensed under the Public Documentation License (PDL). |