Difference between revisions of "Documentation/How Tos/Using Arrays"

From Apache OpenOffice Wiki
Jump to: navigation, search
m (Functions which understand array paramters)
m (Issues)
 
(44 intermediate revisions by 5 users not shown)
Line 1: Line 1:
'''This is a work in progress - unfinished'''
 
 
 
== Introduction ==
 
== Introduction ==
  
Firstly, be aware that you can almost certainly manage without arrays. They do make life more complicated - in some ways, they extend what a spreadsheet can do beyond what spreadsheets were intended for. Arrays can however reduce the number of cells used.
+
Arrays may be helpful in reducing the number of cells you use, but they can be a little complex to understand. In some ways, they extend what a spreadsheet can do beyond what spreadsheets were intended for. Almost all tasks are possible without  them.
  
An array is simply a rectangular block of information that Calc can manipulate in a formula - that is, it is information organised in rows and columns.
+
An array is simply a rectangular block of information that Calc can manipulate in a formula - that is, it is information organised in rows and columns. An array may be cells on a spreadsheet, or may be held internally by Calc.
  
 
There are 2 ways to specify an array in a formula:
 
There are 2 ways to specify an array in a formula:
 
*as a range - for example <tt>'''A2:C3'''</tt>.
 
*as a range - for example <tt>'''A2:C3'''</tt>.
*as an "inline array", for example <tt>'''{1; 5; 3 | 6; 2; 4}'''</tt>  (this needs OpenOffice 2.4 - see [[Documentation/How_Tos/Using Arrays#Issues|Array Issues]])
+
[[Image:Calc_array4.png|right]]
 
+
*as an "inline array", for example <tt>'''{1; 5; 3 | 6; 2; 4}'''</tt>  (these are fully functional from OOo2.4, but do exist in earlier versions - see [[Documentation/How_Tos/Using Arrays#Issues|Array Issues]]). You type curly braces <tt>'''{ }'''</tt> around an inline array. Entries on a row are separated by a semicolon ‘<tt>''';'''</tt>’, and rows are separated by the pipe character ‘<tt>'''|'''</tt>’. Each row must have the same number of elements (it is wrong to write <tt>'''{1; 2; 3 | 4; 5}'''</tt> because there are 3 elements in the top row and only 2 in the next row). Inline arrays may have mixed contents, for example <tt>'''{4; 2; "dog" | -22; "cat"; 0}'''</tt>. However an inline array may <u>not</u> contain references (eg <tt>'''A4'''</tt>), or formulae (eg <tt>'''PI()'''</tt>, <tt>'''2*3'''</tt> ), or percentages (eg  <tt>'''5%'''</tt>).
<center>[[Image:Calc_array4.png]]</center>
+
<br style="clear:both;" />
 
+
:You type curly braces <tt>'''{ }'''</tt> around an inline array. Entries on a row are separated by a semicolon ‘<tt>''';'''</tt>’, and rows are separated by the pipe character ‘<tt>'''|'''</tt>’. Each row must have the same number of elements (it is wrong to write <tt>'''{1; 2; 3 | 4; 5}'''</tt> because there are 3 elements in the top row and only 2 in the next row).
+
  
:Inline arrays may have mixed contents, for example <tt>'''{4; 2; "dog" | -22; "cat"; 0}'''</tt> is valid. However an inline array may <u>not</u> contain references (eg <tt>'''A4'''</tt>) or formulae (eg <tt>'''INT(7/3)'''</tt> ).
 
 
[[Image:Calc_array5.png|right]]
 
[[Image:Calc_array5.png|right]]
 
You can give a name to a range of cells: select the range and choose '''Insert - Names - Define'''.
 
You can give a name to a range of cells: select the range and choose '''Insert - Names - Define'''.
Line 33: Line 28:
 
:<tt>'''SUM( {3; 2; 4} )'''</tt> returns <tt>'''9'''</tt>, the sum of the numbers in the inline array <tt>'''{3; 2; 4}'''</tt>.
 
:<tt>'''SUM( {3; 2; 4} )'''</tt> returns <tt>'''9'''</tt>, the sum of the numbers in the inline array <tt>'''{3; 2; 4}'''</tt>.
  
:<tt>'''SUM( B5; 7 )'''</tt> returns the sum of <tt>'''B5'''</tt> and <tt>'''7'''</tt>. <tt>'''SUM()'''</tt> expects/understands single (‘scalar’) values too.
+
:<tt>'''SUM'''</tt> expects/understands single (‘scalar’) values too - <tt>'''SUM( B5; 7 )'''</tt> returns the sum of <tt>'''B5'''</tt> and <tt>'''7'''</tt>.
  
 
== Functions not expecting array parameters ==
 
== Functions not expecting array parameters ==
Line 69: Line 64:
 
:: The formula <tt>'''=LEN(B5:D5)'''</tt> is entered 'normally' in cell B1, that is in column B. Column B intersects <tt>'''B5:D5'''</tt> at cell B5, thus the formula evaluated is <tt>'''=LEN(B5)'''</tt>.
 
:: The formula <tt>'''=LEN(B5:D5)'''</tt> is entered 'normally' in cell B1, that is in column B. Column B intersects <tt>'''B5:D5'''</tt> at cell B5, thus the formula evaluated is <tt>'''=LEN(B5)'''</tt>.
  
== Array expressions ==
+
== Array formulas ==
The real power of arrays comes when you enter a formula in a special way, as an ‘array expression’. You do this by pressing '''Cntrl-Shift-Enter''' instead of the '''Enter''' button.
+
The real power of arrays comes when you enter a formula in a special way, as an ‘array formula’. You do this by pressing '''Cntrl-Shift-Enter''' instead of the '''Enter''' button (or by ticking the ''Array'' checkbox if you are using the Formula Wizard).
  
 
If in cell B1 you enter <tt>'''={3; 4}'''</tt> ‘normally’ by pressing '''Enter''', the first value <tt>'''3'''</tt> is displayed in the cell.
 
If in cell B1 you enter <tt>'''={3; 4}'''</tt> ‘normally’ by pressing '''Enter''', the first value <tt>'''3'''</tt> is displayed in the cell.
  
If in cell B2 you type <tt>'''={3; 4}'''</tt> but press '''Cntrl-Shift-Enter''' instead of '''Enter''', the cell becomes an “array expression”. The formula now returns the entire array <tt>'''{3; 4}'''</tt>. Cell B2 displays <tt>'''3'''</tt> and cell C2 displays <tt>'''4'''</tt>.
+
If in cell B2 you type <tt>'''={3; 4}'''</tt> but press '''Cntrl-Shift-Enter''' instead of '''Enter''', the cell becomes an 'array formula'. The formula now returns the entire array <tt>'''{3; 4}'''</tt>. Cell B2 displays <tt>'''3'''</tt> and cell C2 displays <tt>'''4'''</tt>.
  
:''Note that if you entered the formula using the '''Enter''' key, simply selecting the cell and pressing '''Cntrl-Shift-Enter''' will not convert the cell to an array expression - you must make an actual edit (such as adding then deleting a character).''
+
:''Note that if you entered the formula using the '''Enter''' key, simply selecting the cell and pressing '''Cntrl-Shift-Enter''' will not convert the cell to an array expression - you must make an actual edit (such as adding then deleting a character), or click the Function Wizard and tick the ''Array'' checkbox.''
  
 
If you now try to edit cell B2, you are told that ''"you cannot change only part of an array"''.  To edit an array you must select the entire array, either with the mouse or by typing Cntrl-/ (hold the Cntrl key and press the slash key ‘/’).
 
If you now try to edit cell B2, you are told that ''"you cannot change only part of an array"''.  To edit an array you must select the entire array, either with the mouse or by typing Cntrl-/ (hold the Cntrl key and press the slash key ‘/’).
 +
[[Image:Calc_array6.png|right]]
  
The formula bar indicates that this is an array expression by putting curly braces {} around the formula. You do not need to type these - they will disappear while you edit the formula, and Calc will show them again when you have finished editing.
 
  
- pic here -
+
The formula bar indicates that this is an array formula by enclosing it in curly braces {}. You do not need to type these - they will disappear while you edit the formula, and Calc will show them again when you have finished editing.
 +
<br style="clear:both;" />
  
- using the function wizard to be described here -
+
== Array formula calculations ==
 
+
When Calc evaluates an array formula, it treats ‘unexpected arrays’ as a series of values (rather than using a single value), calculating a result for each of the array elements, and returning an <u>array of results</u>.
== Array expression calculations ==
+
In an array expression, when Calc evaluates the formula, it treats ‘unexpected arrays’ as arrays (rather than using a single value), calculating a value for each of the array elements, and returning an array of results.
+
  
 
''Example:''
 
''Example:''
 
+
[[Image:Calc_array3.png|right]]
:<tt>'''=SQRT( {16; 4; 25} )'''</tt> when entered by pressing Cntrl-Shift-Enter instead of Enter returns an array of results, with 1 row and 3 columns - <tt>'''{4; 2; 5}'''</tt>. If the formula is in cell B2, Calc places the results in cells B2:D2. <tt>'''4'''</tt> is placed in B2, <tt>'''2'''</tt> in C2 and <tt>'''5'''</tt> in D2.
+
:<tt>'''=SQRT( {16; 4; 25} )'''</tt> when entered by pressing '''Cntrl-Shift-Enter''' instead of '''Enter''' returns an array of results, with 1 row and 3 columns - <tt>'''{4; 2; 5}'''</tt>. If the formula is in cell B2, Calc places the results in cells B2:D2. <tt>'''4'''</tt> is placed in B2, <tt>'''2'''</tt> in C2 and <tt>'''5'''</tt> in D2.
 
+
<br style="clear:both;" />
-pic here-
+
  
 
The process in effect works thus:
 
The process in effect works thus:
  
#All ‘unexpected arrays’ in the same array calculation should be the same size (see this issue: Issue 46681).
+
#All ‘unexpected arrays’ in the same array calculation must either
 +
#*a) have the same number of rows and columns, or
 +
#*b) have the same number of rows and some arrays may have only one column, or
 +
#*c) have the same number of rows and some arrays may have only one column.
 +
#Where some of the ‘unexpected arrays’ have just one row or column, that row or column is replicated enough times to give that array the same number of rows and columns as the other arrays. (This was tracked in [https://issues.apache.org/ooo/show_bug.cgi?id=46681 Issue 46681] and fixed in OpenOffice.org v3.1).
 
#The result will be returned in an array of that size.
 
#The result will be returned in an array of that size.
#Any single values are in effect expanded into an array of that size, where each element has that value.
 
 
#The calculation is done for each element in turn, with the result returned in the corresponding element of the output array.
 
#The calculation is done for each element in turn, with the result returned in the corresponding element of the output array.
  
 
''Example:''
 
''Example:''
  
With the array expression <tt>'''=SQRT( {16; 4; 25} )'''</tt>:
+
With the array formula <tt>'''=SQRT( {16; 4; 25} )'''</tt>:
  
 
#There is only one array, with 1 row and 3 columns.
 
#There is only one array, with 1 row and 3 columns.
 
#The result will be returned in an array with 1 row and 3 columns.
 
#The result will be returned in an array with 1 row and 3 columns.
#There are no single values.
 
 
#The calculation is done for <tt>'''16'''</tt> first, then for <tt>'''4'''</tt>, then for <tt>'''25'''</tt>, giving the array result <tt>'''{4; 2; 5}'''</tt>.
 
#The calculation is done for <tt>'''16'''</tt> first, then for <tt>'''4'''</tt>, then for <tt>'''25'''</tt>, giving the array result <tt>'''{4; 2; 5}'''</tt>.
  
 
''Example:''
 
''Example:''
  
With the array expression <tt>'''=SQRT( {8 | 18} * 2 )'''</tt> in cell A5:
+
With the array formula <tt>'''=SQRT( {8 | 18} * 2 )'''</tt> in cell A5:
  
 
#There is only one array, with 2 rows and 1 column.
 
#There is only one array, with 2 rows and 1 column.
 
#The result will be returned in an array with 2 rows and 1 column.
 
#The result will be returned in an array with 2 rows and 1 column.
#The single value <tt>'''2'''</tt> is expanded, so we have <tt>'''SQRT( {8 | 18} * {2 | 2} )'''</tt>.
 
 
#The calculation is: first element <tt>'''SQRT(8*2) = 4'''</tt>; second element <tt>'''SQRT(18*2) = 6'''</tt>; the array result is thus <tt>'''{4 | 6}'''</tt> - that is, <tt>'''4'''</tt> in cell A5 and <tt>'''6'''</tt> in cell A6.
 
#The calculation is: first element <tt>'''SQRT(8*2) = 4'''</tt>; second element <tt>'''SQRT(18*2) = 6'''</tt>; the array result is thus <tt>'''{4 | 6}'''</tt> - that is, <tt>'''4'''</tt> in cell A5 and <tt>'''6'''</tt> in cell A6.
  
Line 124: Line 118:
 
''Example:''
 
''Example:''
  
:<tt>'''=SUM(SQRT( {16; 4; 25} ))'''</tt> as an array expression. The calculation of <tt>'''SQRT( {16; 4; 25} )'''</tt>, as before, yields a result of <tt>'''{4; 2; 5}'''</tt>, thus giving <tt>'''SUM( {4; 2; 5} )'''</tt>, returning a final result in the cell of 4+2+5 = <tt>'''11'''</tt>.
+
:<tt>'''=SUM(SQRT( {16; 4; 25} ))'''</tt> as an array formula. The calculation of <tt>'''SQRT( {16; 4; 25} )'''</tt>, as before, yields a result of <tt>'''{4; 2; 5}'''</tt>, thus giving <tt>'''SUM( {4; 2; 5} )'''</tt>, returning a final result in the cell of 4+2+5 = <tt>'''11'''</tt>.
  
 
''Example:''
 
''Example:''
  
With the array expression <tt>'''=SUM(IF(A1:A4>0; B1:B4; 0))'''</tt>
+
With the array formula <tt>'''=SUM(IF(A1:A4>0; B1:B4; 0))'''</tt>
  
#The two arrays are the same size.
+
#The two arrays <tt>'''A1:A4'''</tt> and <tt>'''B1:B4'''</tt> both have 4 rows and 1 column.
 
#The result of the <tt>'''IF()'''</tt> array calculation will be an array of that size, which <tt>'''SUM()'''</tt> will add up.
 
#The result of the <tt>'''IF()'''</tt> array calculation will be an array of that size, which <tt>'''SUM()'''</tt> will add up.
#Expanding the single values gives <tt>'''=SUM(IF(A1:A4>{0;0;0;0}; B1:B4; {0;0;0;0}))'''</tt>
 
 
#If A1>0 the first element is <tt>'''B1'''</tt>; else <tt>'''0'''</tt>. If A2>0 the second element is <tt>'''B2'''</tt>; else <tt>'''0'''</tt> .... The array presented to <tt>'''SUM()'''</tt> has the values in B1:B4 where the adjacent value in A1:A4 is >0. The final output is the sum of the values in B1:B4 for which the adjacent value in A1:A4 is >0.
 
#If A1>0 the first element is <tt>'''B1'''</tt>; else <tt>'''0'''</tt>. If A2>0 the second element is <tt>'''B2'''</tt>; else <tt>'''0'''</tt> .... The array presented to <tt>'''SUM()'''</tt> has the values in B1:B4 where the adjacent value in A1:A4 is >0. The final output is the sum of the values in B1:B4 for which the adjacent value in A1:A4 is >0.
  
== Functions which are always array expressions ==
+
''Example:''
Some functions are intended to be evaluated in 'array' mode. They are listed in the '''[[Documentation/How_Tos/Calc: Array functions|Array function category]]''', and are:
+
 
:[[Documentation/How_Tos/Calc: FREQUENCY function|FREQUENCY]], [[Documentation/How_Tos/Calc: GROWTH function|GROWTH]], [[Documentation/How_Tos/Calc: LINEST function|LINEST]], [[Documentation/How_Tos/Calc: LOGEST function|LOGEST]], [[Documentation/How_Tos/Calc: MDETERM function|MDETERM]], [[Documentation/How_Tos/Calc: MINVERSE function|MINVERSE]], [[Documentation/How_Tos/Calc: MMULT function|MMULT]], [[Documentation/How_Tos/Calc: MUNIT function|MUNIT]], [[Documentation/How_Tos/Calc: SUMPRODUCT function|SUMPRODUCT]], [[Documentation/How_Tos/Calc: SUMX2MY2 function|SUMX2MY2]], [[Documentation/How_Tos/Calc: SUMX2PY2 function|SUMX2PY2]], [[Documentation/How_Tos/Calc: SUMXMY2 function|SUMXMY2]], [[Documentation/How_Tos/Calc: TRANSPOSE function|TRANSPOSE]], [[Documentation/How_Tos/Calc: TREND function|TREND]]
+
With the array formula <tt>'''=((({6|8})>({1;3;5|7;9;10}))*({1;3;5|7;9;10}))'''</tt>
 +
 
 +
# The first array has two rows and one column. The other two arrays have two rows and three columns.
 +
# The single column of the first array is replicated three times, to make an array with two rows and three columns.
 +
# The result is <tt>'''{1;3;5|7;0;0}'''</tt>.
 +
# If the first array had more than two rows, or a different number of columns than one or three, it would not be sized correctly for this formula.
 +
 
 +
== Array functions ==
 +
Some functions calculate their result as an array. If they are to <u>return</u> that array, they must be entered as an array formula, by pressing '''Cntrl-Shift-Enter''' (or by ticking the ''Array'' checkbox if you are using the Formula Wizard). If, instead, they are entered 'normally' with the '''Enter''' button, just the (single) top left element of the calculated array will be returned. These funtions are listed in the '''[[Documentation/How_Tos/Calc: Array functions|Array function category]]''', and are:
 +
:[[Documentation/How_Tos/Calc: MINVERSE function|MINVERSE]], [[Documentation/How_Tos/Calc: MMULT function|MMULT]], [[Documentation/How_Tos/Calc: MUNIT function|MUNIT]], [[Documentation/How_Tos/Calc: TRANSPOSE function|TRANSPOSE]], [[Documentation/How_Tos/Calc: FREQUENCY function|FREQUENCY]], [[Documentation/How_Tos/Calc: GROWTH function|GROWTH]], [[Documentation/How_Tos/Calc: LINEST function|LINEST]], [[Documentation/How_Tos/Calc: LOGEST function|LOGEST]], [[Documentation/How_Tos/Calc: TREND function|TREND]]
 +
 
 +
 
 +
''For example'': =[[Documentation/How_Tos/Calc: MUNIT function|MUNIT(2)]] entered as an array formula in cell B1 returns the 2 x 2 unit (identity) matrix as the array {1; 0 | 0; 1} - that is, cells B1, B2, C1, C2 show 1, 0, 0, 1. If this formula is entered 'normally', just the 'top left' value (1) is returned in cell B1.
 +
 
 +
 
 +
Some functions (including some listed above) take parameters that are forced to evaluate as an array formula, even when the formula is entered 'normally':
 +
 
 +
:[[Documentation/How_Tos/Calc: MDETERM function|MDETERM]], [[Documentation/How_Tos/Calc: MINVERSE function|MINVERSE]], [[Documentation/How_Tos/Calc: MMULT function|MMULT]], [[Documentation/How_Tos/Calc: SUMPRODUCT function|SUMPRODUCT]], [[Documentation/How_Tos/Calc: SUMX2MY2 function|SUMX2MY2]], [[Documentation/How_Tos/Calc: SUMX2PY2 function|SUMX2PY2]], [[Documentation/How_Tos/Calc: SUMXMY2 function|SUMXMY2]], [[Documentation/How_Tos/Calc: CORREL function|CORREL]], [[Documentation/How_Tos/Calc: COVAR function|COVAR]], [[Documentation/How_Tos/Calc: FORECAST function|FORECAST]], [[Documentation/How_Tos/Calc: FTEST function|FTEST]], [[Documentation/How_Tos/Calc: INTERCEPT function|INTERCEPT]], [[Documentation/How_Tos/Calc: MODE function|MODE]], [[Documentation/How_Tos/Calc: PEARSON function|PEARSON]], [[Documentation/How_Tos/Calc: PROB function|PROB]], [[Documentation/How_Tos/Calc: RSQ function|RSQ]], [[Documentation/How_Tos/Calc: SLOPE function|SLOPE]], [[Documentation/How_Tos/Calc: STEYX function|STEYX]], [[Documentation/How_Tos/Calc: TTEST function|TTEST]]
 +
 
 +
 
 +
''For example'': [[Documentation/How_Tos/Calc: MODE function|MODE(ABS(A1:A3))]] (in normal mode) forces ABS(A1:A3) to be evaluated as an array formula, returning an array of the absolute values of A1:A3 from which MODE selects the most common value to be returned.
 +
 
 +
''For example'': [[Documentation/How_Tos/Calc: PROB function|PROB(A1:A5; B1:B5/100; 3)]] (in normal mode) forces B1:B5/100 to be evaluated as an array formula, returning an array where each value is one hundredth of its value in B1:B5 (perhaps because the probabilities were expressed in percent).
 +
 
  
For example, [[Documentation/How_Tos/Calc: SUMPRODUCT function|SUMPRODUCT]] multiplies corresponding elements of the arrays given as parameters, then returns the sum.
+
A few of these functions have some (array) parameters that are forced to evaluate as an array formula, and other (single value) parameters that evaluate normally.
  
For example, [[Documentation/How_Tos/Calc: MUNIT function|MUNIT]](2) returns the unit array {1; 0 | 0; 1}.
+
''For example'': [[Documentation/How_Tos/Calc: PROB function|PROB(A1:A5; B1:B5; {3; 4})]] (in normal mode) does not evaulate {3; 4} as an array, because this would mean PROB returning an array. It evaluates PROB(A1:A5; B1:B5; 3) - that is, it interprets {3; 4} in scalar mode, taking the 'top left' value 3. However, if entered as an array formula it will of course return an array.
  
 
== Tips and Tricks ==
 
== Tips and Tricks ==
<u>All these examples are to be entered as an array expression, by pressing '''Cntrl-Shift-Enter'''.</u>
+
{{Note|<u>All these examples are to be entered as an array expression, by pressing '''Ctrl-Shift-Enter'''.</u>}}
  
 
====Sum of entries matching multiple conditions====
 
====Sum of entries matching multiple conditions====
Line 188: Line 204:
  
 
== Issues ==
 
== Issues ==
* Inline arrays work in OOo2.3, but fail if there are any spaces, or negative numbers. This is fixed for OOo2.4 (issue 82644).
+
* Some functions cannot yet be used in array formulas: <tt>'''COUNTIF'''</tt>, <tt>'''SUMIF'''</tt> ([https://bz.apache.org/ooo/show_bug.cgi?id=65866 issue 65866]), and  <tt>'''ISLOGICAL'''</tt> ([https://bz.apache.org/ooo/show_bug.cgi?id=87219 issue 87219]). <tt>'''MATCH'''</tt> ([https://bz.apache.org/ooo/show_bug.cgi?id=8947 issue 8947]) works from OOo3.0.
* It will be possible to include different size arrays in a formula, as there is a defined calculation process in the forthcoming international standard ODFF. Calc should comply in OOo3.0. See Issue 46681.
+
* <tt>'''OFFSET'''</tt> has limited behaviour within an array formula.
* Various functions cannot be used in array expressions: <tt>'''COUNTIF'''</tt>, <tt>'''SUMIF'''</tt> (issue 65866); <tt>'''MATCH'''</tt> (issue 8947)
+
* Names defined by '''Insert - Names - Define''' can be used within an array formula, but labels (either '''Insert - Names - Labels''' or automatically recognised at the head of a column) should not be used ([https://bz.apache.org/ooo/show_bug.cgi?id=87268 issue 87268]).
* Array expressions have to do a lot of calculating - which may slow down your computer with large arrays.
+
* In Excel, not all functions can be used in array formulas. For example <tt>'''CONCATENATE'''</tt> works in Calc but not in Excel. Be aware of Excel's limitations if you plan to use a Calc spreadsheet in Excel.
 +
* Volatile functions such as <tt>'''RAND'''</tt>, <tt>'''RANDBETWEEN'''</tt>, <tt>'''NOW'''</tt> may not be recalculated in array formulas; for example in <tt>'''{=A1:A3+RAND()}'''</tt> just one random number is generated ([https://bz.apache.org/ooo/show_bug.cgi?id=67135 issue 67135]).
 +
* Before OOo2.4 spaces and negative numbers in inline arrays failed ([https://bz.apache.org/ooo/show_bug.cgi?id=82644 issue 82644]).
 +
* Before OOo3.1 it was not possible to include different dimension arrays in a formula ([https://bz.apache.org/ooo/show_bug.cgi?id=46681 issue 46681]). Now <tt>'''{ {1|2} + {10;20|30;40} }'''</tt> internally expands <tt>'''{1|2}'''</tt> to <tt>'''{1;1|2;2}'''</tt>, to correctly return the result <tt>'''{11;21|32;42}'''</tt>.
 +
* Array formulas may slow down your computer if you have large arrays.
 +
* The [[Documentation/How_Tos/Calc:_INDEX_function#INDEX_function_in_array_formula_context|INDEX function]] may return ''#VALUE!'' errors instead of the values in the datatable, if you both use an array parameter, and ask for a complete row to be returned. This behaviour is seen in OpenOffice.org Calc version 4.0.1, and is the subject of [https://bz.apache.org/ooo/show_bug.cgi?id=125743 issue 125743]. It might be correct, and might be an error. <tt>'''{=INDEX({1;3;5|7;9;10};{2|1};0;1)}'''</tt> returns array <tt>'''{7; #VALUE!; #VALUE! | 1; #VALUE!; #VALUE!}'''</tt> instead of the expected <tt>'''{7; 9; 10 | 1; 3; 5}'''</tt>.
 +
 
 +
{{PDL1}}
 +
 
 +
[[Category:Documentation/How Tos/Calc]]
 +
[[Category:Calc]]

Latest revision as of 14:48, 18 May 2022

Introduction

Arrays may be helpful in reducing the number of cells you use, but they can be a little complex to understand. In some ways, they extend what a spreadsheet can do beyond what spreadsheets were intended for. Almost all tasks are possible without them.

An array is simply a rectangular block of information that Calc can manipulate in a formula - that is, it is information organised in rows and columns. An array may be cells on a spreadsheet, or may be held internally by Calc.

There are 2 ways to specify an array in a formula:

  • as a range - for example A2:C3.
Calc array4.png
  • as an "inline array", for example {1; 5; 3 | 6; 2; 4} (these are fully functional from OOo2.4, but do exist in earlier versions - see Array Issues). You type curly braces { } around an inline array. Entries on a row are separated by a semicolon ‘;’, and rows are separated by the pipe character ‘|’. Each row must have the same number of elements (it is wrong to write {1; 2; 3 | 4; 5} because there are 3 elements in the top row and only 2 in the next row). Inline arrays may have mixed contents, for example {4; 2; "dog" | -22; "cat"; 0}. However an inline array may not contain references (eg A4), or formulae (eg PI(), 2*3 ), or percentages (eg 5%).


Calc array5.png

You can give a name to a range of cells: select the range and choose Insert - Names - Define.

You can give a name to an inline array: choose Insert - Names - Define; type the array (eg {1; 3; 2} including the curly braces) in the Assigned to box.


Functions which understand array parameters

Some functions, such as SUM(), AVERAGE(), MATCH(), LOOKUP(), accept one or more of their parameters as arrays.

For example:

SUM( A2:C3 ) returns the sum of the numbers in the range A2:C3.
SUM( {3; 2; 4} ) returns 9, the sum of the numbers in the inline array {3; 2; 4}.
SUM expects/understands single (‘scalar’) values too - SUM( B5; 7 ) returns the sum of B5 and 7.

Functions not expecting array parameters

Some functions, such as ABS(), SQRT(), COS(), LEN() expect their parameters to be ‘scalar’ - that is, a single value such as 354 or "dog" or the contents of a cell eg B5.

Example:

SQRT(4) returns 2.
LEN("dog") returns 3.

However, you may still use an array where a single value is expected - for example SQRT( {9; 4} ). If you enter the formula ‘normally’ by pressing Enter, Calc will then evaluate the formula using a single value from the array as follows:

If it is an inline array:

Calc will use the first value (the ‘top left’).
Example:
=SQRT( {9; 4 | 25; 16} ) then press Enter returns 3, the square root of the first element in the array (9).

If it is a range:

1. Calc will return an error unless the array is a single row or a single column.
2. For a single row or a single column range, Calc will use the value where the formula cell’s column/row intersects with the array (or return an error if there is no intersection).
Examples:
Calc array1.png
The formula =ABS(B2:B5) is entered 'normally' in cell D3, that is on row 3. Row 3 intersects B2:B5 at cell B3, thus the formula evaluated is =ABS(B3).
Calc array2.png
The formula =LEN(B5:D5) is entered 'normally' in cell B1, that is in column B. Column B intersects B5:D5 at cell B5, thus the formula evaluated is =LEN(B5).

Array formulas

The real power of arrays comes when you enter a formula in a special way, as an ‘array formula’. You do this by pressing Cntrl-Shift-Enter instead of the Enter button (or by ticking the Array checkbox if you are using the Formula Wizard).

If in cell B1 you enter ={3; 4} ‘normally’ by pressing Enter, the first value 3 is displayed in the cell.

If in cell B2 you type ={3; 4} but press Cntrl-Shift-Enter instead of Enter, the cell becomes an 'array formula'. The formula now returns the entire array {3; 4}. Cell B2 displays 3 and cell C2 displays 4.

Note that if you entered the formula using the Enter key, simply selecting the cell and pressing Cntrl-Shift-Enter will not convert the cell to an array expression - you must make an actual edit (such as adding then deleting a character), or click the Function Wizard and tick the Array checkbox.

If you now try to edit cell B2, you are told that "you cannot change only part of an array". To edit an array you must select the entire array, either with the mouse or by typing Cntrl-/ (hold the Cntrl key and press the slash key ‘/’).

Calc array6.png


The formula bar indicates that this is an array formula by enclosing it in curly braces {}. You do not need to type these - they will disappear while you edit the formula, and Calc will show them again when you have finished editing.

Array formula calculations

When Calc evaluates an array formula, it treats ‘unexpected arrays’ as a series of values (rather than using a single value), calculating a result for each of the array elements, and returning an array of results.

Example:

Calc array3.png
=SQRT( {16; 4; 25} ) when entered by pressing Cntrl-Shift-Enter instead of Enter returns an array of results, with 1 row and 3 columns - {4; 2; 5}. If the formula is in cell B2, Calc places the results in cells B2:D2. 4 is placed in B2, 2 in C2 and 5 in D2.


The process in effect works thus:

  1. All ‘unexpected arrays’ in the same array calculation must either
    • a) have the same number of rows and columns, or
    • b) have the same number of rows and some arrays may have only one column, or
    • c) have the same number of rows and some arrays may have only one column.
  2. Where some of the ‘unexpected arrays’ have just one row or column, that row or column is replicated enough times to give that array the same number of rows and columns as the other arrays. (This was tracked in Issue 46681 and fixed in OpenOffice.org v3.1).
  3. The result will be returned in an array of that size.
  4. The calculation is done for each element in turn, with the result returned in the corresponding element of the output array.

Example:

With the array formula =SQRT( {16; 4; 25} ):

  1. There is only one array, with 1 row and 3 columns.
  2. The result will be returned in an array with 1 row and 3 columns.
  3. The calculation is done for 16 first, then for 4, then for 25, giving the array result {4; 2; 5}.

Example:

With the array formula =SQRT( {8 | 18} * 2 ) in cell A5:

  1. There is only one array, with 2 rows and 1 column.
  2. The result will be returned in an array with 2 rows and 1 column.
  3. The calculation is: first element SQRT(8*2) = 4; second element SQRT(18*2) = 6; the array result is thus {4 | 6} - that is, 4 in cell A5 and 6 in cell A6.

The result of an array expression is an array, which can be used within the formula.

Example:

=SUM(SQRT( {16; 4; 25} )) as an array formula. The calculation of SQRT( {16; 4; 25} ), as before, yields a result of {4; 2; 5}, thus giving SUM( {4; 2; 5} ), returning a final result in the cell of 4+2+5 = 11.

Example:

With the array formula =SUM(IF(A1:A4>0; B1:B4; 0))

  1. The two arrays A1:A4 and B1:B4 both have 4 rows and 1 column.
  2. The result of the IF() array calculation will be an array of that size, which SUM() will add up.
  3. If A1>0 the first element is B1; else 0. If A2>0 the second element is B2; else 0 .... The array presented to SUM() has the values in B1:B4 where the adjacent value in A1:A4 is >0. The final output is the sum of the values in B1:B4 for which the adjacent value in A1:A4 is >0.

Example:

With the array formula =((({6|8})>({1;3;5|7;9;10}))*({1;3;5|7;9;10}))

  1. The first array has two rows and one column. The other two arrays have two rows and three columns.
  2. The single column of the first array is replicated three times, to make an array with two rows and three columns.
  3. The result is {1;3;5|7;0;0}.
  4. If the first array had more than two rows, or a different number of columns than one or three, it would not be sized correctly for this formula.

Array functions

Some functions calculate their result as an array. If they are to return that array, they must be entered as an array formula, by pressing Cntrl-Shift-Enter (or by ticking the Array checkbox if you are using the Formula Wizard). If, instead, they are entered 'normally' with the Enter button, just the (single) top left element of the calculated array will be returned. These funtions are listed in the Array function category, and are:

MINVERSE, MMULT, MUNIT, TRANSPOSE, FREQUENCY, GROWTH, LINEST, LOGEST, TREND


For example: =MUNIT(2) entered as an array formula in cell B1 returns the 2 x 2 unit (identity) matrix as the array {1; 0 | 0; 1} - that is, cells B1, B2, C1, C2 show 1, 0, 0, 1. If this formula is entered 'normally', just the 'top left' value (1) is returned in cell B1.


Some functions (including some listed above) take parameters that are forced to evaluate as an array formula, even when the formula is entered 'normally':

MDETERM, MINVERSE, MMULT, SUMPRODUCT, SUMX2MY2, SUMX2PY2, SUMXMY2, CORREL, COVAR, FORECAST, FTEST, INTERCEPT, MODE, PEARSON, PROB, RSQ, SLOPE, STEYX, TTEST


For example: MODE(ABS(A1:A3)) (in normal mode) forces ABS(A1:A3) to be evaluated as an array formula, returning an array of the absolute values of A1:A3 from which MODE selects the most common value to be returned.

For example: PROB(A1:A5; B1:B5/100; 3) (in normal mode) forces B1:B5/100 to be evaluated as an array formula, returning an array where each value is one hundredth of its value in B1:B5 (perhaps because the probabilities were expressed in percent).


A few of these functions have some (array) parameters that are forced to evaluate as an array formula, and other (single value) parameters that evaluate normally.

For example: PROB(A1:A5; B1:B5; {3; 4}) (in normal mode) does not evaulate {3; 4} as an array, because this would mean PROB returning an array. It evaluates PROB(A1:A5; B1:B5; 3) - that is, it interprets {3; 4} in scalar mode, taking the 'top left' value 3. However, if entered as an array formula it will of course return an array.

Tips and Tricks

Documentation note.png All these examples are to be entered as an array expression, by pressing Ctrl-Shift-Enter.

Sum of entries matching multiple conditions

SUM( (A1:A6="red")*(B1:B6="big")*C1:C6 ) returns the sum of entries in C1:C6 whose A column entries are "red" AND whose B column entries are "big". A1:A6 and B1:B6 each produce a 6 element array of TRUE or FALSE - which in number calculations are 1 or 0. Thus if A2 contains "red" and B2 contains "big" the second element of the array is 1 * 1 * C2 = C2. If A2 contains "blue" instead, the second element of the array is 0 * 1 * C2 = 0.

SUM( ((A1:A6="red")+(B1:B6="big")>0)*C1:C6 ) returns the sum of entries in C1:C6 whose A column entries are "red" OR whose B column entries are "big".

SUM( MOD((A1:A6="red")+(B1:B6="big");2)*C1:C6 ) returns the sum of entries in C1:C6 either whose A column entries are "red" OR whose B column entries are "big" but not both (exclusive OR)

SUM( NOT((A1:A6="red")+(B1:B6="big"))*C1:C6 ) returns the sum of entries in C1:C6 where neither the A column entry is "red" NOR the B column entry is "big".

Count of entries matching multiple conditions

SUM( (A1:A6="red")*(B1:B6="big") ) returns the number of rows whose A column entries are "red" AND whose B column entries are "big".

Maximum in a particular month

MAX(IF(MONTH(B1:B9)=5; C1:C9; 0)) returns the maximum value in C1:C9 where the corresponding B1:B9 date is in May (month 5). The MIN() function would provide the minimum in the month.

Average of entries meeting a condition

AVERAGE(IF(A1:A9="red"; B1:B9; "")) returns the average of entries in B1:B9 whose A column entries are “red”. The AVERAGE function ignores any blank entries.

The MEDIAN function can be used similarly.

Dynamic sorting of a column

=LARGE(B3:B9;ROW(B3:B9)+1-ROW(B3)) entered in cell C3 returns an array in C3:C9 which is B3:B9 in descending order. +1-ROW(3) is a constant - you could write =LARGE(B3:B9;ROW(B3:B9)-2) instead.

=SMALL(B3:B9;ROW(B3:B9)-2) returns B3:B9 in ascending order.

Sum ignoring errors

SUM( IF(ISERROR(A1:A9); 0; A1:A9) ). Normally SUM will propagate any error found.

Average, ignoring zero entries

AVERAGE(IF(B1:B9<>0; B1:B9; ""))

Test a cell for one of a set of values

OR(B2={2; 5; 6}) as an array expression, or OR(B2=2; B2=5; B2=6) as a ‘normal’ formula.

Sum of the smallest 4 numbers

SUM(SMALL(B3:B9; {1;2;3;4})) or SUM(SMALL(B3:B9;ROW(A1:A4))). ROW(A1:A4) produces {1;2;3;4}.

Last used cell in a column

MAX(ROW(B1:B9)*(B1:B9<>"")) returns the row number of the last cell used.

Issues

  • Some functions cannot yet be used in array formulas: COUNTIF, SUMIF (issue 65866), and ISLOGICAL (issue 87219). MATCH (issue 8947) works from OOo3.0.
  • OFFSET has limited behaviour within an array formula.
  • Names defined by Insert - Names - Define can be used within an array formula, but labels (either Insert - Names - Labels or automatically recognised at the head of a column) should not be used (issue 87268).
  • In Excel, not all functions can be used in array formulas. For example CONCATENATE works in Calc but not in Excel. Be aware of Excel's limitations if you plan to use a Calc spreadsheet in Excel.
  • Volatile functions such as RAND, RANDBETWEEN, NOW may not be recalculated in array formulas; for example in {=A1:A3+RAND()} just one random number is generated (issue 67135).
  • Before OOo2.4 spaces and negative numbers in inline arrays failed (issue 82644).
  • Before OOo3.1 it was not possible to include different dimension arrays in a formula (issue 46681). Now { {1|2} + {10;20|30;40} } internally expands {1|2} to {1;1|2;2}, to correctly return the result {11;21|32;42}.
  • Array formulas may slow down your computer if you have large arrays.
  • The INDEX function may return #VALUE! errors instead of the values in the datatable, if you both use an array parameter, and ask for a complete row to be returned. This behaviour is seen in OpenOffice.org Calc version 4.0.1, and is the subject of issue 125743. It might be correct, and might be an error. {=INDEX({1;3;5|7;9;10};{2|1};0;1)} returns array {7; #VALUE!; #VALUE! | 1; #VALUE!; #VALUE!} instead of the expected {7; 9; 10 | 1; 3; 5}.
Content on this page is licensed under the Public Documentation License (PDL).
Personal tools