Documentation/How Tos/Regular Expressions in Calc

From Apache OpenOffice Wiki
< Documentation‎ | How Tos
Revision as of 08:39, 28 October 2007 by Drking (Talk | contribs)

Jump to: navigation, search

Introduction:

In simple terms regular expressions are a clever way to find text.


A typical use for regular expressions is in finding text; for instance to locate all cells containing man or woman in your spreadsheet, you could search using a single regular expression.


Regular expressions in Calc and Writer

Regular expressions are available in Calc as follows:

  • Edit - Find & Replace dialog
  • Data - Filter - Standard filter
  • Functions, such as SUMIF, LOOKUP

The best way to learn about regular expressions in Calc is to start by understanding how to use Find & Replace. This is covered by the 'HowTo for Regular Expressions in Writer', which you should read.


In Calc, regular expressions are applied separately to each cell. (You'll see that regular expressions are applied separately to each paragraph in Writer.) So a search for 'r.d' will match red in cell A1 but will not match 'r' in cell A2 with 'd' (or 'ed') in cell A3. (The regular expression 'r.d' means 'try to match 'r' followed by another character followed by 'd' ').


Regular expressions in Calc functions

There are a number of functions in Calc which allow the use of regular expressions - for example: 'SUMIF', 'COUNTIF', 'MATCH', 'LOOKUP', 'HLOOKUP', 'VLOOKUP', and the 'database' functions 'DCOUNT', 'DSUM' etc.


Whether or not regular expressions are used is selected on the Tools - Options - OpenOffice.org Calc - Calculate dialog:


choosing to use regular expressions in Calc functions


Regular expressions with Calc Find & Replace

Find & Replace in Calc is very similar to Find & Replace in Writer, as described in the 'HowTo for Regular Expressions in Writer'. The following points are interesting to Calc users:

  • If a cell contains a hard line break (entered by Cntrl-Enter), this may be found by '\n'. For example if a cell contains 'red hard_line_break clay' then searching for 'd\nc' and replacing with nothing leaves the cell containing 'relay'.
  • The hard line break acts to mark "end of text" as understood by the regular expression special character '$' (in addition of course to the end of text in the cell). For example if a cell contains 'red hard_line_break clay' then a search for 'd$' replacing with 'al' leaves the cell with 'real hard_line_break clay'. Note that with this syntax the hard line break is not replaced - it simply marks the end of text.
  • The Find & Replace dialog has an option to search 'Formulas', 'Values', or 'Notes'. This applies to any search, not just one using regular expressions. Searching with the 'Formulas' option would find 'SUM' in a cell containing the formula '=SUM(A1:A6)'.
Personal tools