Difference between revisions of "Documentation/How Tos/Removing Duplicate Rows"

From Apache OpenOffice Wiki
Jump to: navigation, search
m (Removed the "Additionally you can use <code>=IF(A1=A2;1;IF(A2=A1;1;0))</code> To indicate all duplicates", because it is just a more complicated formula with the same result)
m (Removing Duplicate Rows in Calc)
Line 1: Line 1:
 
== Removing Duplicate Rows in Calc ==
 
== Removing Duplicate Rows in Calc ==
  
Q: ''How do I select all unique values from a column in an OOo Calc spreadsheet? I want to remove all redundant rows from a report with thousands of records.''
+
Q: ''How do I select all unique values from a column in an {{AOo}} Calc spreadsheet? I want to remove all redundant rows from a report with thousands of records.''
  
 
A: There is no automatic function to remove duplicate rows. Follow these steps to delete all rows that have duplicate values in column A:
 
A: There is no automatic function to remove duplicate rows. Follow these steps to delete all rows that have duplicate values in column A:
Line 8: Line 8:
 
#:On most systems, you can click any cell inside the data range, then press Ctrl+Multiplication key on the numeric keypad.
 
#:On most systems, you can click any cell inside the data range, then press Ctrl+Multiplication key on the numeric keypad.
 
#Sort the data range by column A.
 
#Sort the data range by column A.
#:Choose Data - Sort.
+
#:Choose {{menu|Data|Sort}}.
 
#Click an empty cell in the first row. Let's assume it is cell C1. Enter the formula:
 
#Click an empty cell in the first row. Let's assume it is cell C1. Enter the formula:
 
#:  <code>=IF(A1=A2;1;0)</code><!-- this code only works for the first duplicate and always misses the last duplicate if there is more than 2 records -->
 
#:  <code>=IF(A1=A2;1;0)</code><!-- this code only works for the first duplicate and always misses the last duplicate if there is more than 2 records -->
Line 15: Line 15:
 
#:Drag the lower right edge of the cell C1 down to the last row.
 
#:Drag the lower right edge of the cell C1 down to the last row.
 
#Now the formulas must be replaced by their values to freeze the contents.
 
#Now the formulas must be replaced by their values to freeze the contents.
#:While the column C is still highlighted, press Ctrl+C to copy all selected cells to the clipboard.
+
#:While the column C is still highlighted, press {{key|Ctrl|C}} to copy all selected cells to the clipboard.
#Press Shift+Ctrl+V to open the Paste Special dialog box.
+
#Press {{key|Shift|Ctrl|V}} to open the Paste Special dialog box.
#:In the Selection area, enable only the Numbers command; disable the other Selection commands. Click OK.
+
#:In the Selection area, enable only the Numbers command; disable the other Selection commands. Click {{button|OK}}.
 
#Select the whole data range including the new column C and sort the range by column C.
 
#Select the whole data range including the new column C and sort the range by column C.
#:Choose Data - Sort.
+
#:Choose {{menu|Data|Sort}}.
#Select all rows which have a value 1 in column C, then press Del key.
+
#Select all rows which have a value 1 in column C, then press {{key|Del}} key.
 
#Optional steps: Delete column C. Select the remaining rows and sort them again by column A.
 
#Optional steps: Delete column C. Select the remaining rows and sort them again by column A.
  

Revision as of 14:54, 18 May 2022

Removing Duplicate Rows in Calc

Q: How do I select all unique values from a column in an Apache OpenOffice Calc spreadsheet? I want to remove all redundant rows from a report with thousands of records.

A: There is no automatic function to remove duplicate rows. Follow these steps to delete all rows that have duplicate values in column A:

  1. Select all cells of the current data range.
    On most systems, you can click any cell inside the data range, then press Ctrl+Multiplication key on the numeric keypad.
  2. Sort the data range by column A.
    Choose Data → Sort.
  3. Click an empty cell in the first row. Let's assume it is cell C1. Enter the formula:
    =IF(A1=A2;1;0)
    This will display 1 if the current row has the same value in column A as the next row. It will display 0 if the values are different.
  4. Copy the formula down for all rows of the data range.
    Drag the lower right edge of the cell C1 down to the last row.
  5. Now the formulas must be replaced by their values to freeze the contents.
    While the column C is still highlighted, press  Ctrl  +  C  to copy all selected cells to the clipboard.
  6. Press  ⇧ Shift  +  Ctrl  +  V  to open the Paste Special dialog box.
    In the Selection area, enable only the Numbers command; disable the other Selection commands. Click  OK .
  7. Select the whole data range including the new column C and sort the range by column C.
    Choose Data → Sort.
  8. Select all rows which have a value 1 in column C, then press  Del  key.
  9. Optional steps: Delete column C. Select the remaining rows and sort them again by column A.

Tip: This method can also be used to remove duplicate text lines from a Writer document. Copy the text to a Calc spreadsheet. Remove the duplicates, then copy and paste back as "unformatted text" to Writer.

Content on this page is licensed under the Public Documentation License (PDL).
Personal tools