Internship 2010: Statistical Data Analysis Tool

From Apache OpenOffice Wiki
Jump to: navigation, search

Overview

Statistical models are widely used in analysis of different types of data currently.Statistical analysis of data acts a major role in decision making under uncertainty and it is widely used in surveys, research,business and science. Hence it is very useful to have a statistical data analysis tool in a data manipulating application like Calc. The main aspect of the statistical data analysis tool project is to provide the user ability of performing various statistical functions to analyze data in openoffice calc in a very user friendly manner.

Project plan

  • Developing a basic calc extension for data analysis--Done
  • Determining the different analysis methods to be included in the tool --Done
  • Getting familiar with each analysis method and collecting information about --In progress
    • How the data input should be given to the method
    • Other external user given parameters required
    • How the output should be displayed
  • Developing each analysis method under the following steps -- In progress
    • User interface design considering the input and other parameters required
    • Developing the functionality of the method
    • Displaying the output of the analysis
    • Integrating the developed method with the analysis tool ( extension)

Enhancements

  • Addition of more useful analysis methods
  • Input data from different sheets
  • User interface enhancements
  • Translating extension to other languages

Documentation

  • Statistical data analysis tool documentation (updating the wiki etc.)
  • Code documentation

Tasks Completed

  • The basic calc extension development has been completed together with the UI structure and the development structure of the extension
  • The extension has been reviewed by my mentor and suggested necessary improvements
  • Correlation analysis
  • Covariance analysis
  • Rank and percentile analysis
  • T test
    • Paired t test for means
    • t test assuming equal variances
    • t test assuming unequal variances
  • ANOVA test
    • One-way ANOVA
    • Two-way ANOVA with replications
    • Two-way ANOVA without replications
  • F-test two-sample for variances
  • Z test- Two samples for means
  • Data validation and further testing of the methods already implemented
  • First steps to the analysis methods with charting
  • Moving average analysis with charts
  • Histogram analysis with charts

How statistical data analysis tool works

The figure shows the user interface structure of the statistical data analysis tool project. The user interfaces are subjected to change with the upcoming requirements and enhancements.

Example: How to use One way ANOVA analysis method

Current tasks

Currently working on adding statistical functions to the developed extension. Familiarization with the statistical method and the development is going together with a set of statistical functions have been decided.

  • Enhancement of the developed methods

Project status

  • The project is accepted for the OpenOffice summer internship program 2010
  • The project is proceeding with adding new analysis methods
Personal tools