For any set of data to become meaningful it must be analyzed, explained, interpreted and presented. The mathematical discipline that deals with this is known as statistics and the broad set of techniques applied in this field of study is collectively known as statistical analysis. It is impractical to subject the total set of data or what is called population to analysis. A population like for example "every grain of sand on a stretch of beach" is too huge. A subset of the population called a sample is necessary. It is the sample then that becomes the object of analysis. Possible conclusions derived from the sample can be extended to the population as long as the sample is correctly representative of the population it belongs to. Statistical analysis is used to achieve two general purposes. One is to describe the data and the other is to make inferences about the data.
Statistics is a tool that is used in varied areas from engineering to psychology. Various methods of statistical analysis have been developed. Not all of them were strictly formulated by pure mathematicians or statisticians. Some disciplines find some methods more relevant than others and use it more consistently. Three of the more commonly used methods are regression analysis, factor analysis and multivariate analysis.
Regression analysis pertains to any of several techniques used to understand the relationship of a dependent variable and one or more independent variables. The aim is to figure out how the value of a dependent variable is affected when one independent variable is changed while all other independent variables remain constant. This method is thus most useful for making predictions or forecasts.
Factor analysis is a statistical analysis method that is used to uncover and explain hidden variables underlying observable variables. Interdependencies can be revealed to exist between two or more seemingly unrelated variables. These hidden or underlying variables are called factors and bringing them to light reduces the number of variables and simplifies the set of data being studied.
The multivariate analysis method refers to techniques used to study data that comes from more than one variable. Realistic problems hardly involve single variables. In order get a clear picture of a situation you have to take into consideration all relevant factors. Multivariate analysis deals with huge sets of data and thus often employ databases to organize this data. From there, analysis can lead one to make more informed and intelligent decisions.
As computers have become one of the primary tools in various types of research from the physical sciences to the social sciences, so too have software systems been developed to aid in statistical analysis. Some applications are specifically designed for statistical work while others may have been designed mainly for database or mathematical operations but are naturally capable of performing certain types of statistical analysis. One prominent example of a statistics software is the SAS system. SAS is actually a package of various applications, each with its own focus. SAS/STAT is the main product for statistical analysis and can be used in conjunction with SAS/GRAPH which is another product within the package designed for graphical descriptions of data. MATLAB is an example of a mathematics software that is also capable of performing statistical analysis but its set of functions of course extend beyond statistics. Microsoft Excel is strictly speaking a spreadsheet application but it incorporates data analysis tools called the Analysis ToolPak through which users can perform several methods of statistical analysis.