What is Descriptive Analysis?
A descriptive analysis is an important first step for conducting statistical analyses. It gives you an idea of the distribution of your data, helps you detect outliers and typos, and enables you to identify associations among variables, thus making you ready to conduct further statistical analyses.
However, with the availability of so many types of graphical and summary approaches, investigators get confused about which approach to use for analysis of their data.
They either end up conducting a range of analyses, thus wasting their time, or completely skip this crucial step of statistical analysis, thus increasing their chances of making erroneous decisions
However, descriptive analyses are neither difficult nor time-consuming, if done systematically. It is easier to think about descriptive analyses if you divide them into two types:
- Descriptive analysis for each individual variable
- Descriptive analysis for combinations of variables
The best approach for conducting descriptive analyses is to first decide about the types of variables and then use approaches for descriptive analyses based on variable types.
Broadly, variables can be classified into qualitative and quantitative.
Quantitative variables represent quantities or numerical values (e.g. age, weight, phone bill, volume, etc.) while qualitative variables describe the quality or characteristics of individuals (e.g. color, ethnicity, gender, etc.).
Both variable types have further sub-classifications but the broad classification is sufficient for deciding approaches for descriptive analysis.
Descriptive analysis for each individual variable
For quantitative variables, it is a good idea to first create a histogram and a box-and-whisker plot to get an idea of the shape of the distribution.
If the shape is symmetric, then calculate and present mean and standard deviation whereas if the shape is skewed, calculate and present median and quartiles.
You could also calculate and present min and max values. These descriptive analyses would also help you identify outlying and improbable values so that you can double-check data entry errors.
For categorical variables, create frequency tables and present them in bar charts, pie charts or doughnut charts. These approaches are sufficient to get an idea of distributions of variables and of typos and other errors in data entry.
Descriptive analysis for a combination of two variables
Since both the variables can be either qualitative or quantitative they make four combinations:
In fact, the approaches for descriptive analyses for a combination of qualitative and quantitative variables (B and C) are the same, thus essentially there are only three combinations of variables for descriptive analysis: (A) Both variables are quantitative; (B and C) one variable is quantitative and the other is qualitative; (D) Both variables are quantitative.
So all you need to understand is three types of descriptive analysis!
- Both variables quantitative (A): Prepare a scatter plot
- One variable qualitative and the other quantitative (B and C): Calculate summary statistics of the quantitative variable classified by the qualitative variable and prepare box-and-whisker plots of the quantitative variable by the categorical variable
- Both variables qualitative (D): Prepare a contingency table
And that is it!
Obviously, there are a number of other graphical approaches but the above would give you sufficient information about the association between two variables so that you can conduct further statistical analyses.
Both the univariate and bivariate descriptive analyses can be very easily conducted using our Descriptive Analysis tool.