Summarising Data Graphically
Data has an unfortunate habit of arriving in an inconvenient layout, or with items missing. Often you can use a tool such as Excel to do some basic summarising and tabulation, but if all you need is to group data into categories, GraPL has some simple tools to help.
A typical example of a problem datasheet might be:
Here we have a typical ‘database’ format where there are regions and months, with zero or more entries for each region/month combination. We would like to make a barchart of the total sales for each month, by region, so we could use Excel here to make a CrossTab, or we could use the grouping capabilities of GraPL to do the same job.
A first attempt would be to group the sales by month, and categorise by region – to add the necessary properties to a simple barchart you need to bring up the Properties tab and drag over the Group by and Categorise by properties.
Here is what we get:
If you compare the chart with the data, you can see what GraPL has done with the categories and groups. Because there was no extra information on the months, GraPL has simply taken the unique values of the Mon column from the data and used these as the collection buckets for the sales totals. This approach might be OK if you were simply investigating data looking for patterns, but in this case it has two problems:
- we really want the months in the ‘right’ order, regardless of the ordering in the data
- we should leave gaps for months where there is no data in the table
To achieve this, we must add a ‘Group into’ property to pre-define the bins into which the data will be summed. The easy way to do this is to add another data table, which simply has the month codes in the right order:
... then we can use the month ids from here to force the totalling to do what we want:
In this final version, I have added an X-style of ‘Between,Labmid’ which improves the readability of the bars, and also included my GraPL user name and the date in a footnote. This assumes I was happy with the categories – if we really had three regions (say East, South, North, in this order) I would need to add a ‘Categorise into’ item in just the same way so that East would be added to the key and the bars would group in threes, even though the East made no sales in this period!
Other forms of summary
As you can see, the default action when summarising data is simply to total the items. However there are some other possibilities here – add a ‘Group function’ property and check its ‘...’ button to see the possibilities:
For example you can count the number of entries in each category, which is often what you need when analysing census or questionnaire data.
If you need to make several charts from a summary of a large body of data, you might consider using the Summarise and CrossTab calculations to perform the same operations, but storing the aggregated numbers in a new datasheet to save working them out each time.
|