Standard Shape
This page describes how to build and configure a standard shape data cube. For more information about the data shapes supported by csvcubed, see Shaping your data. The instructions below assume a basic understanding of writing a qube-config.json file.
The standard shape extends the common structure by requiring that each row has a measures column and a units column; these columns define the measure and unit (of measure) for each row.
The standard shape is most appropriate where you have a sparse data cube, i.e. there are a large number of possible combinations of dimension values, but very few of them have observed values recorded. If your data cube is dense, then consider using the pivoted shape. See Converting to standard shape for instructions on how to convert the shape of your data in Python and R.
Single Measure
In our example, the single measure observed is Number of Arthur's Bakes
and the corresponding unit is Count
.
Year | Location | Value | Status | Measure | Unit |
---|---|---|---|---|---|
2022 | London | 35 | Provisional | Number of Arthur's Bakes | Count |
2021 | Cardiff | 26 | Final | Number of Arthur's Bakes | Count |
2020 | Edinburgh | 90 | Final | Number of Arthur's Bakes | Count |
2021 | Belfast | 0 | Final | Number of Arthur's Bakes | Count |
The simplest qube-config.json we can define for this data set is:
It is possible to use the configuration by convention approach to generate a valid standard shape cube without defining a qube-config.json at all. Just ensure that your columns use the conventional column names appropriate to their type.
Multiple Measures
One of the benefits of the standard shape is that it is relatively straightforward to add new measure types and unit types; all that you have to do is add additional rows to your data set with the appropriate units and measures present.
We can extend our example data set so that it now includes revenue values for the given year by adding rows to the table:
Year | Location | Value | Status | Measure | Unit |
---|---|---|---|---|---|
2022 | London | 35 | Provisional | Number of Arthur's Bakes | Count |
2022 | London | 25 | Provisional | Revenue | GBP Sterling, Millions |
2021 | Cardiff | 26 | Final | Number of Arthur's Bakes | Count |
2021 | Cardiff | 18 | Final | Revenue | GBP Sterling, Millions |
Note that extending the data set to include multiple measures does not require any changes to the qube-config.json column definitions.
The same data could be represented in the equivalent multi-measure pivoted shape as follows:
Year | Location | Number of Arthur's Bakes | Number of Stores Status | Revenue | Revenue Units | Revenue Status |
---|---|---|---|---|---|---|
2022 | London | 35 | Provisional | 25 | GBP (Sterling) | Provisional |
2021 | Cardiff | 26 | Final | 18 | GBP (Sterling) | Final |