An aggregate is formed when multiple numbers are gathered for statistical purposes and are expressed as one number. This could be in the form of a total or an average.
An attribute component allows us to qualify and interpret an observed value. They enable specification of the units of measure and metadata such as the status of the observation (e.g. estimated, provisional).
Predefined set of codified concepts which represent the distinct values that a dimension can hold.
Comma Separated Values on the Web – a standardised format to express metadata describing the contents of CSV files. For more information, see the W3C project page on this topic.
A dense data set is one where the majority of dimension combinations have observed values recorded. Compare to sparse data sets.
Dimension components identify the subset of a population which has been observed. A set of values for all the dimension components is sufficient to identify a single observation. Examples of dimensions include the time to which the observation applies, or the geographic region that the observation covers.
Basic values such as strings, dates, booleans, and numbers that can only be used in the object position of an RDF triple. Literals are values which can take the form of strings, numbers, dates and booleans. See the CSV-W built-in data types.
A measure defines the population characteristic or phenomenon which has been observed and recorded.
GDP Per Capita
Distance of Daily Commute
A value which has been observed. It is to be interpreted with its corresponding unit and measure values as well as its dimension values. The dimension values define the sub set of the population that the observed value applies to.
One of two data shapes that are accepted by csvcubed. The pivoted shape permits multiple observation value columns to be defined, which minimises redundancy in your data set. Compare to the standard shape.
A property of a population which can be measured or observed. For example, height in a population of people or income in a population of households.
Resources are objects or concepts that exist in the real world that can be expressed in a machine-readable format by converting them to URIs. Resources form the basis of the Resource Description Framework (RDF), and are fundamental to the concept of linked data.
An extension to the world wide web in which information is given structured meaning using vocabularies such as Simple Knowledge Organisation System (SKOS) and the RDF Data Cube vocabulary. The csvcubed tools help you build statistics which fit into the semantic web of linked data.
A sparse data set is one where there are a large number of possible combinations of dimension values, but very few of them have observed values recorded. Compare to dense data sets.
One of two data shapes that are accepted by csvcubed. The standard shape requires that observation values are stored in a single column and are further identified using the measure and unit columns. Compare to the pivoted shape.
A standard data shape/layout designed to ensure interoperability between data tools. A tidy data set is arranged such that each dimension is a single column and each observation a single row. For more information, see Hadley Wickham's paper on this topic.
A quantity or increment by which something is counted or described, such as kg, mm, °C, °F, monetary units such as Euro or US dollar, simple number counts or index numbers.
Uniform Resource Identifiers (URIs) are identifiers which distinguish resources from one another. Note that a URL (Uniform Resource Locator) is a type of URI. Examples:
For more information regarding the use of URIs on the semantic web, see this W3C resource