Content

# Descriptive Statistics

## Introduction

The human mind tries to connect events in Nature with cause-and-effect relationships and generalize from the specific case to the general case by logical reasoning. This process is called inductive learning and results in human knowledge about Nature. It is in fact the only way we can learn things about Nature. But there is an important limitation to this method of learning. It is by definition based upon a limited sample of observations in space and time. By reasoning it is generalized to the total, population, of past, current and future events. Consequently knowledge acquired in this way can never be proved to be true, but only proved to be false. We accept this kind of knowledge for practical convenience as long as it has not been proven false and it is supported by controlled experiments of a repeatable nature with sufficient accurate observations and the conclusions are based on sound logical reasoning. This method of learning is then called the scientific method.

Whenever the relationships found in Nature are quantitatively expressed we call it a mathematical model. A mathematical model defines a functional relation between the cause (independent variables) and the effect (dependent variables). The functional relation is characterized by its parameters (constant) coeffcients. Most of the relationships found in nature are not deterministic. Due to errors in the measurement of input data and model errors (a model is by definition a simplification of Nature, omittance of independent variables) there is always a difference between the outcome of the model and the true outcome in Nature. This difference is also called noise. Because of this phenomenon the input and output variables are called random variables and the model is called a statistical model.

It is the purpose of statistics to define the characteristics of the random variables and find the parameters of the model which minizes the noise.

## Observation

To perceive an event is to observe something changing in contrast with something else. The something we look at we call an element of the universe of discourse. We can observe these elements by one or more of their characteristics. To measure these characteristics we define variables. The measurement outcome of such a variable is data. Measurement consists of either one of following three types:

• Counting: The outcome of a variable is a number indicating the number of units on some standard measurement scale for the characteristic. This type of measurement is called scalar measurement, since each number refers to a point on some scale.
• Ordering: The outcome of a variable is a number indicating the ranking of the characteristic within the total set of outcomes.
• Sorting: The outcome of a variable is a number indicating the category to which the characteristic belongs.

There are different type of scales with regard to the amount of order it defines:
• Categorical scale: A scale that assigns only a category label as outcome without any ordering, e.g. color, and sex
• Ordinal scale: A scale in which data can be ranked by operations, <, >,= but in which arithmetic transformations are not meaningful, e.g. military rank
• Interval scale: A scale which is an ordinal scale and with an interval between successive units of the scale. This scale permit to take two or more measures from this scale and perform the operations of addition and subtraction, e.g. temperature
• Ratio scale: A scale which is an interval scale and with an absolute zero of the quantity being measured. This scale permit to take ratios of measurements on this scale, e.g. length

Variables relating to a categoric scale are called categoric variables. Variables relating to an ordinal scale only are called qualitative variables. Variables relating to an equal interval scale or ratio scale are called quantitative variables.

There are different type of scales with regard to reference for measurement:

• Absolute scale:A scale defined by the set of natural numbers $\mathbb{N} = \{ 0,1,2,3,...\}$.
• Relative scale: A scale defined by a standard unit, the quantity is measured relative to this standard unit.

There are different type of scales with regard to the granularity of the outcomes:

• Continuous scale A scale where the outcomes within the limits of the interval of possible outcomes can take any value.
• Discrete scale: A scale where the set of outcomes have a finite set of possible values.

The distinction between discrete and continuous variables is somewhat vague since in practice there is always a limit to the precision with which we can measure any variable. The limit depends on the instrument we use to make the measurement, how much time we take to make the measurement, and so on.

The set of elements which are part of the observation is called the population and the variables of interest population variables. Instead of analyzing all elements of the population often only a sample is analyzed.

A measurement should be reliable in the sense that its outcomes are accurate so that under the same conditions the same value is measured. A measurement should also be validin the sense that the characteristic of interest is being measured effectively.

## References

• [1]    l, , .
• [1]    l, , .