Content |

The human mind tries to connect events in Nature with *cause-and-effect* relationships and generalize from the
specific case to the general case by logical reasoning. This process is called *inductive learning* and results
in human knowledge about Nature. It is in fact the only way we can learn things about Nature. But there is an important limitation to this method of learning. It is by definition based upon a limited *sample* of observations in space and time. By reasoning it is generalized to the total, *population*, of past, current and future events. Consequently knowledge acquired in this way can never be proved to be true, but only proved to be false. We accept this
kind of knowledge for practical convenience as long as it has not been proven false and it is supported by controlled
experiments of a repeatable nature with sufficient accurate observations and the conclusions are based on sound logical
reasoning. This method of learning is then called the *scientific method*.

Whenever the relationships found in Nature are quantitatively expressed we call it a *mathematical model*. A mathematical model defines a functional relation between the cause (*independent variables*) and the effect (*dependent variables*). The functional relation is characterized by its *parameters* (constant) coeffcients. Most of the relationships found in nature are not *deterministic*. Due to errors in the measurement of input data and model errors (a model is by definition a simplification of Nature, omittance of independent variables) there is always a difference between the outcome of the model and the true outcome in Nature. This difference is also called *noise*. Because of this phenomenon the input and output variables are called *random variables* and the model is called a *statistical model*.

It is the purpose of statistics to define the characteristics of the random variables and find the parameters of the model which minizes the noise.

To perceive an event is to *observe* something changing in contrast with something else. The something we look at we call an *element* of the universe of discourse. We can observe these elements by one or more of their *characteristics*. To measure these characteristics we define *variables*. The measurement
outcome of such a variable is *data*. Measurement consists of either one of following three types:

*Counting*: The outcome of a variable is a number indicating the number of*units*on some standard measurement scale for the characteristic. This type of measurement is called*scalar*measurement, since each number refers to a point on some scale.*Ordering*: The outcome of a variable is a number indicating the*ranking*of the characteristic within the total set of outcomes.*Sorting*: The outcome of a variable is a number indicating the*category*to which the characteristic belongs.

*Categorical scale*: A scale that assigns only a category label as outcome without any ordering, e.g. color, and sex*Ordinal scale*: A scale in which data can be ranked by operations, <, >,= but in which arithmetic transformations are not meaningful, e.g. military rank*Interval scale*: A scale which is an ordinal scale and with an interval between successive units of the scale. This scale permit to take two or more measures from this scale and perform the operations of addition and subtraction, e.g. temperature*Ratio scale*: A scale which is an interval scale and with an absolute zero of the quantity being measured. This scale permit to take ratios of measurements on this scale, e.g. length

Variables relating to a categoric scale are called *categoric variables*. Variables relating to an ordinal scale
only are called *qualitative variables*. Variables relating to an equal interval scale or ratio scale are called
*quantitative variables*.

There are different type of scales with regard to *reference* for measurement:

*Absolute scale*:A scale defined by the set of natural numbers $\mathbb{N} = \{ 0,1,2,3,...\}$.*Relative scale*: A scale defined by a standard unit, the quantity is measured relative to this standard unit.

There are different type of scales with regard to the granularity of the outcomes:

*Continuous scale*A scale where the outcomes within the limits of the interval of possible outcomes can take any value.*Discrete scale*: A scale where the set of outcomes have a finite set of possible values.

The distinction between discrete and continuous variables is somewhat vague since in practice there is always a limit to the precision with which we can measure any variable. The limit depends on the instrument we use to make the measurement, how much time we take to make the measurement, and so on.

The set of elements which are part of the observation is called the *population* and the variables of interest *population variables*. Instead of analyzing all elements of the population often only a *sample* is analyzed.

A measurement should be reliable in the sense that its outcomes are *accurate* so that under the same conditions the same value is measured. A measurement should also be *valid*in the sense that the characteristic of interest is being measured effectively.

**[1]**l, , .**[1]**l, , .

Copyright ©2012 Jacq Krol. All rights reserved. Created ; last updated .