Content

Descriptive Statistics

Introduction

The human mind tries to connect events in Nature with cause-and-effect relationships and generalize from the specific case to the general case by logical reasoning. This process is called inductive learning and results in human knowledge about Nature. It is in fact the only way we can learn things about Nature. But there is an important limitation to this method of learning. It is by definition based upon a limited sample of observations in space and time. By reasoning it is generalized to the total, population, of past, current and future events. Consequently knowledge acquired in this way can never be proved to be true, but only proved to be false. We accept this kind of knowledge for practical convenience as long as it has not been proven false and it is supported by controlled experiments of a repeatable nature with sufficient accurate observations and the conclusions are based on sound logical reasoning. This method of learning is then called the scientific method.

Whenever the relationships found in Nature are quantitatively expressed we call it a mathematical model. A mathematical model defines a functional relation between the cause (independent variables) and the effect (dependent variables). The functional relation is characterized by its parameters (constant) coeffcients. Most of the relationships found in nature are not deterministic. Due to errors in the measurement of input data and model errors (a model is by definition a simplification of Nature, omittance of independent variables) there is always a difference between the outcome of the model and the true outcome in Nature. This difference is also called noise. Because of this phenomenon the input and output variables are called random variables and the model is called a statistical model.

It is the purpose of statistics to define the characteristics of the random variables and find the parameters of the model which minizes the noise.

General Statistical Model

Least Square Model
Parameter Fitting Least Square Model

Observation

To perceive an event is to observe something changing in contrast with something else. The something we look at we call an element of the universe of discourse. We can observe these elements by one or more of their characteristics. To measure these characteristics we define variables. The measurement outcome of such a variable is data. Measurement consists of either one of following three types:

There are different type of scales with regard to the amount of order it defines:

Variables relating to a categoric scale are called categoric variables. Variables relating to an ordinal scale only are called qualitative variables. Variables relating to an equal interval scale or ratio scale are called quantitative variables.

There are different type of scales with regard to reference for measurement:

There are different type of scales with regard to the granularity of the outcomes:

The distinction between discrete and continuous variables is somewhat vague since in practice there is always a limit to the precision with which we can measure any variable. The limit depends on the instrument we use to make the measurement, how much time we take to make the measurement, and so on.

The set of elements which are part of the observation is called the population and the variables of interest population variables. Instead of analyzing all elements of the population often only a sample is analyzed.

A measurement should be reliable in the sense that its outcomes are accurate so that under the same conditions the same value is measured. A measurement should also be validin the sense that the characteristic of interest is being measured effectively.

References



Copyright 2012 Jacq Krol. All rights reserved. Created ; last updated .