Read More
Date: 28-3-2021
1300
Date: 2-3-2021
1483
Date: 4-3-2021
3140
|
Often we want to collect data about all members of some group—all salary earners, all cattle in a herd, all light bulbs produced in a certain factory. We use the word population to refer to all members of a group, all things with a certain description.
Sometimes we are most interested to know the typical value of the data. For example, say we are looking at household income in our town. One indicator would be the figure such that half the population earns more and half earns less. This is called the median income. Another, found by adding all the incomes and dividing by the number of households, is called the mean. These are called measures of central tendency. We shall look further at these measures, and others, later in this chapter.
Describing Data
Most data has some sort of numerical value, which we shall call a variable. The values of the variable are often called readings. In some cases, it is useful to assign a numerical value. For example, if you were polling prospective voters in an election with two candidates, Ms Smith and Mr Jones, the answers will be either “Smith” or “Jones,” but it may be convenient to allocate purely artificial numerical values; we could say “Smith equals 1” and “Jones equals 0.” If the mean is .45, then the poll shows that 45% of the people surveyed said they would vote for Smith.
The numerical values that describe a distribution are referred to as parameters of the distribution.
Suppose we have a collection of data. (Such a collection is often referred to as a population, and the individual items are called observations or scores.) For example, consider the scores students received for a quiz. Obviously our variable will be the score. The number of times a certain score occurs is called the frequency of the score, and set of all the scores, with their frequencies, is called the distribution of scores in the quiz.
For example, suppose the highest possible score on the quiz was 10, and the 20 students scored as follows:
9,2,7,7,4,7,7,10,6,6,10,3,10,9,4,7,6,6,7,6.
For convenience, we rearrange the scores in ascending order. First, we gather the information; the diagram would look like Fig. 1.1, and is often called a frequency table.
Fig. 1.1 Frequencies of the quiz scores
Then the list, in ascending order, is
2,3,4,4,6,6,6,6,6,7,7,7,7,7,7,9,9,10,10,10.
It is convenient to represent observations in a diagram. The two most common are the dotplot and the histogram.
In a dotplot, the possible scores are arranged on a horizontal axis and a dot is placed above the score for every observation with that value: in the example, the dotplot is
A histogram is similar to a dotplot, but the dots are replaced by vertical columns.
A scale on the left shows the frequencies. A histogram for the quiz scores is shown in Fig.1.2.
A histogram is often drawn with groups of observations, called classes, represented in the same column. This is usually done when the number of values is large, or when there is a special meaning assigned to observations in a group. For example, if you were representing the scores in a test with a maximum score of 100, you would possibly use only 20 columns, for scores 0–4, 5–9, 10–14, and so on. In our example, there are not so many possible scores, but you might still wish to group them: for example, 8–10 is Good, 5–7 is Passing, and 2–4 is Failing. The number of
Fig. 1.2 Histogram of the quiz scores
possible scores in a class is called the length of the class; usually, all groups are the same length (in our example, the length is 3).
Sometimes the columns are disjoint, as in the histogram shown in Fig. 1.3a. In other cases, the columns touch; in that case, a column whose left side is at score a and whose right side is at score b represents the number of scores x where a ≤ x < b.
We have shown an example of this in Fig. 1.3b.
Fig. 1.3 Histograms of the quiz scores with scores in classes
Sample Problem 1.1 A class took a test. The highest possible score was 20, and students scored as follows:
19,18,8,17,7,16,10,3,15,15,13,14,13,13,15,12,16,10,9,9,18,17,7,7,15.
(i) How many students took the test?
(ii) Draw a dotplot of the scores.
(iii) Represent the scores in a histogram with classes of length 2, and in a
histogram with classes of length 3.
Solution.
(i) 25 students.
(ii) The scores, in order, are
3, 7, 7, 7, 8, 9, 9, 10, 10, 12, 13, 13, 13, 14,
15, 15, 15, 15, 16, 16, 17, 17, 18, 18, 19.
In some cases it is necessary to define the value of the variable. For example, suppose we have the set of ages of all the people in a certain town. For most purposes it is sufficient to know the age to within a year, so the variable would be the number of years in the age, rounded down to a whole number.
|
|
تفوقت في الاختبار على الجميع.. فاكهة "خارقة" في عالم التغذية
|
|
|
|
|
أمين عام أوبك: النفط الخام والغاز الطبيعي "هبة من الله"
|
|
|
|
|
قسم شؤون المعارف ينظم دورة عن آليات عمل الفهارس الفنية للموسوعات والكتب لملاكاته
|
|
|