Mathematics: Levels E, M, D, and A Study Guide for the TABE Test

Page 13

Statistics and Probability: Part 1

Percentage of Test Level Specifically Assessing Statistics and Probability (— = Assumed)

L E M D A
not tested not tested 5% 22% 16%

Reading charts and graphs may seem daunting initially, but the ability to do so can be developed by familiarizing yourself with the basics. Most of the questions in this category do not even require computation because the answers are already presented in the graph, table, or chart; if not, they can be easily inferred.

Statistical Questions

If you have a question that you expect to get multiple and varied answers for, it is a statistical question. If a question has a single answer, it’s not a statistical question. If you ask Marie how she did on a math test, that’s not a statistical question, but if you ask how the whole class did on the test, that is statistical. You would expect to get many results that you could somehow combine into one number or grade to represent the whole class.

Reading Charts and Graphs Thoroughly

Charts or tables are normally presented with a title—this is your first clue. A title that says “Car sales of the different branches from 2001–2005” will definitely show these pieces of information: the different branches and the sales in these branches for the years stated. Inspect the headings, columns, rows, and legends. Acquaint yourself with how data is presented.

For example, the heading above the columns indicates the sales performance, each column represents the year, and each row represents the performance for each branch. Tackle the questions next. You will probably be asked about the sales of a particular branch in a particular year, or the branch with the highest sales in 2004, or the branch that showed the biggest difference in sales from 2001 to 2005.

81 Reading Charts and Graphs.png

Types of Data Display

There are quite a few ways to display numerical data in a graph. Different types of data are often best shown with different types of graphs.

Dot Plot

Dot plots look like this:

82 Dot Plot.png

Each dot represents one watermelon. The graph tells you that two watermelons weighed \(8\frac{1}{2}\) pounds, two watermelons weighed \(9\) pounds, three watermelons weighed \(8\) pounds, and so on. The graph’s purpose is to give a quick picture of what most melons weighed. In this case, watermelons have weights that roughly cluster around \(9\) pounds.

Histogram

Histograms look like this.

83 Histogram.png

Retrieved from: https://openstax.org/books/statistics/pages/2-2-histograms-frequency-polygons-and-time-series-graphs. Figure 2.6

Histograms are a special kind of bar graph and are good at showing how often things fall into different categories. In this example we are seeing how many students in a room have read how many books. Frequency means the number of students. The tallest bar is the bar for reading \(2.5\) to \(3.5\) books and goes up to \(16\), showing that \(16\) students have read between \(2.5\) and \(3.5\) books. That is the number that more students have read than any other number. On the other hand, it shows that only \(2\) students have read from \(5.5\) to \(6.5\) books.

Box Plot

Box plots look like this:

84 Box Plot.png

Retrieved from: https://openstax.org/books/statistics/pages/2-4-box-plots, Figure 2.13

They are also called box and whisker plots, the box being in the middle and the whiskers on either end. The box plot here is made up of the shoe sizes of the customers in a shoe store one day, which were \(1, \,1,\, 2, \,2,\, 4, \,6, \,6.5,\, 7,\, 8,\, 8.5,\, 9, \,10, \,10, \,11.5\).

The plot is made of four quarters.

The whisker on the left represents one quarter of all the sizes and shows that one quarter of all sizes that day were from \(1\) to \(2\).

The box is divided by a dotted line. The left section shows that one quarter of the sizes that day were from \(2\) to \(7\).

The right section shows that one quarter of all sizes that day were from \(7\) to \(9\).

The right whisker shows that one quarter of all sizes that day were from \(9\) to \(11.5\).

The point of this kind of graph is that it lets you see if the numbers show a pattern. In this example, it is shifted to the left and the manager should probably order more smaller sized shoes.

Scatter Plot

Below is an example of a scatter plot. Each dot represents one person who took the third exam and the final exam. For example, one person scored \(75\) on the third exam and \(200\) on the final. The graph gives you a rough idea of the relationship, if any, between the scores. A question it might answer is, “Does the third exam score predict the final exam score?” The fact that the dots go somewhat uphill tells us that, in general, as we see the third exam scores get higher, so do the final exam scores. Not in every case, but that is the trend.

85 Scatter Plot.png

Retrieved from: https://openstax.org/books/statistics/pages/12-2-the-regression-equation. Figure 12.5

This graph below shows a scatter plot with a line added. The line is called a best-fit line because it has been drawn to come as close to each point as possible. The line represents the trend of the points a bit more clearly than just the points themselves.

86 Line of Best Fit.png

Retrieved from: https://openstax.org/books/statistics/pages/12-2-the-regression-equation Figure 12.7

Is the line a good fit? That’s a bit of a judgment call, but it actually hits some of the points, and is close to the others. That would make it a pretty good fit. Rarely, if ever, with actual measured data, will a best-fit line hit all the points. A decent best-fit line can often be drawn by hand with a straightedge.

Here, we’ve been talking only about data points that approximate a straight line.

Two-Way Table

Suppose a study was done to see who gets the flu most often. The study used men, women, boys, and girls, with one hundred of each group. The table below shows the results. It looks like boys are more likely to get the flu than men, women, and girls.

  Flu
Men 17
Women 15
Boys 23
Girls 19

The next study was the same except the researchers studied who gets the common cold. The results were added to the original table. This is now called a two-way table because it is comparing two different categories to the same subjects. They are sometimes called two-way frequency tables. Remember, frequency means how often this particular event happened.

  Flu Cold
Men 17 20
Women 15 19
Boys 23 34
Girls 19 27

Two-way tables are good for showing patterns between two categories. For example, this table would be good for answering the question, “Are people likely to get colds more often than the flu?” It looks like the answer is yes. Each group got more colds than the flu.

Samples

When you are collecting data for a study of a certain group, you often have to decide how many of the group you want to include in your study. If you wanted the average height of a high school senior in Ohio, you likely wouldn’t be able to measure them all, so you measure a certain group, called the sample. Maybe it would be 500 seniors. The sample would be the group and the sample size would be 500. Bigger sample sizes are better.

Random Sample

We were just looking at data about people who get the flu or a cold. It’s important when picking the people to put in your study to make sure that you randomly pick people. For example, you wouldn’t want to pick your sample from a clinic waiting room, because those people are more likely to be sick than someone out in the general public. That would make your number of people with the flu higher than in the average population. Trying to make a sample truly random can be hard to do, but it’s always something to shoot for.

Multiple or Simulated Samples

To increase the reliability of the statistics, it’s always good to measure several samples.

The table below shows the result of counting vowels in words chosen at random. Each sample is distinct and consists of 30 words.

  a e i o u
Sample 1 7 14 3 6 4
Sample 2 10 12 8 3 3
Sample 3 4 12 5 6 7
Average 7 12.7 5.3 5 4.7

It’s pretty clear by looking at the averages that the most used vowel was e, with a coming in second. There is some other good information that can be gleaned from the table. That is variability, meaning how much the numbers for each vowel stray from the average. For example, look at the e column data and you will see the numbers \(14, 12,\) and \(12\). All three are quite close to the average of \(12.7\). In other words, there’s not much variability among them.

On the other hand, look at the a column data and see \(7, 10,\) and \(4\), with an average of \(7\). The \(7\) is right on the average, but the \(4\) and \(10\) aren’t that close. In other words, the data in that column have more variability than the e column data.

The point is this: Low variability is good. If all the data points are very close to each other, you can be pretty confident about your statistics. High variability is bad. If the numbers are all over the place the data likely isn’t very reliable.

Slope and Intercept

We’ve seen a number of linear graphs in this review. They are all known to have an equation of the form \(y=mx +b\), where \(m\) is the slope of the line and \(b\) is the \(y\)-intercept, the point on the \(y\)-axis where the line crosses. The graph below doesn’t show any \(x\) or \(y\) values. Instead of \(y\) on the vertical axis, we have distance, and we have time on the horizontal distance. We can use \(d\) for distance and \(t\) for time and write an equation for the line: \(d = mt+b\). It has exactly the same form as \(y=mx+b\), but we are using \(d\) and \(t\) where we had \(y\) and \(x\) before.

When we were doing graphs using just \(x\) and \(y\), none of the numbers we used had any units, but we’ve seen linear graphs that use actual measurements, and their units give us information. You should be able to look at a graph like the one below and explain the physical meaning of its slope and \(y\)-intercept.

A car travels along as someone starts timing them and noting the distance the car goes.

87 Slope and Intercept.png

Slope = \(\frac{rise}{run} = \frac{20}{5} = 4\).

What does a slope of \(4\) in this graph mean? Let’s put the units in and see what that tells us.

Putting those into the slope equation, we get \(m=\frac{20 ft}{5 s} = 4\frac{ft}{s}\). We can read this as \(4\) feet per second, which is the speed of the car. We can push this a little further and state that the slope of any distance-time graph is speed.

What about \(b\), the \(y\)-intercept? We can see that it is \((0, 10)\). What does that mean? Apparently, at zero seconds the car had already moved \(10\) feet. How could you explain that? It just means that the car had already moved \(10\) feet before the timer started.

The point of this section is that by looking at the units, you should be able to interpret linear graphs.

All Study Guides for the TABE Test are now available as downloadable PDFs