The Statistics & Probability section of the ACT makes up around 10% of the ACT Math section. Although it may seem daunting at first, through practice with recognizing patterns in these types of questions, you'll be able to answer them with ease! Let's dive in!! ๐โโ๏ธ
There are four main categories included in ACT Statistics and Probability topics.
๐ Data Collection Methods
This section focuses on understanding the ways that data is collecting, and identifying inaccurate ways of collecting data.
๐ Center and Spread
This section focuses on correctly calculating and interpreting median, mode, mean, and range, as well as choosing which best summarizes and captures the data.
๐ Bivariate Data
This section focuses on being able to understand visual representations of bivariate data, as well as understanding correlation between two variables.
๐ฒSimple Probabilities and Calculations
This section is about calculating simple probabilities from scenarios and interpreting them to fit the situation.
There are two main types of data collected:
๐ฆ Qualitative data is data that focuses on subjective qualities. For example, collecting data on the most popular ice cream flavors among teenagers. Since the data is based on preferences, and cannot be summarized by numbers, it is considered qualitative.
๐ Quantitative data is data that is collected in numerical quantities. For example, collecting data on the average height of students in a high school.
There are also two different types of quantitative data that can be collected.
๐ discrete data- this is data that only has specific number options. Ex. collecting data on shoe sizes, where 7.333 or 8.0009 are not options. There are only a few specific numbers that are used to describe shoe sizes.
โ๏ธ continuous data- this is data where every possible number in an interval can be considered a data point. Ex. collecting data on weight, where 200 lbs or 127.752 lbs are both possible responses.
There are several ways that data can be collected. Some of the most common methods covered are:
๐ surveys and censuses- these are studies where a sample of people are asked specific questions usually to gain insight about a population. A census is an example of a population-wide survey.
๐ observational studies- these are studies that observe the effects of a certain treatment on a sample to establish a correlation.
๐ฌ experiments- these are studies that look at effects of a treatment to establish a cause and effect relationship.
A good sample in a study should...
A hypothesis is a prediction made about the outcome of a study.
There are two main types of hypotheses
๐ซ The null hypothesis (H0)- this hypothesis claims no significant relationship between variables.
๐ค The alternative hypothesis (Ha)- this hypothesis predicts some significant relationship between variables.
Let's look at some examples ๐ค
๐ซ Null hypothesis: There is no relationship between height and age amongst children.
๐ค Alternative hypothesis: There is a relationship between height and age amongst children.
๐ซ Null hypothesis: There is no relationship between temperature ๐ฅ and number of ice cream cones sold daily.
๐ค Alternative hypothesis: The higher the temperature ๐ฅ, the more ice cream cones sold in a day.
๐คทโโ๏ธ What do we do with a hypothesis?
Hypotheses are usually established before a study is conducted... so what's the next step?
Significance level
After writing your hypothesis, โบ or the significance level must be established.
The significance level represents the chance of claiming there is a relationship between variables when there really is not.
The significance level is a measure how confident you have to be in your results before deciding to reject the null hypothesis.
The lower the significance level, the more confident you want to be that your conclusion is accurate.
Ex. a significance level of 0.07 means that there is a 7% chance you are claiming there is a relationship between variables when there really is not.
Throughout this study, the goal would be to get results that have a 7% chance or less of making this false claim.
P-Value
A p-value is a value you get after the data is collected.
The p-value measures the chance that you would have gotten the same data given there really is no relationship between variables.
The lower the p-value, the higher the chance that there is a relationship between variables.
A p-value falls between 0 and 1, where 1 represents a 100% chance there is no relationship between variables.
A very common p value used is 0.05.
P-Value and Significance Level
If the p-value calculated is equal to or less than the significance level, the null hypothesis can be rejected. This concludes that there is not no relationship between variables.
If the calculated p-value is greater than the significance level, the null hypothesis cannot be rejected.
Let's look at some examples ๐ฅณ
โบ= 0.05 and p= 0.07
โบ= 0.01 and p=0.01
โบ= 0.07 and p= 0.06
This section covers some of the issues that can affect the accuracy of results in a study.
Bias
Bias is any factor that could influence the data so that it does not accurately represent the reality of its population.
Some common types of bias include:
โ Nonresponse bias
โResponse bias
โฑ Under-coverage bias
โ Nonresponse bias- this is bias that occurs when certain people choose not to respond to questions on a survey. For example, if a survey is being conducted on grade-point averages in a high school, people with lower GPA's would be less likely to respond. These would make the average GPA seem higher based on the survey, even when it is not so high in real life.
โResponse bias- this is bias resulting from the way a question is written, which can influence the way people respond. It can also be bias that results from people intentionally answering a question dishonestly.
โฑ Under-coverage bias- this is bias that happens when a sample does not accurately cover the population. For example, if you are conducting a study on how Americans do their grocery shopping, and only sample people who live in Kentucky, the results will not be accurate in representing all of the United States.
Solutions for Bias
Bias can often be minimized ๐ค by...
sampling a larger group of people ๐ญ
including a more diverse group of people in the sample ๐
adjusting questions to be more subjective ๐
increasing anonymity of a survey ๐ญ
False Positives and Negatives
โ False Positive
โ๏ธ False Negatives
This is when a research states that something is false when it is actually true.
False negatives can also be referred to as type II errors.
The rate of occurrence of type II errors is also referred to as ฮฒ, or beta.
๐ Inaccurate results can have dangerous impacts, especially in science and healthcare.
๐ช Power
Power is how likely a certain test is to not produce a false negative result.
The equation to determine power is 1-ฮฒ
A test with a higher power is considered a more accurate test than one with a low power.
Power can be increased by...
Always read ๐ through the full question-- especially if there's a scenario given.
Underline key numbers, phrases, and vocabulary as you go.
Don't panic ๐คฏ if there's a word or phrase you're not familiar with! Instead identify those you are familiar with, and use context clues to understand the rest of the question.
If you're unsure... take a guess!! There is no point penalty for answering a question incorrectly on the ACT. If you are completely lost on a question, always try to put something down anyway.
Image Courtesy of
ACT Form 15AA51.This question asks us to use past data to predict the expected outcomes for a group of 1000 people. We can assume, since the question states that it is a random group of applicants, that the same percentages of success can be applied to this sample. Therefore, we can start by identifying how many people can be expected to pass the written test. This should be 80% of 1000, or 1000(0.8) which equals 800 people. Next, we know that of these 800 people, only 60% can also be expected to pass the driving test. So, we should find 60% of 800, or 800(0.6) which equals 480.
The answer is B) 480
There are three main measures that can be used to identify the center of a data set
๐ก Mean- this is the same as taking the average of a data set. Add up all the data values and divide by the number of values added together.
sum of values/number of values summed
Ex. Finding the mean of this data set: 24, 9, 0, 0, 56, 9, 12, -7
First, I will sum the values in the set: (24+9+0+0+56+9+12+(-7)) = 103
Next, I will count the total number of terms in the set. Remember to include repeated numbers and zero values!! I count 8 total terms
Lastly, I will divide the sum from step one by 8: 103/8 = 12.875. Therefore the mean of this data set is 12.875
Mean can also be represented with the Greek letter ฮผ (mu)
๐ซ Median- this is a process used to find the literal middle value in a data set.
Arrange data points from smallest to largest by value.
Cross off the smallest and largest values, then the second smallest and second largest etc. until you reach one or two values in the middle.
If there is only one value, this is the median! If there are two values, average these two values together. Ta-da!๐
Ex. Finding the median of this data set: 30, 7, 89, 18, 4
Ordering values: 4, 7, 18, 30, 89
Crossing off values
4, 7, 18, 30, 89
4, 7, 18, 30, 89
4, 7, 18, 30, 89
The median value of the data set is 18.
๐๏ธโโ๏ธ Mode- this process is used to find the data value that appears most often in a set of data.
โญ๏ธ Not all data sets will have a mode, and some will have more than one mode.
Count the number of times each value appears in a set of data.
Whichever value appears most frequently is the mode!
Ex. Finding the mode of this data set: 9, 4, 7, 9, 10, 15, 7, 8, 4, 9, 9, 18
๐ก Mean
๐ซ Median
The median is a consistent way to identify the center of data even with strong skews and outliers.
One downside is that the median is more difficult to find and can leave out the impact of outlier data points.
Example
-56, -9, 0, 1, 1, 2, 3, 3, 3, 4, 7, 7, 45, 67, 89, 100, 2578, 99785
The median of this data set is 3.5, however, this median does not show us the reality of how wide-ranging the data is
โญ๏ธ When a data set is perfectly symmetrical and normal, the mean and median will have the same value.
The spread of a data set is a measure of the data's variability, or how varied the values in the data set are.
Examples
A data set with high variability: -957, -350, -312, -177, -94, -84, -73, -20, 0, 1, 35, 52, 77, 100, 644, 981
A data set with low variability: -1, -0.5, -0.25, -0.1, 0, 0.2, 0.333, 0.6, 0.8, 1, 2, 2.5
The most common measures of variability and spread are
๐๏ธโโ๏ธ Range- this is found by subtracting the smallest number in a data set from the largest number.
๐ข Interquartile Range (IQR) - this is a type of range calculated between medians in a data set.
First, find the median of the data set.
Next, divide the data set in two with the median as the center. Find the median of the upper and lower halves of the data.
Subtract the median of the lower half from the median of the upper.
Example
IQR of this data set: 19, 23, 24, 26, 29, 33, 34,
First, the median of this set is 26
Next, we can find the median of the upper and lower halves. These are called quartile 1 (lower median) and quartile 3 (upper median)
Upper Half: 26, 29, 33, 34,
Lower Half: 19, 23, 24, 26
The Interquartile range is 31 - 23.5 = 7.5
Interquartile range is often used to create box and whisker plots ๐ธ
Image Courtesy of Statistics Canada. ๐คผโโ๏ธ Standard Deviation- this is the average difference of data points from the mean.
The greater the standard deviation, the greater the variability of the data.
Otherwise said, the further apart all of the data points are, the greater the standard deviation.
โญ๏ธ
You will not have to calculate the standard deviation on the ACT!! You will just have to be familiar with what it means. If you are curious though, you can check out this
link! ๐
Many of you may already be familiar with the concepts covered in this category, so it's especially important to make sure you are paying attention to detail to avoid making little mistakes!! ๐คธโโ๏ธ
Practice, practice, practice, practice, practice!! If you're having a bit of trouble with any of these topics, practice will be your best friend. ๐ฏโโ๏ธ By doing practice questions and applying these topics, you'll be even more familiar with them for when you do take the real test! ๐ ๐ช ๐คฉ
With that being said....
This question asks us to rearrange the z score equation to solve for x. Let's rewrite the equation in terms of x.
z(ฯ) = x - ฮผ so z(ฯ) + ฮผ = x
Now we can plug in the given values.
2(6) + 78 = 12 + 78 = 90
Therefore the answer is F) 90
Bivariate data is data that compares the effects of two variables on one another.
The two variables are
Dependent variable- This variable represents an effect. It is usually represented by the y variable.
Independent variable- This variable represents a cause. It is usually represented by the x variable.
Let's look at some examples
Correlation is a way of describing a relationship between variables.
Correlation does not equal cause and effect!!!!
There are three main types of possible correlation
Positive correlation (+1)- as one variable increases, so does the other.
Negative correlation (-1) - as one variable increases, the other decreases.
No correlation (0)- there is no relationship between variables.
A correlation coefficient of 0.33 indicates a ______________ correlation.
A) Weak positive
B) Strong positive
C) Weak negative
Answer: A
Since 0.33 is positive, it is indicative of a positive correlation. However, since 0.33 is closer to 0 than it is to +1, this shows a weak positive correlation.
A correlation coefficient of -0.75 indicates a _______________ correlation.
A) Weak positive
B) Strong negative
C) Weak negative
Answer: B
Since -0.75 is negative, it shows a negative correlation. However, it is closer to -1 than it is to 0, showing a strong negative correlation.
Come up with a way for yourself to easily remember the meanings of a correlation coefficient and positive vs negative correlation.
Practice identifying correlation based on graphs! Many of the ACT questions on correlation could include graphs, so it's important to be familiar with them.
Image Courtesy of
Varsity Tutors. First, let's read through the question and establish what we know. We already know that a correlation coefficient of 1.0 shows a strong positive correlation between the two variables, but not a cause and effect relationship. Now, we can look through the answer choices and identify which is not correct. We have already established that the association is positive, so we know it is not the first answer. We also know that correlation is measure on a scale from -1 to +1, so we can cross off the second choice as well. We also see that there is a strong correlation due to the coefficient of 1, so we can eliminate answer three. The fourth answer however, says that one variable causes another. Correlation and causation are not the same. Causation cannot be assumed from correlation or a correlation coefficient.
So, the answer should be the fourth choice.
Standard Probabilityโ
Probability is a way to calculate how likely something is to happen, often expressed as a percentage or fraction.
It involves placing the outcome you want to focus on as the numerator, and the total number of outcomes as the denominator.
Image Courtesy of
BYJU's. Let's look at a simple example!! ๐ฅณ
If I have a bag of 10 blue marbles, 4 green marbles, and 3 yellow marbles, what is the probability of drawing a yellow marble?
First, add up the total number of marbles in the bag. (10 + 4 + 3 = 17)
Next, consider the total amount of yellow marbles in the bag. (3)
Create a fraction with the "object you want to find probability for" over the "total amount of options"
"And vs Or" Probability
"And" probability questions ask you to find the probability of one outcome AND another outcome.
"Or" probability questions ask you to find the probability of one outcome OR another outcome occurring.
Let's see some more examples
A teacher is teaching a class. ๐ฉโ๐ซ They put 8 popsicle sticks, each with a different student's name written on it, in a bag. The teacher draws a stick every time there is an opportunity for participation in class. Every stick has an equal chance of being drawn. After a student's stick is drawn, it is put immediately back in the bag.
What is the probability that the teacher draws Helen's name five times in a row?
First, we need to determine the probability of the event happening once. In this case, the 'event' is Helen's name being drawn from the bag. The probability is 1/8.
Since we want to know the probability of Helen being chosen 5 times, we need to put this probability to the fifth power
(1/8) x (1/8) x (1/8) x (1/8) x (1/8) represents the chance that Helen is chosen the first time AND the second time AND the third time, and so on. This uses the same strategy as standard probability
Writing out this process can be time-consuming, so instead we can do: (1/8) ^ 5
Let's imagine the same situation as before, but this time the teacher takes a stick out of the bag every time a student is chosen. ๐ง
What is the probability that the teacher will choose Liam, Robert, and Helen in exactly that order?
First, we need to determine the probability of choosing Liam first. In this case, it is (1/8). This is because there are 8 total sticks in the bag, and only one Liam.
However, what happens once we select one person and take the stick out of the bag? There are now 7 total sticks in the bag.
The same pattern continues each time a boy is selected.
๐ก Combinations and Permutations
What are combinations and permutations?
Combinations are a group of events where the order the events does not matter
Permutations are groups of events where the order of the events does matter.
When answering combination and permutation questions, we are aiming to find how many different ways we can create the same group.
Calculating Probabilities
Permutations
Image Courtesy of
Math is Fun. Combinations
Repeating Combinations
In repeating combination questions, the order of events does not matter, and each event can also happen multiple times.
Let's start by looking at the equation.
Image Courtesy of
Math is Fun. Let's say I'm at the grocery store, and I need to buy 2 cartons of milk. ๐ฎ
There are 4 different options of milk in the store:
Low-Fat Milk ๐ฅ
Almond Milk ๐ฑ
Soy Milk ๐
Full-Fat Milk ๐
I can buy any combination of milk I want to: two low-fat milk cartons, one soy carton with one almond carton, etc.
What are all the possible combinations of milk I can buy?
Non-repeating Combinations
In non-repeating combinations, the order of items does not matter, but they cannot be repeated.
Let's take a look at the equation:
Image Courtesy of
Math is Fun. Let's use another scenario. In this case, I am making a smoothie.
I want to be healthy, so I plan on adding 5 fruits and vegetables to my smoothie. ๐
I have a total of 7 fruits and vegetables in my fridge:
Strawberries ๐
Corn ๐ฝ
Banana ๐
Pineapple ๐
Mango ๐ฅญ
Cucumber ๐ฅ
Lemon ๐
I will only be adding each produce item once; I don't want any ingredient to overpower the others!
How many possible smoothie combinations are there?
n in this case is 7 because I have 7 fruits and veggies to pick from
r is 5, because I only have space for 5 of these items in the smoothie.
(7!)/ ((5!) x (7-5)!)
7!/ (5! x 2!)
(7 x 6 x 5 x 4 x 3 x 2 x 1) / (5 x 4 x 3 x 2 x 1 x 2 x 1)
(7 x 6 x 5 x 4 x 3 x 2 x 1)/ (5 x 4 x 3 x 4 x 1)
(7 x 6 x 5 x 4 x 3 x 2 x 1)/ (5 x 4 x 3 x 4 x 1)
(7 x 6 x 2)/ 4
84/4
21 possible smoothie combinations!! ๐
Get familiar with going through the same processes and using the equations, especially using your calculator!
Try to use the same type of calculator to study as you plan to use on the test.
Don't feel overwhelmed if a problem seems complex at first glance. Read through the question and break it down into smaller chunks to make it easier to solve.
First, we can identify what type of question this is.
In this case, it is a non-repeating permutation problem. This is because she cannot pick the same plant twice, but the number of plants she has to choose from decreases each time.
Next, let's identify our r and n values.
Emily has 6 plants to choose from, so n=6. There are 3 spots she can put the plants in, so r= 3
Now, let's plug into the equation and solve.
6!/(6-3!)) = 6!/3! = (6 x 5 x 4 x 3 x 2 x 1)/ (3 x 2 x 1) = 6 x 5 x 4 = 120
Therefore, the correct answer is D) 120
You've made it to the end of this guide, and you're one step closer to crushing the Math ACT!!! ๐คฉ
Remember-- you've got this!! ๐ช
Good luck! ๐ ๐