Visual Displays of Information (Lab 4)

labs
jamovi
excel
graphing
Author
Affiliation
Published

September 25, 2025

Today, you’ll be making some graphs in Jamovi and Excel/Sheets. You’ll also be playing around a bit with three different datasets. Again, you’ll turn in an “answer sheet” on Brightspace. Please turn that in by next lab You needn’t turn in your data. Just the answer sheet.

Data

You should start by downloading two datasets.

  • The updated and cleaned friends dataset is available on Brightspace, here.

  • The nightingale dataset is available on Brightspace, here, or for download here.

Nightingale data

Let’s start by looking at the nightingale data (nightingale.csv), which I also briefly introduced in class. This is the data from Florence Nightingale’s research in the 1850s on causes of death after the Crimean war. Load it into Jamovi.

Date Month Year Army Disease Wounds Other Disease.rate Wounds.rate Other.rate
1854-04-01 Apr 1854 8571 1 0 5 1.4 0.0 7.0
1854-05-01 May 1854 23333 12 0 9 6.2 0.0 4.6
1854-06-01 Jun 1854 28333 11 0 6 4.7 0.0 2.5
1854-07-01 Jul 1854 28722 359 0 23 150.0 0.0 9.6
1854-08-01 Aug 1854 30246 828 1 30 328.5 0.4 11.9
1854-09-01 Sep 1854 30290 788 81 70 312.2 32.1 27.7
1854-10-01 Oct 1854 30643 503 132 128 197.0 51.7 50.1
1854-11-01 Nov 1854 29736 844 287 106 340.6 115.8 42.8
1854-12-01 Dec 1854 32779 1725 114 131 631.5 41.7 48.0
1855-01-01 Jan 1855 32393 2761 83 324 1022.8 30.7 120.0
1855-02-01 Feb 1855 30919 2120 42 361 822.8 16.3 140.1
1855-03-01 Mar 1855 30107 1205 32 172 480.3 12.8 68.6
1855-04-01 Apr 1855 32252 477 48 57 177.5 17.9 21.2
1855-05-01 May 1855 35473 508 49 37 171.8 16.6 12.5
1855-06-01 Jun 1855 38863 802 209 31 247.6 64.5 9.6
1855-07-01 Jul 1855 42647 382 134 33 107.5 37.7 9.3
1855-08-01 Aug 1855 44614 483 164 25 129.9 44.1 6.7
1855-09-01 Sep 1855 47751 189 276 20 47.5 69.4 5.0
1855-10-01 Oct 1855 46852 128 53 18 32.8 13.6 4.6
1855-11-01 Nov 1855 37853 178 33 32 56.4 10.5 10.1
1855-12-01 Dec 1855 43217 91 18 28 25.3 5.0 7.8
1856-01-01 Jan 1856 44212 42 2 48 11.4 0.5 13.0
1856-02-01 Feb 1856 43485 24 0 19 6.6 0.0 5.2
1856-03-01 Mar 1856 46140 15 0 35 3.9 0.0 9.1

Modern graphing software can try to do the Nightingale coxcomb plot we saw in class—see it on wikipedia here—but you’ll see that it doesn’t quite look as nice as hers.

Note. Dashed line represents 1,000 deaths.

Jamovi can’t do a plot like this, but what it can do quite easily is create a histogram.

  1. Create a histogram using the full Disease data. Note that this histogram should only involve the disease data—a histogram shows frequencies of how often you get a certain response. Are these data normally distributed? How do you know? Answer these last two questions on your answer sheet, #1. Your plot should look something like this (although it might not have the title):

Now create a histogram of the deaths from Wounds in the Nightingale data. Is that one normally distributed? (You don’t need to answer on the answer sheet.)

Okay, let’s make a scatterplot. This kind of plot compares two variables to one another—plotting one on the x-axis and the other on the y-axis. There are a few ways to do this in Jamovi, but we’ll use one that’s straightforward to carry out. Select “scatr”: Scatterplot under Analyses: Exploration. (If you don’t have it, install it under Modules; let me know if you need help.)

You can read about scatterplots in Jamovi here. If you need to install it, look on the Analyses ribbon all the way to the right, where there’s a plus sign and the word Modules. You can add a module.

In brief, scatterplots are used to visualize the relationship between two variables which are both numeric. (They’re what you think of as showing correlations.) That’s what we’re doing here. In almost all cases, smaller values are plotted closer to the origin.

If time is one of your variables, it will almost always be on the x-axis.

Plot deaths from Wounds against those from Disease. It’s up to you which is on the x-axis and which on the y. Add Year into the Group box. Can you draw any conclusions from this?

Suppose you wanted to plot cause of death over time for the entirety of the data we have… Line graphs like this are more challenging in Jamovi—feel free to see if you can make one—but you could pretty smoothly make it in Sheets/Excel. In this case, I’ll just include the plot below. Take a look.

  1. What conclusions do you draw from this figure? How does it compare to the Nightingale coxcomb diagram above? Include this answer as #2 in your answer sheet.

Teaching and Learning Research

Let’s turn to graphing in Excel/Google Sheets, and use a relatively simple example for this.

Fiorella & Mayer (2013) hypothesized that students would learn course material better if they thought they were going to later be asked teach the material to the rest of the class. To test this, the researchers divided students into three groups. All groups read a short excerpt about the Doppler effect and were later given a 10-question quiz. The control group studied the excerpt and then immediately took the quiz. The preparation group was instructed that they would later teach the material to a group of students. This group also studied the excerpt and then immediately took the quiz. (They did not actually teach.) Finally, the teaching group was instructed that they would later teach the material to a group of students. This group studied the excerpt, actually taught it to a group of students, and then took the quiz. Fiorella & Mayer reported the following results:

Group n Comprehension score
M SD
Control 31 6.2 3.3
Preparation 32 7.9* 2.4
Teaching 30 8.7* 2.8

* Significantly different from control group at p < .05

We’re going to plot these in a bar graph. Open Excel or Google Sheets and copy these data into a table. Make sure that the column headers line up correctly. Before doing anything else, delete the asterisks in your copied data. We want S/E to recognize these as numbers.

Move the n (sample size) values to the far right column, and then delete the empty column that remains. Now, column A should be group, column B should be means, column C should be SD, and column D should be sample size.

Calculate the standard error of the mean or SEM in column E for each group. Remember that \(\textrm{SEM}=\frac{SD}{\sqrt{n}}\). In Sheets or Excel (S/E), remember that an equation starts with = and then refers to the cell names. Square roots are gotten by writing out SQRT(). In cell E7, calculate the average of your SEMs using the =AVERAGE() formula. Your answer should be 0.50938.

Your spreadsheet columns should look like the below. I’ve added the column (A-E) and row (1-7) as you’ll see in Excel/Sheets, and then the M, SD, n, and SE. The equation for cell E3 is =C3/SQRT(D3), which is then filled down for cells E4 and E5 (=C4/SQRT(D4) and =C5/SQRT(D5)). Finally, in cell E7, I’ve got an average of the three standard errors with =AVERAGE(E3:E5).

A B C D E
1 Comprehension score
2 Group M SD n SE
3 Control 6.2 3.3 31 0.5926975
4 Preparation 7.9 2.4 32 0.42426407
5 Teaching 8.7 2.8 30 0.51120772
6
7 0.50938976

Select the cells representing the names of the groups (i.e., Control, Preparation, Teaching) and the means (6.2, 7.9, 8.7). This should be cells A3:B5 if you’re following along exactly.

Excel

In Excel, go to the Insert menu, then Chart, then Column.

Google Sheets

In Google Sheets, go to Insert, then click Chart. Then from the dropdown menu at the top be sure that “Column Chart” is selected. Both should make a chart that compares the means and add labels on the x-axis.

This part I want you to figure out how to do: give the graph a title, label the y-axis, and explore other possible settings, including (e.g.) colors. (You can probably get to the settings for the chart by double-clicking on it or right-clicking.)

Add the error bars we calculated as SEM: this works fully in Excel, but in Sheets we’ll need to only do it halfway.

Excel

In Excel, go to Add Chart Element: Error Bars: More Error Bars Options. Click on the picture of a column chart in the menu, and change “Error Amount” to Custom. Specify the values in E3:E5 as both positive and negative error values.

Google Sheets

In Google Sheets, after double-clicking on the chart, drop down the “Series” menu. At the bottom of it, check the Error Bars checkbox. Change Type from percent to Constant. Set the value to the average value of the SEM that we calculated above. (Google Sheets won’t let you have different error bars for different bars. This is not good for plotting; I would not use this plot in a senior project. But it’s sufficient for today.)

  1. Submit the plot as part of your Brightspace answer sheet, #3. Chat with a neighbor about what is “lost” by doing this in Sheets vs. Excel. Please copy the chart into your answers or take a screenshot; I’m happy to help you figure that out.

  2. Draw a conclusion from the graph, using the error bars for information. Which method of instruction results in the best scores? What information from the graph makes you feel more confident in that conclusion? Write this answer as #4 on your answer sheet.

I should note: this was only the “immediate test” experiment, which is Experiment 1 in their published article. In Experiment 2, they used a delayed comprehension test. In that test, only the group that actually taught the material did better than the control.

Friends data

Okay, now let’s talk about the data your class collected. You downloaded the cleaned up version of it. Load it into jamovi.

  1. Create a boxplot comparing gender (on the x-axis) to how many instagram followers our participants have (gram.followers). Submit it as #5 on your answer sheet and comment on whether there are any differences.

  2. Create a scatterplot comparing how many hours people use social media (smed.hours—“How many hours per day do you spend on social media?”) and how many instagram followers they have (gram.followers). Is there a trend? Submit the plot as #6 on your answer sheet.

One-tailed tests

Let’s return to one of the questions from last week, which we discussed as being two-tailed tests… but do this test now as a one-tailed test. (We can go through this quickly since you have the responses from last time already! You can use the mean and standard deviation you found in #5 last class as the \(\mu\) and \(\sigma\) for your population. Want to find them again? Use the variable height at the very end of the cleaned friends data.)

Let’s take a new person—5′1″ in height—and ask: Would this person be considered “significantly short” compared to other Bard students? Imagine in this case that we don’t care whether they’re on the tall side, so we don’t need to use a two-tailed test.

We’re still using the z-distribution, but now we’re only interested in scores that fall in the lower 5% of the distribution—where 95% of the scores are higher:

And just as we discussed in class, our cut-off (the critical z-score or \(z_{crit}\)) will be +1.64 for the right side (are they taller?) or, in this case, -1.64 for the left side (are they shorter?). Any z-score below (more negative) than -1.64 will be “unlikely” and let us know that \(p<.05\)—i.e., if the z we calculate has a greater magnitude (more negative) than -1.64, we will conclude that there is statistical significance and reject the null hypothesis.

Why 1.64 instead of 1.96? Well, \(\pm{}1.96\) corresponded to 2.5% in either tail (i.e., a total of 5% across both tails), and 1.64 corresponds to 5% just in one tail. Feel free to look at the z-table on Brightspace to confirm that.

  1. Carry out the test, using the person who’s 5′1″ as your X score and the mean and standard deviation of height as your \(\mu\) and \(\sigma\). You should (conceptually) use the steps of hypothesis-testing, but you only need to report the z-score and your conclusion about the null hypothesis. This is answer #7.

Testing means of samples

In class, we discussed the idea of hypothesis testing with means of samples. We explored the rules that we get from the Central Limit Theorem. Now is a great time to play around with the tool I showed you in class. As you’ll recall, we conclude the following from the Central Limit Theorem:

For random samples of size n, selected from a population with mean \(\mu\) and standard deviation \(\sigma\), has…

  • a mean, \(\mu_X\), equal to the mean of the population: \(\mu_X=\mu\) regardless of size n of the sample
  • a standard deviation, \(\sigma_X\), equal to the standard deviation of the population divided by the square root of the sample size: \(\sigma_X=\frac{\sigma}{\sqrt{n}}\)—this is the Standard Error of the Mean (SEM) and will be described by this equation regardless of size n of the sample, and
  • a shape that is normal if the population is normal AND, for populations with finite mean and variance, the shape becomes more normal as sample size n increases

Suppose I tell you that we know something about all Bard students (our population—and yes, I’m making this up): The average number of classes taken is \(\mu=4.00\), with a standard deviation of \(\sigma=0.25\). From the above, you should be able to determine the properties of the sampling distribution. (You want to have your n equal to the number of people in our sample for whom we have an answer for the variable numclasses. Jamovi will easily give you this n as well as the mean of numclasses.)

To calculate a z test with means of samples, we’re going to use the formula you were introduced to in class for a z-test with means of samples. When we know the population mean and standard deviation (which we do here—I gave them to you above), we can calculate the z-score for a sample as \(z=\frac{(M-\mu_X)}{\sigma_X}\) (where M is our sample mean, \(\mu_X\) is the mean of the sampling distribution and equivalent to the population mean, and \(\sigma_X\) is the standard error of the mean, \(\sigma_X=\frac{\sigma}{\sqrt{n}}\)).

  1. For answer #8, determine the properties of the comparison distribution, and then calculate the z-score for our sample’s mean number of classes. Does our sample differ from the population mean? Feel free to use the steps of hypothesis-testing, or just let me know (a) the sampling distribution’s properties (describe it based on the central limit theorem), (b) the z-score for our sample, and (c) the conclusion you reach about the null hypothesis.

Reuse

Citation

BibTeX citation:
@online{dainer-best2025,
  author = {Dainer-Best, Justin},
  title = {Visual {Displays} of {Information} {(Lab} 4)},
  date = {2025-09-25},
  url = {https://faculty.bard.edu/jdainerbest/stats/labs/posts/04-visualizations-and-hypothesis-tests/},
  langid = {en}
}
For attribution, please cite this work as:
Dainer-Best, Justin. 2025. “Visual Displays of Information (Lab 4).” September 25, 2025. https://faculty.bard.edu/jdainerbest/stats/labs/posts/04-visualizations-and-hypothesis-tests/.