Working Group Session # 10- Christine Franklin and Robert Gould
Teaching Statistics to High School Teachers

Part I: Statistics a Rapidly Growing Area within the K-12 curriculum
     
K-12 Teachers Need Training with Content and Pedagogy

Christine Franklin, University of Georgia, Department of Statistics, chris@stat.uga.edu

Let's Estimate the Ages of the Rich and Famous

Here are some names of well-known individuals. Let’s see how well you can estimate the age of each individual.

Name                                     Estimated Age

Nancy Reagan
Mister Rogers
Sandra Day O’Connor
Chelsea Clinton
Hillary Clinton
Eddie Murphy
Tom Brokaw
Dan Rather
Roseanne
Ringo Starr
Opray Winfrey
Jon Bob Jovi
Garth Brooks
Jane Fonda
Bill Gates

This is an activity that students love. The competitive nature of students emerges as the correct ages are given. So the question might be proposed to the students, “How can we decide which student did the best job of getting closest overall to the actual ages of the rich and famous?” Let the students think about this, then give them a scatterplot to plot the points of the estimated age versus the actual age. Ask them to think about a line that would help them answer the question of interest (the line y=x works nicely). They will soon see that they can look at the vertical distances from their points to this line and the discussion can lead to using these vertical distances to make a judgment as to which student did the best job. This activity is an example of how to motivate students to learn important mathematical and statistical concepts through the use of data. This activity is from the Data Driven Mathematics series [Exploring Linear Relations, Lesson 7, “Lines on Scatter Plots”, pp. 74-86], a collection of modules that attempt to accomplish the following:

bulletHaving each lesson designed around a problem or mathematical situation
bulletBeginning with a series of introductory questions or scenarios that will prompt discussion
bulletEvolving from the questions, students will begin to think about the problem and realize why such a problem might be of interest to someone outside the classroom
bulletEmphasizing student involvement
bulletEmphasizing verbal expression through speech and writing
bulletHelping students understand not only how to compute with an algorithm but to understand the underlying process

The Data Driven Mathematics series are excellent materials to illustrate many of the goals of the 2000 NCTM Principles and Standards of School Mathematics and many of the goals of the MET document as both relate to the use of data and statistics in the K-12 curriculum.

Background of Quantitative Literacy 

Let’s look briefly at the history of where we are today with probability and statistics in the K-12 curriculum. During the 1970’s, a developing movement to improve the practical skills of high school and college graduates to deal with numerical and quantitative issues began, given the name of Quantitative Literacy. In the 1980’s, the American Statistical Association (ASA) had a program called the Quantitative Literacy Project. They produced materials, called Quantitative Literacy (QL), and conducted workshops for teachers to demonstrate how to integrate data analysis and quantitative reasoning into the K-12 curriculum. This formed the background work for the infusion of probability and data analysis in the 1989 NCTM Math Standards. One of the immediate results seen in the 1990’s was the development and implementation of the Advanced Placement Statistics program at the high school level. The first AP Statistics exam was given in 1997 with 7500 exams being graded. In 2001, approximately 43,000 exams were graded. With the rapid growth of AP Statistics, the need for training of secondary teachers in statistics was becoming crucial. In the early and mid-90’s, workshops were primarily used to train new and experienced teachers. One-day to one-week workshops are still crucial to meet the immediate needs of in-service teachers. However, it is difficult to train teachers on what to teach, how to teach, and to understand the material in this time frame. Most math teachers feel that their backgrounds are inadequate to teach the concept-oriented curriculum and to use the available materials and resources, including technology such as computers or calculators. In 1997, the University of Georgia Department of Statistics approached the UGA School of Education about developing a course targeted toward secondary math teachers, both pre-service and in-service teachers. The need for this type of course was reinforced, beginning in 1997, with the work and recommendations of the American Statistical Association Advisory Review Group (ASA-ARG) for NCTM’s  2000 Principles and Standards for School Mathematics(PSSM).

Connection between the 2000 PSSM, the UGA Course, and the MET Document 

There are 10 standards in the 2000 PSSM with one of the content standards being Standard 5, Data Analysis and Probability. Near the end of the ASA-ARG’s work, the committee gave some general thoughts about the upcoming standards. Some of the recommendations made were:

bullet

We must make math fun, interesting, relevant, exciting, and useful for all students

bullet

One of the primary reasons for teaching statistics and data based mathematics in K-12 is to prepare students to become informed citizens, intelligent consumers, and better prepared quantitatively for the workforce

bullet

Statistics in K-12 is a great background for college work in statistics, both for disciplines which require some statistical training and for providing departments that teach statistics more opportunities

bullet

It’s important to make the CONNECTION between the Data Analysis and Probability standard and the other mathematical standards. Working with data provides a context for building core math concepts and skills. (The QL and DDM materials strive to meet this recommendation.)

bullet

The standards strive to have students reach the level of understanding where they comprehend the concept and process behind a mathematical topic. We want students to go beyond just evaluating a formula or applying an al gorithm. By the use of hands-on-activities and technology, students can develop a deeper understanding of data analysis, inference, and probabilistic reasoning.

bulletTeachers will have to work hard to develop instructional strategies to deliver the recommended curriculum. To do this successfully, a teacher must be highly competent in both mathematical content (probability and statistics) and pedagogy.

The MET Document addresses what teachers need with respect to training. The UGA course developed in 1997 illustrates many of the recommendations in the MET document. 

Some MET Document Recommendations

Prospective teachers need math courses that develop a deep understanding of the math they will teach

The UGA course content includes the mathematical reasoning  (similar material that would be covered in a math stat course) necessary for development of topics covered in an AP Statistics course. Advanced statistical topics beyond the AP course are also discussed. 

Courses designed for teachers should develop careful reasoning and mathematical ‘common sense’ in analyzing conceptual relationships and in applied problem solving. Courses need to demonstrate flexible, interactive styles of teaching. 

The UGA course emphasizes statistics in everyday situations and in current literature, data collection (sampling and experimental design) and data analysis, hands on activities and technology that helps to reinforce concepts. Major goals of the course are:

bullet

To give teachers a lasting appreciation of the vital role statistics plays in empirical research

bullet

To help teachers understand the importance of providing a data-driven curriculum in their math classes. Probability and statistics can be taught in such a way that it strengthens a student’s understanding of algebra and functions.

bullet

To teach a course that intertwines content and pedagogy!

Teacher education must be recognized as an important part of mathematics departments’ mission at institutions that educate teachers. More mathematics faculty should consider becoming deeply involved in K-12 math education. The mathematics education of teachers should be seen as a partnership between mathematicians and math educators.

The UGA Department of Statistics recognizes the importance of providing the valuable service of training teachers to teach statistics at the K-12 level. UGA has a visionary College of Education that recognizes what the teachers’ needs are in the area of statistics. The relationship between the two departments has been successful because of mutual respect. The departments have worked together to produce a successful course that is in its fourth year of being offered. Currently, the two departments are working together to develop a similar course for K-8 teachers that will be taught in Spring 2002.

 There is a need for collaborations between mathematics faculty and school math teachers.

UGA Statistics faculty members have made frequent classroom visits, giving guest lectures which helps teachers and students to see the relevance of statistics. Just as importantly, in-service teachers have provided valuable perspectives for the instruction of teachers in the area of statistics. These are teachers that are in the ‘trenches’ and they can give a real world perspective of what is needed in the classroom.

Efforts to improve standards for school mathematics instruction will be strengthened by participation of the academic mathematics community.

Well respected statisticians have been actively involved since the 1970’s with developing quantitative literacy through the development of valuable statistical resources, AP Statistics, and working with developing the NCTM Math Standards and the MET document. The American Statistical Association and the University of Georgia are currently supporting and organizing a ‘Conference on Statistics in Teacher Preparation Programs’ to be held in 2003.

Where Do We Go from Here?

Quantitative literacy, also referred to as numeracy, is the ability of an individual to make use of mathematical skills for problem solving in quantitative situations both in everyday life and work.

Lynn A. Steen states in Education Week on the Web [Wednesday, September 5, 2001, Vol.21, No.1, p.58],

 “…numeracy is not the same as mathematics, nor is it an alternative to mathematics. Today’s students need both mathematics and numeracy. Whereas mathematics asks students to rise above context, quantitative literacy is anchored in real data that reflect engagement with life’s diverse contexts and situations.  The case for numeracy in schools is not a call for more mathematics, nor even for more applied mathematics. It is a call for a different and more meaningful pedagogy across the entire curriculum. In life, numbers are everywhere, and the responsibility for fostering quantitative literacy should be spread broadly across the curriculum. Quantitative thought must be regarded as much more than an affair of the mathematics classroom alone.

We must work hard to train our teachers to make quantitative literacy and use of data as a main objective of K-12 education. An activity illustrating how we can attempt to train our teachers by integrating content and pedagogy is presented in the appendix. This activity, written by the author of this paper, will appear with full discussion in the upcoming NCTM Navigation Series, Data Analysis for the Secondary Level, scheduled to appear in print in April 2002. The activity is presented to the teachers, as they would use the activity in their classrooms. After going through the activity, the teachers hopefully understand the concepts being illustrated and the pedagogy being used to help students visualize the concepts. It is then that the mathematical theory behind the activity can be presented to the teachers. In the past, teachers would have been presented with only the mathematical theory.

 

Part II:  Teaching the  Extra-Mathematical Statistics Content

Robert Gould, UCLA, Department of Statistics, rgould@stat.ucla.edu

Although the intersection between Statistics and Mathematics is quite broad, there is sufficient "extra-mathematical" content in Statistics to make it challenging to develop a curriculum that will adequately prepare a teacher to teach both math and Statistics.  I use "extra-mathematical" to refer to ideas and concepts that might be "math like", but essentially have no basis in mathematics.

This discussion will be motivated by examination of a particular case-study.  Such an examination does not survive well in translation from a group discussion to a static paper, and those interested in more details are encouraged to either consult the reference in which this case study is published (Trumbo 2001), view the notes from the MET workshop (www.stat.ucla.edu/~rgould/met), or -- best of all -- examine the data themselves at www.stat.ucla.edu/~rgould/met/lead.dat.

The data for this case study will be examined using Fathom, a software package designed for classroom instruction.   This study, while somewhat old (particularly by high-school student standards)  is none-the-less of current interest.  In 1982, Morton et. al published a study examining lead absorption in the children of employees of a battery factory in Oklahoma.  Blood samples were taken from 33 children whose parents worked around lead (the "Exposed group") and from 33 children whose parents did not work around lead (the "Control group").  Ordinarily, when approach a data set, one should gather as much information about the data as possible before beginning.  But here, in order to illustrate certain points, I'll hold onto some information and leak it a strategic moments.

Case Study

Our goal is to examine this data to see if there is evidence that the parents' work is inadvertently increasing the lead content of their children's blood.  High levels of lead in a child's blood can cause severe developmental problems, and so is of great concern.  Once in the blood, the body has no mechanism for cleaning out the lead, and therefore lead levels increase over time with repeated exposure.

A common approach in "chug-and-plug" statistics is to compute the average of the sample. Of course we'll learn nothing from considering on ly the average of the Exposed group; we must compare it to the average of the Control group.

We can ask Fathom to compute these, and we learn that the average lead level in the Exposed children is approximately 32 micrograms per deciliter (mg/dcl), and in the Control group it is 16 mg/dcl.

Now no one contests that 32 is a bigger number than 16, but the real question is does it matter?  There are (at least) two ways of approaching this.  One requires the input from medical specialists.  They would tell us that levels of 40 and above are considered alarming, and some experts say levels of 60 and above require immediate hospitalization.  We mention this because if a level of 32 were considered alarming, we would have some evidence that the groups were alarmingly different.  Another approach of asking if this difference matters is to examine the "spread" of scores.  If scores within each group of children have considerable spread, we could argue that 16 is "close" to 32.  After all, "closeness" is a relative term.  The loose concept of "spread" is most popularly measured by the standard-deviation.

The combined information of mean and standard-deviation tells us that the Exposed group had a mean level of 32, give-or-take 14, and the Control group a mean of 16 give-or-take 4.5.   This is our first sign that something is seriously amiss with the Exposed children.  Not only are the means different, but the standard deviations are substantially different.

It is difficult, and ill advised, to go much further without a picture.  A popular "rule  of thumb" says that about 68% of the data lie within 1 SD of the mean, which means that most exposed children have lead levels between 18 and 46.  This means some children have alarmingly high lead levels, but this rule of thumb requires a symmetric distribution of values.  Thus we examine a picture of the distributions of the Exposed and Control levels.

The Control group has a distribution that is quite symmetric.  With some thought, one should expect this.  The Exposed group, on the other hand, looks drastically different.  Values are spread out, and at least two children are in need of immediate hospitalization.  These two pictures say far more about the situation than a collection of summary statistics, and provide strong evidence of....of what?

Certainly this is not proof that working at the battery factor caused this difference.  But it is quite compelling evidence that a difference does exist.  And requiring a significance test at this point is pedantic, since no reasonable person would argue that there were no differences between these two groups.

But now we should ask what could account for these differences.  We might be quick to blame the factory, but students should be able to think of other causes.  For example, many of the Exposed kids might coincidentally live in an area that has a high level of lead in the environment (perhaps in the paint used on their homes)?  This area could be near the factory, which would explain the discrepancy between the two groups.  It might also be that the Exposed children are older than the Control children -- for whatever reason -- and therefore have had more time for the lead to accumulate. 

As it turns out, the researchers also had these thoughts.  (And here is an example of some information somewhat unfairly hidden until a strategic moment.)   The researchers selected the Control group to "match" the Exposed group.  For each child in the Exposed group, another child of the same age living in the same neighborhood who did not have a parent working around lead was chosen for the Control group.  So the data actually consist of pairs of Exposed and Control children.  How can we exploit this fact when we examine the data?

Because the data are paired, it makes sense to examine the difference in lead levels for each Exposed child and his or her pair.  A dot-plot is revealing: 

The vertical line highlights the fact that only  four pairs had negative values: these are pairs in which the Exposed child had less lead than the Control child.  One pair was the same, and the rest were all positive.   In short, in only four of 33 cases were the Exposed children better off than the Control.  (Although we should be careful of loose phrases like "better off" because a higher level of lead does not necessarily mean worse health or greater risk.)

Because of the paired structure of the data, we have eliminated environment and age as explanations for the differences between the groups.  Another explanation is chance:  were we to sample another 33 pairs, we might get a completely different outcome.  This is a good point for a discussion about how to choose a statistical significance test for these data.  Such tests come with assumptions, and by this point the students should be familiar enough with the data set to question those assumptions.  For example: can we think of these pairs as randomly selected from some larger population?  (Probably not.  There is no reason to assume any random selection mechanism was used.  The published report of the study also explains how difficult it was to find suitable controls.  It is also not clear exactly what the larger population should be in a rigorous probabilistic setting.  The set of all children of workers at this factory?  At all battery factories in Oklahomah?  At all factories with lead in the environment?)  One "probability model" might be to think of these pairs as coin flips.  If the observed outcomes are due solely to chance, then it should be equally likely that an Exposed child will have more lead than a Control as the other way around.  It is like flipping a coin: heads represents an Exposed child having less lead than a Control.  Tails represents equal amounts or more.  (Or you could throw the 0 observation into the "heads" side if you wish.)  Now if you flip a fair coin 33 times, what's the probability that four or fewer will land heads?  The probability is in fact tremendously small (about 6 times 10 to the negative sixth for four or fewer heads, and 4 times 10 to the negative fifth for five or fewer heads).  Such small probabilities suggest that chance is not a good explanation for the discrepancy.  We must accept that something other than age, environment, or chance is at play.

By this point we have eliminated causes, but have we "proven" that the factory is the cause?  This is a good point for the association-is-not-causation discussion. But it is only fair to mention (more strategically revealed information) that the study also reported a negative  association between household cleanliness and lead levels in the children. In addition, Exposed children whose parents showered and changed before leaving the battery factory had lead levels comparable to the Control group.  Taken  together this makes persuasive evidence that the factory is the cause, but does not "prove" that this is so.

Lessons Learned

I'd like to point out some themes of this case study:

  1. Numerical summaries were incomplete; the story was better told through the dot-plots.  The shape of the distribution was informative.  The fact that the shape of the distribution of the Exposed group was so different from that of the Control group was compelling evidence of differences.   This change of shape is missed by a  standard numerical summary.

  2. Scientific context guides the analysis.  We calculate statistics and make comparisons guided by our goal of examining a particular scientific issue.

  3. The study design affects the analysis.  Once we were told about the paired structure of the data, we used a different analysis than we would had the data been unpaired.

Eight Extra-Mathematical Themes

With the lead case study as a context, let's examine 8 themes or topics of Statistics that I consider to be "extra mathematical."

1. Inference

The mathematical tools of probability allow us to quantify inference, but the basic act of inference -- generalizing about the whole based on a small part -- is more philosophical than  mathematical.  Particularly in this case study, in which our inference was causal -- did working at the factory cause increased lead levels? -- mathematics can offer little support.  Of course we were not only interested in the 66 pairs of children at this factory.  We were concerned about any child whose parent's work exposes them to lead.   How can we generalize what we learned about these 66 children to any child?  This is not a mathematical issue.

2. Graphical Tools

Mathematics also relies on graphical tools, but differently.  Both Mathematics and Statistics want its students to learn to interpret graphs, but Statistics also wants its students to know how and when to use graphs.  I relied on dot-plots above, but why not histograms?  Or box-plots?  A data analyst needs to not only know which tools are available, but also which will most likely be useful.

3. Calculation of Statistics

Statistics is not concerned with calculation.  Yes, it is important for students to produce the correct average of a list of numbers.  But a calculator or computer will do this for them.  Statistics needs students to be able to interpret these numbers in the context in which they were collected, and also needs students to know when to use which statistics to help them answer questions.  One could go a long way in analyzing the data from  the lead case study without calculating any statistics.  What was more valuable was to know which statistics to calculate.  (And, to be frank, the mean and SD were quite possibly not the best choices, at least for the Exposed group.)  It was also important to know that the numerical value of the average was meaningless without knowing what a lead level of 32 meant medically, and without knowing how this compared to children who were not exposed to lead.

4. Science

Statistics is a tool of science, and as such is concerned with weighing evidence and not with proof.  Science does not prove (although it may disprove).  The same can be said of Business, Engineering and the Social Sciences, all of which weigh evidence and make decisions, but only in overly simplified situations are able to  -- or even interested in --  "proving" in the mathematical sense.

5. Data  Collection

The method by which the data were collected will affect the choices we make in our analysis.  In the lead study, knowing that we did not have a random sample meant that we would have to exercise extreme caution in applying a t-test.  The data collection method also affects the conclusions we might make.  If the collection method was flawed, we might not make any conclusions, but might still wish to provide a description of the data set. The collection method also imposes a structure on the data, and our analysis should take this into account.  Knowing that the children were "paired" results in a different analysis.  Mathematics says nothing about data collection methods, and in fact designing experiments that meet mathematical assumptions of statistical tests is a craft. 

6. Data Management

The first concern of a data analyst should be to check the quality of the data.  This means more than looking  into the data collection method.  It also means looking for influential observations, missing values, and possible errors in entry.  This is not such an issue in a data set as small as the lead case study, but even there we might wonder the extent to which our conclusions were driven by one or two extreme observations.  (Not much, as it turns out.) 

For large studies, such as most of the medical studies that are reported in our daily newspapers, data management is an every-day concern.  Data are collected on tens of thousands of people, on perhaps hundreds (or thousands) of variables, at collections spread throughout the country or even the world.  Organizing this data in a database, and retrieving the data required by a particular researcher, requires considerable skill and training.  Good and experienced database managers these days can almost set their own salary.  Even every-day consumers of Statistics need to understand the extent to which the data themselves are constructed, stored, and retrieved. 

As a side note, the analysis of what are called Massive Data Sets is an area of active research that includes statisticians, mathematicians, computer scientists, and physical scientists. 

7. Computer

The computer has a very special relationship to Statistics, and, if I might go out on a limb, it is a relationship that mathematicians have a hard time understanding.  The computer is much more than a crutch that does tedious calculations very quickly.  It is instead a device that mediates between the data and the set of graphical and numerical tools at the statistician's disposal.  Which software the statistician uses influences the course of the analysis.  I would not go so far as to say that it affects the conclusions a statistician might make, but I do think such an argument could perhaps be made.  Certainly many journals insist on knowing what software was used to analyze the data.  An analogy can be made to music. I have two recordings of Bach's Violin Partita No. 1.  One is played by Jascha Heifitz, the other is a piano transcription and played by the pianist Jorge Bolet.  Same music, very different experience.  Statistics software is an instrument in the same sense a violin or piano is an instrument. 

8. Consulting

By its very nature, Statistics is a collaborative activity.  A good analysis requires substantive knowledge, such as knowing that lead levels over 60 are extremely dangerous and that lead accumulates in the blood over time.  This means that Statisticians must collaborate.  Mathematics is increasingly collaborative, but still it is possible for a mathematician to be successful working alone.  A statistician, though, almost always works in a collaborative or consulting context, and needs to develop skills to assist in working with people from different disciplines. 

How To Train Teachers?

Given these differences between Statistics and Mathematics, how, then, should high school teachers be prepared?  There are three words to keep in mind: practice, practice, and practice.  Teachers of statistics need to practice using computers to analyze real data, in realistic contexts.  They need practice choosing and applying tools.  Ideally, I would like to think that at least once they would be confronted with that confusing paralysis that happens when one faces a complex and large data set and must choose how and where to begin.  They should also have practice in collecting data, or at least in dealing with data collected by different methods. 

My own experience in training teachers comes through a Center which we founded within the UCLA Department of Statistics: the Center for Teaching Statistics.  The CTS disseminates information about statistics teaching both to UCLA and to the local community.  One of its activities is to offer a course called "Data Analysis for High School Teachers."  This course is for current teachers of AP Statistics.  Each week we work (using Fathom or other statistical software such as ARC or Data Desk) on a different data set.  The course was designed after a meeting with a "focus group" of local AP Statistics teachers.  The consensus of that meeting was that while new Statistics teachers had the background necessary to conquer the mathematical content of Statistics, many were unprepared to deal with the applications side.  Many teachers, for example, felt uninformed in assigning or evaluating student projects.  This course, by providing participants with practice working with real data, helps them develop understanding of how actual statisticians apply tools. 

I have also been involved with a program run by  the UCLA Mathematics Department called the UCLA Math Content Program for Teachers.  This program is designed for current teachers who are "under-prepared" to teach mathematics.  It consists of a two-year series of courses than focus on 4th to 9th grade mathematics.  Dealing with Data, an introductory statistics course, is now the most popular of their second-year offerings.  The course teaches the basic tools of statistics and is activity-oriented,  involving participants in examples pulled from real-life.  This course makes frequent use of simulations, both as an aid to understanding probability concepts, and also as a problem solving tool.   An important feature of the course is the end-of-term data collection and analysis project.  The Math Content Program is directed by Dr. Shelley Kriegler, and more information can be found at www.math.ucla.edu/teachers. 

In conclusion, my belief is that training mathematics teachers to teach statistics requires a substantial effort.  This effort goes beyond the traditional mathematical statistics course most math majors take (if they take any statistics class at all), and requires at least one course -- perhaps called "data analysis"? -- that teaches this extra-mathematical content. 

REFERENCES to Part I 

1.                  Burrill, G., Kranendonk, H., Landwehr, J., Scheaffer, R., Witmer, J., etc., Data-Driven Mathematics (11 modules), Columbus, OH, Pearson Learning, 1998. www.pearsonlearning.com

2.                  Landwehr, J., Scheaffer, R., Watkins, A., etc., The Quantitative Literacy Series  (5 books), Columbus. OH, Pearson Learning, 1995. www.pearsonlearning.com 

3.                  Franklin, Christine, “Are Our Teachers Prepared to Provide Instruction in Statistics at the K-12 Level?”, NCTM Mathematics Education Dialogues, Vol.37, Issue 3, October 2000.  

4.                  Franklin, Christine, “Exploring Data: How Students Answer Questions on the AP Statistics Exam and What are the Desired Responses”, American Statistical Association’s 2000 Proceedings of the Section on Statistics Education, pp.192-197.

REFERENCES to Part II

1.         Morton, D., (1982), "Lead absorption in children of employees in a lead-related industry," American Journal of Epidemiology, Vol. 155, pp. 549-555. 

2.         Trumbo, Bruce, (2001), Learning Statistics with Real Data, Duxbury Press 

3.         Gould, Robert and Bruce Trumbo, "Lead Absorption Case Study", appearing in  Cyberstats, Cybergnostics Inc., to be published.  www.cybergnostics.com

Software

Fathom: www.keycollege.com/Pages/ProdFathom.html
DataDesk: www.datadesk.com
ARC: www.stat.umn.edu/arc
JMP: www.jmpdiscovery.com
Stata: www.stata.com
SAS: www.sas.com

My opinions on choosing software can be found at www.stat.ucla.edu/~rgould/met

Related Links

www.stat.ucla.edu/~rgould/met             outline from presentation, and software advice
www.stat.ucla.edu/cts                           the UCLA Center for Teaching Statistics
www.math.ucla.edu/teachers                 the UCLA Math Content Program for Teachers

   

Appendix:  Activity

Chris Franklin

A Case of Possible Discrimination

Statisticians are often asked to look at data from situations where an individual or individuals believe that discrimination has taken place. A well-known study of possible discrimination was reported in the Journal of Applied Psychology. The scenario of this study is given below.

Scenario

In 1972, 48 male bank supervisors were each given the same personnel file and asked to judge whether the person should be promoted to a branch manager job that was described as “routine’ or whether the person’s file should be held and other applicants interviewed. The files were identical except that half of the supervisors had files showing the person was male while the other half had files showing the person was female. Of the 48 files reviewed, 35 were promoted. (B.Rosen and T. Jerdee (1974), “Influence of sex role stereotypes on personnel decisions,” J. Applied Psychology, 59:9-14.)

Preliminary Questions

1.         Suppose there was no discrimination involved in the promotions.  Enter the expected numbers of males promoted and females promoted for this case in Table1. 

 

PROMOTED

NOT PROMOTED

TOTAL

MALE

 

 

24

FEMALE

 

 

24

TOTAL

35

13

48

Table 1 No discrimination   

2.         Suppose there was strong evidence of discrimination against the women in those recommended for promotion.  Create a table that would show this case. 

 

PROMOTED

NOT PROMOTED

TOTAL

MALE

 

 

24

FEMALE

 

 

24

TOTAL

35

13

48

Table 2 Strong case of discrimination against the women
 

3.         Suppose the evidence of discrimination against the women falls into a ‘gray area’; i.e., a case where discrimination against the women is not clearly obvious without further investigation. Create a table that would show this case. 

 

PROMOTED

NOT PROMOTED

TOTAL

MALE

 

 

24

FEMALE

 

 

24

TOTAL

35

13

48

Table 3.  ‘Gray’ case of discrimination against the women   

Returning to the study reported earlier in the activity scenario, the results were reported that of the 24 male files, 21 were recommended for promotion. Of the 24 female files, 14 were recommended for promotion.  

4.         Enter the data from the actual discrimination study in Table 4. 

 

PROMOTED

NOT PROMOTED

TOTAL

MALE

 

 

 

FEMALE

 

 

 

TOTAL

 

 

 

            Table 4.  Actual discrimination study

   
5.         What percentage of males was recommended for promotion? What percentage of females was recommended for promotion?

6.         Without exploring the data any further, do you think there was discrimination by the bank supervisors against the females? How certain are you?

7.         Could the smaller number of recommended female applicants for promotion be attributed to chance?  What is your sense of how likely the smaller number of recommended females could have occurred by chance?    

8.                  Suppose that the bank supervisors looked at files of actual female and male applicants. Assume that all of the applicants were identical with regard to their qualifications and use the same results as before (21 males and14 females). If a lawyer retained by the female applicants hired you as a statistical consultant, how would you consider obtaining evidence to make a decision about whether the observed results were due to chance variation or if the observed results were due to discrimination against the women?

Statisticians would formalize the overriding question by giving two statements, a null hypothesis which represents no discrimination so that any departure from Table 1 is due solely to a chance process, and an alternative hypothesis which represents discrimination against the women.

9.         What initial thoughts do you have about the manner in which the experiment (study) was conducted? What would need to be assumed about the study in order to infer that gender is the cause of the apparent differences? 

[PSSM quote: Students should formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them.]  

Simulation of the Discrimination Case

Using a deck of cards, let 24 black cards represent the males, and 24 red cards represent the females (remove 2 red cards and 2 black cards from the deck).  This will simulate the 48 folders, half of which were labeled male and the other half female.

1:         Shuffle the 48 cards at least 6 or 7 times to insure that the cards counted out are from a random process.

2:         Count out the top 35 cards. These cards represent the applicants recommended for promotion to bank manager. The simulation could be conducted more efficiently by dealing out 13 “not promoted” cards, which would be equivalent to dealing out 35 “promoted” cards.

3:         Out of the 35 cards, count the number of black cards (representing the males).

4:         On the number line provided, Figure 1, create a dot plot by placing a blackened circle or X above the number of black cards counted. The range of values for possible black cards is 11 to 24.

5:         Repeat steps 1 – 4 nineteen more times for a total of 20 simulations.

FIGURE 1

DOT PLOT TO BE USED TO GRAPH THE 20 SIMULATED RESULTS

   

 

 

 

____________________________________________________________________

11    12     13     14     15      16       17      18      19       20        21       22      23        24  
 

                                    Number of Men Promoted

[PSSM quote: Students should use simulations to explore the variability of sample statistics from a known population and to construct sampling distributions.]

A sampling distribution is the distribution of possible values of a statistic for repeated samples of the same size from a population. For the scenario under consideration, the number of black cards (number of males promoted) from each of the simulations is the statistic.

6:         Using the results (the counts) plotted on the number line, estimate the chances that 21 or more black cards (males) out of 35 will be selected if the selection process is random; that is, if there is no discrimination against the women in the selection process.

The probability of observing 21 or more black cards if the selection process is due to randomness or chance variation is called the p-value.

7:         Look at the dot plot and comment on the shape, center, and variability of distribution of the counts by answering the following. 
(a)                Is the distribution somewhat symmetric, pulled (skewed) to the right, or pulled to the left?
(b)               Do you observe any unusual observation(s)?
(c)                Where on the dot plot is the lower 50% of the observations?
(d)               Estimate the mean of the distribution representing the number of black cards obtained out of the 20 simulations.
(e)                What values occurred most often?
(f)                 Finally, using the dot plot, comment on the spread of the data.

[PSSM quote: Students should select and use appropriate statistical methods to analyze data for univariate measurement data, be able to display the distribution, describe its shape, and select and calculate summary statistics.]

8.         Is the behavior of this distribution what you might expect? Why or why not?

9.         Think about the question of possible discrimination against the women. Based on the exploration just made, does there appear to be evidence to support the claim that selecting 21 males  (black cards) for promotion out of 35 candidates was not due to chance variation?; i.e., how does your data compare with that of the original study?

10.       How do your results compare with the results of your classmates?

Using Probability Theory

In attempting to answer the question, “ Is the difference between the number of observed promoted males and the number of expected male due to chance variation or is this difference a real difference?”, simulation provided an estimated probability of 21 or more males being promoted out of 35 promotions. This probability can also be found by the application of probability theory. The stated question of interest leads to a situation that results in “ success-failure” outcomes. A group of N population objects is classified into two subgroups, where one group of size k is the success group and the other group of size N-k is the failure group. The number of objects selected from the population is denoted as n.  Since a male being promoted was defined as the variable of interest, a male being promoted is a success; a female being promoted is a failure. For this discrimination activity, N = 48 applicants, k = 24 males, and N-k = 24 females. Of the 48 applicants, n=35 were selected for promotion. A random variable X is defined that represents the number of successes out of n; that is, the number of males selected out of the 35 promoted. The probability of a success occurring on any given promotion is k/N = 24/48 = 0.50. The expected value for the number of males to be promoted out of the 35 would equal 35(0.50) = 17.5.

[The expected value of a random variable X is another name given to the mean of the random variable X.]

[PSSM quote: All students should be able to compute and interpret that expected value of random variables in simple cases.]

The selection of 35 of the 48 applicants for promotion is viewed as sampling without replacement; that is, once an applicant is selected for promotion, the applicant is not returned to the population of applicants, which does not provide the opportunity for another evaluation and promotion. Mathematically, the probability of selecting 21 or more males for promotion out of the 35 promoted can be found as follows. Let N = k + (N-k), n = number of objects selected from the population of size N, and X = number of successes out of n. The desired probability is P(X  21) = P(X=21)+P(X=22)+P(X=23)+P(X=24).

To find the probability that exactly X of the males were promoted, we must first find the number of ways to select x males from k = 24 males in the population and the number of ways to select (n-x) females from (N-k) females in the population. This can be found by evaluating the combinations  and . By the multiplication principle, the product  equals the number of ways of promoting x males and n-x females. The combination gives the total number of ways of promoting n applicants out of N. Thus, P(X=x) = / .

[The random variable X follows a hypergeometric distribution.]

[PSSM quote: Students should be able to represent and analyze mathematical situations using algebraic symbols.]

Evaluating for

P(X=21) = OWS/TEMP/msoclip1/01/clip_image036.pcz" o:title=""/> /  = [24!/(21! 3!)] [24!/(14! 10!) / 48!(35!13!) = 0.021

 

P(X=22) = /   = 0.004

 

P(X=23) =  = 0.000

 

P(X=24) = =  = 0.000.

 

Therefore, P(X  21) = 0.021 + 0.004 + 0.000 + 0.000 = 0.025.

This mathematical probability is a long-term probability. It is the percentage of time that an outcome would be expected to occur in the long run if a simulation or experiment was replicated many times. Observe how close the mathematical probability of 0.025 is to the simulated probability of 0.028 from the class data sets considered in Sample Distribution 4 and Sample Distributions 5.

[ Refer to a high school mathematics book that covers combinations for additional details on how to evaluate a combination and evaluating factorials.]