Chi-square for test of association: Cross tabs

 

The Chi-square test is a not only a simple and useful statistic that relates the frequency of counts to a theory, it can also be used to see if one variable in a study is related to another.  As such it is one of the most widely used statistics in all of MR. For example, suppose that one question on a questionnaire asked a respondent if she likes a new product.  She could answer “yes” or “no.”  Another question asked if she was a regular user of a pre-existing product.  Again the answer is “yes” or “no.” Is the use of a pre-existing product related to liking the new product?  The Chi-square for association is designed to answer this type of question.

 

The Chi-square test still has the same formula:

 

 

 

where o is the observed count and e is the expected count.  The sum sign indicates that this equation has to be done for all possible outcomes and then added.  In the example above a “cross tabs” would create a table called a “cross tab table” that combines the results of the two questions together.

 

 

“Yes” New Product

“No” New Product

“Yes” Old Product

           60

             20

“No” Old Product

           50

             70

 

There were 200 respondents to the questionnaire, 60 people said “yes” to both the new and old product, 70 said “no” to both the new and the old product.  Each outcome of the cross tab table is called a cell.  Are the questions related?

This Chi-square also has “degrees of freedom.”  That is the number of possible cells you would have to know before you knew what was in all the other cells.  This is assuming that you know the results of both questions separately.  In the example above, if I know how many respondents were in one cell, I would know all the others, so the degrees of freedom (df) is one.

 

With every value of Chi-square and with any value of df, there is a certain probability of the null hypothesis, or that the two questions have nothing to do with each other.  The chi-square value in this problem is 21.55; sig is less than .001.  This means that there is almost no chance at all that the two questions are unrelated.  I would conclude, therefore, that use of the old product influences the preference for the new product.

 

Assignment:

 

1. Read the following statement to 30 people, about 15 of each gender.

 

You really want a certain new car, but find that it only comes in three colors.  Which color would you select of the following?”

 

a) red        b) blue         c) white

 

 

Assume that gender and color preference is unrelated. 

 

Record the data on an SPSS spreadsheet.  The first column is “gender.”  Record men as 1, and women as 2.  Column two is color.  Record red as 1, blue as 2, and white as 3.

 

Go to the bottom left hand corner of the spreadsheet, find a tab that says “variable view,” click that tab.  Name your variables.  Go right until you find a column called “values.”  Click the space, and then click the key that comes up.  Input the value labels (1 = male, for example) in the value box for both variables.

 

Do the following:

 

Analyze

            Descriptive

                        Crosstabs

                                    Input gender in rows, and color in columns

                                    Select Statistics

                                                Click Chi-square

                                    Select Cells

                                                Click Row, Column, and Total under Percentages

 

Run the statistic, paste both frequencies and test statistics into assignment sheet.

 

  1. Explain your answer. 
  2. Are the two questions related.  Yes or No and Why? 
  3. How is this statistic different from Chi-square for fit?