Include Top

# How Do I Perform Chi-Square Tests in Excel Using SigmaXL?

## Chi-Square Test (Stacked Column Format Data)

1. Open Customer Data.xls. Click Sheet 1 tab. The discrete data of interest is Complaints and Customer Type, i.e., does the type of complaint differ across customer type? Formally the Null Hypothesis is that there is no relationship (or independence) between Customer Type and Complaints.
2. Click SigmaXL > Statistical Tools > Chi-Square Test. Select Major-Complaint, click Count Category (X1) >>; select Customer Type, click Group Category (X2) >>.

3. Click OK. Results:

4. With the p-value = 0.142 we fail to reject H0, so we do not have enough evidence to show a difference in customer complaints across customer types.

Note: If more than 20% of Fitted Cells are Sparse - cells whose expected value is less than 5 – consider collecting more data, consolidation of levels, or removal of columns.

Tip: Use Chi-Square Analysis to complement Advanced Pareto Analysis.

## Chi-Square Test (Two-Way Table Data)

1. Open the file Attribute Data.xls, ensure that Example 1 Sheet is active. This data is in Two –Way Table format, or pivot table format. Note that cells B2:D4 have been pre-selected.
2. Click SigmaXL > Statistical Tools > Chi-Square Test – Two-Way Table Data. Note the selection of data includes the Row and Column labels (if we had Row and Column Totals these would NOT be selected).
3. Click Next. Results:

4. The p-value matches that of the 2 proportion test. Since the p-value of 0.1 is greater than .05, we fail to reject H0.
5. Now click Example 2 Sheet tab. The Yields have not changed but we have doubled the sample size. Repeat the above analysis. The resulting output is:

6. Since the p-value is < .05, we now reject the Null Hypothesis, and conclude that Day Shift and Night Shift are significantly different. The Residuals tell us that Day Shift failures are less than expected (assuming equal proportions), and Night Shift failures are more than expected.

Note, by doubling the sample size, we improved the power or sensitivity of the test.

7. Click the Example 3 Sheet tab. In this scenario we have 3 suppliers, and an additional marginal level. A random sample of 100 units per supplier is tested. The null hypothesis here is: No relationship between Suppliers and Pass/Fail/Marginal rates, but in this case we can state it as No difference across suppliers. Redoing the above analysis (for selection B2:E5) yields the following:

The p-value tells us that we do not have enough evidence to show that there is a difference across the 3 suppliers.
8. Click the Example 4 Sheet tab. Here we have doubled the sample size to 200 per supplier. Note that the percentages are identical to example 3. Redoing the above analysis yields the following:

With the p-value < .05 we now conclude that there is a significant difference across suppliers. Examining the residuals tells us that Supplier A has fewer failures than expected (if there was no difference across suppliers) and Supplier C has more failures than expected. Supplier B has more marginal parts than expected and Supplier C has fewer marginal parts than expected.

# Web Demos

Our CTO and Co-Founder, John Noguera, regularly hosts free Web Demos featuring SigmaXL and DiscoverSim