Chi-Square & Non-parametric Tests

Sampling and Estimation Theories
Statistics » Linear Regression » Sampling Theories » Correlation » Statistics and Probability

The Chi-Square ($\chi^2$) Distribution

The Chi-Square ($\chi^2$) distribution is a continuous probability distribution that is widely used in hypothesis testing. It is particularly important for:

  • Goodness-of-Fit Tests: Determining if sample data matches a known theoretical distribution.
  • Tests of Independence: Assessing whether two categorical variables are independent.
  • Tests for Variance: Testing hypotheses about a population variance.

Characteristics of a Chi-Square Curve:

  • Non-negative: The $\chi^2$ value is always non-negative.
  • Skewed: The distribution is skewed to the right. As degrees of freedom increase, it becomes more symmetrical.
  • Degrees of Freedom (df): Its shape depends on degrees of freedom ($df$).
  • Area under the curve: Total area is 1.

The $\chi^2$ statistic measures discrepancy between observed and expected frequencies.

Chi-Square Goodness-of-Fit Test

$\chi^2$ Goodness-of-Fit Calculator

This test determines if the observed frequencies of categories in a sample significantly differ from a set of expected frequencies. It's often used to see if sample data conforms to a hypothesized distribution.

Hypotheses:

  • $H_0$: The observed data fits the expected distribution (i.e., there is no significant difference between observed and expected frequencies).
  • $H_1$: The observed data does not fit the expected distribution (i.e., there is a significant difference).

Test Statistic:

$$ \chi^2 = \sum_{i=1}^{k} \frac{(O_i - E_i)^2}{E_i} $$

Where:

  • $O_i$ is the observed frequency for category $i$.
  • $E_i$ is the expected frequency for category $i$.
  • $k$ is the number of categories.
The sum is taken over all $k$ categories.

Degrees of Freedom (df):

$df = k - 1 - m$, where $m$ is the number of parameters estimated from the sample data to calculate the expected frequencies. If expected frequencies are given or derived without estimating parameters from the current sample (e.g., assuming a uniform distribution), then $m=0$, and $df = k-1$.

Decision Rule:

If the calculated $\chi^2$ value is greater than the critical $\chi^2$ value from the Chi-Square distribution table (for the chosen significance level $\alpha$ and degrees of freedom $df$), or if the p-value is less than $\alpha$, then reject the null hypothesis ($H_0$).

Assumptions:

  • Data are frequencies (counts of occurrences).
  • Categories are mutually exclusive.
  • Observations are independent.
  • Expected frequencies should generally be at least 5 for each category. If some are less than 5, categories may need to be combined.

Distribution-Free (Non-parametric) Tests

Non-parametric tests, also known as distribution-free tests, are statistical methods that do not rely on assumptions about the shape or parameters of the underlying population distribution (e.g., they don't assume data is normally distributed). They are particularly useful when:

  • Data is nominal (categorical) or ordinal (ranked).
  • Sample sizes are small, and the normality assumption of parametric tests cannot be confidently met.
  • The data contains significant outliers that would unduly influence parametric tests.
  • The assumptions of a parametric test (like equal variances for t-tests) are violated.

While generally less powerful than their parametric counterparts if the assumptions of the parametric test *are* actually met, non-parametric tests are more robust and can provide valid inferences when those assumptions are not tenable.

Sign Test

Sign Test Calculator (Paired Samples)

The Sign Test is one of the simplest non-parametric tests. It can be used for single samples (testing the median against a hypothesized value) or for two related/paired samples (testing if one tends to have larger or smaller values than the other).

Procedure for Paired Samples:

  1. For each pair of observations, calculate the difference (e.g., Sample 1 value - Sample 2 value, or After - Before).
  2. Record the sign of each non-zero difference (+ if positive, - if negative). Ignore pairs where the difference is zero (ties).
  3. Let $N$ be the total number of non-zero differences.
  4. Count the number of positive signs ($S_+$) and the number of negative signs ($S_-$).
  5. The test statistic, often denoted as $S$ or $X$, is typically the smaller of $S_+$ and $S_-$.
  6. This test statistic is compared to critical values from a binomial distribution $B(N, 0.5)$ or, for larger $N$ (e.g., $N > 25$), a normal approximation can be used.

Hypotheses (Two-Tailed for Paired Samples):

  • $H_0$: The median of the differences between paired observations is zero. (Equivalently, $P(\text{Sample 1} > \text{Sample 2}) = P(\text{Sample 2} > \text{Sample 1}) = 0.5$)
  • $H_1$: The median of the differences is not zero.

One-sided hypotheses can also be formulated (e.g., $H_1$: Median difference is greater than zero).

Wilcoxon Signed-Rank Test

Wilcoxon Signed-Rank Test Calculator (Paired Samples)

The Wilcoxon Signed-Rank Test is a non-parametric test used for paired data. It's an alternative to the paired t-test when the assumption of normality of differences is not met. It considers not only the signs of the differences (like the Sign Test) but also their magnitudes by ranking them.

Procedure:

  1. Calculate the difference for each pair.
  2. Discard any pairs with a difference of zero. Let $N$ be the number of remaining pairs with non-zero differences.
  3. Rank the absolute values of these non-zero differences from smallest to largest. If there are ties in the absolute differences, assign each tied value the average of the ranks they would have occupied.
  4. Affix the original sign (+ or -) of the difference to each rank.
  5. Calculate the sum of the ranks of the positive differences ($W_+$ or $T_+$) and the sum of the ranks of the negative differences ($W_-$ or $T_-$). Note: $|W_-|$ is typically used.
  6. The test statistic $T$ (or $W$) is usually the smaller of $W_+$ and $|W_-|$.
  7. Compare $T$ to critical values from a Wilcoxon Signed-Rank critical value table for the given $N$ and $\alpha$. For larger $N$ (e.g., $N > 20-25$), a normal approximation can be used.

Hypotheses (Two-Tailed):

  • $H_0$: The median of the differences is zero.
  • $H_1$: The median of the differences is not zero.

Mann-Whitney U Test (Wilcoxon Rank-Sum Test)

Mann-Whitney U Test Calculator (Independent Samples)

The Mann-Whitney U Test (also known as the Wilcoxon Rank-Sum Test) is a non-parametric test used to compare two independent samples. It is the non-parametric alternative to the independent samples t-test. It tests whether the two samples come from populations with the same distribution, often interpreted as testing for a difference in medians.

Procedure:

  1. Combine all observations from both samples (let sample sizes be $n_1$ and $n_2$).
  2. Rank all these combined observations from smallest to largest. If there are ties in ranks, assign each tied observation the average of the ranks they would have occupied.
  3. Calculate the sum of the ranks for sample 1 ($R_1$) and the sum of the ranks for sample 2 ($R_2$). As a check, $R_1 + R_2 = \frac{N(N+1)}{2}$ where $N = n_1 + n_2$.
  4. Calculate the $U$ statistics:
    $U_1 = n_1 n_2 + \frac{n_1(n_1+1)}{2} - R_1$
    $U_2 = n_1 n_2 + \frac{n_2(n_2+1)}{2} - R_2$
    Alternatively, $U_2 = n_1n_2 - U_1$.
  5. The test statistic $U$ is the smaller of $U_1$ and $U_2$.
  6. Compare $U$ to critical values from a Mann-Whitney U critical value table for the given $n_1, n_2,$ and $\alpha$. For larger sample sizes (e.g., $n_1, n_2 > 20$), a normal approximation can be used. (Note: Tie correction for $\sigma_U$ in normal approximation can be complex and is often omitted in basic calculations but should be considered for high precision with many ties).

Hypotheses (Two-Tailed):

  • $H_0$: The two populations from which the samples are drawn have identical distributions (or, more commonly, identical medians).
  • $H_1$: The two populations have different distributions (or different medians).