In this article, I demonstrate how important it is to evaluate or test for the signal-to-noise ratio in statistical analysis, especially when the sample size is large or massive.

## What is the signal-to-noise ratio?

Consider, as an example, a linear regression model of the following form:

We often test for a linear hypothesis such as

against H1 where H0 is violated. The test is conducted using the F-test, which can be written as

where ** T** is the sample size, and

The quantity ** m** is called the

**. and it measure**

__signal-to-noise ratio__*.*

__how much the restriction under H0 (or the corresponding X variables) contributes to the goodness-of-fit of the model__Taking a simple case as an example,

the signal-to-noise ratio ** m** in this case measures the contribution of the variable

**to the model's fit or in-sample predictability, relative to its noise-component.**

*X1*## Why is it important?

Notice that the F-test statisic given above is a factor of the signal-to-noise ratio. The factor is driven by the sample size (** T**), given the values of

**(number of**

*K***variables) and**

*X***(number of restrictions). In fact, many other statistical tests (such as the t-test) can be expressed similarly as a factor of the signal-to-noise ratio.**

*J*The problem occurs when the sample size is large or massive. __Even if the value of ____m____ is small or even negligible, the F-test statistic can be large enough to reject the null hypothesis.__

Hence, even if the contribution of the variables being tested is negligible, your t-test and F-test can reject the null hypothesis. This is a serious limitation of hypothesis testing in the big data era. That is, when the sample size is large enough, any practically negligible deviation from H0 can be rejected with an infinitesmal p-value.

In fact, the signal-to-noise ratio is also known as Cohen's f2 in behavioral science as a measure of effect size. According to __Cohen (2013)__, the ** m** values of 0.02, 0.15, and 0.35 respectively serve as thresholds for a small, medium, and large effect.

Hence, it is imperative to check the value of ** m** (or effect size) in practical applications, especially when the sample size is large, using the values suggested by

__Cohen (2013)__as benchmarks.

A testing procedure for ** m** has been proposed in my working paper, which is posted

__here__.

## Example

The above table reports the regression results from a paper published in a top journal in finance, where T = 119785. Each column represents alternative regresion models for the same dependent variable. Compare the regression (1) and (2). The variable (UEHIGH × HISR) in (2) shows the t-statistic of 4.58 (=0.55/0.12), but the value of ** m** is nearly 0 (=(0.043–0.042)/(1–0.043)). The statistical significance of (UEHIGH × HISR) is driven almost entirely by the large sample size, while the variable adds virtually nothing to the explnatory power of the model.

## コメント