Loading is taking an unusually long time. Contact the app developers if it takes much longer.

General help

You can enter any of five kinds of test statistics (F, z, t, r, chi2) in the format shown by the default value of the analysis. In general, these are the same statistics and formats supported by Simonsohn et al's app. Lines highlighted in pink cannot be read as test statistics and are ignored. You can use feature to add comments to your input.

Also, any text after a test statistic that follows a hash (#) will be read as a line-specific comment and highlighted in blue. You can use this to label specific test statistics; for instance, to note what study they've been taken from. The comments will be shown in the plot and in the data table.

You can also click the "📋 Link" button at the top of the page to obtain a link that will point back to the current analysis. The link will be copied to your clipboard.

If you would like to copy a simplified version of the analysis text for pasting into Simonsohn et al's app, click the "📋 Text" button below the analysis text area. This removes all comments and copies the simplified analysis to your clipboard.

The code base for this app is completely independent of Simonsohn et al's app, so may produce slightly different results in some cases. For instance, as of version 4.10, Simonsohn et al's app truncates all p values to be greater than 2.2e-16 (corresponding to a Z statistic of about 8.21). Our app does not do this.

See the Examples section for demonstrations.

Visualization help

The plot shows the empirical cumulative distribution of the p values of the significant (at 0.05) test statistics entered into the app. Note that for clarity the x axis is logarithmic. The figure also shows what would be expected if the p values were independent draws from a uniform distribution.

  1. Each point represents a study entered into the textbox. Move your mouse over a point to get information about that point. Each point is treated as an order statistic.
  2. The light blue ribbon represents the 90% interval (5%-95%) in which the corresponding order statistic (e.g. the third smallest p value if the point is the third from the bottom) would be expected to be found if the distribution of p values were uniform. Note that this interval is point-wise, and not simultaneous (that is: there is a 90% probability that each point is within the ribbon, not that all points are within the ribbon). Note also that order statistics are not independent (so the probability that more than one is outside the ribbon, even given a uniform distribution, may be larger than one would expect).
  3. The dark blue line within the ribbon represents the median for the corresponding order statistic if the distribution of p values were uniform.
  4. The purple diamond represents the geometric mean of all the significant p values entered into the app.
  5. The light blue band represents the 90% interval (5%-95%) within which the geometric mean p value would be expected to be found if the distribution of p values were uniform. This is a form of Fisher's meta-analytic test conditioned on significance (that is, Simonsohn et al's 2014 test for "evidential value"); if the diamond is to the left of this band, this test is significant at the 5% level.
Tests help

The tests table contains the results of the various P-curve tests that Simonsohn et al have developed. Each column is described below. We do not show the half P-curve tests by default, for reasons that are explained in our paper. Click the "Include half P-curves?" check box to show them. We also do not show the "left skew" tests by default, given that the authors have not focused much on these tests. lick the "Include LS tests?" check box to show them.

Test
"EV" refers to Simonsohn et al's "evidential value" or "right skew" test; "LEV" refers to their "lack of evidential value" or "flatter than 33% power" test; "LS" refers to their "left skew" test. Note that the authors never developed the LS test for the Stouffer transform in 2015, but its p value is simply one minus the p value for the Stouffer "EV" test, so it is trivial.
α
α (alpha) represents the critical boundary chosen for the test. 0.05 is their "full" P-curve; 0.025 is their "half" P-curve.
Fisher χ²
The χ² test statistic for Simonsohn et al's (2014) P-curve test using a log transformation on the scaled p values (Fisher's method). The test statistic has 2×k degrees of freedom, where k is the number of significant studies.
Fisher p
The p value corresponding to the Fisher χ² test statistic.
Stouffer Z
The Z test statistic for Simonsohn et al's (2015) P-curve test using a probit transformation on the scaled p values (Stouffer's method).
Stouffer p
The p value corresponding to the Stouffer Z test statistic.
# studies
The total number of test statistics detected in the input.
# sig.
The number of studies significant at the α-level in the α column.
Data table help

The data table shows all the test statistics detected in the input.

Line
The line number of the input on which that statistic was found.
Input
The test statistic entered on this line of the input.
Comment
The comment given for that test statistic (entered after a hash (# on that input row).
p
The recomputed p value for the entered test statistic.
Sig.?
Is the p value in this row significant at 0.05? ✅ means p<0.05, and hence the value will be included in the full P-curve; ❌ means p≥0.05, and hence the value will not be included.
Fisher
The contribution of that row's test statistic to the overall Fisher's χ² statistic. The sum of this column is the total χ² statistic in the tests table (within rounding error).
Stouffer
The contribution of that row's test statistic to the overall Stouffer's Z statistic. The sum of this column is the total Z statistic in the tests table (within rounding error).
LEV NCP
The noncentrality parameter used for the full P-curve LEV test ("lack of evidential value", or "flatter than 33% power") for the test statistic in that row; i.e. the distribution of the test statistic that would yield a 1/3 probability of significance at 0.05.

P-curve analysis app

Morey & Davis-Stober

Scroll down, or choose from the menu above, to see the results of P-curve analysis. You can modify the analysis be editing the textbox to the left. Also, try some of the examples. The code for this app can be found on GitHub.


Visualization


Tests


Data table


Examples

Replications of published analyses

The replications of analysis demonstrate how this app improves the transparency of P-curve analyses using comments and links.

Demonstrations of problems with the P-curve

For most of these we do not use Simonsohn et al's (2015) full/half rule for determining "evidential value"; to see why, see the nonmonotonicity demonstration.

Sensitivity

Evidential value

Nonmonotonicity

These six sets of studies have test statistics that dominate each earlier set. Set 1 is not significant by Simonsohn et al's (2015) half/full evidential value rule and set 2 is significant. But then, set 3 is not significant and set 4 is significant. Then set 5 is not significant, and set 6 is significant. The procedure is not monotone in the evidence.


Citation

Our citations

General P-curve citations

  • Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014). P-curve: A key to the file-drawer. Journal of Experimental Psychology: General, 143, 534–547. doi:10.1037/a0033242
  • Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2015). Better P-curves: Making p-curve analysis more robust to errors, fraud, and ambitious p-hacking, a reply to Ulrich and Miller (2015). Journal of Experimental Psychology: General, 144(6), 1146–1152. doi:10.1037/xge0000104