# Differences

This shows you the differences between two versions of the page.

Both sides previous revision Previous revision | |||

hku_psyc2071_and_psyc3052_-_autumn_2018-9 [2018/11/20 07:47] filination [Data analysis] |
hku_psyc2071_and_psyc3052_-_autumn_2018-9 [2018/11/21 03:39] (current) filination [Data analysis] |
||
---|---|---|---|

Line 3: | Line 3: | ||

===== Data analysis ===== | ===== Data analysis ===== | ||

+ | ==== Excluding extreme responses ==== | ||

+ | Q: | ||

+ | <blockquote> | ||

+ | Apart from the main effects, I would like to ask something related to my extensions. My extension after you kindly modified asks the participants to fill in an amount for the safe/ risky option so that they would choose that option. As I was trying to analyze my extension data, I have found some of the participants filling answers like $1million or amount that is the same as the other option, which has an amount stated clearly (rendering a fail to measure risk preference). | ||

+ | |||

+ | Therefore, I am not sure how to set the exclusion criteria. I would like to ask if it is okay for me to exclude participants that show skyrocketing amounts using IQR? I suspect these participants are very likely to be outliers. Moreover, can I also exclude participants that fill the amount as stated in the other option? I am not quite sure on how I can define the exclusion criteria on this. | ||

+ | |||

+ | I am actually unsure if this constitutes p-hacking. Nevertheless, put in reality, it seems very unlikely that the amount for the two options can differ by that large amount. So I am not sure on what I can do with these data | ||

+ | </blockquote> | ||

+ | |||

+ | A: | ||

+ | <blockquote> | ||

+ | This is a good example of the complexity of analyzing data :) | ||

+ | |||

+ | First off, just so we get the terminology right, p-hacking constitutes things for decisions taken to affect the p-value without transparency. Since you are stating the differences from pre-registration, marking those as exploratory and explaining every step of the way, reporting both before and after exclusions, plus sharing all your data and code, none of that is p-hacking. | ||

+ | |||

+ | We failed to anticipate this in advance, but that is okay. There are several ways to address this: | ||

+ | - A standard criterion to address outliers in such cases is whether participants were +- 3 standard deviation from the mean. Someone who wrote a million US$ would qualify. | ||

+ | - Examine the distribution, if the distribution is not normal (skewness and kurtosis), which sounds like I might not be, then can perform a log transformation: newvariable=ln(oldvariablee) . In any case, I suggest plotting out the distribution of these variables and reporting their skewness and kurtosis in your descriptives. | ||

+ | |||

+ | Not sure I understand what “the same as the other option, which has an amount stated clearly (rendering a fail to measure risk preference)” means exactly, but you could code that and see how many qualify, and send me a follow up email. | ||

+ | |||

+ | </blockquote> | ||

==== Which effect size for proportions ==== | ==== Which effect size for proportions ==== | ||