Wednesday, October 11, 2006

special topics in calamity psychology

wichters et al figure
(graph from the Wicherts et al. study)

My paper advocating for replication standards in sociology has been conditionally accepted by Sociological Methods and Research, and I finished the revisions yesterday. Today, I learned about the study with the above graph, which is from the most recent issue of the American Psychologist. The Code of Ethics Standard for data sharing in sociology bears direct affinities to the one in psychology (one borrows language from the other, or both from the same source; it's unclear). Only in psychology, a condition of publication is that authors have to sign a statement of adherence to research ethics, which is presumed to encompass their agreement. The relevant APA code:
After research results are published, psychologists do not withhold the data on which their conclusions are based from other competent professionals who seek to verify the substantive claims through reanalysis and who intend to use such data only for that purpose, provided that the confidentiality of the participants can be protected and unless legal rights concerning proprietary data preclude their release. (American Psychological Association, 2001, p. 396)
Anyway, the authors of this study asked the authors of 141 papers that appeared in American Psychological Association journals for the data from their papers for the purposes of re-analysis.* Only 27% ended up providing their data after repeated requests from the authors. Put another way, 73% of what you read as findings in the most esteemed journals in psychology are not available for independent verification by others.

Does someone really have to go and do a study like this for sociology for it to be believed that this is also a problem in our discipline?

* Wicherts, Jelte M.; Borsboom, Denny; Kats, Judith; Molenaar, Dylan The Poor Availability of Psychological Research Data for Reanalysis. American Psychologist. 61(7), Oct 2006, 726-728. The authors used a census of articles from the last 2 issues of 2004 and thus the 141 articles do not comprise some sample selected for their being more or less likely to share. [HT: John Hoffmann]


Kim said...

I don't doubt that it's a problem in sociology, but it's hard to predict whether sociologists would be better or worse about sharing data than psychologists. On one hand, a larger proportion of articles in sociology are published using public use data. (Getting authors to share code is another matter, but if methods sections are adequately precise, users should be able to replicate data analytic decisions & results relatively well even without the code.) I wouldn't want to hazard a guess as to what the proportion of public-use data articles is -- obviously, it's going to vary across journals/subfields -- but it can't be trivial.

On the other hand, we have slower publication queues in sociology and, in some subfields, greater time investments in the data collection itself (e.g., years to collect and code data from primary sources, as opposed to a semester or less to run a psych experiment). Given this, you'd expect sociologists who rely on primary data collection to be a bit slower to release it, simply to give them more time to recoup their investments.

If your campaign for greater accountability is successful, will it have the unintended consequence of reducing sociologists' motivation to invest in collecting new data?

Lars said...

Jumping off Kim's second point, I wonder if (in some magical world), something like data collection finance reform would change willingness to share data/make data avaialble, etc.? The general questions, I guess, are: what are the costs of data collection, who bears these costs, and do they (costs and their nature) have an effect on their availability?

I've thought fow a while it owuld be great if field researchers could somehow share their data (asusming a way to protect the innocent, etc.). But, of course, many of the costs of field work are borne by researchers themselves.

Anonymous said...

This is a very important question. I'm an anthropologist and we have similar issues. Obviously, reproducibility is an important aspect of scientific verification. So it is important for scientists to check the veracity of claims. And this can only be done with the original data.

On the other hand, I am sympathetic to the concerns of kim and lars. Data are often hard-won. If collected via field research, they sometimes represent years of investment by the researcher. That researcher has an obligation to dessiminate their results. But they also have a right to publish their results without having them be scooped or parasitized by arm-chair workers. This dilemma causes a lot of conflict in many fields.

The strongest incentive to provide original data to other researchers should come from the unwillingness of funding agencies to provide future support to people who refuse to share data.

Anonymous said...

I meant to say "disseminate".

Anonymous said...

I'm still looking for a good definition of 'sociologist' and 'sociology' (no kidding).

Anonymous said...

Anonymous said...

Arm-chair workers? Like, say, Durkheim?

Anonymous said...

Yep. Pretty much.