As mentioned, I am doing a presentation at the sociology methodology meetings next month, and my intention is to make an argument so radical (for sociology) that I fear I could be judged insane.
The argument is that if quantitative sociologists have enough confidence in their results to publish them in one of the discipline's journals, they should have enough confidence to deposit the code that produces these results in an independent public online archive (like this one) at the time of the article's publication.*
Yes! I am really this crazy!
Everyone agrees that good data analytic practice implies having a set of code that takes one all the way from a pristine data set to the numbers that are presented into a paper. This code serves as an implicit technical appendix to any published quantitative article. So long as this code already is presumed to exist, why not make it publicly available? Not just "upon request," but available up front. Not just "available on your webpage," but available in a place where it will still be even if you quit sociology.
And not just as a matter of good individual practice, but as a matter of collective practice. This is something we should insist that researchers do if they want their work to appear in the discipline's major journals. This should be part of the price of admission for publishing.
I look around quantitative sociology and think, what is the simplest thing that could be normativized or institutionalized that would increase the quality and credibility of quantitative work done in the discipline, and I think this is it. Besides which, I think it is absurd for sociologists to stand around and lament how everyone gives economics so much more credibility than sociology when the flagship journal of economics holds its researchers to this standard and sociology just has some vague and completely toothless statement that researchers should "permit" others to verify their results.
The title of my presentation is "Reproducibility Standards for Quantitative Social Science: Why Not Sociology?" Let me know if you have any reactions to this blogprecis.
* If they have custodial rights over the data, they should be depositing that, too, but I don't want the complications and politics that surround data-sharing to be used as grounds to dodge making the code available. The confidentiality and exclusivity arguments that are employed against broader data-sharing evaporate when you focus the standard on the code. (This is, indeed, the only part of my argument that is remotely original.) I do think that people who have custodial rights over data should be expected to say something about the availability of that data. In other words, if a researcher's stance is that "For confidentiality reasons, no extract of these data can be given to outside researchers, even for the purposes of verifying results," I think this is something that the reviewers and audience for the article have a right to know up front.
BTW, the ICPSR-PRA archive instructions are a little misleading in that they make it sound like it's only for depositing data, but you can deposit code there without data. The Murray Archive at Harvard also accepts code, and presumably there are others.