(My hope is that this will maybe make it into the lecture notes for my methods class rather than being simple an exercise for the weblog.)
Today, amidst a gazillion other things, I also find myself preoccupied with sampling weights. Generally, as has been known for some time, most quantitative researchers think about weights incorrectly. The reason for this, I suspect, is that most quantitative researchers have only a murky idea about the implications of having a non-representative sample for regression estimates. Sociologists altogether exaggerate the virtues of a representative sample for analyzing associations between variables. This is relevant to weights because what weights do is provide some means of making a nonrepresentative sample more representative, by telling the computer to give more weight to those observations corresponding to observations with characteristics that are underrepresented in the sample relative to the population.
There are two key things that I think most quantitative sociologists don't realize, or if they do realize they fail to put the consequences of the two things together. The first is that nonrepresentativeness with respect to the explanatory variables in a model does not bias one's regression coefficients as long as your model is specified. The second is that for most analyses that sociologists conduct, the weights are constructed wholly with respect to variables that actually or potentially are explanatory variables. Put these together, and the implication is that for most analyses, the unweighted regression estimates should be similar to weighted regression estimates. And the unweighted estimates will have smaller standard errors, and so be better in that respect. Moreover, if they are not similar, instead of automatically deciding to use the weighted estimates, you should pause and wonder why. Specifically, you should try to think if the reason they are different points to a problem with the specification of the model.
For example, if the effect of income is different for whites and blacks and the model fails to include an interaction term that accounts for this, then the estimate of the effect of income is going to basically be a kind of average of the slopes for blacks and the slopes for whites. It's going to use one parameter to provide the best estimate of what are really two separate parameters, and so it will split the difference between the two of them. If there are more whites in the sample than blacks, the estimated income effect will be closer to the effect of income for whites than for the effect of income for blacks. If sampling weights are employed and these weights were constructed to account for some nonrepresentativeness of the sample by race, then this will adjust that estimate so that the estimated effect of income reflects the relative proportion of blacks and whites in the population rather than in the sample. But, of course, the better model is one that has an interaction term in it to begin with--that is, a model that is using separate parameters to estimate the different effects of income for blacks and whites.
I think much of the time when you get a difference between weighted and unweighted estimates, what is really going on is that you are pseudo-patching up a specification problem, and that very often what is being pseudo-patched are missing interaction terms. The practical determination of which interaction terms to include or exclude in a model is fraught with all kinds of complications--two complications are that the number of potential interactions gets very large quickly as the number of explanatory variables increase, and that generally the kind of datasets that sociologists provide very low power for testing interaction effects.
Anyway, having a nonrepresentative sample does make a real difference in regression estimates when the sample is nonrepresentative specifically with respect to the dependent variable. If you are studying the effects of independent variables are income and the sample is nonrepresentative with respect to income (and not just indirectly through the independent variables), the estimates you get from unweighted regression will be incorrect estimates of the population parameters. If the sampling weights account for the nonrepresentativeness of the sample on the dependent variable, they will fix the problem, although the standard errors will be wrong if you do not use robust standard errors.
What makes this interesting is that most of the fuss about weighting is with respect to variables that we think of as explanatory variables and not with respect to what are more commonly outcomes (income and educational attainment being the main exceptions). For example, the biggest example of this I can think of is health. I think there are good reasons to suspect that one of the reasons people may decline to participate in surveys is because they are in poor health. This would result in underestimating the effect of explanatory variables on health. Even so, I don't know if sampling weights are commonly adjusted to give more weight to the less healthy individuals who do participate. Maybe people in public health do this and I'm just unaware of it. I could see where they wouldn't, however, because one would have to be making a guess about what would be the best weight to provide.