今天在检索GMM的时候发现Wooldrige大神在一个帖子下回复，深以为然。大神还顺带帮我们区分了method和model的不同，之前确实是瞎用一气。😂

Just a correction in terminology: OLS and GMM are not “models.” They are estimation methods. I would say that about FE, too. You have one model — an AR(1) model with an unobserved effect — and you are estimating in four different ways: OLS, FE, difference GMM, system GMM.

Just using OLS never consistently estimates the coefficient on l.depvar unless there is no heterogeneity, and there is an upward biase. FE is also biased, usually downward, and the bias can be substantial unless T is large. So you get a large positive estimate using OLS and a negative estimate using FE? That is perfectly consistent with what is known about the estimators. GMM using the Arellano-Bond conditions is the most robust: It only uses the moment conditions implied by the AR(1) model, and it properly removes the heterogeneity. GMM system adds extra moment conditions that may be false. If the difference GMM estimate seems reasonable precise, and there is no evidence of weak instruments, I would go with that.

Why are OLS and FE even in the running? Maybe FE with a very large T, but it has nothing over GMM unless GMM produces large standard errors.

顺便再贴一个动态模型达人Sebastian Kripfganz针对系统GMM的变量选择、滞后阶以及虚假检验的说明：

The results are not directly comparable (FE v.s. Sys-GMM). Your system GMM estimates are for a dynamic model while the fixed-effects estimates are for a static model.

Your system GMM estimates are likely to suffer from instrument proliferation. Despite the`collapse`

suboption, the number of instruments is still too large. You could restrict the number of lags used to form these instruments with the suboption`laglimits()`

or re-think your assumptions. Do you really need to assume that so many variables are endogenous?`xtabond2`

shows you a warning message that uncorrected two-step standard errors are unreliable. This should be taken serious. Windmeijer-corrected robust standard errors should be computed with the robust option.

Too many instruments together with non-robust two-step standard errors turn the Hansen test result unreliable. The “too high” p-value is a clear indication in that regard. The Sargan test is not really useful in the context of system GMM estimation because the one-step weighting matrix is not optimal. (In any case, a p-value of 0.000 is not a good sign as you would reject the null hypothesis of correct model specification.)