使用矩阵对ols进行估计

张剑

2020/03/28

Categories: 计量 Tags: R

矩阵形式计算多元OLS估计系数,标准误等

显然我们有:

\[\hat{\boldsymbol{\beta}}=\left(\mathbf{X}^{\prime} \mathbf{X}\right)^{-1} \mathbf{X}^{\prime} \mathbf{y}\]

同时我们有:

\[\hat{\mathbf{u}}=\mathbf{y}-\mathbf{X} \hat{\boldsymbol{\beta}}\]

于是乎我们又有:

\[\hat{\sigma}^{2}=\frac{1}{n-k-1} \hat{\mathbf{u}}^{\prime} \hat{\mathbf{u}}\]

最后我们有;

\[\widehat{\operatorname{Var}(\hat{\boldsymbol{\beta}})}=\hat{\sigma}^{2}\left(\mathbf{X}^{\prime} \mathbf{X}\right)^{-1}\]

然后就可以sqrt得到se了


使用计量经济学导论的一个例子进行验证

模型: \[lwage_i=\beta_0+\beta_1educ_i+\beta_2exper_i+\beta_3tenure+\mu_i\]

pacman::p_load(tidyverse,wooldridge,stargazer,equatiomatic)
wage1 <- wooldridge::wage1
res1 <- lm(lwage~educ+exper+tenure,data=wage1)

stargazer(res1,type = 'text')
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                                lwage           
## -----------------------------------------------
## educ                         0.092***          
##                               (0.007)          
##                                                
## exper                         0.004**          
##                               (0.002)          
##                                                
## tenure                       0.022***          
##                               (0.003)          
##                                                
## Constant                     0.284***          
##                               (0.104)          
##                                                
## -----------------------------------------------
## Observations                    526            
## R2                             0.316           
## Adjusted R2                    0.312           
## Residual Std. Error      0.441 (df = 522)      
## F Statistic           80.391*** (df = 3; 522)  
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01
stargazer(res1,type='text')
## 
## ===============================================
##                         Dependent variable:    
##                     ---------------------------
##                                lwage           
## -----------------------------------------------
## educ                         0.092***          
##                               (0.007)          
##                                                
## exper                         0.004**          
##                               (0.002)          
##                                                
## tenure                       0.022***          
##                               (0.003)          
##                                                
## Constant                     0.284***          
##                               (0.104)          
##                                                
## -----------------------------------------------
## Observations                    526            
## R2                             0.316           
## Adjusted R2                    0.312           
## Residual Std. Error      0.441 (df = 522)      
## F Statistic           80.391*** (df = 3; 522)  
## ===============================================
## Note:               *p<0.1; **p<0.05; ***p<0.01

接下来矩阵形式计算

df <- wage1 %>% select(lwage,educ,exper,tenure) %>% drop_na()
n <- nrow(df)
k <- 3 # k是控制变量数量,就是为了计算n-k-1
y <- df$lwage
x <- cbind(1,df$educ,df$exper,df$tenure)
bhat <- solve(t(x) %*% x ) %*% t(x) %*% y
bhat
##             [,1]
## [1,] 0.284359541
## [2,] 0.092028988
## [3,] 0.004121109
## [4,] 0.022067218
uhat <- y - x %*% bhat
head(uhat,5)
##             [,1]
## [1,] -0.17351850
## [2,] -0.34793289
## [3,] -0.20630832
## [4,] -0.02804286
## [5,]  0.20601725
sigsqhat <- as.numeric(t(uhat) %*% uhat/(n-k-1))
vbetahat <- sigsqhat * solve(t(x) %*% x)  
se <- sqrt(diag(vbetahat))
se
## [1] 0.104190379 0.007329923 0.001723277 0.003093649

结果是一致的