Hide keyboard shortcuts

Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1_plot_added_variable_doc = """\ 

2 Create an added variable plot for a fitted regression model. 

3 

4 Parameters 

5 ---------- 

6 %(extra_params_doc)sfocus_exog : int or string 

7 The column index of exog, or a variable name, indicating the 

8 variable whose role in the regression is to be assessed. 

9 resid_type : str 

10 The type of residuals to use for the dependent variable. If 

11 None, uses `resid_deviance` for GLM/GEE and `resid` otherwise. 

12 use_glm_weights : bool 

13 Only used if the model is a GLM or GEE. If True, the 

14 residuals for the focus predictor are computed using WLS, with 

15 the weights obtained from the IRLS calculations for fitting 

16 the GLM. If False, unweighted regression is used. 

17 fit_kwargs : dict, optional 

18 Keyword arguments to be passed to fit when refitting the 

19 model. 

20 ax: Axes 

21 Matplotlib Axes instance 

22 

23 Returns 

24 ------- 

25 Figure 

26 A matplotlib figure instance. 

27""" 

28 

29_plot_partial_residuals_doc = """\ 

30 Create a partial residual, or 'component plus residual' plot for a 

31 fitted regression model. 

32 

33 Parameters 

34 ---------- 

35 %(extra_params_doc)sfocus_exog : int or string 

36 The column index of exog, or variable name, indicating the 

37 variable whose role in the regression is to be assessed. 

38 ax: Axes 

39 Matplotlib Axes instance 

40 

41 Returns 

42 ------- 

43 Figure 

44 A matplotlib figure instance. 

45""" 

46 

47_plot_ceres_residuals_doc = """\ 

48 Conditional Expectation Partial Residuals (CERES) plot. 

49 

50 Produce a CERES plot for a fitted regression model. 

51 

52 Parameters 

53 ---------- 

54 %(extra_params_doc)s 

55 focus_exog : {int, str} 

56 The column index of results.model.exog, or the variable name, 

57 indicating the variable whose role in the regression is to be 

58 assessed. 

59 frac : float 

60 Lowess tuning parameter for the adjusted model used in the 

61 CERES analysis. Not used if `cond_means` is provided. 

62 cond_means : array_like, optional 

63 If provided, the columns of this array span the space of the 

64 conditional means E[exog | focus exog], where exog ranges over 

65 some or all of the columns of exog (other than the focus exog). 

66 ax : matplotlib.Axes instance, optional 

67 The axes on which to draw the plot. If not provided, a new 

68 axes instance is created. 

69 

70 Returns 

71 ------- 

72 Figure 

73 The figure on which the partial residual plot is drawn. 

74 

75 Notes 

76 ----- 

77 `cond_means` is intended to capture the behavior of E[x1 | 

78 x2], where x2 is the focus exog and x1 are all the other exog 

79 variables. If all the conditional mean relationships are 

80 linear, it is sufficient to set cond_means equal to the focus 

81 exog. Alternatively, cond_means may consist of one or more 

82 columns containing functional transformations of the focus 

83 exog (e.g. x2^2) that are thought to capture E[x1 | x2]. 

84 

85 If nothing is known or suspected about the form of E[x1 | x2], 

86 set `cond_means` to None, and it will be estimated by 

87 smoothing each non-focus exog against the focus exog. The 

88 values of `frac` control these lowess smooths. 

89 

90 If cond_means contains only the focus exog, the results are 

91 equivalent to a partial residual plot. 

92 

93 If the focus variable is believed to be independent of the 

94 other exog variables, `cond_means` can be set to an (empty) 

95 nx0 array. 

96 

97 References 

98 ---------- 

99 .. [1] RD Cook and R Croos-Dabrera (1998). Partial residual plots 

100 in generalized linear models. Journal of the American 

101 Statistical Association, 93:442. 

102 

103 .. [2] RD Cook (1993). Partial residual plots. Technometrics 35:4. 

104 

105 Examples 

106 -------- 

107 Using a model built from the the state crime dataset, make a CERES plot with 

108 the rate of Poverty as the focus variable. 

109 

110 >>> import statsmodels.api as sm 

111 >>> import matplotlib.pyplot as plt 

112 >>> import statsmodels.formula.api as smf 

113 >>> from statsmodels.graphics.regressionplots import plot_ceres_residuals 

114 

115 >>> crime_data = sm.datasets.statecrime.load_pandas() 

116 >>> results = smf.ols('murder ~ hs_grad + urban + poverty + single', 

117 ... data=crime_data.data).fit() 

118 >>> plot_ceres_residuals(results, 'poverty') 

119 >>> plt.show() 

120 

121 .. plot:: plots/graphics_regression_ceres_residuals.py 

122""" 

123 

124 

125_plot_influence_doc = """\ 

126 Plot of influence in regression. Plots studentized resids vs. leverage. 

127 

128 Parameters 

129 ---------- 

130 {extra_params_doc} 

131 external : bool 

132 Whether to use externally or internally studentized residuals. It is 

133 recommended to leave external as True. 

134 alpha : float 

135 The alpha value to identify large studentized residuals. Large means 

136 abs(resid_studentized) > t.ppf(1-alpha/2, dof=results.df_resid) 

137 criterion : str {{'DFFITS', 'Cooks'}} 

138 Which criterion to base the size of the points on. Options are 

139 DFFITS or Cook's D. 

140 size : float 

141 The range of `criterion` is mapped to 10**2 - size**2 in points. 

142 plot_alpha : float 

143 The `alpha` of the plotted points. 

144 ax : AxesSubplot 

145 An instance of a matplotlib Axes. 

146 **kwargs 

147 Additional parameters passed through to `plot`. 

148 

149 Returns 

150 ------- 

151 Figure 

152 The matplotlib figure that contains the Axes. 

153 

154 Notes 

155 ----- 

156 Row labels for the observations in which the leverage, measured by the 

157 diagonal of the hat matrix, is high or the residuals are large, as the 

158 combination of large residuals and a high influence value indicates an 

159 influence point. The value of large residuals can be controlled using the 

160 `alpha` parameter. Large leverage points are identified as 

161 hat_i > 2 * (df_model + 1)/nobs. 

162 

163 Examples 

164 -------- 

165 Using a model built from the the state crime dataset, plot the influence in 

166 regression. Observations with high leverage, or large residuals will be 

167 labeled in the plot to show potential influence points. 

168 

169 >>> import statsmodels.api as sm 

170 >>> import matplotlib.pyplot as plt 

171 >>> import statsmodels.formula.api as smf 

172 

173 >>> crime_data = sm.datasets.statecrime.load_pandas() 

174 >>> results = smf.ols('murder ~ hs_grad + urban + poverty + single', 

175 ... data=crime_data.data).fit() 

176 >>> sm.graphics.influence_plot(results) 

177 >>> plt.show() 

178 

179 .. plot:: plots/graphics_regression_influence.py 

180 """ 

181 

182 

183_plot_leverage_resid2_doc = """\ 

184 Plot leverage statistics vs. normalized residuals squared 

185 

186 Parameters 

187 ---------- 

188 results : results instance 

189 A regression results instance 

190 alpha : float 

191 Specifies the cut-off for large-standardized residuals. Residuals 

192 are assumed to be distributed N(0, 1) with alpha=alpha. 

193 ax : Axes 

194 Matplotlib Axes instance 

195 **kwargs 

196 Additional parameters passed the plot command. 

197 

198 Returns 

199 ------- 

200 Figure 

201 A matplotlib figure instance. 

202 

203 Examples 

204 -------- 

205 Using a model built from the the state crime dataset, plot the leverage 

206 statistics vs. normalized residuals squared. Observations with 

207 Large-standardized Residuals will be labeled in the plot. 

208 

209 >>> import statsmodels.api as sm 

210 >>> import matplotlib.pyplot as plt 

211 >>> import statsmodels.formula.api as smf 

212 

213 >>> crime_data = sm.datasets.statecrime.load_pandas() 

214 >>> results = smf.ols('murder ~ hs_grad + urban + poverty + single', 

215 ... data=crime_data.data).fit() 

216 >>> sm.graphics.plot_leverage_resid2(results) 

217 >>> plt.show() 

218 

219 .. plot:: plots/graphics_regression_leverage_resid2.py 

220 """