acgc.stats.bivariate_lines
Specialized methods of bivariate line fitting
- Standard major axis (SMA) also called reduced major axis (RMA)
- York regression, for data with errors in x and y
- Theil-Sen, non-parametric slope estimation (use numba to accelerate the function in this module)
1# -*- coding: utf-8 -*- 2"""Specialized methods of bivariate line fitting 3 4* Standard major axis (SMA) also called reduced major axis (RMA) 5* York regression, for data with errors in x and y 6* Theil-Sen, non-parametric slope estimation (use numba to accelerate the function in this module) 7""" 8# from collections import namedtuple 9import warnings 10import numpy as np 11import scipy.stats as stats 12from sklearn.covariance import MinCovDet 13import statsmodels.formula.api as smf 14import statsmodels.robust.norms as norms 15from scipy.stats import theilslopes 16#from numba import jit 17 18__all__ = [ 19 "bivariate_line_equation", 20 "sma", 21 "smafit", 22 "sen", 23 "sen_slope", 24 "sen_numba", 25 "york" 26] 27# Aliases 28def sen_slope(*args,**kwargs): 29 '''Alias for `sen`''' 30 return sen(*args,**kwargs) 31def smafit(*args,**kwargs): 32 '''Alias for `sma`''' 33 return sma(*args,**kwargs) 34 35def bivariate_line_equation(fitresult, 36 floatformat='{:.3f}', 37 ystring='include', 38 include_error=False ): 39 '''Write equation for the fitted line as a string 40 41 Parameters 42 ---------- 43 fitresult : dict 44 results of the line fit 45 floatformat : str 46 format string for the numerical values (default='{:.3f}') 47 ystring : {'include' (default), 'separate', 'none'} 48 specifies whether "y =" should be included in result, a separate item in tuple, or none 49 include_error : bool 50 specifies whether uncertainty terms should be included in the equation 51 52 Returns 53 ------- 54 fitline_string : str 55 equation for the the fitted line, in the form "y = a x + b" or "y = a x" 56 If uncertainty terms are included, then "y = (a ± c) x + (b ± d)" or "y = (a ± c) x" 57 ''' 58 59 # Left-hand side 60 lhs = "y_"+fitresult['method'] 61 62 # Right-hand side 63 if fitresult['fitintercept']: 64 if include_error: 65 rhs = f'({floatformat:s} ± {floatformat:s}) x + ({floatformat:s} ± {floatformat:s})'.\ 66 format( fitresult['slope'], fitresult['slope_ste'], fitresult['intercept'], fitresult['intercept_ste'] ) 67 else: 68 rhs = f'{floatformat:s} x + {floatformat:s}'.\ 69 format( fitresult['slope'], fitresult['intercept'] ) 70 else: 71 if include_error: 72 rhs = f'({floatformat:s} ± {floatformat:s}) x'.\ 73 format( fitresult['slope'], fitresult['slope_ste'] ) 74 else: 75 rhs = f'{floatformat:s} x'.\ 76 format( fitresult['slope'] ) 77 78 # Combine right and left-hand sides 79 if ystring=='include': 80 equation = f'{lhs:s} = {rhs:s}' 81 elif ystring=='separate': 82 equation = (lhs,rhs) 83 elif ystring=='none': 84 equation = rhs 85 else: 86 raise ValueError('Unrecognized value of ystring: '+ystring) 87 88 return equation 89 90def sma(X,Y,W=None, 91 data=None, 92 alpha=0.95, 93 intercept=True, 94 robust=False,robust_method='FastMCD'): 95 '''Standard Major-Axis (SMA) line fitting 96 97 Calculate standard major axis, aka reduced major axis, fit to 98 data X and Y. The main advantage of this over ordinary least squares is 99 that the best fit of Y to X will be the same as the best fit of X to Y. 100 101 The fit equations and confidence intervals are implemented following 102 Warton et al. (2006). Robust fits use the FastMCD covariance estimate 103 from Rousseeuw and Van Driessen (1999). While there are many alternative 104 robust covariance estimators (e.g. other papers by D.I. Warton using M-estimators), 105 the FastMCD algorithm is default in Matlab. When the standard error or 106 uncertainty of each point is known, then weighted SMA may be preferrable to 107 robust SMA. The conventional choice of weights for each point i is 108 W_i = 1 / ( var(X_i) + var(Y_i) ), where var() is the variance 109 (squared standard error). 110 111 References 112 Warton, D. I., Wright, I. J., Falster, D. S. and Westoby, M.: 113 Bivariate line-fitting methods for allometry, Biol. Rev., 81(02), 259, 114 doi:10.1017/S1464793106007007, 2006. 115 Rousseeuw, P. J. and Van Driessen, K.: A Fast Algorithm for the Minimum 116 Covariance Determinant Estimator, Technometrics, 41(3), 1999. 117 118 Parameters 119 ---------- 120 X, Y : array_like or str 121 Input values, Must have same length. 122 W : array_like or str, optional 123 array of weights for each X-Y point, typically W_i = 1/(var(X_i)+var(Y_i)) 124 data : dict_like, optional 125 data structure containing variables. Used when X, Y, or W are str. 126 alpha : float (default = 0.95) 127 Desired confidence level [0,1] for output. 128 intercept : bool, default=True 129 Specify if the fitted model should include a non-zero intercept. 130 The model will be forced through the origin (0,0) if intercept=False. 131 robust : bool, default=False 132 Use statistical methods that are robust to the presence of outliers 133 robust_method: {'FastMCD' (default), 'Huber', 'Biweight'} 134 Method for calculating robust variance and covariance. Options: 135 - 'MCD' or 'FastMCD' for Fast MCD 136 - 'Huber' for Huber's T: reduce, not eliminate, influence of outliers 137 - 'Biweight' for Tukey's Biweight: reduces then eliminates influence of outliers 138 139 140 Returns 141 ------- 142 fitresult : dict 143 Contains the following keys: 144 - slope (float) 145 Slope or Gradient of Y vs. X 146 - intercept (float) 147 Y intercept. 148 - slope_ste (float) 149 Standard error of slope estimate 150 - intercept_ste (float) 151 standard error of intercept estimate 152 - slope_interval ([float, float]) 153 confidence interval for gradient at confidence level alpha 154 - intercept_interval ([float, float]) 155 confidence interval for intercept at confidence level alpha 156 - alpha (float) 157 confidence level [0,1] for slope and intercept intervals 158 - df_model (float) 159 degrees of freedom for model 160 - df_resid (float) 161 degrees of freedom for residuals 162 - params ([float,float]) 163 array of fitted parameters 164 - fittedvalues (ndarray) 165 array of fitted values 166 - resid (ndarray) 167 array of residual values 168 - method (str) 169 name of the fit method 170 ''' 171 172 def str2var( v, data ): 173 '''Extract variable named v from Dataframe named data''' 174 try: 175 return data[v] 176 except Exception as exc: 177 raise ValueError( 'Argument data must be provided with a key named '+v ) from exc 178 179 # If variables are provided as strings, get values from the data structure 180 if isinstance( X, str ): 181 X = str2var( X, data ) 182 if isinstance( Y, str ): 183 Y = str2var( Y, data ) 184 if isinstance( W, str ): 185 W = str2var( W, data ) 186 187 # Make sure arrays have the same length 188 assert ( len(X) == len(Y) ), 'Arrays X and Y must have the same length' 189 if W is None: 190 W = np.zeros_like(X) + 1 191 else: 192 assert ( len(W) == len(X) ), 'Array W must have the same length as X and Y' 193 194 # Make sure alpha is within the range 0-1 195 assert (alpha < 1), 'alpha must be less than 1' 196 assert (alpha > 0), 'alpha must be greater than 0' 197 198 # Drop any NaN elements of X, Y, or W 199 # Infinite values are allowed but will make the result undefined 200 # idx = ~np.logical_or( np.isnan(X0), np.isnan(Y0) ) 201 idx = ~np.isnan(X) * ~np.isnan(Y) * ~np.isnan(W) 202 203 X0 = X[idx] 204 Y0 = Y[idx] 205 W0 = W[idx] 206 207 # Number of observations 208 N = len(X0) 209 210 include_intercept = intercept 211 212 # Degrees of freedom for the model 213 if include_intercept: 214 dfmod = 2 215 else: 216 dfmod = 1 217 218 method = 'SMA' 219 220 # Choose whether to use methods robust to outliers 221 if robust: 222 223 method = 'rSMA' 224 225 # Choose the robust method 226 if ((robust_method.lower() =='mcd') or (robust_method.lower() == 'fastmcd') ): 227 # FAST MCD 228 229 if not include_intercept: 230 # intercept=False could possibly be supported by calculating 231 # using mcd.support_ as weights in an explicit variance/covariance calculation 232 raise NotImplementedError('FastMCD method only supports SMA with intercept') 233 234 # Fit robust model of mean and covariance 235 mcd = MinCovDet().fit( np.array([X0,Y0]).T ) 236 237 # Robust mean 238 Xmean = mcd.location_[0] 239 Ymean = mcd.location_[1] 240 241 # Robust variance of X, Y 242 Vx = mcd.covariance_[0,0] 243 Vy = mcd.covariance_[1,1] 244 245 # Robust covariance 246 Vxy = mcd.covariance_[0,1] 247 248 # Number of observations used in mean and covariance estimate 249 # excludes observations marked as outliers 250 N = mcd.support_.sum() 251 252 elif ((robust_method.lower() =='biweight') or (robust_method.lower() == 'huber') ): 253 254 # Tukey's Biweight and Huber's T 255 if robust_method.lower()=='biweight': 256 norm = norms.TukeyBiweight() 257 else: 258 norm = norms.HuberT() 259 260 # Get weights for downweighting outliers 261 # Fitting a linear model the easiest way to get these 262 # Options include "TukeyBiweight" (totally removes large deviates) 263 # "HuberT" (linear, not squared weighting of large deviates) 264 rweights = smf.rlm('y~x+1',{'x':X0,'y':Y0},M=norm).fit().weights 265 266 # Sum of weight and weights squared, for convienience 267 rsum = np.sum( rweights ) 268 rsum2 = np.sum( rweights**2 ) 269 270 # Mean 271 Xmean = np.sum( X0 * rweights ) / rsum 272 Ymean = np.sum( Y0 * rweights ) / rsum 273 274 # Force intercept through zero, if requested 275 if not include_intercept: 276 Xmean = 0 277 Ymean = 0 278 279 # Variance & Covariance 280 Vx = np.sum( (X0-Xmean)**2 * rweights**2 ) / rsum2 281 Vy = np.sum( (Y0-Ymean)**2 * rweights**2 ) / rsum2 282 Vxy = np.sum( (X0-Xmean) * (Y0-Ymean) * rweights**2 ) / rsum2 283 284 # Effective number of observations 285 N = rsum 286 287 else: 288 289 raise NotImplementedError("sma hasn't implemented robust_method={:%s}".\ 290 format(robust_method)) 291 else: 292 293 if include_intercept: 294 295 wsum = np.sum(W) 296 297 # Average values 298 Xmean = np.sum(X0 * W0) / wsum 299 Ymean = np.sum(Y0 * W0) / wsum 300 301 # Covariance matrix 302 cov = np.cov( X0, Y0, ddof=1, aweights=W0**2 ) 303 304 # Variance 305 Vx = cov[0,0] 306 Vy = cov[1,1] 307 308 # Covariance 309 Vxy = cov[0,1] 310 311 else: 312 313 # Force the line to pass through origin by setting means to zero 314 Xmean = 0 315 Ymean = 0 316 317 wsum = np.sum(W0) 318 319 # Sum of squares in place of variance and covariance 320 Vx = np.sum( X0**2 * W0 ) / wsum 321 Vy = np.sum( Y0**2 * W0 ) / wsum 322 Vxy= np.sum( X0*Y0 * W0 ) / wsum 323 324 # Standard deviation 325 Sx = np.sqrt( Vx ) 326 Sy = np.sqrt( Vy ) 327 328 # Correlation coefficient (equivalent to np.corrcoef()[1,0] for non-robust cases) 329 R = Vxy / np.sqrt( Vx * Vy ) 330 331 ############# 332 # SLOPE 333 334 Slope = np.sign(R) * Sy / Sx 335 336 # Standard error of slope estimate 337 ste_slope = np.sqrt( 1/(N-dfmod) * Sy**2 / Sx**2 * (1-R**2) ) 338 339 # Confidence interval for Slope 340 B = (1-R**2)/(N-dfmod) * stats.f.isf(1-alpha, 1, N-dfmod) 341 ci_grad = Slope * ( np.sqrt( B+1 ) + np.sqrt(B)*np.array([-1,+1]) ) 342 343 ############# 344 # INTERCEPT 345 346 if include_intercept: 347 Intercept = Ymean - Slope * Xmean 348 349 # Standard deviation of residuals 350 # New Method: Formula from smatr R package (Warton) 351 # This formula avoids large residuals of outliers when using robust=True 352 Sr = np.sqrt((Vy - 2 * Slope * Vxy + Slope**2 * Vx ) * (N-1) / (N-dfmod) ) 353 354 # OLD METHOD 355 # Standard deviation of residuals 356 #resid = Y0 - (Intercept + Slope * X0 ) 357 # Population standard deviation of the residuals 358 #Sr = np.std( resid, ddof=0 ) 359 360 # Standard error of the intercept estimate 361 ste_int = np.sqrt( Sr**2/N + Xmean**2 * ste_slope**2 ) 362 363 # Confidence interval for Intercept 364 tcrit = stats.t.isf((1-alpha)/2,N-dfmod) 365 ci_int = Intercept + ste_int * np.array([-tcrit,tcrit]) 366 367 else: 368 369 # Set Intercept quantities to zero 370 Intercept = 0 371 ste_int = 0 372 ci_int = np.array([0,0]) 373 374 result = dict( method = method, 375 fitintercept = include_intercept, 376 slope = Slope, 377 intercept = Intercept, 378 slope_ste = ste_slope, 379 intercept_ste = ste_int, 380 slope_interval = ci_grad, 381 intercept_interval = ci_int, 382 alpha = alpha, 383 df_model = dfmod, 384 df_resid = N-dfmod, 385 params = np.array([Slope,Intercept]), 386 nobs = N, 387 fittedvalues = Intercept + Slope * X0, 388 resid = Intercept + Slope * X0 - Y0 ) 389 390 # return Slope, Intercept, ste_slope, ste_int, ci_grad, ci_int 391 return result 392 393def york( x, y, err_x=1, err_y=1, rerr_xy=0 ): 394 '''York regression accounting for error in x and y 395 Follows the notation and algorithm of York et al. (2004) Section III 396 397 Parameters 398 ---------- 399 x, y : ndarray 400 dependent (x) and independent (y) variables for fitting 401 err_x, err_y : ndarray (default=1) 402 standard deviation of errors/uncertainty in x and y 403 rerr_xy : float (default=0) 404 correlation coefficient for errors in x and y, 405 default to rerr_xy=0 meaning that the errors in x are unrelated to errors in y 406 err_x, err_y, and rerr_xy can be constants or arrays of the same length as x and y 407 408 Returns 409 ------- 410 fitresult : dict 411 Contains the following keys: 412 - slope (float) 413 Slope or Gradient of Y vs. X 414 - intercept (float) 415 Y intercept. 416 - slope_ste (float) 417 Standard error of slope estimate 418 - intercept_ste (float) 419 standard error of intercept estimate 420 - slope_interval ([float, float]) 421 confidence interval for gradient at confidence level alpha 422 - intercept_interval ([float, float]) 423 confidence interval for intercept at confidence level alpha 424 - alpha (float) 425 confidence level [0,1] for slope and intercept intervals 426 - df_model (float) 427 degrees of freedom for model 428 - df_resid (float) 429 degrees of freedom for residuals 430 - params ([float,float]) 431 array of fitted parameters 432 - fittedvalues (ndarray) 433 array of fitted values 434 - resid (ndarray) 435 array of residual values 436 ''' 437 438 # relative error tolerance required for convergence 439 rtol = 1e-15 440 441 # Initial guess for slope, from ordinary least squares 442 result = stats.linregress( x, y ) 443 b = result[0] 444 445 # Weights for x and y 446 wx = 1 / err_x**2 447 wy = 1 / err_y**2 448 449 # Combined weights 450 alpha = np.sqrt( wx * wy ) 451 452 # Iterate until solution converges, but not more 50 times 453 maxiter=50 454 for i in range(1,maxiter): 455 456 # Weight for point i 457 W = wx * wy / ( wx + b**2 * wy - 2 * b * rerr_xy * alpha ) 458 Wsum = np.sum( W ) 459 460 # Weighted means 461 Xbar = np.sum( W * x ) / Wsum 462 Ybar = np.sum( W * y ) / Wsum 463 464 # Deviation from weighted means 465 U = x - Xbar 466 V = y - Ybar 467 468 # parameter needed for slope 469 beta = W * ( U / wy + b*V / wx - (b*U + V) * rerr_xy / alpha ) 470 471 # Update slope estimate 472 bnew = np.sum( W * beta * V ) / np.sum( W * beta * U ) 473 474 # Break from loop if new value is very close to old value 475 if np.abs( (bnew-b)/b ) < rtol: 476 break 477 else: 478 b = bnew 479 480 if i==maxiter: 481 raise ValueError( f'York regression failed to converge in {maxiter:d} iterations' ) 482 483 # Intercept 484 a = Ybar - b * Xbar 485 486 # least-squares adjusted points, expectation values of X and Y 487 xa = Xbar + beta 488 ya = Ybar + b*beta 489 490 # Mean of adjusted points 491 xabar = np.sum( W * xa ) / Wsum 492 yabar = np.sum( W * ya ) / Wsum 493 494 # Devaiation of adjusted points from their means 495 u = xa - xabar 496 v = ya - yabar 497 498 # Variance of slope and intercept estimates 499 varb = 1 / np.sum( W * u**2 ) 500 vara = 1 / Wsum + xabar**2 * varb 501 502 # Standard error of slope and intercept 503 siga = np.sqrt( vara ) 504 sigb = np.sqrt( varb ) 505 506 # Define a named tuple type that will contain the results 507 # result = namedtuple( 'result', 'slope intercept sigs sigi params sigma' ) 508 509 # Return results as a named tuple, User can access as a regular tuple too 510 # return result( b, a, sigb, siga, [b,a], [sigb, siga] ) 511 512 dfmod = 2 513 N = np.sum( ~np.isnan(x) * ~np.isnan(y) ) 514 515 result = dict( method = 'York', 516 fitintercept = True, 517 slope = b, 518 intercept = a, 519 slope_ste = sigb, 520 intercept_ste = siga, 521 slope_interval = [None,None], 522 intercept_interval = [None,None], 523 alpha = alpha, 524 df_model = dfmod, 525 df_resid = N-dfmod, 526 params = np.array([b,a]), 527 nobs = N, 528 fittedvalues = a + b * x, 529 resid = a + b * x - y ) 530 531 return result 532 533def sen( x, y, alpha=0.95, method='separate' ): 534 ''''Theil-Sen slope estimate 535 536 This function wraps `scipy.stats.theilslopes` and provides 537 results in the same dict format as the other line fitting methods 538 in this module 539 540 Parameters 541 ---------- 542 x, y : ndarray 543 dependent (x) and independent (y) variables for fitting 544 alpha : float (default = 0.95) 545 Desired confidence level [0,1] for output. 546 method : {'separate' (default), 'joint'} 547 Method for estimating intercept. 548 - 'separate' uses np.median(y) - slope * np.median(x) 549 - 'joint' uses np.median( y - slope * x ) 550 551 Returns 552 ------- 553 fitresult : dict 554 Contains the following keys: 555 - slope (float) 556 Slope or Gradient of Y vs. X 557 - intercept (float) 558 Y intercept. 559 - slope_ste (float) 560 Standard error of slope estimate 561 - intercept_ste (float) 562 standard error of intercept estimate 563 - slope_interval ([float, float]) 564 confidence interval for gradient at confidence level alpha 565 - intercept_interval ([float, float]) 566 confidence interval for intercept at confidence level alpha 567 - alpha (float) 568 confidence level [0,1] for slope and intercept intervals 569 - df_model (float) 570 degrees of freedom for model 571 - df_resid (float) 572 degrees of freedom for residuals 573 - params ([float,float]) 574 array of fitted parameters 575 - fittedvalues (ndarray) 576 array of fitted values 577 - resid (ndarray) 578 array of residual values 579 ''' 580 581 slope, intercept, low_slope, high_slope = theilslopes(y,x,alpha,method) 582 583 dfmod = 2 584 N = np.sum( ~np.isnan(x) * ~np.isnan(y) ) 585 586 result = dict( method = 'Theil-Sen', 587 fitintercept = True, 588 slope = slope, 589 intercept = intercept, 590 slope_ste = None, 591 intercept_ste = None, 592 slope_interval = [low_slope,high_slope], 593 intercept_interval = [None,None], 594 alpha = alpha, 595 df_model = dfmod, 596 df_resid = N-dfmod, 597 params = np.array([slope,intercept]), 598 nobs = N, 599 fittedvalues = intercept + slope * x, 600 resid = intercept + slope * x - y ) 601 602 return result 603 604#@jit(nopython=True) 605def sen_numba( x, y ): 606 '''Estimate linear trend using the Thiel-Sen method 607 608 This non-parametric method finds the median slope among all 609 combinations of time points. 610 scipy.stats.theilslopes provides the same slope estimate, with 611 confidence intervals. However, this function is faster for 612 large datasets due to Numba 613 614 Parameters 615 ---------- 616 x : array_like (N,) 617 independent variable 618 y : array_like (N,) 619 dependent variable 620 621 Returns 622 ------- 623 sen : float 624 the median slope 625 slopes : array (N*N,) 626 all slope estimates from all combinations of x and y 627 ''' 628 629 with warnings.catch_warnings(): 630 warnings.simplefilter('always', DeprecationWarning) 631 warnings.warn(f'Sen function is slow unless numba.jit is used. Use scipy.stats.theilslopes instead.', 632 DeprecationWarning, stacklevel=2) 633 634 if len( x ) != len( y ): 635 print('Inputs x and y must have same dimension') 636 return np.nan 637 638 # Find number of time points 639 n = len( x ) 640 641 # Array to hold all slope estimates 642 slopes = np.zeros( np.ceil( n * ( n-1 ) / 2 ).astype('int') ) 643 slopes[:] = np.nan 644 645 count = 0 646 647 for i in range(n): 648 for j in range(i+1, n): 649 650 # Slope between elements i and j 651 slopeij = ( y[j] - y[i] ) / ( x[j] - x[i] ) 652 653 slopes[count] = slopeij 654 655 count += 1 656 657 # Thiel-Sen estimate is the median slope, neglecting NaN 658 sen = np.nanmedian( slopes ) 659 660 return sen, slopes
36def bivariate_line_equation(fitresult, 37 floatformat='{:.3f}', 38 ystring='include', 39 include_error=False ): 40 '''Write equation for the fitted line as a string 41 42 Parameters 43 ---------- 44 fitresult : dict 45 results of the line fit 46 floatformat : str 47 format string for the numerical values (default='{:.3f}') 48 ystring : {'include' (default), 'separate', 'none'} 49 specifies whether "y =" should be included in result, a separate item in tuple, or none 50 include_error : bool 51 specifies whether uncertainty terms should be included in the equation 52 53 Returns 54 ------- 55 fitline_string : str 56 equation for the the fitted line, in the form "y = a x + b" or "y = a x" 57 If uncertainty terms are included, then "y = (a ± c) x + (b ± d)" or "y = (a ± c) x" 58 ''' 59 60 # Left-hand side 61 lhs = "y_"+fitresult['method'] 62 63 # Right-hand side 64 if fitresult['fitintercept']: 65 if include_error: 66 rhs = f'({floatformat:s} ± {floatformat:s}) x + ({floatformat:s} ± {floatformat:s})'.\ 67 format( fitresult['slope'], fitresult['slope_ste'], fitresult['intercept'], fitresult['intercept_ste'] ) 68 else: 69 rhs = f'{floatformat:s} x + {floatformat:s}'.\ 70 format( fitresult['slope'], fitresult['intercept'] ) 71 else: 72 if include_error: 73 rhs = f'({floatformat:s} ± {floatformat:s}) x'.\ 74 format( fitresult['slope'], fitresult['slope_ste'] ) 75 else: 76 rhs = f'{floatformat:s} x'.\ 77 format( fitresult['slope'] ) 78 79 # Combine right and left-hand sides 80 if ystring=='include': 81 equation = f'{lhs:s} = {rhs:s}' 82 elif ystring=='separate': 83 equation = (lhs,rhs) 84 elif ystring=='none': 85 equation = rhs 86 else: 87 raise ValueError('Unrecognized value of ystring: '+ystring) 88 89 return equation
Write equation for the fitted line as a string
Parameters
- fitresult (dict): results of the line fit
- floatformat (str): format string for the numerical values (default='{:.3f}')
- ystring ({'include' (default), 'separate', 'none'}): specifies whether "y =" should be included in result, a separate item in tuple, or none
- include_error (bool): specifies whether uncertainty terms should be included in the equation
Returns
- fitline_string (str): equation for the the fitted line, in the form "y = a x + b" or "y = a x" If uncertainty terms are included, then "y = (a ± c) x + (b ± d)" or "y = (a ± c) x"
91def sma(X,Y,W=None, 92 data=None, 93 alpha=0.95, 94 intercept=True, 95 robust=False,robust_method='FastMCD'): 96 '''Standard Major-Axis (SMA) line fitting 97 98 Calculate standard major axis, aka reduced major axis, fit to 99 data X and Y. The main advantage of this over ordinary least squares is 100 that the best fit of Y to X will be the same as the best fit of X to Y. 101 102 The fit equations and confidence intervals are implemented following 103 Warton et al. (2006). Robust fits use the FastMCD covariance estimate 104 from Rousseeuw and Van Driessen (1999). While there are many alternative 105 robust covariance estimators (e.g. other papers by D.I. Warton using M-estimators), 106 the FastMCD algorithm is default in Matlab. When the standard error or 107 uncertainty of each point is known, then weighted SMA may be preferrable to 108 robust SMA. The conventional choice of weights for each point i is 109 W_i = 1 / ( var(X_i) + var(Y_i) ), where var() is the variance 110 (squared standard error). 111 112 References 113 Warton, D. I., Wright, I. J., Falster, D. S. and Westoby, M.: 114 Bivariate line-fitting methods for allometry, Biol. Rev., 81(02), 259, 115 doi:10.1017/S1464793106007007, 2006. 116 Rousseeuw, P. J. and Van Driessen, K.: A Fast Algorithm for the Minimum 117 Covariance Determinant Estimator, Technometrics, 41(3), 1999. 118 119 Parameters 120 ---------- 121 X, Y : array_like or str 122 Input values, Must have same length. 123 W : array_like or str, optional 124 array of weights for each X-Y point, typically W_i = 1/(var(X_i)+var(Y_i)) 125 data : dict_like, optional 126 data structure containing variables. Used when X, Y, or W are str. 127 alpha : float (default = 0.95) 128 Desired confidence level [0,1] for output. 129 intercept : bool, default=True 130 Specify if the fitted model should include a non-zero intercept. 131 The model will be forced through the origin (0,0) if intercept=False. 132 robust : bool, default=False 133 Use statistical methods that are robust to the presence of outliers 134 robust_method: {'FastMCD' (default), 'Huber', 'Biweight'} 135 Method for calculating robust variance and covariance. Options: 136 - 'MCD' or 'FastMCD' for Fast MCD 137 - 'Huber' for Huber's T: reduce, not eliminate, influence of outliers 138 - 'Biweight' for Tukey's Biweight: reduces then eliminates influence of outliers 139 140 141 Returns 142 ------- 143 fitresult : dict 144 Contains the following keys: 145 - slope (float) 146 Slope or Gradient of Y vs. X 147 - intercept (float) 148 Y intercept. 149 - slope_ste (float) 150 Standard error of slope estimate 151 - intercept_ste (float) 152 standard error of intercept estimate 153 - slope_interval ([float, float]) 154 confidence interval for gradient at confidence level alpha 155 - intercept_interval ([float, float]) 156 confidence interval for intercept at confidence level alpha 157 - alpha (float) 158 confidence level [0,1] for slope and intercept intervals 159 - df_model (float) 160 degrees of freedom for model 161 - df_resid (float) 162 degrees of freedom for residuals 163 - params ([float,float]) 164 array of fitted parameters 165 - fittedvalues (ndarray) 166 array of fitted values 167 - resid (ndarray) 168 array of residual values 169 - method (str) 170 name of the fit method 171 ''' 172 173 def str2var( v, data ): 174 '''Extract variable named v from Dataframe named data''' 175 try: 176 return data[v] 177 except Exception as exc: 178 raise ValueError( 'Argument data must be provided with a key named '+v ) from exc 179 180 # If variables are provided as strings, get values from the data structure 181 if isinstance( X, str ): 182 X = str2var( X, data ) 183 if isinstance( Y, str ): 184 Y = str2var( Y, data ) 185 if isinstance( W, str ): 186 W = str2var( W, data ) 187 188 # Make sure arrays have the same length 189 assert ( len(X) == len(Y) ), 'Arrays X and Y must have the same length' 190 if W is None: 191 W = np.zeros_like(X) + 1 192 else: 193 assert ( len(W) == len(X) ), 'Array W must have the same length as X and Y' 194 195 # Make sure alpha is within the range 0-1 196 assert (alpha < 1), 'alpha must be less than 1' 197 assert (alpha > 0), 'alpha must be greater than 0' 198 199 # Drop any NaN elements of X, Y, or W 200 # Infinite values are allowed but will make the result undefined 201 # idx = ~np.logical_or( np.isnan(X0), np.isnan(Y0) ) 202 idx = ~np.isnan(X) * ~np.isnan(Y) * ~np.isnan(W) 203 204 X0 = X[idx] 205 Y0 = Y[idx] 206 W0 = W[idx] 207 208 # Number of observations 209 N = len(X0) 210 211 include_intercept = intercept 212 213 # Degrees of freedom for the model 214 if include_intercept: 215 dfmod = 2 216 else: 217 dfmod = 1 218 219 method = 'SMA' 220 221 # Choose whether to use methods robust to outliers 222 if robust: 223 224 method = 'rSMA' 225 226 # Choose the robust method 227 if ((robust_method.lower() =='mcd') or (robust_method.lower() == 'fastmcd') ): 228 # FAST MCD 229 230 if not include_intercept: 231 # intercept=False could possibly be supported by calculating 232 # using mcd.support_ as weights in an explicit variance/covariance calculation 233 raise NotImplementedError('FastMCD method only supports SMA with intercept') 234 235 # Fit robust model of mean and covariance 236 mcd = MinCovDet().fit( np.array([X0,Y0]).T ) 237 238 # Robust mean 239 Xmean = mcd.location_[0] 240 Ymean = mcd.location_[1] 241 242 # Robust variance of X, Y 243 Vx = mcd.covariance_[0,0] 244 Vy = mcd.covariance_[1,1] 245 246 # Robust covariance 247 Vxy = mcd.covariance_[0,1] 248 249 # Number of observations used in mean and covariance estimate 250 # excludes observations marked as outliers 251 N = mcd.support_.sum() 252 253 elif ((robust_method.lower() =='biweight') or (robust_method.lower() == 'huber') ): 254 255 # Tukey's Biweight and Huber's T 256 if robust_method.lower()=='biweight': 257 norm = norms.TukeyBiweight() 258 else: 259 norm = norms.HuberT() 260 261 # Get weights for downweighting outliers 262 # Fitting a linear model the easiest way to get these 263 # Options include "TukeyBiweight" (totally removes large deviates) 264 # "HuberT" (linear, not squared weighting of large deviates) 265 rweights = smf.rlm('y~x+1',{'x':X0,'y':Y0},M=norm).fit().weights 266 267 # Sum of weight and weights squared, for convienience 268 rsum = np.sum( rweights ) 269 rsum2 = np.sum( rweights**2 ) 270 271 # Mean 272 Xmean = np.sum( X0 * rweights ) / rsum 273 Ymean = np.sum( Y0 * rweights ) / rsum 274 275 # Force intercept through zero, if requested 276 if not include_intercept: 277 Xmean = 0 278 Ymean = 0 279 280 # Variance & Covariance 281 Vx = np.sum( (X0-Xmean)**2 * rweights**2 ) / rsum2 282 Vy = np.sum( (Y0-Ymean)**2 * rweights**2 ) / rsum2 283 Vxy = np.sum( (X0-Xmean) * (Y0-Ymean) * rweights**2 ) / rsum2 284 285 # Effective number of observations 286 N = rsum 287 288 else: 289 290 raise NotImplementedError("sma hasn't implemented robust_method={:%s}".\ 291 format(robust_method)) 292 else: 293 294 if include_intercept: 295 296 wsum = np.sum(W) 297 298 # Average values 299 Xmean = np.sum(X0 * W0) / wsum 300 Ymean = np.sum(Y0 * W0) / wsum 301 302 # Covariance matrix 303 cov = np.cov( X0, Y0, ddof=1, aweights=W0**2 ) 304 305 # Variance 306 Vx = cov[0,0] 307 Vy = cov[1,1] 308 309 # Covariance 310 Vxy = cov[0,1] 311 312 else: 313 314 # Force the line to pass through origin by setting means to zero 315 Xmean = 0 316 Ymean = 0 317 318 wsum = np.sum(W0) 319 320 # Sum of squares in place of variance and covariance 321 Vx = np.sum( X0**2 * W0 ) / wsum 322 Vy = np.sum( Y0**2 * W0 ) / wsum 323 Vxy= np.sum( X0*Y0 * W0 ) / wsum 324 325 # Standard deviation 326 Sx = np.sqrt( Vx ) 327 Sy = np.sqrt( Vy ) 328 329 # Correlation coefficient (equivalent to np.corrcoef()[1,0] for non-robust cases) 330 R = Vxy / np.sqrt( Vx * Vy ) 331 332 ############# 333 # SLOPE 334 335 Slope = np.sign(R) * Sy / Sx 336 337 # Standard error of slope estimate 338 ste_slope = np.sqrt( 1/(N-dfmod) * Sy**2 / Sx**2 * (1-R**2) ) 339 340 # Confidence interval for Slope 341 B = (1-R**2)/(N-dfmod) * stats.f.isf(1-alpha, 1, N-dfmod) 342 ci_grad = Slope * ( np.sqrt( B+1 ) + np.sqrt(B)*np.array([-1,+1]) ) 343 344 ############# 345 # INTERCEPT 346 347 if include_intercept: 348 Intercept = Ymean - Slope * Xmean 349 350 # Standard deviation of residuals 351 # New Method: Formula from smatr R package (Warton) 352 # This formula avoids large residuals of outliers when using robust=True 353 Sr = np.sqrt((Vy - 2 * Slope * Vxy + Slope**2 * Vx ) * (N-1) / (N-dfmod) ) 354 355 # OLD METHOD 356 # Standard deviation of residuals 357 #resid = Y0 - (Intercept + Slope * X0 ) 358 # Population standard deviation of the residuals 359 #Sr = np.std( resid, ddof=0 ) 360 361 # Standard error of the intercept estimate 362 ste_int = np.sqrt( Sr**2/N + Xmean**2 * ste_slope**2 ) 363 364 # Confidence interval for Intercept 365 tcrit = stats.t.isf((1-alpha)/2,N-dfmod) 366 ci_int = Intercept + ste_int * np.array([-tcrit,tcrit]) 367 368 else: 369 370 # Set Intercept quantities to zero 371 Intercept = 0 372 ste_int = 0 373 ci_int = np.array([0,0]) 374 375 result = dict( method = method, 376 fitintercept = include_intercept, 377 slope = Slope, 378 intercept = Intercept, 379 slope_ste = ste_slope, 380 intercept_ste = ste_int, 381 slope_interval = ci_grad, 382 intercept_interval = ci_int, 383 alpha = alpha, 384 df_model = dfmod, 385 df_resid = N-dfmod, 386 params = np.array([Slope,Intercept]), 387 nobs = N, 388 fittedvalues = Intercept + Slope * X0, 389 resid = Intercept + Slope * X0 - Y0 ) 390 391 # return Slope, Intercept, ste_slope, ste_int, ci_grad, ci_int 392 return result
Standard Major-Axis (SMA) line fitting
Calculate standard major axis, aka reduced major axis, fit to data X and Y. The main advantage of this over ordinary least squares is that the best fit of Y to X will be the same as the best fit of X to Y.
The fit equations and confidence intervals are implemented following Warton et al. (2006). Robust fits use the FastMCD covariance estimate from Rousseeuw and Van Driessen (1999). While there are many alternative robust covariance estimators (e.g. other papers by D.I. Warton using M-estimators), the FastMCD algorithm is default in Matlab. When the standard error or uncertainty of each point is known, then weighted SMA may be preferrable to robust SMA. The conventional choice of weights for each point i is W_i = 1 / ( var(X_i) + var(Y_i) ), where var() is the variance (squared standard error).
References Warton, D. I., Wright, I. J., Falster, D. S. and Westoby, M.: Bivariate line-fitting methods for allometry, Biol. Rev., 81(02), 259, doi:10.1017/S1464793106007007, 2006. Rousseeuw, P. J. and Van Driessen, K.: A Fast Algorithm for the Minimum Covariance Determinant Estimator, Technometrics, 41(3), 1999.
Parameters
- X, Y (array_like or str): Input values, Must have same length.
- W (array_like or str, optional): array of weights for each X-Y point, typically W_i = 1/(var(X_i)+var(Y_i))
- data (dict_like, optional): data structure containing variables. Used when X, Y, or W are str.
- alpha (float (default = 0.95)): Desired confidence level [0,1] for output.
- intercept (bool, default=True): Specify if the fitted model should include a non-zero intercept. The model will be forced through the origin (0,0) if intercept=False.
- robust (bool, default=False): Use statistical methods that are robust to the presence of outliers
- robust_method ({'FastMCD' (default), 'Huber', 'Biweight'}):
Method for calculating robust variance and covariance. Options:
- 'MCD' or 'FastMCD' for Fast MCD
- 'Huber' for Huber's T: reduce, not eliminate, influence of outliers
- 'Biweight' for Tukey's Biweight: reduces then eliminates influence of outliers
Returns
- fitresult (dict):
Contains the following keys:
- slope (float) Slope or Gradient of Y vs. X
- intercept (float) Y intercept.
- slope_ste (float) Standard error of slope estimate
- intercept_ste (float) standard error of intercept estimate
- slope_interval ([float, float]) confidence interval for gradient at confidence level alpha
- intercept_interval ([float, float]) confidence interval for intercept at confidence level alpha
- alpha (float) confidence level [0,1] for slope and intercept intervals
- df_model (float) degrees of freedom for model
- df_resid (float) degrees of freedom for residuals
- params ([float,float]) array of fitted parameters
- fittedvalues (ndarray) array of fitted values
- resid (ndarray) array of residual values
- method (str) name of the fit method
Alias for sma
534def sen( x, y, alpha=0.95, method='separate' ): 535 ''''Theil-Sen slope estimate 536 537 This function wraps `scipy.stats.theilslopes` and provides 538 results in the same dict format as the other line fitting methods 539 in this module 540 541 Parameters 542 ---------- 543 x, y : ndarray 544 dependent (x) and independent (y) variables for fitting 545 alpha : float (default = 0.95) 546 Desired confidence level [0,1] for output. 547 method : {'separate' (default), 'joint'} 548 Method for estimating intercept. 549 - 'separate' uses np.median(y) - slope * np.median(x) 550 - 'joint' uses np.median( y - slope * x ) 551 552 Returns 553 ------- 554 fitresult : dict 555 Contains the following keys: 556 - slope (float) 557 Slope or Gradient of Y vs. X 558 - intercept (float) 559 Y intercept. 560 - slope_ste (float) 561 Standard error of slope estimate 562 - intercept_ste (float) 563 standard error of intercept estimate 564 - slope_interval ([float, float]) 565 confidence interval for gradient at confidence level alpha 566 - intercept_interval ([float, float]) 567 confidence interval for intercept at confidence level alpha 568 - alpha (float) 569 confidence level [0,1] for slope and intercept intervals 570 - df_model (float) 571 degrees of freedom for model 572 - df_resid (float) 573 degrees of freedom for residuals 574 - params ([float,float]) 575 array of fitted parameters 576 - fittedvalues (ndarray) 577 array of fitted values 578 - resid (ndarray) 579 array of residual values 580 ''' 581 582 slope, intercept, low_slope, high_slope = theilslopes(y,x,alpha,method) 583 584 dfmod = 2 585 N = np.sum( ~np.isnan(x) * ~np.isnan(y) ) 586 587 result = dict( method = 'Theil-Sen', 588 fitintercept = True, 589 slope = slope, 590 intercept = intercept, 591 slope_ste = None, 592 intercept_ste = None, 593 slope_interval = [low_slope,high_slope], 594 intercept_interval = [None,None], 595 alpha = alpha, 596 df_model = dfmod, 597 df_resid = N-dfmod, 598 params = np.array([slope,intercept]), 599 nobs = N, 600 fittedvalues = intercept + slope * x, 601 resid = intercept + slope * x - y ) 602 603 return result
'Theil-Sen slope estimate
This function wraps scipy.stats.theilslopes
and provides
results in the same dict format as the other line fitting methods
in this module
Parameters
- x, y (ndarray): dependent (x) and independent (y) variables for fitting
- alpha (float (default = 0.95)): Desired confidence level [0,1] for output.
- method ({'separate' (default), 'joint'}):
Method for estimating intercept.
- 'separate' uses np.median(y) - slope * np.median(x)
- 'joint' uses np.median( y - slope * x )
Returns
- fitresult (dict):
Contains the following keys:
- slope (float) Slope or Gradient of Y vs. X
- intercept (float) Y intercept.
- slope_ste (float) Standard error of slope estimate
- intercept_ste (float) standard error of intercept estimate
- slope_interval ([float, float]) confidence interval for gradient at confidence level alpha
- intercept_interval ([float, float]) confidence interval for intercept at confidence level alpha
- alpha (float) confidence level [0,1] for slope and intercept intervals
- df_model (float) degrees of freedom for model
- df_resid (float) degrees of freedom for residuals
- params ([float,float]) array of fitted parameters
- fittedvalues (ndarray) array of fitted values
- resid (ndarray) array of residual values
Alias for sen
606def sen_numba( x, y ): 607 '''Estimate linear trend using the Thiel-Sen method 608 609 This non-parametric method finds the median slope among all 610 combinations of time points. 611 scipy.stats.theilslopes provides the same slope estimate, with 612 confidence intervals. However, this function is faster for 613 large datasets due to Numba 614 615 Parameters 616 ---------- 617 x : array_like (N,) 618 independent variable 619 y : array_like (N,) 620 dependent variable 621 622 Returns 623 ------- 624 sen : float 625 the median slope 626 slopes : array (N*N,) 627 all slope estimates from all combinations of x and y 628 ''' 629 630 with warnings.catch_warnings(): 631 warnings.simplefilter('always', DeprecationWarning) 632 warnings.warn(f'Sen function is slow unless numba.jit is used. Use scipy.stats.theilslopes instead.', 633 DeprecationWarning, stacklevel=2) 634 635 if len( x ) != len( y ): 636 print('Inputs x and y must have same dimension') 637 return np.nan 638 639 # Find number of time points 640 n = len( x ) 641 642 # Array to hold all slope estimates 643 slopes = np.zeros( np.ceil( n * ( n-1 ) / 2 ).astype('int') ) 644 slopes[:] = np.nan 645 646 count = 0 647 648 for i in range(n): 649 for j in range(i+1, n): 650 651 # Slope between elements i and j 652 slopeij = ( y[j] - y[i] ) / ( x[j] - x[i] ) 653 654 slopes[count] = slopeij 655 656 count += 1 657 658 # Thiel-Sen estimate is the median slope, neglecting NaN 659 sen = np.nanmedian( slopes ) 660 661 return sen, slopes
Estimate linear trend using the Thiel-Sen method
This non-parametric method finds the median slope among all
combinations of time points.
scipy.stats.theilslopes provides the same slope estimate, with
confidence intervals. However, this function is faster for
large datasets due to Numba
Parameters
- x (array_like (N,)): independent variable
- y (array_like (N,)): dependent variable
Returns
- sen (float): the median slope
- slopes (array (N*N,)): all slope estimates from all combinations of x and y
394def york( x, y, err_x=1, err_y=1, rerr_xy=0 ): 395 '''York regression accounting for error in x and y 396 Follows the notation and algorithm of York et al. (2004) Section III 397 398 Parameters 399 ---------- 400 x, y : ndarray 401 dependent (x) and independent (y) variables for fitting 402 err_x, err_y : ndarray (default=1) 403 standard deviation of errors/uncertainty in x and y 404 rerr_xy : float (default=0) 405 correlation coefficient for errors in x and y, 406 default to rerr_xy=0 meaning that the errors in x are unrelated to errors in y 407 err_x, err_y, and rerr_xy can be constants or arrays of the same length as x and y 408 409 Returns 410 ------- 411 fitresult : dict 412 Contains the following keys: 413 - slope (float) 414 Slope or Gradient of Y vs. X 415 - intercept (float) 416 Y intercept. 417 - slope_ste (float) 418 Standard error of slope estimate 419 - intercept_ste (float) 420 standard error of intercept estimate 421 - slope_interval ([float, float]) 422 confidence interval for gradient at confidence level alpha 423 - intercept_interval ([float, float]) 424 confidence interval for intercept at confidence level alpha 425 - alpha (float) 426 confidence level [0,1] for slope and intercept intervals 427 - df_model (float) 428 degrees of freedom for model 429 - df_resid (float) 430 degrees of freedom for residuals 431 - params ([float,float]) 432 array of fitted parameters 433 - fittedvalues (ndarray) 434 array of fitted values 435 - resid (ndarray) 436 array of residual values 437 ''' 438 439 # relative error tolerance required for convergence 440 rtol = 1e-15 441 442 # Initial guess for slope, from ordinary least squares 443 result = stats.linregress( x, y ) 444 b = result[0] 445 446 # Weights for x and y 447 wx = 1 / err_x**2 448 wy = 1 / err_y**2 449 450 # Combined weights 451 alpha = np.sqrt( wx * wy ) 452 453 # Iterate until solution converges, but not more 50 times 454 maxiter=50 455 for i in range(1,maxiter): 456 457 # Weight for point i 458 W = wx * wy / ( wx + b**2 * wy - 2 * b * rerr_xy * alpha ) 459 Wsum = np.sum( W ) 460 461 # Weighted means 462 Xbar = np.sum( W * x ) / Wsum 463 Ybar = np.sum( W * y ) / Wsum 464 465 # Deviation from weighted means 466 U = x - Xbar 467 V = y - Ybar 468 469 # parameter needed for slope 470 beta = W * ( U / wy + b*V / wx - (b*U + V) * rerr_xy / alpha ) 471 472 # Update slope estimate 473 bnew = np.sum( W * beta * V ) / np.sum( W * beta * U ) 474 475 # Break from loop if new value is very close to old value 476 if np.abs( (bnew-b)/b ) < rtol: 477 break 478 else: 479 b = bnew 480 481 if i==maxiter: 482 raise ValueError( f'York regression failed to converge in {maxiter:d} iterations' ) 483 484 # Intercept 485 a = Ybar - b * Xbar 486 487 # least-squares adjusted points, expectation values of X and Y 488 xa = Xbar + beta 489 ya = Ybar + b*beta 490 491 # Mean of adjusted points 492 xabar = np.sum( W * xa ) / Wsum 493 yabar = np.sum( W * ya ) / Wsum 494 495 # Devaiation of adjusted points from their means 496 u = xa - xabar 497 v = ya - yabar 498 499 # Variance of slope and intercept estimates 500 varb = 1 / np.sum( W * u**2 ) 501 vara = 1 / Wsum + xabar**2 * varb 502 503 # Standard error of slope and intercept 504 siga = np.sqrt( vara ) 505 sigb = np.sqrt( varb ) 506 507 # Define a named tuple type that will contain the results 508 # result = namedtuple( 'result', 'slope intercept sigs sigi params sigma' ) 509 510 # Return results as a named tuple, User can access as a regular tuple too 511 # return result( b, a, sigb, siga, [b,a], [sigb, siga] ) 512 513 dfmod = 2 514 N = np.sum( ~np.isnan(x) * ~np.isnan(y) ) 515 516 result = dict( method = 'York', 517 fitintercept = True, 518 slope = b, 519 intercept = a, 520 slope_ste = sigb, 521 intercept_ste = siga, 522 slope_interval = [None,None], 523 intercept_interval = [None,None], 524 alpha = alpha, 525 df_model = dfmod, 526 df_resid = N-dfmod, 527 params = np.array([b,a]), 528 nobs = N, 529 fittedvalues = a + b * x, 530 resid = a + b * x - y ) 531 532 return result
York regression accounting for error in x and y Follows the notation and algorithm of York et al. (2004) Section III
Parameters
- x, y (ndarray): dependent (x) and independent (y) variables for fitting
- err_x, err_y (ndarray (default=1)): standard deviation of errors/uncertainty in x and y
- rerr_xy (float (default=0)): correlation coefficient for errors in x and y, default to rerr_xy=0 meaning that the errors in x are unrelated to errors in y err_x, err_y, and rerr_xy can be constants or arrays of the same length as x and y
Returns
- fitresult (dict):
Contains the following keys:
- slope (float) Slope or Gradient of Y vs. X
- intercept (float) Y intercept.
- slope_ste (float) Standard error of slope estimate
- intercept_ste (float) standard error of intercept estimate
- slope_interval ([float, float]) confidence interval for gradient at confidence level alpha
- intercept_interval ([float, float]) confidence interval for intercept at confidence level alpha
- alpha (float) confidence level [0,1] for slope and intercept intervals
- df_model (float) degrees of freedom for model
- df_resid (float) degrees of freedom for residuals
- params ([float,float]) array of fitted parameters
- fittedvalues (ndarray) array of fitted values
- resid (ndarray) array of residual values