Hide keyboard shortcuts

Hot-keys on this page

r m x p   toggle line displays

j k   next/prev highlighted chunk

0   (zero) top of page

1   (one) first highlighted chunk

1"""Statewide Crime Data""" 

2from statsmodels.datasets import utils as du 

3 

4__docformat__ = 'restructuredtext' 

5 

6COPYRIGHT = """Public domain.""" 

7TITLE = """Statewide Crime Data 2009""" 

8SOURCE = """ 

9All data is for 2009 and was obtained from the American Statistical Abstracts except as indicated below. 

10""" 

11 

12DESCRSHORT = """State crime data 2009""" 

13 

14DESCRLONG = DESCRSHORT 

15 

16#suggested notes 

17NOTE = """:: 

18 

19 Number of observations: 51 

20 Number of variables: 8 

21 Variable name definitions: 

22 

23 state 

24 All 50 states plus DC. 

25 violent 

26 Rate of violent crimes / 100,000 population. Includes murder, forcible 

27 rape, robbery, and aggravated assault. Numbers for Illinois and 

28 Minnesota do not include forcible rapes. Footnote included with the 

29 American Statistical Abstract table reads: 

30 "The data collection methodology for the offense of forcible 

31 rape used by the Illinois and the Minnesota state Uniform Crime 

32 Reporting (UCR) Programs (with the exception of Rockford, Illinois, 

33 and Minneapolis and St. Paul, Minnesota) does not comply with 

34 national UCR guidelines. Consequently, their state figures for 

35 forcible rape and violent crime (of which forcible rape is a part) 

36 are not published in this table." 

37 murder 

38 Rate of murders / 100,000 population. 

39 hs_grad 

40 Percent of population having graduated from high school or higher. 

41 poverty 

42 % of individuals below the poverty line 

43 white 

44 Percent of population that is one race - white only. From 2009 American 

45 Community Survey 

46 single 

47 Calculated from 2009 1-year American Community Survey obtained obtained 

48 from Census. Variable is Male householder, no wife present, family 

49 household combined with Female householder, no husband present, family 

50 household, divided by the total number of Family households. 

51 urban 

52 % of population in Urbanized Areas as of 2010 Census. Urbanized 

53 Areas are area of 50,000 or more people.""" 

54 

55 

56def load(as_pandas=None): 

57 """ 

58 Load the statecrime data and return a Dataset class instance. 

59 

60 Parameters 

61 ---------- 

62 as_pandas : bool 

63 Flag indicating whether to return pandas DataFrames and Series 

64 or numpy recarrays and arrays. If True, returns pandas. 

65 

66 Returns 

67 ------- 

68 Dataset instance: 

69 See DATASET_PROPOSAL.txt for more information. 

70 """ 

71 return du.as_numpy_dataset(load_pandas(), as_pandas=as_pandas, 

72 retain_index=True) 

73 

74 

75def load_pandas(): 

76 data = _get_data() 

77 return du.process_pandas(data, endog_idx=2, exog_idx=[7, 4, 3, 5], index_idx=0) 

78 

79 

80def _get_data(): 

81 return du.load_csv(__file__, 'statecrime.csv')