Forest covertypes¶
The samples in this dataset correspond to 30×30m patches of forest in the US, collected for the task of predicting each patch’s cover type, i.e. the dominant species of tree. There are seven covertypes, making this a multiclass classification problem. Each sample has 54 features, described on the dataset’s homepage. Some of the features are boolean indicators, while others are discrete or continuous measurements.
Data Set Characteristics:
Classes |
7 |
Samples total |
581012 |
Dimensionality |
54 |
Features |
int |
sklearn.datasets.fetch_covtype() will load the covertype dataset;
it returns a dictionary-like “Bunch” object
with the feature matrix in the data member
and the target values in target. If optional argument “as_frame” is
set to “True”, it will return data and target as pandas
data frame, and there will be an additional member frame as well.
The dataset will be downloaded from the web if necessary.