when we work with a new dataset we should LOOK at the data first. The format? The dimensions? The variable names? How are the variables stored? Is there missing data? Are there any flaws?
FIRST we look at the data class
> class(plants)
[1] "data.frame"
Its very common for data to be stored in data frames. its in fact the default class for reading data into R using functions like read.csv() and read.table()
it also tells us this is 2D and fits neatly into rows and columns
> dim(plants)
[1] 5166 10
yoooooo 5166 rows lmaooo
size if you are curious
> object.size(plants)
745944 bytes
to get the column names
> names(plants)
[1] "Scientific_Name" "Duration" "Active_Growth_Period" "Foliage_Color" "pH_Min" "pH_Max"
[7] "Precip_Min" "Precip_Max" "Shade_Tolerance" "Temp_Min_F"
now you cant look at the whole thing obviously so just go and peek at it a little
> head(plants)
Scientific_Name Duration Active_Growth_Period Foliage_Color pH_Min pH_Max Precip_Min Precip_Max Shade_Tolerance Temp_Min_F
1 Abelmoschus <NA> <NA> <NA> NA NA NA NA <NA> NA
2 Abelmoschus esculentus Annual, Perennial <NA> <NA> NA NA NA NA <NA> NA
3 Abies <NA> <NA> <NA> NA NA NA NA <NA> NA
4 Abies balsamea Perennial Spring and Summer Green 4 6 13 60 Tolerant -43
5 Abies balsamea var. balsamea Perennial <NA> <NA> NA NA NA NA <NA> NA
6 Abutilon <NA> <NA> <NA> NA NA NA NA <NA> NA
#use
head(plants, 10)
#to get the first 10 rows
if you want to preview the end of the dataset you do
> tail(plants, 15) #default is 6 rows
Scientific_Name Duration Active_Growth_Period Foliage_Color pH_Min pH_Max Precip_Min Precip_Max Shade_Tolerance Temp_Min_F
5152 Zizania <NA> <NA> <NA> NA NA NA NA <NA> NA
5153 Zizania aquatica Annual Spring Green 6.4 7.4 30 50 Intolerant 32
5154 Zizania aquatica var. aquatica Annual <NA> <NA> NA NA NA NA <NA> NA
5155 Zizania palustris Annual <NA> <NA> NA NA NA NA <NA> NA
5156 Zizania palustris var. palustris Annual <NA> <NA> NA NA NA NA <NA> NA
5157 Zizaniopsis <NA> <NA> <NA> NA NA NA NA <NA> NA
5158 Zizaniopsis miliacea Perennial Spring and Summer Green 4.3 9.0 35 70 Intolerant 12
5159 Zizia <NA> <NA> <NA> NA NA NA NA <NA> NA
5160 Zizia aptera Perennial <NA> <NA> NA NA NA NA <NA> NA
5161 Zizia aurea Perennial <NA> <NA> NA NA NA NA <NA> NA
5162 Zizia trifoliata Perennial <NA> <NA> NA NA NA NA <NA> NA
5163 Zostera <NA> <NA> <NA> NA NA NA NA <NA> NA
5164 Zostera marina Perennial <NA> <NA> NA NA NA NA <NA> NA
5165 Zoysia <NA> <NA> <NA> NA NA NA NA <NA> NA
5166 Zoysia japonica Perennial <NA> <NA> NA NA NA NA <NA> NA
DAMN NOW SUMMARY IS INSANE IT LITERALLY GIVES THE SUMMARY OF THE WHOLE DATASET
> summary(plants)
Scientific_Name Duration Active_Growth_Period Foliage_Color pH_Min pH_Max Precip_Min Precip_Max
Length:5166 Length:5166 Length:5166 Length:5166 Min. :3.000 Min. : 5.100 Min. : 4.00 Min. : 16.00
Class :character Class :character Class :character Class :character 1st Qu.:4.500 1st Qu.: 7.000 1st Qu.:16.75 1st Qu.: 55.00
Mode :character Mode :character Mode :character Mode :character Median :5.000 Median : 7.300 Median :28.00 Median : 60.00
Mean :4.997 Mean : 7.344 Mean :25.57 Mean : 58.73
3rd Qu.:5.500 3rd Qu.: 7.800 3rd Qu.:32.00 3rd Qu.: 60.00
Max. :7.000 Max. :10.000 Max. :60.00 Max. :200.00
NA's :4327 NA's :4327 NA's :4338 NA's :4338
Shade_Tolerance Temp_Min_F
Length:5166 Min. :-79.00
Class :character 1st Qu.:-38.00
Mode :character Median :-33.00
Mean :-22.53
3rd Qu.:-18.00
Max. : 52.00
NA's :4328