R workshop: day #2

Mike Hammond
U. of Arizona
  1. Objects
    1. ls() — list currently loaded objects.
    2. rm(object) — delete a currently loaded object.
    3. summary(object) — find out about some object.
    4. str(object) — find out even more about some object.
  2. Accessing a dataframe
    1. names(g) — gets the names of all columns in a dataframe.
    2. g$gender — access a column by name.
    3. g[,1] — access a column by number.
    4. g[,'age'] — access a column by name.
    5. g[1,] — access a row.
    6. g[4,2] — access a cell.
    7. g[3:6,1:2] — access a block of cells.
    8. Subsetting: return specific rows and/or columns.
      g[g$gender=='f',]
      g[g$weight < 160,]
      g[g$gender=='m' & g$weight < 160,]
      g[g$gender=='f',2]
      g[g$gender=='f','major']
      
    9. attach(object) — make the parts of some currently loaded object directly available, e.g. g$gender is available as gender after attach(g).
    10. detach(object) — make the parts unavailable again, except with the dataframe prefix, e.g. g$gender.
    11. search() — examine what objects are attached.
  3. Basic statistics for vectors
    1. mean() — mean.
    2. sd() — standard deviation.
    3. var() — variance.
  4. Manipulating parts of dataframes
    1. tapply() — apply some function by factors to a dataframe.
      tapply(weight,gender,mean)
      tapply(weight,gender,length)
      
    2. t() — transpose a dataframe.
    3. transform() — alter a dataframe in some way.
      transform(g,age=age+weight)
      
    4. aggregate() — aggregate within some dataframe.
      aggregate(g,list(gender),length)
      
    5. as.factor() — convert a list/vector of numbers into a factor.
    6. is.factor() — test if something is a factor.
    7. cut(vector,levels) — break a vector of numbers into a factor with n levels.
      tapply(age,cut(weight,2),mean)
      
  5. Gotchas
    1. Input file format textfile, tabs, missing fields, column issues.
    2. Dataframe columns/rows, vector vs. factor.