Eric Guo's

Hoping writing JS, Ruby & Rails and Go article, but fallback to DevOps note

Frequently Used R Package List and Install Hint


Confirm works at Ubuntu 14.04 / 12.04.

  • knitr - A general-purpose package for dynamic report generation in R
  • rmarkdown - Dynamic Documents for R
  • SixSigma - Six Sigma Tools for Quality and Process Improvement
  • sqldf - Perform SQL Selects on R Data Frames
  • tseries - Time series analysis and computational finance
  • quantmod - Quantitative Financial Modeling Framework
  • doBy - Groupwise summary statistics, general linear contrasts, population means (least-squares-means), and other utilities
  • rJava - Low-Level R to Java Interface to support RWeka and xlsx
run before install rJava
R CMD javareconf
  • xlsx - Read, write, format Excel 2007 and Excel 97/2000/XP/2003 files
  • RCurl - General network (HTTP/FTP/…) client interface for R
To install rgdal package
apt-get install libcurl-dev
  • devtools - Tools to make developing R code easier
  • gplots - Various R programming tools for plotting data
  • ggmap - A package for spatial visualization with Google Maps and OpenStreetMap
  • googleVis - Interface between R and the Google Chart Tools
  • rworldmap - Mapping global data, vector and raster.
  • rgdal - Bindings for the Geospatial Data Abstraction Library
To install rgdal package
apt-get install libgdal1-dev
apt-get install libproj-dev # or libproj if not found
  • ROracle - Oracle database interface (DBI) driver for R.

    ROracle need Oracle Client install, and further additional setup procedure

  • pls - Multivariate regression methods Partial Least Squares Regression (PLSR), Principal Component Regression (PCR) and Canonical Powered Partial Least Squares (CPPLS).

  • assertthat - assertthat is an extension to stopifnot() that makes it easy to declare the pre and post conditions that you code should satisfy.

  • dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

  • data.table - Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns and a fast file reader (fread). Offers a natural and flexible syntax, for faster development.

  • earth - Build regression models using the techniques in Friedman’s papers “Fast MARS” and “Multivariate Adaptive Regression Splines”.

  • kernlab - Kernel-based machine learning methods for classification, regression, clustering, novelty detection, quantile regression and dimensionality reduction. Among other methods kernlab includes Support Vector Machines, Spectral Clustering, Kernel PCA, Gaussian Processes and a QP solver.

  • caret - Classification and Regression Training, Misc functions for training and plotting classification and regression models.

  • rpart - Recursive Partitioning and Regression Trees

  • party - A Laboratory for Recursive Partytioning

  • RWeka - An R interface to Weka (Version 3.7.12). Weka is a collection of machine learning algorithms for data mining tasks written in Java, containing tools for data pre-processing, classification, regression, clustering, association rules, and visualization.

  • ipred - Improved predictive models by indirect classification and bagging for classification, regression and survival problems as well as resampling based estimators of prediction error.

  • randomForest - Breiman and Cutler’s random forests for classification and regression

  • gbm - Generalized Boosted Regression Models

  • Cubist - Regression modeling using rules with added instance-based corrections

  • VGAM - Vector Generalized Linear and Additive Models, An implementation of about 6 major classes of statistical regression models. At the heart of it are the vector generalized linear and additive model (VGLM/VGAM) classes. Currently only fixed-effects models are implemented, i.e., no random-effects models. Many (150+) models and distributions are estimated by maximum likelihood estimation (MLE) or penalized MLE, using Fisher scoring.

  • mda - Mixture and flexible discriminant analysis, multivariate adaptive regression splines (MARS), BRUTO…

  • klaR - Miscellaneous functions for classification and visualization developed at the Fakultaet Statistik, Technische Universitaet Dortmund

  • e1071 - Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien

  • C50 - C5.0 decision trees and rule-based models for pattern recognition.

  • ROCR - Visualizing the Performance of Scoring Classifiers

  • ISLR - Data for An Introduction to Statistical Learning with Applications in R