Supports RHadoop all R packages

Install R and RHadoop (rhdfs / rmr2 / rhbase / RHive) on Cloudera Hadoop CDH

From: http://www.geedoo.info/installed-on-the-cloudera-hadoop-cdh-r-and-rhadoop-rhdfs-rmr2-rhbase-rhive.html

Preface: RHadoop is an open source project initiated by Revolution Analytics that can combine the statistical language R with Hadoop. Currently the project includes three R packages, namely rmr, which supports writing MapReduce applications in R, rhdfs for accessing HDFS in R language, and rhbase for accessing HBASE in R language.

1. System and required software version

Server operating system: CentOS 6.3

R Language Version: R-2.15.3 (I previously used the latest version of R-3 and found that the new version had various incompatibilities, so I chose the latest version of R-2.)

Download address: http://ftp.ctex.org/mirrors/CRAN/src/base/R-2/R-2.15.3.tar.gz

Cloudera Hadoop CDH Version: 4.4.0

JDK version: 1.6.0_31

Use the free installation package cloudera-manager-installer.bin from Cloudera Manager to complete the CDH and JDK installation. Refer to CDH Installation for more information

Download address: https://ccp.cloudera.com/display/SUPPORT/Cloudera+Manager+Free+Edition+Download

rJava (Java can call R, can be installed with CRAN) Version: rJava_0.9-5

Download address: http://www.rforge.net/src/contrib/rJava_0.9-5.tar.gz

The RHadoop version is the latest official version. The project address (https://github.com/RevolutionAnalytics) contains the following elements:

  • rmr-2.2.2
  • rhdfs-1.0.6
  • rhbase-1.2.0

Download address: https://github.com/RevolutionAnalytics/RHadoop/wiki/Downloads

Documentation: https://github.com/RevolutionAnalytics/RHadoop/wiki

Second, dependent installation (R language package, rJava package)

Before installing, you must install the R language pack and the rJava pack on each host in the cluster, one at a time, and then install Hadoop. The specific installation steps are as follows:

1. Install the R language pack

Before you can compile R, you need to install the following programs through yum:

# yum install gcc-gfortran

Otherwise the error "configure: error: No F77 compiler found" is reported

# yum install gcc gcc-c ++

Otherwise report the error "configure: error: C ++ preprocessor" / lib / cpp "error checking integrity"

# yum install readline-devel

Otherwise errors "–with-readline = yes (default) and headers / libraries are not available" are reported

# yum install libXt-devel

Otherwise report the error "configure: error: -with-x = yes (default) and X11 headers / libs are not available"

Then download and compile the source code

# wget http://cran.rstudio.com/src/base/R-2/R-2.15.3.tar.gz

# tar -zxvf R-2.15.3.tar.gz

# cd R-2.15.3

# ./configure --prefix = / usr --disable-nls --enable-R-shlib / ** (The last two options --disable-nls --enable-R-shlib are prepared for the RHive mount. If you don't install RHive, you can omit) * /

# make

# make install

 

http://soledede.com/

 

Personal WeChat:scccdgf

 

WeChat public account: