Installing Development Tools on Hadoop

What Is is About?

This guide is a continuation of this guide . Its goal is to describe how to set up a comfortable development environment for Hadoop in a Linux virtual machine with HDP 2.2.

Linux Stuff

Adding Users to Sudoers

Run command

chmod 640 /etc/sudoers

gedit /etc/sudoers &

and find the line

root    ALL=(ALL)     ALL

Add there one more line:

Ihor_Bobak    ALL=(ALL)     ALL

This will allow the user to launch commands with administrator privileges.

VNC Server

You will need VNC Server in the case if you’re installing everything on a remote machine, and you will have to perform development on it. In the case if this is a local VM, you don’t need it.

Run the following commands:

yum install tigervnc-server

sudo chkconfig vncserver on

gedit /etc/sysconfig/vncservers &

 

add the following lines to the end:

VNCSERVERS=”1:Ihor_Bobak 2:Someone_Else”

VNCSERVERARGS[1]=”-geometry 1400×900″

VNCSERVERARGS[2]=”-geometry 1400×900″

And save the file.

su – Ihor_Bobak

vncpasswd

Then run the server

service vncserver start

netstat -antl | grep 590

It must give you the following output:

tcp 0 0 0.0.0.0:5901 0.0.0.0:* LISTEN

Now install UltraVNC on your client machine and connect to yourhostname:5901. You should see a GUI interface in the VNC window.

Development Software

Java

Java was installed by Ambari Server (JDK 1.7.0_67). We need to make so that everyone sees the JAVA_HOME directory, and that the java’s bin directory is on the path.

Put a file jaaRun commands:

cd /etc/profile.d

gedit java.sh

and fill in the following lines into the shell file that you opened:

export JAVA_HOME=/usr/jdk64/jdk1.7.0_67

export PATH=$PATH:/usr/jdk64/jdk1.7.0_67/bin

Save the file.

But this is not all. Run

java -version

it will produce you in the output that this is OpenJDK, but I need it to be Oracle JDK just because of one reason: OpenJDK is buggy.

Run command

which java

– it will point you to the path from where the java is running. In my case this is a /usr/bin/java

Run

ls /usr/bin/java –l

in my case it points that this link goes to

lrwxrwxrwx. 1 root root 22 May 15 04:38 /usr/bin/java -> /etc/alternatives/java

Now let us do

ls /etc/alternatives/java -l

It will output a link to

lrwxrwxrwx. 1 root root 46 May 15 04:38 /etc/alternatives/java -> /usr/lib/jvm/jre-1.7.0-openjdk.x86_64/bin/java

So, here it how to fix it: we need to simply create a new link to the necessary JRE. I would prefer putting this link

unlink /etc/alternatives/java

ln -s /usr/jdk64/jdk1.7.0_67/bin/java /etc/alternatives/java

And here you are:

[root@epmbigdata devtools]# java -version

java version “1.7.0_67″

Java(TM) SE Runtime Environment (build 1.7.0_67-b01)

Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

– this is what we needed exactly.

Maven

Go to the page https://maven.apache.org/download.cgi and download apache-maven-3.3.3-bin.tar.gz (the filename may differ dependently on the current version)

cd /devtools

wget http://ftp.byfly.by/pub/apache.org/maven/maven-3/3.3.3/binaries/apache-maven-3.3.3-bin.tar.gz

tar -xvf apache-maven-3.3.3-bin.tar.gz

rm apache-maven-3.3.3-bin.tar.gz

Do the same thing as with java – put a file maven.sh into a directory /etc/prodile.d with this contents:

export M2_HOME=/devtools/apache-maven-3.3.3

export PATH=$PATH:/devtools/apache-maven-3.3.3/bin

REMOVE THE FILE maven.sh~ which gedit creates as a backup !!!

IntelliJ

Go to https://www.jetbrains.com/idea/download/ and get the latest version of community IntelliJ IDEA.

Put the file to /devtools and run the command

tar -xvf ideaIC-14.1.3.tar.gz

and delete the file ideaIC-14.1.3.tar.gz

If you have BGR display, open this file in editor:

gedit /devtools/idea-IC-141.1010.3/bin/idea64.vmoptions &

and fix the last line to be

-Dawt.useSystemAAFontSettings=lcd_hbgr

-Dswing.aatext=true

-Dsun.java2d.xrender=false

SquirrelSQL

Download this file

https://drive.google.com/file/d/0B3DMXMfcPWF3ck1MMkEtU01yOWc/view?usp=sharing

and unpack to /devtools.

Edit squirrel-sql.sh – do there two changes in these lines:

IZPACK_JAVA_HOME=/usr/jdk64/jdk1.7.0_45/jre

And

if $macosx ; then

    SQUIRREL_SQL_HOME=’/home/pdi/DevTools/squirrel-sql-3.6/Contents/Resources/Java’

else

    SQUIRREL_SQL_HOME=’/home/pdi/DevTools/squirrel-sql-3.6′

fi

In the squirrel add the following driver:

The list of jar files to add in the “Extra Class Path” is next: add everything that is in the directories:

/usr/hdp/2.2.4.2-2/hadoop

/usr/hdp/2.2.4.2-2/hadoop/lib

/usr/hdp/2.2.4.2-2/hive/lib

Then add a new alias:

Now you can connect to the Hive server, run SQL queries and browse the objects:

 

MySQL Workbench

Go to this website http://dev.mysql.com/downloads/repo/yum/ and download the rpm file for your platform (in my case I have CentOS 6.6, therefore I download “Red Hat Enterprise Linux 6″.

Install this rpm file.

As a result, you got two files in the /etc/yum.repos.d which begin from “mysql”. Your repository is set. Now run the command:

yum repolist enabled | grep “mysql.*-community.*”

There must be the following output:

mysql-connectors-community MySQL Connectors Community 14

mysql-tools-community MySQL Tools Community 23

mysql56-community MySQL 5.6 Community Server 146

Now run the following command:

yum install mysql-workbench-community

Run the command

service mysql start

it will tell you that MySQL service is started.

Go to the top menu Applications / Programming -> MySQL Workbench:

It must tell you that it successfully established connection to MySQL.

Run OK and connect to this MySQL instance. Run the following SQL scripts:

use hive;

select * from VERSION;

Office Software

Midnight Commander

yum install mc

Epel repository

yum install epel-release

FileZilla FTP Client

You may need it to have a better FTP client – to exchange information with the external host.

yum install filezilla

Configuration of the connection on the Linux machine:

 

Settings on the windows site are as follows. In the filezilla server interface:

Firewall is set up in my case, therefore:

Press “advanced settings” and add the following rule:

And in the filezilla server configuration condifure it in the following way:

Now check the connection in the FileZilla server:

FTP Server on Linux

Run the following commands:

sudo yum install vsftpd

gedit /etc/vsftpd/vsftpd.conf &

Edit the following line: anonymous_enable=NO

Uncomment line

chroot_local_user=YES

Save the file. Run command

sudo service vsftpd restart

chkconfig vsftpd on

And then you may connect to this Linux machine from any other using FTP or FTPS with your Linux login and password.

LibreOffice

Download https://www.libreoffice.org/download/libreoffice-fresh/ LibreOffice from here. You will get a file

LibreOffice_4.4.3_Linux_x86-64_rpm.tar.gz (the version may differ – depending on the current version).

Run the following commands:

tar -xvf LibreOffice_4.4.3_Linux_x86-64_rpm.tar.gz

cd LibreOffice_4.4.3.2_Linux_x86-64_rpm

yum localinstall RPMS/*.rpm

Adobe Reader

The manual is here: http://www.if-not-true-then-false.com/2010/install-adobe-acrobat-pdf-reader-on-fedora-centos-red-hat-rhel/

In our case

cd ~/Downloads

wget http://ardownload.adobe.com/pub/adobe/reader/unix/9.x/9.5.5/enu/AdbeRdr9.5.5-1_i486linux_enu.rpm

yum localinstall AdbeRdr9.5.5-1_i486linux_enu.rpm

Fonts

For more comfortable work in CentOS I suggest to install Ubuntu fonts. Download this file

https://drive.google.com/file/d/0B3DMXMfcPWF3R000WVhMVTV2a3M/view?usp=sharing

and extract it. You will get a folder ubuntu-fonts. Copy this folder to /usr/share/fonts

and run the following commands:

fc-cache -fv

Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>