Tags » Hadoop

Install Oracle Big Data SQL on Exadata

Oracle Big Data SQL is part of Oracle Big Data family. It allows users to use regular Oracle SQL to access data in Oracle database, Hadoop and other sources at the same time. 3,001 more words


Rank and Dense_Rank

RANK and DENSE_RANK gives you ranking within an ordered partition. Ties are assigned the same rank.

Difference between RANK and DENSE_RANK:

In RANK after ties are assigned the same rank, the next ranking(s) are skipped. 514 more words


Distribute By, Sort By, Order By and Cluster By in Hive

The ORDER BY clause is familiar from other SQL dialects. It performs a total ordering of the query result set.
This means that all the data is passed through a single reducer, which may take an unacceptably long time to execute for larger data sets. 1,283 more words


Warp10 Distributed Installation

Thanks to David (@David Morin) I could manage to run my test cloud with minimum settings. There were several things regarding configuration which I didn’t know since it’s my first time installing and configuring the whole stack of Hadoop, Zookeeper, Kafka, Hbase, and on top Warp10. 1,438 more words


Install Yarn on Ubuntu Cluster via Scripts

There is beautiful blog on using scripts to install YARN on Ubuntu


You can use the script hadoop2-install-scripts.zip to install hadoop on centos systems. 21 more words

Hbase Installation (Pseudo Distributed)

1.) Download hbase stable version from https://www.apache.org/dist/hbase
2.) Untar using, tar -xvzf hbase-0.98.22-hadoop2-bin.tar.gz
3.) Configure hbase-site.xml:



</property> 26 more words


User Defined Function

Sometimes the query you want to write can’t be expressed easily (or at all) using the builtin functions that Hive provides. By allowing you to write a user-defined function (UDF), Hive makes it easy to plug in your own processing code and invoke it from a Hive query. 189 more words