Tags » Hadoop

Let's talk about the elephant in the room

If you know anything about the Big Data phenomenon called Hadoop (R) then you probably get that the reference in the title to “the elephant” is because Hadoop’s mascot is a… 476 more words


#Hadoop #HDFS in Safemode, here's how to get out of it.

It happens once in a while. When you are performing an operating with HDFS like adding new data you may see this message:

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... 47 more words

Storage Tutorial: Live from Spark Summit East with Continuum

This Storage Tutorial was filmed live at Spark Summit East.

Our host, Brian Chang, is joined by Peter Wang, president of Continuum, along with show regulars Irshad Raihan and Greg Kleiman of Red Hat Big Data. 602 more words


Apache Crunch Toolkit #3: Secondary Sort in Apache Crunch

Secondary sorting is one of the most common requirements in data processing, and a staple of MapReduce. Let’s look at Apache Crunch’s support for it. The… 795 more words


Hadoop MapReduce wordcount example in Java. Introduction to Hadoop job.

In this article we are going to review the classic Hadoop word count example, customizing it a little bit. As usual I suggest to use Eclipse with Maven in order to create a project that can be modified, compiled and easily executed on the cluster. 1,368 more words


Hadoop Pig Installation , Pig Configuration in Local and MapReduce Mode

HI All,

In this post we will see how we will install pig and run the pig from local or on cluster using mapreduce . 281 more words

Map Reduce

Apache Crunch Toolkit #2: Viewing Pipeline Execution Plan Visualisations

What is a pipeline execution plan visualisation?

Apache Crunch has in-built support for creating dot files which visually show how Crunch pipelines are executed under-the-hood. For example, a simple operation involving a… 205 more words