Tags » Hadoop

Pig Tutorial

Pig Scalar DataTypes

Complete types
MAP: A map in Pig is a chararray to data element mapping, where that element can be any Pig type, including a complex type. 275 more words


Apache Pig in a blog - Part I

Pig is an Apache open source project in the Hadoop ecosystem, that can be used to write parallelized dataflows on top of Hadoop.

Pig vs Hive… 848 more words


Build Hadoop-2.2.0 Source on windows and Configure in Eclipse

We can now build hadoop source version 2.2.0 on windows and configure it to use in eclipse. Follow the steps mentioned below to configure hadoop source on windows. 558 more words


Apache Nutch - Apache Hadoop

What is Hadoop?

Hadoop is a software framework for distributed processing of large datasets across large clusters of computers , written in java by apache software foundation.It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.Hadoop provides a reliable shared storage and analysis system. 421 more words


Install Hadoop on Ubuntu


  • You need to have any version of Ubuntu.
  • You need to have a stable copy of Hadoop Distribution.

If you do not currently have hadoop downloaded on your machine then don’t worry we can have it later in this tutorial. 1,569 more words


Vaidya Hadoop PostExecution Diagnostic Tool

Vaidya is very helpful tool to identify hadoop performance issues for map and reduce jobs.
Please find more documentation at https://oracleebsblog.files.wordpress.com/2014/06/chz-2nd.pdf


Hadoop I/O and data formats

In this blog post I would like to summarize Hadoop I/O and supported data formats and provide some brief introduction to this topic. As things are constantly evolving especially in the hadoop area I will be glad for comments in case I missed something important. 796 more words