Tags » Hadoop

Pig Tutorial

Pig Scalar DataTypes
====================
int
long
float
double
chararray
bytearray

Complete types
==============
MAP: A map in Pig is a chararray to data element mapping, where that element can be any Pig type, including a complex type. 275 more words

Hadoop

Apache Pig in a blog - Part I

Pig is an Apache open source project in the Hadoop ecosystem, that can be used to write parallelized dataflows on top of Hadoop.

Pig vs Hive… 848 more words

Hadoop

Build Hadoop-2.2.0 Source on windows and Configure in Eclipse

We can now build hadoop source version 2.2.0 on windows and configure it to use in eclipse. Follow the steps mentioned below to configure hadoop source on windows. 558 more words

Hadoop

Apache Nutch - Apache Hadoop

What is Hadoop?

Hadoop is a software framework for distributed processing of large datasets across large clusters of computers , written in java by apache software foundation.It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.Hadoop provides a reliable shared storage and analysis system. 421 more words

HADOOP

Install Hadoop on Ubuntu

Prerequisites

  • You need to have any version of Ubuntu.
  • You need to have a stable copy of Hadoop Distribution.

If you do not currently have hadoop downloaded on your machine then don’t worry we can have it later in this tutorial. 1,569 more words

Java

Vaidya Hadoop PostExecution Diagnostic Tool

Vaidya is very helpful tool to identify hadoop performance issues for map and reduce jobs.
Please find more documentation at https://oracleebsblog.files.wordpress.com/2014/06/chz-2nd.pdf

Hadoop

Hadoop I/O and data formats

In this blog post I would like to summarize Hadoop I/O and supported data formats and provide some brief introduction to this topic. As things are constantly evolving especially in the hadoop area I will be glad for comments in case I missed something important. 796 more words

Java