Tags » Hadoop

Installing and getting started with Hadoop on Mac - OS X

There are many posts existing already on installing and configuring Hadoop. This is one of those posts. This mainly deals with installing Hadoop locally on Mac, configuring and running a map reduce example on it! 856 more words


Apache Pig in a blog - Part I

Pig is an Apache open source project in the Hadoop ecosystem, that can be used to write parallelized dataflows on top of Hadoop.

Pig vs Hive… 848 more words


Teradata grabs startups Hadapt & Revelytix that make Hadoop easier

Teradata is not about to let Hadoop and its circle of companies get all the attention and revenue when it comes to big data.

Today it announced the acquisition of two startups that have built tools that make it easier to analyze data sitting in the Hadoop file system, which can store lots of different kinds of data. 211 more words


Books on Hadoop and its ecosystem part-1

                                            Pig Design Patterns

Pig Design Patterns is a comprehensive guide that will enable readers to readily use design patterns that simplify the creation of complex data pipelines in various stages of data management. 752 more words

Data Science

Facts about MongoDB database

Here are some of the facts that I found on internet about the MongoDB database.

Currently in the process of learning about the NoSQL database. 150 more words


Avro end to end in hdfs - part 1: why avro?

This is a series of posts aiming at explaining how and why to set up compressed avro in hdfs. It will be divided in a few posts, more will be coming if relevant. 729 more words


Apache Hadoop Setup

Hadoop 2.x is based on YARN architecture, which uses ResourceManagaer and ApplicationManager. ResourceManagaer manage recourses across cluster and Application Manager manages job life cycles.