Tags » Hadoop

Apache Oozie - Installation and configuration

Many a times there will be a requirement of running a group of dependent data processing jobs. Also, we might want to run some of them at regular intervals of time. 747 more words

Hadoop

Five Factors In Building Giants Of The Big Data Era

Editor’s note: Navin Chaddha is managing director of Mayfield, an early-stage venture capital firm. Some of the companies he is currently championing include Gigya, Elastica, Lyft, MapR and Poshmark. 733 more words

TC

Nitin reblogged this on HadoopEssentials.

Seven signs your hair is on fire: The challenges of scaling Hadoop

Every Hadoop implementation encounters the occasional crisis, including moments when the folks running Hadoop feel like their hair is on fire. Sometimes it happens before you get to production, which can cause organizations to throw the Hadoop baby out with the bathwater. 989 more words

Nitin reblogged this on HadoopEssentials.

2014 Summer == Full Time Data Science Work

For the last three months I have been working at Change.org as a data scientist and engineer. Its been a great experience so far and I’m blown away that this is where I landed after starting this journey a year plus ago. 1,900 more words

Zipfian

Nitin reblogged this on HadoopEssentials.

HDFS Block Size

In hadoop data is stored as blocks. In every systems, blocks are the basic units. Hadoop block size is larger compared to the disk level and the os level block size, because hadoop is dealing with large data, so if small block size will result in more seek time and more metadata which ultimately results in poor performance. 149 more words

Hadoop

5 Big Disadvantages of Hadoop for Big Data

As the backbone of so many implementations, Hadoop is almost synomous with big data. Offering distributed storage, superior scalability, and ideal performance, many view it as the standard platform for high volume data infrastructures. 188 more words

Building a Recommendation Service on AWS with Mahout

Hi all,

Those (two) of my regular readers will know I frequently curse my fool-of-a-took laptop. With this in mind I’ve been meaning to write a tutorial on working with Amazon’s web services for when you don’t have the hardware you need. 1,720 more words

Hadoop