Tags » Bigdata

Hive - Some commands

Hive is a data warehousing infrastructure based on Apache Hadoop. Hadoop provides massive scale out and fault tolerance capabilities for data storage and processing on commodity hardware. 138 more words

Re-Structure Ahead in Big Data & Spark

Big Data used to be about storing unstructured data in its raw form – . “Forget about structures and Schema, it will be defined when we read the data”. 995 more words


Setup Apache spark on Docker

As promised, we have come up with the method to setup apache spark on docker with ubuntu as underlying OS.


Docker is an open platform for developing, shipping, and running applications. 1,014 more words

Crowdsourcing Innovation in Sports

Welcome Back to My Side of the Net

While I’m not big on new year’s resolutions, I did decide 2016 is the right time to revive My Side of the Net. 405 more words


Apache Spark - Part 1

This is first of many upcoming posts discussing my learnings as i explore Spark. This post discusses how to setup Spark on Ubuntu 14.04. 362 more words


Cloud Data Services Force Awakens

If you’ve been reading the storage blogs and analyst reports you may conclude that storage growth is in Flash arrays, Hyper-converged, maybe scale-out NAS or Object. 1,402 more words


ProActive Choice - The Prevention alternative to Medicine

I have come to realize that medicine is about the tactical response to disease.  The body undergoes insult daily.   As an example, Cancer.

Imagine I am on the beach and the enemy is firing on me.   2,033 more words