Call Us: 971-4-881-9690

What is Hadoop?

What is Hadoop? Well, many have heard about this but, to make any sense, we need to know some history of how Hadoop came into being. I thought of writing this blog in a coffee shop and while I was writing this part of the blog there was this music from Miles David in my back ground and the booth next to me had the new ‘Barista’ of the coffee shop being quizzed on his knowledge of coffee, the trainer was asking do you know how to make a latte, if you do then do you know how to add flavour to that and when would you add the flavour and so on.

Well, the ‘Barista’ was learning in a sequence of work to do, the trainer was guiding him what comes first and what comes last, and I wondered if the Coffee can me made parallely so as to deliver the cofffee to the consumer in flavour, and if the consumer decided at final end to change the flavor will the ‘Barista’ be able to meet his customer’s need. Unfortunately ‘NO’, well fortunately we in the information industry did have the same challenge as the Barista. However, smart guys designed or innovated options what was my wish list above of changing the flavor.

So, how do we change the flavor, the answer is fairly simple, we allow parallel activities not confined to any relational attributes. How can that be done, that is a mouthful of jargon for anybody to just digest. Well, it still is mouthful but, quite digestible, so let us understand this in a simple way. We gather data everyday, every time, we decide what to keep and what not to keep, and this depends on the data’s use and our ability to store that information. So, as when the storage and processing improved, we started to explore ways to use this volumes of data for good instead of just sitting there. And this gave rise to a system which would allow information architects to create a bowl where all this can be mixed and recipe of different outputs created. This was called Hadoop. So in short Hadoop is the new Buzzword, it is a open source project, written Java, with a file system of its own called HDFS, it is optimized to handle massive, gigantic amounts of data (throw everything you have got into it, structured, unstructured, semi-structured), was designed to use inexpensive hardware (OCP) thus the system was designed to take faliures hence, is redundant in itself. However, this is not a replacement for existing RDBMS but complimenting them and is not used for OLTP,  OLAP, however tools are designed to used over the existing OLAP system to give better Analytics and is also called Big Data Analytics.

Hadoop Ecosystems as Sold by Data Science using OCP Infrastructure

Hadoop Ecosystems as Sold by Data Science using OCP Infrastructure

 

Reseller Login