Hadoop Single Node Installation

HDFS Single Node Configuration

Hadoop 2.X Single Node Installation

Tagged with:
Posted in Hadoop

Configuration Of Hadoop in Eclipse LUNA

It was really frustrating, when I had to create JOB jars each time to execute my Map-Reduce driver code.
After doing lot of exercise, I found a way to do so using eclipse straight forward and it’s easy.
Please follow this and focus on your code.

Tagged with:
Posted in Hadoop

JAVA Client for HDFS


Tagged with:
Posted in Hadoop

What is big data in layman’s terms?

Answer by Balaji Viswanathan:

Big Data, Cloud, Internet of Things are sexy, marketing buzzwords to describe existing technologies that are ready for the mainstream. In fact, at LinuxCon I was at a talk emphasizing on creating such marketing goo to help whip up the excitement.

Dilbert comic strip for 07/29/2012 from the official Dilbert comic strips archive.
Big Data used to be called Analytics/Business Intelligence before the industry felt the need for a sexier term. If you have ever drawn a chart in Excel out of a column of data, you have used a tiny version of “Big Data”.  Just that scale is massive. Big data just means making sense out of a large volume of data.

Ok, enough of cynicism.

How is Big Data different from “little data”?

Let’s assume you have a leak in a water pipe in your garden. You take a bucket and a some sealing material to fix the problem. After a while, you see that the leak is much bigger that you need a specialist (plumber) to bring bigger tools. In the meanwhile, you are still using the bucket to drain the water. After a while, you notice that a massive underground stream has opened and you need to handle millions of liters of water every second.

You don’t just need new buckets, but a completely new approach to looking at the problem just because the volume and velocity of water has grown. To prevent the town from flooding, maybe you need your government to build a massive dam that requires an enormous civil engineering expertise and an elaborate control system. To make things worse, everywhere water is gushing out from nowhere and everyone is scared with the variety.

Welcome to Big Data.

I will give you an example from my previous startup. [More details: Does Social Media Affect Capital Markets?] We had a hypothesis that we could understand the market psychology by looking at the tweets. For instance, if I want to predict the movement of Apple stock, I could look at the tweets related to:

  1. Media perceptions of Apple – how many times the company/product gets mentioned in major media.
  2. Customer perceptions of Apple – are the customers positive or negative about the upcoming iPhone 6? Will people continue to buy Apple?
  3. Employee perceptions of Apple – are there any tweets from Cupertino [the company’s location] that could be linked to some employees of the company? How happy or sad are they?
  4. Investor perceptions of Apple – what do sophisticated investors and analysts think about Apple?

The sum of all these perceptions will determine what will be the price of Apple’s stock in the future. Getting that right could mean billions of dollars.

To put it layman’s terms, if we could really understand what the different people are talking about a particular company and its products, we could somewhat predicts its future earnings and thus the direction in which the stock price would move. That would be a huge advantage to some investors.
Babson MBAs Use Social Media to Predict Moves in the Stock Market

5 Key Elements of Big Data

However the problem is this:

  1. There are over 500 million tweets every day that is flowing every second (High Volume & Velocity)
  2. We have to understand what each tweet means – where is it from, what kind of a person is tweeting, is it trustworthy or not. (High Variety)
  3. Identify the sentiment – is this person talking negative about iPhone or positive? (High Complexity)
  4. We need to have a way to quantify the sentiment and track it in real time. (High Variability)

The key elements that make today’s Big Data different from yesterday’s analytics is that we have a lot more volume, velocity, variety, variability and complexity of data.


Big data includes problems that involve such large data sets and solutions that require a complex connecting the dots. You can see such things everywhere.

  1. Quora and Facebook use Big data tools to understand more about you and provide you with a feed that you in theory should find it interesting. The fact that the feed is not interesting should show how hard the problem in.
  2. Credit card companies analyze millions of transactions to find patterns of fraud. Maybe if you bought pepsi on the card followed by a big ticket purchase, it could be a fraudster?
  3. My cousin works for a Big Data startup that analyzes weather data to help farmers sow the right seeds at the right time. The startup got acquired by Monsanto for big $$.
  4. A friend of mine works for a Big Data startup that analyzes customer behavior in real time to alert retailers on when they should stock up stuff.

There are similar problems in defense, retail, genomics, pharma, healthcare that requires a solution.


Big Data is a group of problems and technologies related to the availability of extremely large volumes of data that businesses want to connect and understand. The reason why the sector is hot now is that the data and tools have reached a critical mass. This occurred in parallel with years of education effort that has convinced organizations that they must do something with their data treasure.

What is big data in layman’s terms?

Tagged with:
Posted in Hadoop

The Quora Topic Network

Post by Don van der Drift:

The Quora Topic Network

The Quora Topic Network

Posted in Uncategorized

What is the Data Science topic FAQ?

What is the Data Science topic FAQ?

Posted in Uncategorized

What do Indian IT Companies like Infosys, Tech Mahindra, Cognizant do?

Answer by Subash Raj:

Do you like cupcakes? I am sure you do.

This is an interesting question and it would be great if we see this while enjoying cupcakes.

In technical terms, the companies that you have mentioned do what is termed IT Consulting which is basically helping other organizations by figuring out where they have a problem and what can they do to help from an IT perspective.

Ok, enough of the technical stuff. They are so boring. And these cupcakes look so delicious. I am more interested in them.

Now, let us make an assumption that you are living in America or Europe as most of the clients of the companies that you've mentioned are from these countries.

  • Imagine that your grandmother made great cupcakes

Why cupcakes and not something else? Well, I like cupcakes and this is my example. So, cupcakes it will be 🙂

Your lovely grandmother from your mother's side made super delicious cupcakes. You don't remember much of your childhood, but you do remember sitting in your grandmother's lap during your holidays and enjoying these cupcakes.

Chocolate, Pistachio, Cinnamon – don't get hungry just yet. There is a long way to go.

Now, you're little old. You have completed your graduation and thinking of what you want to do next in life.

You think about your grandma and her cupcakes. You call her up. She is very old and couldn't get up from her bed. You talk about her cup cakes and how much you loved them.

She decides to reveal her secret cupcake recipe to you so that you can make them on your own.

You feel like this. Whoa !

  • "I will make cupcakes" – that one decision in life which you will never regret

And that day you decide, that you will make cupcakes. You're pretty sure they will do well, because you have tasted them yourselves. You love them, your cousins and everyone in your family loves them.

You start off in a small way. You get the recipe from your grandma.

You register a company, rent a small place, buy the necessary equipment needed, hire a few employees and get going.

Oh boy, you're enjoying making cupcakes. Aren't you?

  • Everybody likes my cupcakes. I am loving it.

Your grandma's secret cupcakes are just too good. They are way too much tastier than any other brand in the market. Everybody is buying them.

You scale up your business. From a non branded product, you create a brand for your cupcakes. You do neat marketing.

Your company keeps growing and growing.

And then comes a time, when along with your company, your problems also keep growing, slowly at first and then exponentially.

  • Everybody likes my cupcakes but I am facing so many problems now

Your sugar supplier complaints about outstanding payments.

Your purchasing team says that they didn't order sugar in the first place and the sugar supplier is referring to a cancelled order.

Your packaging material supplier complaints about their delivery trucks taking too much time to unload in your factory.

You get some financial grant from an angel investor and as your business is doing good, you are finding it more and more difficult to maintain your financial books.

Walmart, one of your top customers, calls up your sales executive and places a huge order. The executive forgets about it.

You realize that you have produced too much of Chocolate flavor cupcakes as there is still a considerable amount left from the batch produced last week.

You don't know where to stop, the list seems endless. Making cupcakes isn't fun anymore.

  • I need help! Who can help me?

A little bird tells you that you can fix your problems by using Information Technology which would help you to make cupcakes without losing your peace of mind.

Interesting? Tell me more.

You make enough money to afford a good IT system. You should look at purchasing an ERP system. There are lot of IT Consulting companies and they have their own ERP offerings which can help you.

And do look at Indian companies. They have lot of offshore facilities that help you to get great work done at a very reasonable price.

  • IT Consulting Companies – Where art thou?

Then you approach these companies and tell them about your problem. They say they can help you.

Your purchasing team can handle purchases more effectively.

Your packaging material supplier will know how long their trucks would need to wait and you can also guide them and tell them when is the best time to make a delivery.

Your financial books will be easy to maintain.

The Walmart guy can directly log into a portal and place his order there. He doesn't need to talk to a sales guy anymore. This reduced human intervention and makes the process better.

You would be able to track your inventory better and thus, no more extra production of Chocolate flavored cupcakes.

Your problems can thus, be solved by these IT Companies.

Company A offers their Product A.
Company B offers their Product B… and so on.

You evaluate their products and consider their offerings taking into parameters such as:

Cost (It should be cheap for you)
Usability (It should be easy for you to use)
Support (You should get help if it stops working) and many other such factors.

You finally decide to make cupcakes using information technology.

  • Thank you IT Consulting companies

I have to thank you from the bottom of my heart as you've helped me to sell my cupcakes more efficiently. Thanks to you, many more people are enjoying my delicious cupcakes.

Without your help, I would have struggled a lot. But tell me something:

Why do you have so much work?

As you see, your business is comprised of so many divisions.

You have a purchasing team to buy right stuff,  a production line to manufacture and pack delicious cupcakes, a great supply chain to ensure that your products reach your customers across the country, a great sales team to find new customers like Walmart, a great marketing team to tell end consumers (which could also be a 5 year old kid) using advertisements that your cupcakes are the best.

From our side, we have specialized team to help each of your divisions.

But, we are a company too. So we have our payroll team and yes, we have our own IT systems as well. There is just so much work to do.

Why do you need so many people?

Because we don't make cupcakes. We make products which can be built only by humans and not by machines.

Machines make your company products. Our products are made by people like you and me. 

I need more of them so that I get more business.

Next time, you notice that your friend is in a 'bench' or a 'free pool' – it is not because we like them there. It is because, it will help us to get new clients and new projects.

We tell other companies that we have 500 people ready to help you out using our IT solution if you do business with us.

So, we need more people.

Who all are your clients?

As you see, you are in the cupcake business. But we have lot of clients across industry.

Next time you fly, you may notice that your plane's engine was manufactured by one of our clients.

Next time you do an online transaction, the web portal might have been developed by us.

Next time you walk into a supermarket, the receipt might have been printed using our system.

Next time you buy fuel for your car, our IT solutions would be running the software that pumps the crude oil.

You see, my cupcake friend, we are everywhere. We are IT.

Answers I've loved writing by Subash Raj

What do Indian IT Companies like Infosys, Tech Mahindra, Cognizant do?

Posted in Uncategorized

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 875 other followers