Hadoop & Bussiness Intelligent

The group at Yahoo! that I came from was using Hadoop for data analytics and data warehousing. We had something like 100,000 web servers across the world, and once we collected data from across all these servers, we dumped it into Hadoop, which became the place where we stored all of the data, instead of traditional network storage.

Our reasoning for doing that was a matter of economics, given the quantity of hardware. Hadoop lets us scalably process that data, clean it up, and normalize it so we could pass it along to the systems that need it.

Hadoop is getting very wide adoption in the data warehousing and business intelligence domains. One of the biggest uses within Yahoo! right now is dealing with all of the log information from servers. Analyzing that information allows for better spam filtering, ad targeting, content targeting, A/B testing for new features, et cetera.

It’s not web-specific. For example, everybody does data warehousing, and we see very strong adoption there.

Separate from that, your example of oil companies is a very good one, as is the financial sector. Right now, we do have a couple of very large financial institutions working with us on these exact problems, taking huge amounts of data from domains like credit card processing and building predictive models for fraud that enable better decisions, for example, about whether to block or allow a given transaction.

In the stock market, Hadoop is being used to do simulations that help predict option pricing and related problems. That’s another very healthy market that we’ve seen growth in.

Knowing that Yahoo is the biggest contributor and adopter of Hadoop and the company is used Hadoop to solve various problems from data analytics and data warehousing: log processing, gene sequence mapping (basically a fuzzy string matching problem) to business intelligent domains: financial, stock market …

Rumor said that a bank in Singapore invest millions of dollars create a computing and predicting system from scratch using Haskell – a static type, functional programming language to warranty scaling and performance.

I wonder why the bank did not take a look at Distributed File System (DFS) + MapReduce (Hadoop is an open source implementation of it) as a massively scalable on commodity hardware that successfully utilized at biggest IT firms in the world (Google, Yahoo, Facebook … just to name the few) … or they just re-implementing DFS+MapReduce themselves 😀

vinova

Recent Posts

Top 10 Real Estate Tokenization Companies to Know in 2024

Real estate tokenization is revolutionizing the property investment field, offering a more accessible, transparent, and…

7 hours ago

Review Top 10 Web Security Tools Dominating the Market in 2024

Nowadays, ensuring the safety of your website is no longer optional – it’s a necessity.…

1 day ago

Isolates in Flutter: A Comprehensive Guide For Beginners

Isolation is often overlooked by fledgling developers, but it's a crucial step in the app…

2 days ago

Overview of the Enterprise Development Grant Singapore 2024

Starting a business in Singapore is a promising venture, thanks to the numerous government grants…

3 days ago

Top 8 Government Grants in Singapore for Startups in 2024

Starting a new business can be a daunting task, especially when it comes to securing…

4 days ago

TOP 30+ Web Development Terms Beginners Need to Know

Understanding the language of web development is important for anyone working in the tech field.…

5 days ago