This has led to ever increasing amount of data, which is leading to the following issues within the enterprise
- Business do not have access to the real time data feeds
- Queries are running in hrs and minutes and not in seconds
- The batch processes, ETL, data loads are taking too much of time
- Ability to construct new models based on the data is time consuming
- Scalability of systems with ever increasing data is becoming a problem
- Data Storage/Redundancy is another problem
- Increasing License cost of software/hardware is another issue
When Business looks and reads about how Facebook, Yahoo, Google etc are managing large amounts of data (BigData) and are able to process the same at real time, they want to adopt some of their systems and techniques.
The new systems/technologies that have been open sourced and are getting adopted rapidly by the enterprises are Hadoop and its various commercial versions (Cloudera, Hortonworks, GreenPlum). In addition, other commercial vendors have also jumped in with their specific BigData offerings – IBM has BigInsights, Oracle has Exalytics In-memory machine. The commercial vendors are trying to sell big machines – with more RAM and more CPU to be able to process more data.
But, the question is – are enterprises looking at to buy more hardware, software, acquire more licenses to process data or they want to solve the issues.
I believe, the fundamental problem is speed of access to the data (FastData) which is a paramount requirement for the enterprise. BigData only promises to help solve the problem of large amounts of data but it still has a long way to go before it can fulfill rest of the enterprise needs.