In our world, with every passing day, amount of data increases and so is the complexity of this data. The term “Big Data “ is used to represent the collection of datasets which are not only large in size but also complex in nature which makes it difficult to analyze by using conventional data processing applications. The challenges faced using conventional data processing applications includes capture, duration, storage, search, sharing, transfer, analysis and visualization. These challenges are mainly faced due to constant improvement in conventional DBMS technology as well as new databases like NoSQL and their ability to handle larger amounts of data. With this difficulty, new platforms of “big data” tools are being developed to handle various aspects of large quantities of data.
Why this large amount of data?
Larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data. This allows correlations to be found for the following:
• Spot business trends
• Determine quality of research
• Prevent diseases
• Determine real-time roadway traffic conditions
Few notable use of this Big data in different fields:
• The NASA Center for Climate Simulation (NCCS) stores 32 petabytes of climate observations and simulations on the Discover supercomputing cluster.
• In 2012, the US president Barak Obama announced the Big Data Research and Development Initiative, which explored how big data could be used to address important problems faced by the government. Big data analysis played a large role for Barak Obama’s success in 2012 re-election campaign.
• Social media Giant Facebook handles 50 billion photos from its user base.
• The White House announced a national “Big Data Initiative” that consisted of six Federal departments and agencies where they allotted more than $200 million to big data research projects.
Big data research will become a key basis of competition, underpinning new waves of productivity, growth, innovation and consumer surplus. The increasing volume and detail of information captured by enterprises, the rise of multimedia, social media, and the Internet of things will fuel exponential growth in data for the foreseeable future.