What is Big Data and Characteristics of Big Data V3s?
4 years ago
Data Mining and Data Warehousing
What is Big Data?
- Big Data may well be the Next Big Thing in the IT world.
- Big data burst upon the scene in the first decade of the 21st
- The first organizations to embrace it were online and startup Firms like Google, eBay, LinkedIn, and Facebook were built around big data from the beginning.
- Like many new information technologies, big data can bring about dramatic cost reductions, substantial improvements in the time required to perform a computing task, or new product and service
- ‘Big Data’ is similar to ‘small data’, but bigger in size
- But having data bigger it requires different approaches:
Techniques, tools and architecture
- An aim to solve new problems or old problems in a better way
- Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing
Example of Big Data
- Walmart handles more than 1 million customer transactions every
- Facebook handles 40 billion photos from its user
- Decoding the human genome originally took 10 years to process; now it can be achieved in one
- Twitter generates 7TB of data
- IBM claims 90% of today’s stored data was generated in just the last two
How Is Big Data Different?
- Automatically generated by a machine (e.g. Sensor embedded in an engine)
- Typically, an entirely new source of data (e.g. Use of the internet)
- Not designed to be friendly (e.g. Text streams)
- May not have much values need to focus on the important part
Three Characteristics of Big Data (V3s)

1. Volume
- A typical PC might have had 10 gigabytes of storage in
- Today, Facebook ingests 500 terabytes of new data every
- Boeing 737 will generate 240 terabytes of flight data during a single flight across the US.
- The smart phones, the data they create and consume; sensors embedded into everyday objects will soon result in billions of new, constantly-updated data feeds containing environmental, location, and other information, including
2. Velocity
- Clickstreams and ad impressions capture user behavior at millions of events per second
- High-frequency stock trading algorithms reflect market changes within microseconds
- Machine to machine processes exchange data between billions of devices
- Infrastructure and sensors generate massive log data in real-time
- On-line gaming systems support millions of concurrent users, each producing multiple inputs per
3. Variety
- Big Data isn't just numbers, dates, and Big Data is also geospatial data, 3D data, audio and video, and unstructured text, including log files and social media.
- Traditional database systems were designed to address smaller volumes of structured data, fewer updates or a predictable, consistent data
- Big Data analysis includes different types of data
Benefits of Big Data
- Real-time big data isn’t just a process for storing petabytes or Exabyte of data in a data warehouse, it’s about the ability to make better decisions and take meaningful actions at the right time.
- Fast forward to the present and technologies like Hadoop give you the scale and flexibility to store data before you know how you are going to process it.
- Technologies such as MapReduce, Hive and Impala enable you to run queries without changing the data structures
- Now newest research finds that organizations are using big data to target customer-centric outcomes, tap into internal data and build a better information
- Big Data is already an important part of the $64 billion database and data analytics
- It offers commercial opportunities of a comparable scale to enterprise software in the late 1980s and the Internet boom of the 1990s, and the social media explosion of
Application of Big Data analytics
- Smarter Healthcare
- Multi-channel sales
- Homeland Security
- Traffic Control
- Manufacturing
- Telecom
- Trading Analytics
Leading Technology Vendors (Big Data)
- IBM – Netezza
- EMC – Greenplum
- Oracle – Exadata
Raju Singhaniya
Oct 14, 2021