Big Data System Part 1 Computer Science YouTube Lecture Handouts

1. What is data

2. Introduction and example of big data

3. Sources of big data

4. Use of big data

5. Characteristics of big data

6. MCQ

What is Data

  • Data are raw fact and figure. Data must be processed to make it useful for the user.
  • Data can be anything. It can be text, numbers, audio, video, images etc. When data is processed, organized, structured or presented in a given context so as to make it useful it is called information.

Introduction of Big Data

  • Big data is also a data which are very large in size. Big data means huge data that increases over time.
  • Big data term is used to describe a large amount of data. Usually we work on megabytes and gigabytes size but the size of big data is in terabytes of Petabytes or Exabytes or even more.
  • Big data is in different formats that cannot be handled by traditional tools and applications and constantly increasing the size of this data.


Facebook՚s database generates more than 500 terabytes of data every day, this data is generated mainly by commenting photos and videos upload messages, etc.

Sources of Big Data

Uses of Big Data

  • Healthcare: Big data has an important role in the field of health, in which information of millions of patients can be collected by various hospitals to decide which treatment is right so that new patients coming can be treated properly.
  • Banking: big data plays an important role in the banking sector. The bank can directly access the financial data of the customer and can monitor its financial activities and offers loans and credit to the customer according to the financial activates. By big data, the bank can increase its sales and control fraud.
  • Politics: big data also very important in the field of politics. According to the previous voting data, by finding the interest of new voters, they can be attracted to vote and voters can also take their right decision according to the data.
  • Retail: business activities can be monitored by big data. What products are the customers using and what are their requirements, it is ascertained by big data and on this basis new products are brought into the market.
  • Education: big data also important in the field of education. It is necessary for both the students and the institute.
  • The uses of big data in education as following:
    • Enhanced grading system
    • Career guidance to students
    • Proposing new learning plans
    • Improve students results
    • Reduce dropouts

Characteristics of Big Data

  • Volume is one of the characteristic of big data which means size of data.
  • It contains a huge amount of data which is increased every second by various sources such as Facebook, cloud, web, social network site, Wal-Mart etc.
  • The size of big data is very high such as Petabytes or Exabyte or even more.
  • This data stored in the data warehouse and brought together by a software framework like hadoop.


  • Variety is also one of the important characteristics of big data.
  • Variety refers to source and nature of data. Big data consists of data structure, unstructured and semi-structured form. Earlier most of data was collected by database and spreadsheet.
  • But nowadays, the form of data are email photos, video, audio, social media text message, Pdf, graph, and output from all types of machine- generated data from cell phone GPS signal, machine logs, device etc.


  • Value is an important characteristics of big data that we needs to be focused on. There is no use of data until the data is converted into value.
  • Therefore it is need to make data valuable so that the correct information is obtained from it.
  • It is not just the stored and processed data but it is the valuable, reliable, and trustworthy data.


  • The speed of generating of data is to prefers the term of ‘velocity’ .
  • This is main aspect of the big data in which the data has to be provided at the faster as per demand.
  • The data flow very fast and continuously. Velocity is the speed in which data flow from different sources like social media, Social influencers, Legacy documents, Data warehouse appliances etc.


  • Big data consists of a data structure and unstructured form, data that is unstructured sometime becomes messy, which makes it difficult to control its quality and accuracy. Veracity refers to the inconsistency and messiness of the data.
  • Due to the different sources of data in big data, data is available in different form so that sometime data inconsistencies and uncertainty which produces confusions and becomes difficult to control.


Q1. What does ‘velocity’ in big data means?


1. Speed of input data Generation

2. Speed of output data

3. Speed of processing

4. Speed of storing and processing data

Answer: 4

Q2. Which of the following are examples of real time big data processing?


1. Bank fraud Transactions detections

2. Stock market data analysis

3. Health sector

4. Both (1) & (3)

Answer: 4