2180710 BDA Syllabus
2180710 BDA Syllabus
Big Data Analytics 2180710 Syllabus
GUJARAT TECHNOLOGICAL UNIVERSITY
COMPUTER ENGINEERING (07) & INFORMATION TECHNOLOGY (16)
BIG DATA ANALYTICS
SUBJECT CODE: 2180710 B.E. 8thSEMESTER
Type of course: Elective
Teaching and Examination Scheme:
|Teaching Scheme||Credits||Examination Marks||Total Marks|
|L||T||P||C||Theory Marks||Practical Marks|
|PA (M)||ESE (V)||PA
|Sr. No.||Content||Total Hrs||%
|1||INTRODUCTION TO BIG DATA
Introduction– distributed file system–Big Data and its importance, Four Vs, Drivers for Big data, Big data analytics, Big data applications. Algorithms using map reduce
|2||INTRODUCTION TO HADOOP AND HADOOP
Big Data – Apache Hadoop & Hadoop EcoSystem, Moving Data in and out of Hadoop – Understanding inputs and outputs of MapReduce -, Data Serialization.
|3||HDFS, HIVE AND HIVEQL, HBASE
HDFS-Overview, Installation and Shell, Java API; Hive Architecture and Installation, Comparison with Traditional Database, HiveQL Querying Data, Sorting And Aggregating, Map Reduce Scripts, Joins & Sub queries, HBase concepts, Advanced Usage, Schema Design, Advance Indexing, PIG, Zookeeper , how it helps in monitoring a cluster, HBase uses Zookeeper and how to Build Applications with Zookeeper.
Introduction to Data Analysis with Spark, Downloading Spark and Getting Started, Programming with RDDs, Machine
|Learning with MLlib.|
What is it?, Where It is Used Types of NoSQL databases, Why
NoSQL?, Advantages of NoSQL, Use of NoSQL in Industry, SQL vs NoSQL, NewSQL
|6||Data Base for the Modern Web
Suggested Specification table with Marks (Theory):
|Distribution of Theory Marks|
|R Level||U Level||A Level||N Level||E Level||C Level|
Legends: R: Remembrance; U: Understanding; A: Application, N: Analyze and E: Evaluate C: Create and above Levels (Revised Bloom’s Taxonomy)
Note: This specification table shall be treated as a general guideline for students and teachers. The actual distribution of marks in the question paper may vary slightly from above table.
- Boris lublinsky, Kevin t. Smith, AlexeyYakubovich, “Professional Hadoop Solutions”, Wiley, ISBN: 9788126551071, 2015.
- Chris Eaton,Dirk derooset al. , “Understanding Big data ”, McGraw Hill, 2012.
- BIG Data and Analytics , Sima Acharya, Subhashini Chhellappan, Willey 4. MongoDB in Action, Kyle Banker,Piter Bakkum , Shaun Verch, Dream tech Press
- Tom White, “HADOOP: The definitive Guide”, O Reilly 2012.
- VigneshPrajapati, “Big Data Analyticswith R and Haoop”, Packet Publishing 2013.
- Learning Spark: Lightning-Fast Big Data Analysis Paperback by Holden Karau Course Outcome:
Upon completion of this course, students will be able to do the following:
- Students will to build and maintain reliable, scalable, distributed systems with Apache Hadoop.
- Students will be able to write Map-Reduce based Applications
- Students will be able to design and build MongoDB based Big data Applications and learn MongoDB query language
- Students will learn difference between conventional SQL query language and NoSQL basic concepts
- Students will learn tips and tricks for Big Data use cases and solutions.
List of Experiments:
- To understand the overall programming architecture using Map Reduce API
- Store the basic information about students such as roll no, name, date of birth , and address of student using various collection types such as List, Set and Map
- Basic CRUD operations in MongoDB
- Retrieve various types of documents from students collection
- To find documents from Students collection
- Develop Map Reduce Work Application
- Creating the HDFS tables and loading them in Hive and learn joining of tables in Hive
Design based Problems (DP)/Open Ended Problem:
- Create a system which can use of Web search, web crawlers and web information retrieval.
- Analyze and implement a system with Web graph mining.
- Implement and Subscribe RSS News feeds to get latest news in India.
XMLSpy, RSS Feed, RSS Reader.
List of Open Source Software/learning website:
ACTIVE LEARNING ASSIGNMENTS: Preparation of power-point slides, which include videos, animations, pictures, graphics for better understanding theory and practical work – The faculty will allocate chapters/ parts of chapters to groups of students so that the entire syllabus to be covered. The power-point slides should be put up on the web-site of the College/ Institute, along with the names of the students of the group, the name of the faculty, Department and College on the first slide. The best three works should submit to GTU.