21CS753 Introduction To Big Data

Course Learning Objectives
CLO 1. Understand Hadoop Distributed File system and examine MapReduce Programming CLO 2. Explore Hadoop tools and manage Hadoop with Sqoop
CLO 3. Appraise the role of data mining and its applications across industries CLO 4. Identify various Text Mining techniques
SYLLABUS COPY
MODULE - 1
Hadoop Distributed file system
HDFS Design, Features, HDFS Components, HDFS user commands Hadoop MapReduce Framework: The MapReduce Model, Map-reduce Parallel Data Flow,Map Reduce Programming
MODULE - 2
Essential Hadoop Tools
Using apache Pig, Using Apache Hive, Using Apache Sqoop, Using Apache Apache Flume, Apache H Base
MODULE - 3
Data Warehousing
Introduction, Design Consideration, DW Development Approaches, DW Architectures
Data Mining
Introduction, Gathering, and Selection, data cleaning and preparation, outputs ofData Mining, Data Mining Techniques
MODULE - 4
Decision Trees
Introduction, Decision Tree Problem, Decision Tree Constructions, Lessons from Construction Trees. Decision Tree Algorithm
Regressions
Introduction, Correlations and Relationships, Non-Linear Regression, Logistic Regression, Advantages and disadvantages.
MODULE - 5
Text Mining
Introduction, Text Mining Applications, Text Mining Process, Term Document Matrix, Mining the TDM, Comparison, Best Practices
Web Mining
Introduction, Web Content Mining, Web Structured Mining, Web Usage Mining, Web Mining Algorithms.
Course outcome
At the end of the course the students will be able to:
CO 1. Master the concepts of HDFS and MapReduce framework.
CO 2. Investigate Hadoop related tools for Big Data Analytics and perform basic CO 3. Infer the importance of core data mining techniques for data analytics CO 4. Use Machine Learning algorithms for real world big data.
Suggested Learning Resources
Textbooks
1. Douglas Eadline,”Hadoop 2 Quick-Start Guide: Learn the Essentials of Big DataComputing in the Apache Hadoop 2 Ecosystem”, 1 Edition, Pearson Education,2016. 2. Anil Maheshwari, “Data Analytics”, 1 Edition, McGraw Hill Education,2017