Course Title: Web Mining

Number of Unites: 4

Schedule: Three hours o f lecture and one hour of discussion per week.

Prerequisites: Basic concepts in data mining

Catalog Description :
Web mining aims to discover useful knowledge from the Web, e.g. Web hyperlink structure, Web  page content and Web usage log. Based on the fundamental type of data used in the mining process, Web mining tasks are classified into three main types: Web structure mining, Web content mining and Web usage mining. The goal of this course is to present these tasks, and their essential algorithms. 

Expanded Description: (Following Chapters from the Book)
1.   Introduction to Web Intelligence
     1.1   Historical Perspective
     1.2   Towards Intelligent Web
     1.3   Knowledge
     1.4   Web Mining
     1.5   Building Better Web sites using Intelligent Technologies
     1.6   Benefits of Intelligent Web

6.   Web Usage Mining
     6.1   Introduction to Web Mining
     6.2   Introduction to Web usage Mining
     6.3   Web Log Processing
     6.4   Analyzing Web Logs
     6.5   Web Usage Mining Applications

7.   Web Content Mining
     7.1   Introduction
     7.2   Data Collections
     7.3   Search Engines
     7.4   Robot Exclusion
     7.5   Personalization of Web Content
     7.6   Multimedia Information Retrieval

8.   Web Structure Mining
     8.1   Introduction
     8.2   Modeling Web Topology
     8.3   Other Approaches to Studying the Web-Link Structure  

Course Objectives & Role in the Program:
This course has three objectives. First, to provide students with a sound basis in Web data mining tasks and techniques. Second, to ensure that students are able to read, and critically evaluate Web mining research papers. Third, to ensue that students are able to implement and to use some of the important Web mining algorithms.

Learning Outcome:
Students will learn: (a) how search engines index and rank web documents,
(b) how to conduct business intelligence from online resources, and
(c) apply Web Mining strategies and algorithms in their workplace or research careers.

Method of Evaluation
  1. Midterm: 25%
  2. Final Exam: 40%
  3. Projects from the Textbook: 
    • Project 1: Algorithm implementation (15%)
    • Project 2: Research project (including implementation) (20%)
Required Books:

  1. Building an Intelligent Web: Theory & Practice, R. Akerkar & P. Lingras; Jones & Bartlett, 2007. 

Reference book:
  1. "Mining the Web, Discovering Knowledge from Hypertext Data", Soumen Chakrabarti, Morgan Kaufmann Publishers, 2003

© 2006 -14  Technomathematics Research Foundation