Information Retrieval Systems course
This page
contains syllabus, lecture slides, reading material, and exam for
the course "Information Retrieval Systems".
For any
questions or comments regarding the lecture or this web site,
please contact Rajendra Akerkar.
Course Syllabus
The objective
of this course is to elaborate on the fundamentals of information
retrieval (IR), study of indexing, search, relevance,
classification, organisation, storage, browsing, visualisation,
etc. Focus on prominent computer algorithms and methods used in
the field from a computer scientist's perspectives.
The course outline:
Session 1 Introduction
to information retrieval and extraction
Session 2 Conventional information retrieval systems
Session 3 Document processing
Session 4 Automatic indexing
Session 5 Information retrieval models
Session 6 Retrieval performance evaluation
Session 7 Query operation
Session 8 Relevance feedback
Session 9 Clustering techniques
Session 10 Searching on
the Web
Session 11 Information
extraction
Learning outcomes:
- Learn:
- How does Web search work?
- What’s the future of Web search?
- Learn how to:
- Analyze, discuss, and present
research papers
- Do projects at the
frontier of Web search and information extraction
Project, Presentation Topics and
Lecture Slides
The course material will be
available via course management system.
Course Structure & Resources
This will be 2
weeks course. Each class session will be of 4 hour 30 min.
duration. Classes will comprise of lecture, hands-on-practice,
discussion etc. Students will be encouraged to participate in
class-discussion and will make at least one presentation during
the course. Morning sessions will mainly consist of lectures and
group exercises. Afternoons will be focused mainly around
assessment activities and personal study. Students are expected to
organise their time to cover preparatory work and assessment
activities.
The textbooks
for the course are
- Rajendra
Akerkar and Pawan Lingras. Building
an Intelligent Web: Theory
& Practice. Jones and Burtlett 2008, ISBN-13:
978-0-7637-4137-2, ISBN-10: 0-7637-4137-X.
- R. Baeza-Yates, B. Ribeiro-Neto. Modern
Information Retrieval. Addison-Wesley, 2010, ISBN: 9780321416919.
Conferences
Journals
General Reading Material
- World-Wide Web Consortium
(W3C)
- On-line
textbook on Information Retrieval by C. J. van Rijsbergen
(1979)
- Information
Retrieval Links
- UMass Center for
Intelligent Information Retrieval
- Bibliography
on Zipf's Law
- Web Robots
Pages
- Prosecuting
Bots for Trespassing (e.g. Ebay vs. Bidder's Edge) (or try a Google
search on "robots.txt lawsuit")
- Search Engine Watch
- Search Tools for Web
Sites
- History
of Search Engines
- Scientific American articles on XML and the Semantic
Web
- Web IR and IE
- Reading
List on Machine Learning and Information Retrieval
- Repository of
Online Information Sources Used in Information Extraction
Tasks
- Bibliography
on Automated Text Categorization
- Recommender
Systems Links
- NY
Times article on Text Mining
- Wired
article on Google's Algorithm
- Weaving
the Web: The original design and ultimate destiny of the
World Wide Web, by its inventor, Tim Berners-Lee
with Mark Fischetti, 1999.
- Speeding
the Net: The Inside Story of Netscape and How It
Challenged Microsoft ,
Joshua Quittner, Michelle Slatalla, 1998.
- The
Search: How Google and Its Rivals Rewrote the Rules of
Business and Transformed Our Culture ,
John Battelle, 2005.
- The
Google Story, David Vise and Mark Malseed, 2005.
- Planet
Google: One Company's Audacious Plan To Organize
Everything We Know ,
Randall Stross, 2008.
- In
The Plex: How Google Thinks, Works, and Shapes Our Lives ,
Stephen Levy, 2011.
- Linked:
The New Science of Networks: How Everything is Connected
to Everything Else and What it Means for Science, Business
and Everyday Life, A.L. Barabasi, 2002.
- The
Long Tail: Why the Future of Business is Selling Less of
More, Chris Anderson, 2006.
Assignments/Exam
This
course is assessed by coursework (40%) and online exam (60%).
There are two types of coursework. The first is a Group
Presentation. This is a 30-minute group talk by three students
on a given topic using Powerpoint slides. The second is a
Project Report.
ASSIGNMENT 1: GROUP PRESENTATION – 10%
You will give a group presentation: a 30-minute talk to be
presented to your seminar group in second Week. The talk will be
followed by 5 minutes for questions. The presentation represents
10% of the total marks for the course.
ASSIGNMENT 2: PROJECT REPORT – 30%
The content of this project will be a report on an IR & IE
system undertaken by students from a choice of 3 options. The
project report represents 30% of the total marks for the course.
EXAMINATION – 60%
50% of the marks for the course are allocated to an final exam
which takes place online through the course managment system.
Advice on what the exam consists of and how to approach it will
be given in the last session of the course.
Details about the end of
semester exam will be available soon.
Marking & Grading
The candidate will be evaluated on a 10 point
scale and the Grading pattern will be as follows:
| Percentage |
96≤P≤100 |
90≤P≤95 |
80≤P≤89 |
70≤P≤79 |
60≤P≤69 |
55≤P≤59 |
50≤P≤54 |
40≤P≤49 |
31≤P≤39 |
00≤P≤30 |
| G |
10
|
9
|
8
|
7
|
6
|
5
|
4
|
3
|
2
|
0
|
Specific criteria for judging the assignments will
vary, but generally they will be judged on:
- knowledge of the literature and/or available information
- clarity of oral or written presentation
- originality of analysis or interpretation by the student or
group
- appropriate application of the techniques used
- justification of design choices or value statements
This website is
licensed under a Creative
Commons Attribution-Share Alike 3.0 License.