"

New I have couple of RA / TA positions, could be turned into PhD positions dependent on satisfactory performance. Contact me for details.

Information Retrieval

Lecture 1

Information Retrieval: Natural Language Processing - Introduction

Date

26th Aug, 2015

Lab/Assignment

Lab

Regular Expression for tokenization and sentence boundary identification.

Assignment

Find out data set for your respective languages on Twitter/Facebook

Run tokenization and sentence boundary identification on the data

Lecture 2

Spell Correction - Minimum Edit Distance

Date

28th Aug, 2015

Lab/Assignment

Lab

Write codes for MED: normal, alignment, and weighted.

Assignment

details

Topic to be covered

  • Introduction to Information Retrieval, NLP basics, Reg Exp, Tokenization, Stemming, Sentence Boundary Detection
  • TF-IDF and Compression
  • Page-Rank and Link-Analysis
  • Spell Correction, MED
  • Language Modelling, Language Identification
  • Text Classification. Naive Bayes, Spam-Ham, WEKA
  • Evaluation of IR
  • Clustering, Document Similarity, Cosine Scores
  • Vector space Models
  • Relevance feedback and Query Expansion
  • =================================Mid Sem: Oct 15-17======================================
  • Probabilistic information retrieval
  • Question Answering
  • Introduction to Big data Hadoop, Map-Reduce
  • =================================End Sem: Dec 11-17======================================

References

Text Book

  • Introduction to Information Retrieval, by C. Manning, P. Raghavan, and H. Sch├╝tze. Cambridge University Press, 2008.
  • Readings in Information Retrieval. Karen Sparck Jones and Peter Willett. San Francisco : Morgan Kaufmann, 1997

References

  • Search Engines: Information Retrieval in Practice, by Bruce Croft, Donald Metzler and Trevor Strohman.
  • Information Retrieval: Algorithms and Heuristics. David A. Grossman and Ophir Frieder. Dordrecht, The Netherlands: Springer, 2004
  • Modern Information Retrieval, by R. Baeza-Yates and B. Ribeiro-Neto.

Evaluation and Grading

  • Mid sems: 30%
  • End sems: 30%
  • Group Project: 20%
  • Class projects: 20%.

Attendance Policy

Attendance will be taken everyday and missing class can be expected to significantly reduce your chances of success. There will be no repetition.

Missing Exams

  • If you miss a exam due to an unexcused absence, you will receive a grade of 0 for that quiz/exam.
  • If you miss a exam due to an excused absence, you must provide appropriate verification within one week of the quiz/exam. You will then be allowed to take the make-up exam at a date/time to be decided later. The make-up exam may be SIGNIFICANTLY MORE DIFFICULT than the original exam.
  • If you cannot be at the final exam, let me know as soon as you know.
  • No excuses will be entertained for the final project. If you do not work for the project or miss to submit report the will a grade of 0.