Wednesday, October 29, 2008

Next topic: Text classification (required reading: Chapter 14 of Manning et al's book)

Our next topic is classification techniques. Classification is an enormous area and entire courses are devoted to it.
We will only spend about 1.5 classes on it and get a birds eye view of the main issues, and look at Naive Bayes Classifier--a techniques that works about as well
as a default strategy in text classification as K-means does as a default strategy for clustering.

The classification techniques are also useful in content-based filtering (which will come up in the discussion of recommender systems to start next week).

The reading for tomorrow is chapter 14 of Manning et al's book

rao

No comments: