Tuesday, December 23, 2008



 The grades have been submitted, and I assume several of you already know yours.

Judging from a sampling of mails prior to the grades, it is possible that several of you
received better grades than you suspected. This is all part of a carefully cultivated delusion
on my part that this  class is somehow harder than your other classes and should have
more christmasey grades.. :-> 

If you did get a grade better than you suspected, just remember the climactic scene from Saving Private Ryan
and just     "... earn it" (which could, for example, involve listening to the final exam answers audio so you don't
carry misunderstandings forward).

 I also wanted to take this opportunity to publicly thank Garrett--your long suffering TA. He had to grade
three projects, four long homeworks and see a demo, and voluntarily held additional office hours over weekends
to help you with your projects.

 I wish you all great holidays and hope  our paths will cross again in future (and that we won't glare at each other, if they do ;-)

 I myself am off to Hawaii where I expect to catch golf balls for my buddy Barack.


Monday, December 22, 2008

Final exam answers

As you wait to see your grades, I thought this is pretty much the last chance I have to get you to hear the answers for the final exam

Here it is in audio form (imagine--you can put it on your ipod and hear it instead of those cloying christmas songs...), and just 31min long.


Seriously, if you got a low score on the final, then you probably want to listen to this so you won't carry the misunderstandings with you.


Cumulatives with late penalties and participation


 There were a bunch of cases of late submissions for which we had forgotten to include the late submission penalty. The version below includes that. Also, the total+P column is total that includes participation credit.



Project 3

If anyone wants to stop by to pickup project 3, I am here in the lab now (557AD).


Final Cumulatives (sans extra credit)--


 Here are the final "real" cumulatives (with the final exam marks thrown in). These are computed
with 40% for projects, 20% for homeworks and 40% for exams. [The participation credit is a subjective credit
given out of 5 points. It is not currently added into the cumulatives]

*****I am willing to hear from the students at the top of each list (598 and 471) as to what should be the grade cutoffs
for the rest of the class (these two students are guaranteed an A at least).**** (Those two can send me email with their suggestions)



final exam points

Here are the stats on the final exam that I just finished grading.


Saturday, December 20, 2008

Project 3

If anyone needs their project 3, please send me an mail…






Cumulatives (sans the final exam grade)


 Here are the current cumulatives (I am still grading the final--so those are not included). If you find any errors, please let me and garrett know ASAP.


Tuesday, December 16, 2008

Blowing Steam Topic: Questions on the final that may have infuriated you etc.


 Here is a last topic for the blog where you are encouraged to discuss any questions on the final that you found insulting/infuriating/intriguing -- or just want to know what is supposed to be the right answer.


Attendance statistics...

According to the participation survey,

 Avg # classes missed: 1.3   (average missed with prior notification: 0.3) Standard Dev: 1.4
Max: 6       Min: 0  [11 students had perfect attendance; so surprisingly, did Rao!]


Monday, December 15, 2008

Re: Semantics and schemas

Not an easy question to answer via email, but I will try..

All I meant when I said databases and/or XML don't have semantics is that the columns don't have any meaning that you might associated based on their "english" names.

Databases do have semantics--they have basically "look-up" semantics. Each tuple describes a complete world and you can do
selection/projection/join/count etc inference (queries). However, just because the database tuple says the person is "dead", you can't assume that the database can answer a query "is the person currently reading a paper?".

The more background knowledge you provide, the more deeper (i.e., non-lookup) inferences you can do. [A set of RDF triples provide no more inferential power than a set of relational tables conforming to a single schema. For example, a database tuple
<id=345, grade=A, gender=male> corresponds to the RDF triples t1--id--345; t1--grade--A; t1--gender--male.
It is the background knowledge--via RDF-Schema--that allows you to  do more inferences.]

--look at it another way. If you remember the magellan story-- after hearing that magellan is an explorer and he went around the world three times, a look-up inference can answer "Is magellan an explorer?" and "how many trips did he make?". A look-up inference however cannot answer the query "in which trip did magellan die?"--that requires more background knowledge. [Of course, if you are happy with just look-up inferences, then you don't need background knowledge. However, when you have multiple autonomous databases, and you wan't to "integrate" then, you in essense at least need a special type of background knowledge that at least maps the columns in both databases]

 if you want a really clean grasp of what it means to talk about inference and semantics, I would recommend that you take intro to AI ( http://rakaposhi.eas.asu.edu/cse471 )


On Sat, Dec 13, 2008 at 10:42 PM, Pierre Bucher <Pierre.Bucher@asu.edu> wrote:
Dear Mr Kambhampati,

The answers to the question you put on the blog some time ago, about whether relational DB had semantics, made me doubt about the meaning of "semantics". I guess that XML Schema don't have semantics, because (1) we would not need RDF if it was the case (2) there is no concept of property and of inheritance in XML Schema, which seems to be the base of semantics. Then I would guess that relational DB don't have semantics either: they just have meta-data without meaning like XML and XML Schema. Is it true?

Moreover, is the fact that we have properties and relationships that gives RDF semantics? Then the concept of inheritance made possible by RDF Schema  would not give semantics, but just makes easier the processing of semantics (and actually we could do little without it). Is that right too?

Thank you.

Pierre Bucher.

Re: One question about Authority/Hub computation

By  "different", I assume you referring to them having different results during  intermediate iterations, right?
We are not really interested in a1 and h1--but rather in a* and h*. Both reach the same fix point. [I suspect the rate of convergence for the first method is going to be faster--for the same reason asynchronous iterations on pagerank make it converge faster.]


On Sun, Dec 14, 2008 at 5:05 PM, Jianhui Chen <Jianhui.Chen@asu.edu> wrote:
Hello, Dear Dr. Rao,

I have a question about the computation of Authority / Hub score as below:
If a0 and h0 are pre-given, let A be the adjacent matrix, we can
compute a1 and h1 via
following two approaches theoretically

1. a1 = A' * h0, h1 = A * a1
2. a1 = (A' * A) * a0, h1 = (A * A') * h0.

However, in generally, the two approaches would generate different result.

Could you please help to advice which one we should take for the computation?

Sorry for disturbing you during the weekend. Many thanks.


Re: Regarding the discussion

Basically yes..

What is interesting is that you need to have a pretty strong bias to begin with if you are learning only from positive examples if you want to avoid over-generalizing..

Consider, for example, that your most general grammar hypothesis allows both languages that constraint word order (e.g. english) as well as those that don't  (e.g. hindi," spanish -- where you can, in essense say "Tom Mary hit"--while in english you have to say "tom hit mary" or "mary hit tom".) If I only give you positive examples of usage in English, you would not know that English doesn't allow sentences like "Tom mary hit". In order to do do that, you will need to bias your
learner saying that word order dependence and word order independence are mutually exclusive (so you won't over generalize).

Of course, this is not just idle speculation---our current understanding is that children come into this world with a universal grammar which has all these kinds of constraints embedded, so they are able to learn grammar from mostly positive examples.


On Mon, Dec 15, 2008 at 4:13 PM, Shruti Gaur <sgaur2@asu.edu> wrote:
From the version space idea you explained, it seems the more positive examples we have, the more we can generalize (whereas the negative examples will break the current hypothesis we hold into the next level where we can evaluate each of them and remove the false ones.) which I guess is true in human learning as well.
This made the idea of generating grammar from positive examples more clear as grammar is also kind of a generalization of all the syntactically valid sentences we can make.eg (subject verb object) is the grammar or generalization where each constituent can have different values.

Please correct me if I am wrong

Thanks & Regards

Shruti Gaur
Graduate Student
MS(Computer Science)
Dept of Computer Science & Engineering
Arizona State University

Homework 4

If you would like to pickup your graded homework 4 before the final exam, please stop by the lab on the 5th floor.  557AD


REMINDER: Exam tomorrow Tuesday 12/16 12:10--2:00pm same room

availability today for final-exam-related questions..

I will be  mostly in my office until about 3pm today. If you have questions, feel free to drop by.

(if you want to check before showing up, my office number is 480 965 0113)


The final exam will be closed-book (as had been agreed in the class)

Some of you sent mails asking whether the final will be open or closed book.

As had been agreed in the class a couple of weeks back, the final will be closed book. You just bring your
brains to the exam (fill them up first though..)

You can use standard (non-matrix) calculators during the exam.


Thursday, December 11, 2008

Final exam coverage..

The final exam is going to be comprehensive, but with significantly higher emphasis on the
topics not covered by the midterm.


Tuesday, December 9, 2008

in praise of intellectual swagger..


Just got done going through all your comments on the top-k ideas you appreciated. Thanks for the many thoughtful

I should probably sign-off on that chummy note and resist my curmudgeon-like urge to kvetch (especially on the eve of teaching evaluations ;-), but I can't.

Despite the semester-long attempts to foster skepticism, there still was a little too much  google-swooning in the reviews for my taste.  While I don't mean to legislate your predilections, I do want to suggest that it would be way much more fun to be path-breakers rather than be groupies.  For the  former you would need  more skepticism and a lot more intellectual swagger.
No one ever changed/improved  anything that they are smitten with.


Monday, December 8, 2008

Fwd: CEAS teaching evaluations..

Dear all:

 As per the below, I encourage you to take part in the CEAS teaching evaluations.
I take the feedback seriously and particularly  pay attention to written comments on what worked and didn't.

Here is hoping for a 110% turnout ;-)


---------- Forwarded message ----------
From: James Collofello <JAMES.COLLOFELLO@asu.edu>
Date: Mon, Dec 1, 2008 at 12:09 PM
To: "DL.WG.CEAS.Faculty" <DL.WG.CEAS.Faculty@mainex1.asu.edu>
Cc: Ann Zell <ann.zell@asu.edu>



The Fall 2008 teaching evaluations became available to students today, Monday, Dec 1, around 10:00 a.m. and will close on Wednesday, Dec 10 (reading day) at 12:00 midnight.  Students will be able to access the evaluation tool at: https://fultonapps.asu.edu/eval

Please encourage your students to complete the evaluations or face several nagging email requests.  Good luck on your scores!


Chairs – please distribute this message to your faculty associates.






James S. Collofello

Associate Dean of Academic and Student Affairs

Professor of Computer Science and Engineering

Ira A. Fulton School of Engineering

Arizona State University


Re: question about hw q3

Sorry for the confusion. You can (and should) have room information in the mediator schema.

[I didn't specifically mean to require that mediator schema not have room attribute (it would seem rather silly to have a course schema that doesn't give room information). ]


On Mon, Dec 8, 2008 at 9:31 AM, Mijung Kim <Mijung.Kim.1@asu.edu> wrote:
Hi Professor,

I have a question regarding hw q3.

In the question,

"We have another source called ASU-CS-S02-Catalog- which exports the set of CS courses being taught in Spring 2003, the instructor who will be teaching them, and the rooms in which the classes will be taught. Write this source too as a materialized view on the mediator schema."

According to the mediator schema, we don't have "room" attribute. In this case, should we update mediator schema adding the room to it?

Could you enlighten me on this??



Saturday, December 6, 2008

participation evaluation sheet (please return in hardcopy on Tuesday)


 I put up a participation self-evaluation sheet at


Please print it, fill it and bring it to class on Tuesday (I am sending this now rather than just bring it to Tuesday's class since
the form asks you to remember how many classes you missed etc. and I would prefer getting better than order-of-approximation answers).


Thursday, December 4, 2008

Rescheduled Demo Slots

I forgot to block out the time slots which overlap with Tuesday's class so would the 5 students who signed up during that time please contact me and let me know when you would like to do your demo.

Just to remind everyone, you will only have 20 minutes to complete the demo.  I completed one of the demos earlier today in about 12 minutes.  The demo was very thorough and included some waiting time as well as time to demo several extra credit functions, so 20 minutes should be more than enough time to complete it. 

Keep in mind that if you plan on using one of the lab machines for the demo, that they sometimes take a while to boot/login.

I look forward to seeing everyone's demo next week.


The demo signup sheet

Hi Garrett (and the class)

 Here is the demo sign-up sheet with information on who took which slot.

I just noticed that several of those slots are right during the class time on Tue. This is a clear no-no (and I have already black-listed the folks who decided to sign-up for them :-> ). Anyways, please reschedule those to times outside the class time.


Wednesday, December 3, 2008

Blog qn for Homework 4 (You should post your answer as a comment to this thread and also enclose it in your hw 4 submission)

Here is the "interactive review" question for homework 4 that I promised.

"List five nontrivial ideas you came to appreciate during the course of this semester"
(You cannot list generic statements like "I thought pagerank was kewl". Here is an example:

The deep connection between feature selection and dimensionality reduction: By
understanding LSI, LDA and MI based feature selection, I was able to see the deep
connections between dimensionality reduction (which works without taking any particular
classification task into account) and feature selection (which typically take the classification task
into account).


Tuesday, December 2, 2008

Topic for this week: Information Integration --slides available online..


 This week we will discuss issues in information integration; tthis will also be the last major topic for the semester.
I put up the slides for the class online. I strongly suggest
that you take a look at them before coming.