There will be three lectures and two exercise/dicussion blocks per
lecturer. A morning or afternoon block will start with 1 - 1 1/2
hours of lecture, followed by 1 1/2 - 2 hours of exercises + rest, followed
by a discussion of the exercises. During the exercise periods, the
respective lecturer will be around, as well as some fruit, snacks, and
drinks. The half blocks are intended as a reserve, and may
be either lectures or exercises or a mix of the two.
Below you will find, probably by August, short abstracts of the
lectures together with an indication of useful
prerequisites. Clicking on the photos gets you to the
respective lecturers' homepage.
Verity, Inc. / Stanford University
Elements of Text and Web Search
These lectures will cover the basics of inverted indexes to handle text
querying and scoring of retrieved results. Students are expected to be
familiar with basic concepts from data structures, linear algebra and
Here is a pdf version of the slides of
lecture 2, and
Matrix Decomposition Techniques
in Information Retrieval
and Machine Learning
The tutorial discusses various methods from statistics and machine
learning that are based on matrix decompositions. This includes
classical methods such as principal component analysis and factor
analysis, but also more recent achievements such as non-linear PCA,
independent component analysis, non-negative matrix factorization,
(probabilistic) latent semantic analysis and spectral clustering. The
lectures will not deal with numerical issues of matrix decompositions,
but rather illustrate how such methods can be used in the context of
machine learning and its applications. Special emphasis is put on
tasks from the domain of information retrieval such as semantic
search, collaborative filtering and hyperlink analysis.
Some basic knowledge in linear algebra, probability and statistics.
Here is a pdf version of the slides.
Indian Institute of Technology, Bombay
Using Graphs in Unstructured
and Semistructured Data Mining
Abstract: Until recently, machine learning and data
mining techniques focused on single, flat tables of feature vectors.
However, in the last few years, there has been an explosion of data
mining applications where the underlying data has an irresistible
graphical interpretation. The Web is a standard example by now, but
there are many other domains spanning the physical Internet, emails,
USENET, blogs, keyword search in XML and relational data,
multi-relational mining, natural language processing, and biological
data. This short course will give a concise overview of the important
concepts and tools for characterizing, modeling, and analyzing graph
structures common in some of the application domains listed above, with
pointers to active research areas, latest publications, and available
Prerequisites: For data management researchers and
professionals who need to deal with data domains naturally represented by
graphs; algorithm designers. Basic (undergraduate level) probability and
systems performance maturity is
Soumen has posted
under his homepage
an up-to-date version of his slides, some exercises & solutions, and a reading list.
ADFOCS 2004 is organized by Holger Bast & Matthias Bender.
Logo and help with web pages: Alexandra Zhilyakova.
Help with local arrangements:
Petra Mayer and
For comments or questions send an email to