Tuesday, December 20th, 9:00-10:30am |
|
Michael J. Franklin HiFi — Network-centric query processing in the physical world
Advances in wireless sensors, RFID technology, and mobile devices have enabled the development of information systems that
monitor and react to events in the real world.
When deployed on a large (e.g., national) scale, these systems assume a high fan-in architecture, in which vast numbers of
events measured at the edges of the network are continually refined, summarized, augmented, and aggregated as they flow
towards the interior. High fan-in systems present a wealth of new research problems reflecting the different concerns and
priorities at each level of the system as well as the interactions among the levels. The solutions will require insights
from recent efforts in data stream processing, sensor databases, event systems, data warehousing, and spatio-temporal data
management. In this talk I will identify the key characteristics and challenges presented by high fan-in systems, and
argue for a uniform, query-based approach towards addressing them. I will then present the design of HiFi, the system we
are building to embody these ideas, and describe an initial proof-of-concept prototype that is capable of combining data
from RFID readers, barcode scanners, and clusters of sensor motes.
|
|
Wednesday, December 21st, 9:00-10:30am |
|
Jayant R. Haritsa Drawing Out the Artistic Talents of Database Query Optimizers
A "plan diagram" is a pictorial enumeration of the execution plan
choices of a database query optimizer over the relational selectivity
space. We have recently developed a tool, called Picasso, for
automatically generating plan diagrams. In this talk, we present
and analyze representative plan diagrams produced by Picasso on a
suite of popular commercial query optimizers for queries based on the
TPC-H benchmark. These diagrams, which often appear similar to cubist
paintings, provide a variety of interesting insights, including that
current optimizers make extremely fine-grained plan choices, which may
often be supplanted by less efficient options without substantively
affecting the quality; that the plan optimality regions may have highly
intricate patterns and irregular boundaries, indicating strongly
non-linear cost models; that non-monotonic cost behavior exists
where increasing result cardinalities decrease the estimated cost;
and, that the basic assumptions underlying the research literature on
parametric query optimization often do not hold in practice. The talk
will conclude with a discussion on the implications of these results
for next-generation database query optimizers.
|
|
Thursday, December 22nd, 9:00-10:30am |
|
Andrei Z. Broder The next stage in Web IR: From query based Information Retrieval to context driven Information Supply
In the past decade, Web search engines have evolved from a first
generation based on classic Information Retrieval (IR) scaled up to web
size and supporting only informational queries, to a second generation
supporting navigational queries using web specific information
(primarily link analysis), and then to a third generation enabling
transactional and other "semantic" queries based on a variety of
technologies aimed to directly satisfy the unexpressed "user intent."
What is coming next? In this talk, we argue for the trend towards
context driven Information Supply, that is, the goal of Web IR will
widen to include the supply of relevant information without requiring
the user to make an explicit query.
The information supply concept greatly precedes information retrieval.
(Newspapers, or even the "Acta Diurna" of ancient Rome.) What is new in
the web framework, is the ability to supply relevant information
specific to a given activity and a given user, while the activity is
being performed. A prime example is the matching of ads to content
being read, however the information supply paradigm is starting to
appear in other contexts such as social networks, e-commerce, browsers,
and others.
|