Talks & Seminars
Algorithms for Data Stream Systems
Mayur Datar, Stanford University
Date & Time: September 12, 2003 16:00
Venue: Seminar Hall
In a growing number of information processing applications, data takes the form of "continuous data streams" rather than traditional stored databases. These applications share several distinguishing features like the need for real time analysis, huge volumes of data, and unpredictable and bursty data arrivals. These applications have spawned a considerable and growing body of research into data stream processing, ranging from algorithms for data streams to full-fledged data stream systems. In this talk, I will present some of my work in this area. For this presentation, we will focus on some issues related to the design of a general purpose data stream management system (DSMS). We will look at two runtime resource allocation issues, namely operator scheduling and load shedding. The problem of operator scheduling is to design a strategy that decides at every time instant what is the next job (operator) to schedule for execution on the processor. We will present an almost optimal scheduling strategy called "Chain" that is designed to minimize the memory requirement of the system. In the second part of the talk I will discuss techniques for intelligently dropping unprocessed data (tuples) from the system, i.e. load shedding, so as to reduce the load on the system during moments of high volumes of input data. We present an algorithm for load shedding that minimizes the inaccuracy in query results, while at the same time making sure that the load on the system is below the required threshold.
Speaker Profile:
List of Talks


Faculty CSE IT
Forgot Password
    [+] Sitemap     Feedback