Records Retention: Addressing Insider Threats to Data Integrity
Prof. Marianne Winslett. Univ. of Illinois – Urbana Champaign (UIUC) and Advanced Digital Sciences Center, Singapore
Marianne Winslett has been a professor in the Department of Computer Science at the University of Illinois since 1987. She has been the director of the Advanced Digital Sciences Center in Singapore since 2009. She is an ACM Fellow and the recipient of a Presidential Young Investigator Award from the US National Science Foundation. She is the former vice-chair of ACM SIGMOD and has served on the editorial boards of ACM Transactions on the Web, ACM Transactions on Database Systems, IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Information and Systems Security, and the Very Large Data Bases Journal. She has received best paper awards for research on managing regulatory compliance data (VLDB and SSS), analyzing browser extensions to detect security vulnerabilities (Usenix Security), and keyword search (ICDE). Her PhD is from Stanford University.
Abstract: Inaccurate financial statements from major companies, dead people who still vote in elections, world-class gymnasts with uncertain birth dates: insiders often have the power and ability to make inappropriate changes to the content of electronic records. As electronic records replace paper records, it becomes easy to make such alterations without leaving behind evidence that can be used to detect the changes and determine who made them. The US Sarbanes-Oxley Act is perhaps the most (in)famous law that addresses these problems, but it is just one of many regulations that require long-term high-integrity retention of electronic records, all with the goal of ensuring that societal trust in business and government at reasonable cost.
In this talk, we will discuss some of the technical challenges posed by the need for "tamper-proof" retention of records. We will describe how industry has responded to these challenges, the security weaknesses in current product offerings, and the role that researchers and government can play in addressing these weaknesses. We will give an overview of research progress to date and describe the major open research problems in this area.
Information Management in the Cloud - Parallel Dataflow Programming Beyond Map/Reduce
Prof. Volker Markl. Technical University Berlin (TU-Berlin)
Volker Markl is a Full Professor and Chair of the Database Systems and Information Management (DIMA) group at the Technische Universität Berlin (TU-Berlin). Prior to joining TU-Berlin, Dr. Markl lead a research group at FORWISS, the Bavarian Research Center for Knowledge-based Systems in Munich, Germany, and was a research staff member and project leader at the IBM Almaden Research Center in San Jose, California, USA. His research interests include: information as a service, new hardware architectures for information management, information integration, autonomic computing, query processing, query optimization, data warehousing, electronic commerce, and pervasive computing.
Volker has presented over 100 invited talks in numerous industrial settings and at major conferences and research institutions worldwide. He has authored and published more than 50 research papers at world-class scientific venues. Volker regularly serves as member and chair for program committees of major international database conferences. He also is a member of the Board of Trustees of the VLDB Endowment. Volker has 5 patent awards, and he has submitted over 20 invention disclosures to date. Over the course of his career, he has garnered many prestigious awards, including the European Information Society and Technology Prize, an IBM Outstanding Technological Achievement Award, an IBM Shared University Research Grant, an HP Open Innovation Award, and the Pat Goldberg Memorial Best Paper Award.
Abstract: We have been researching a massively parallel data processor in the Stratosphere Research Unit, a DFG funded project among TU Berlin, FU Berlin, and HPI Potsdam. The research of the data management group at TU Berlin in this area focuses on a new flavor of data processors that goes beyond the popular map/reduce paradigm. We propose a programming model based on second order functions that describe what we call parallelization contracts (PACTs). PACTs are a generalization of the map/reduce programming model, extending it with additional higher order functions and output contracts that give guarantees about the behavior of a function. A PACT program is transformed into a data flow for a massively parallel execution engine, which executes its sequential building blocks in parallel and provides communication, synchronization and fault tolerance. The concept of PACTs allows the system to abstract parallelization from the specification of the data flow and thus enables several types of optimizations on the data flow. The system as a whole is as generic as map/reduce systems, but can provide higher performance through optimization and adaptation of the system to changes in the execution environment. Moreover, it enables the execution of tasks that traditional map/reduce systems cannot execute without mixing data flow program specification and parallelization, like joins, time-series analysis or data mining operations. We will present our research vision and preliminary research results that we have achieved during the last year. We will also highlight our research agenda for the upcoming year.
Challenges in High Dimensional Data Visualization
Prof. Kamal Karlapalem. International Institute of Information Technology, Hyderabad, India (IIITH)
Kamalakar Karlapalem is a faculty member at International Institute of Information Technology, Hyderabad, and heads the Centre for Data Engineering. His research spans the areas of database systems, data mining, workflow management systems, multi-agent systems, and data visualization. He with his students have participated in many academic competitions such as, RobOCup, VAST, TAC and got few awards. He has graduated eight PhD and thirty Masters by Research students.
He is an alumni of Indian Statistical Institute (M.Stat), IIT, Kharagpur (M.Tech) and Georgia Tech (Ph.D, 1992) and was a faculty member in computer science department at HKUST (1992-2000), before joining IIIT-Hyderabad.
Abstract: High dimensional real data sets need to be mined for applications such as, social networks, bio-informatics, and for many business critical applications. A major challenge for mining such data is lack of tools to comprehend both the data and patterns mined from the data. Traditionally, data visualization helps in comprehending the data and validating the mined patterns especially for two and three dimensional data.
The problem is to come up with tools and solutions for visualizing very high dimensional real data and the mined patterns. In this talk, I shall (i) present current approaches to address the problem, (ii) introduce three tools we have built - Heidi, Beads and CROVHD for visualizing high dimensional data, and (iii) list a set of open problems to be addressed.