Graphic representing big data

National Science Foundation renews support of AsterixDB

The ongoing collaboration between UC Riverside and UC Irvine has produced a one-of-a-kind open source database system

September 11, 2019
Author: Holly Ober
September 11, 2019

The National Science Foundation (NSF), renewed its support of AsterixDB, an ongoing collaboration between UC Riverside and UC Irvine database researchers, with a $2 million grant. 

Vassilis Tsotras

Vassilis Tsotras and Ahmed Eldawy, professors of computer science and engineering in UC Riverside’s Marlan and Rosemary Bourns College of Engineering, will receive $860,000, and Michael Carey and Chen Li in UC Irvine’s Donald Bren School of Information and Computer Sciences will receive $1.14 million.

Ahmed Eldawy

Apache AsterixDB is a highly scalable big data management system that stores, indexes, and manages large volumes of structured and semistructured data. At the same time, it supports a full query language with the expressiveness of the database programming language SQL, and more. The new grant sustains Apache AsterixDB’s development as a resource for the NSF Computer and Information Science and Engineering research community.

The funds will enable a variety of enhancements, including improved text handling and query processing, additional standard-based geospatial data support, new user-defined function support for user-provided logic, and upgraded system storage and indexing capabilities.

The origins of this work go back a decade, to 2009, when a team of database researchers from three UC campuses (UCI, UCR, and UCSD) first embarked on the NSF-funded ASTERIX research project. The team, led by Carey at UC Irvine and Tsotras at UC Riverside, wanted to define — and also build and share — a next-generation data management platform. Their goal was to improve database storage and queries by bringing parallel database technology to bear on the emerging world of “big data,” which was new at that time. In addition to handling semistructured data, the new database system would have out-of-the-box support for temporal, spatial, and textual data. 

The result, now an Apache project, is the only open source parallel NoSQL database system available today. NoSQL is a database programming language especially useful for semistructured data.

Thumbnail/header image: DARPA (Wikimedia Commons).