![]() AboutEffectively using parallel resources such as Cloud infrastructures is a challenge. Deployed software is required to scale well with available resources, and has to be resistant to faults, which are prevalent in large-scale distributed systems. Writing such software is hard for trained computer scientists, yet alone domain scientists whose background is not in distributed computing.
In the RUBIX project, we strive to provide a programming framework for the bio-informatics domain to allow scientists to write scalable applications without detailed knowledge in distributed computing. The RUBIX programing model is based on the Datalog language. As such, RUBIX programs do not specify how a computation is performed, but what should be computed. This declarative approach provides ample opportunities for automatic parallelization and fault-tolerance. A key aspect of RUBIX is that the algorithms used for parallelization and fault-tolerance are specified in RUBIX itself.
The RUBIX parallelization strategies will be implemented as extensions to the DatalogLB language developed by LogicBlox. Besides using this mature database engine, LogicBlox provides a second class of use-cases from the business domain. We will design RUBIX such that it is applicable for both bioinformatics use-cases as well as for applications in the retail industry, and we hope that the general principles we discover are applicable to other areas as well.
News A poster describing our work-in-progress was presented at the NorCal database day (link). It is available here. Team
Research and Development:
Science and Technology Advisors:
Bertram Ludaescher (UC Davis Genomecenter)
Todd J. Green (CS Dept. UC Davis)
Martin Bravenboer (LogicBlox)
Shan Shan Huang (LogicBlox)
Collaborations
The RUBIX project is funded by members of the UC Davis GenomeCenter as well as LogicBlox.
|
