NASA SBIR 2009 Solicitation


PROPOSAL NUMBER: 09-1 S6.04-9644
SUBTOPIC TITLE: Data Management - Storage, Mining and Visualization
PROPOSAL TITLE: Towards Efficient Scientific Data Management using Cloud Storage

Open Research, Inc.
104 Fountain Green Lane
Gaithersburg, MD 20878 - 7851
(240) 751-4526

Qiming He
104 Fountain Green Lane
Gaithersburg, MD 20878 - 7851
(301) 525-6612

Begin: 4
End: 6

Building more in-house datacenters to backup explosively growing scientific datasets is neither cost-effective nor in line with government green initiative. Cloud computing is emerging as a viable platform for data storage, collaboration and disaster recovery. We are going to develop a suite of "backup-to-cloud" tools that allows user to backup scientific datasets and applications into the cloud, and use cloud storage as a distribution platform. Our tool is optimized under technical and economical constraints posed by common cloud storage. We use both public and private cloud platforms to conduct feasibility study from performance, security, scalability and cost perspectives.

Many NASA scientific applications are data-intensive, i.e., generating sheer amount of data on a regular basis. Data backup is a mandatory requirement for some valuable datasets. In the past, NASA has patterned with 3rd-party vendors to setup remote backup. The deliverable of this proposal will provide NASA with a suite of automated self-servicing data management tools using cloud as data store. Using our deliverable, NASA can 1) backup its valuable datasets off-site reliably, securely and economically, without dedicated personnel and facilities; 2) accelerate disaster recovery of its online applications in the cloud in a timely fashion; 3) distribute large scientific datasets to a large population of users with minimal cost incurred.

Some "backup-to-cloud" tools have been offered on the market. Unlike most of these tools which are designed for individual use and optimized for random access, our deliverable is optimized for bandwidth and space efficiency and will be more useful for non-NASA users with extremely large datasets. Enterprise with growing backup demand can use our deliverable to archive their datasets into the cloud and cut IT operational cost.

Architectures and Networks
Computer System Architectures
Data Acquisition and End-to-End-Management

