NASA SBIR 2004 Solicitation


PROPOSAL NUMBER: 04 S1.05-9852
SUBTOPIC TITLE: Information Technology for Sun-Earth Connection Missions
PROPOSAL TITLE: Computing Infrastructure and Remote, Parallel Data Mining Engine for Virtual Observatories

SMALL BUSINESS CONCERN (Name, E-mail, Mail Address, City/State/Zip, Phone)
777 South Highway 101, Suite 108
Solana Beach, CA 92075-2623

PRINCIPAL INVESTIGATOR/PROJECT MANAGER (Name, E-mail, Mail Address, City/State/Zip, Phone)
homa karimabadi
777 South Highway 101, Suite 108
Solana Beach, CA 92075-2623

We propose to develop a state-of-the-art data mining engine that extends the functionality of Virtual Observatories (VO) from data portal to science analysis resource. Our solution consists of two integrated products, IDDat and RemoteMiner:

(1) IDDat is an advanced grid-based computing infrastructure which acts as an add-on to VOs and supports processing and remote data analysis of widely distributed data in space sciences. IDDat middleware design is such as to reduce undue network traffic on the VO.

(2) RemoteMiner is a novel data mining engine that connects to the VO via the IDDat. It supports multi-users, has autonomous operation for automated systematic identification while enabling the advanced users to do their own mining and can be used by data centers for pre-mining.

These innovations will significantly enhance the science return from NASA missions by providing data centers and individual researchers alike an unprecedented capability to mine vast quantities of data. Phase I is aimed at complete definition of the design of the product and a demonstration of a prototype of the proposed major innovations. Phase II work will encompass the building of a full commercial product with associated production quality technical and user documentation.

NASA is a data centric organization and as such shares with many industries an urgent need for sophisticated data mining technologies to deal with the tsunami of data. Today, the vast majority of spacecraft data from past missions remain unexplored and this situation will worsen with the many planned multi-spacecraft missions (Themis, MMS, ST5, etc.). Our proposed solution provides the necessary data analysis infrastructure and tools for the existing and the planned missions. It leverages on-going efforts in the grid computing community. Our technology is also expected to be relevant to other divisions within NASA such as the Intelligent Systems Project which supports development of autonomous systems.

All industries that deal with data are potential customers of our product. No commercial data mining engine offers all of these facilities and a few systems support only a small fraction of the solution. Given the customization of our solution to VO, NASA will clearly remain one of our main target areas beyond Phase II. However, we have already identified several other important markets for deployment of our product including NSF, and DOE within the Federal Government as well as pharmaceutical, bioinformatics, health care, fraud detection and network intrusion detection in the commercial sector.