NASA SBIR 2004 Solicitation

FORM B - PROPOSAL SUMMARY


PROPOSAL NUMBER: 04 E3.01-7624
SUBTOPIC TITLE: Automation and Planning
PROPOSAL TITLE: Taxonomy Enabled Discovery (TED)

SMALL BUSINESS CONCERN (Name, E-mail, Mail Address, City/State/Zip, Phone)
Inxight Software Inc
500 Macara Ave
Sunnyvale, CA 94085-2807
(408)738-6200

PRINCIPAL INVESTIGATOR/PROJECT MANAGER (Name, E-mail, Mail Address, City/State/Zip, Phone)
Ramana Rao
rrao@inxight.com
500 Macara Ave
Sunnyvale, CA 94085-2807
(408)738-6200

TECHNICAL ABSTRACT (LIMIT 200 WORDS)
The proposal addresses the NASA's need to enable scientific discovery and the topic's requirements for: processing large volumes of data, commonly available on the Internet, into useful information; intelligent search of large, distributed data archives and data discovery through searches of heterogeneous data sets and architectures; and search agents that support the use of NASA data. A precondition for data discovery in large distributed data environments, is the accurate and consistent characterization of the data stored in the archives. To accurately and consistently characterize data requires an enterprise policy and process for tagging data with metadata. Our proposal for a Taxonomy Enabled Discovery system (TED) provides a process and technology that assists and automates the process of generating and harvesting metadata. The approach employs a highly innovative taxonomy management platform, based on a hybrid of linguistic, statistical, machine learning, and advanced visualization techniques, enhanced with NASA data, supporting open metadata standards and a grid architecture. We demonstrate the feasibility of our approach in a NASA NTRS OAI-PMH (Open Archives Initiative ? Protocol for Metadata Harvesting) environment and prototype.

POTENTIAL NASA COMMERCIAL APPLICATIONS (LIMIT 100 WORDS)
Potential NASA Applications: Such a system would have broad applicability across NASA. STI (Science and Technical Information) NTRS (National Technical Report Server) network of distributed servers is the primary target. Other applications could include enhanced text mining for applications such as ASRS (Aviation Safety Reporting System), the ExpertFinder database, or the PLLS database (Public Lessons Learned System). More generally TED could enhance any application that processes and stores unstructured content, such as Web Content Management Systems, Document Management Systems, and Email Systems.

POTENTIAL NON-NASA COMMERCIAL APPLICATIONS (LIMIT 100 WORDS)
Potential Non-NASA Commercial Applications: TED could provide enhanced processing of unstructured data in a wide range of enterprise systems including document management, Web content management, email, information retrieval, and knowledge management systems.