NASA STTR 2017 Solicitation

FORM B - PROPOSAL SUMMARY


PROPOSAL NUMBER: 171 T4.03-9857
RESEARCH SUBTOPIC TITLE: Coordination and Control of Swarms of Space Vehicles
PROPOSAL TITLE: Reinforcement Learning For Coordination And Control of Swarming Satellites

SMALL BUSINESS CONCERN (SBC): RESEARCH INSTITUTION (RI):
NAME: ASTER Labs, Inc. NAME: The Regents of the University of Minnesota
STREET: 155 East Owasso Lane STREET: 200 Oak Street S.E.
CITY: Shoreview CITY: Minneapolis
STATE/ZIP: MN  55126 - 3034 STATE/ZIP: MN  55455 - 2070
PHONE: (651) 484-2084 PHONE: (612) 624-5599

PRINCIPAL INVESTIGATOR/PROJECT MANAGER (Name, E-mail, Mail Address, City/State/Zip, Phone)
Dr. Suneel Ismail Sheikh
sheikh@asterlabs.com
155 East Owasso Lane
Shoreview, MN 55126 - 3034
(651) 484-2084

CORPORATE/BUSINESS OFFICIAL (Name, E-mail, Mail Address, City/State/Zip, Phone)
Dr. Suneel Ismail Sheikh
sheikh@asterlabs.com
155 East Owasso Lane
Shoreview, MN 55126 - 3034
(651) 484-2084

Estimated Technology Readiness Level (TRL) at beginning and end of contract:
Begin: 2
End: 3

Technology Available (TAV) Subtopics
Coordination and Control of Swarms of Space Vehicles is a Technology Available (TAV) subtopic that includes NASA Intellectual Property (IP). Do you plan to use the NASA IP under the award?
No

TECHNICAL ABSTRACT (Limit 2000 characters, approximately 200 words)
Inspired by frequent observation of repetitive learned swarm behavior exhibited in nature, this novel program will develop and demonstrate new capabilities in decentralized control of large heterogeneous vehicle swarms limited in communication, sensors, and actuators, with direct application to communication-less coordination. These goals are accomplished through the adaptation and use of Reinforcement Learning solutions to the optimal control problem. Reinforcement Learning approaches define a value function, which represents the total reward for possible actions at a given state, deriving a decentralized formulation for each agent in a Multi-Agent System. The proposal implements the policy gradient method for Reinforcement Learning applied to swarming spacecraft control. Three major tasks are proposed for the development of swarming space vehicle coordination and control: Approximate Optimal Control for Large Swarms, Communication-Less Swarm Coordination Implementation, and Human-Swarm Interactions via Supervised Reinforcement Learning. Algorithm development in Phase I will extend to a Centralized Optimal Control Solution, Inverse Reinforcement Learning for the Local Decentralized Problem, Model Free Learning, "Expert Solution" Conversions to the Local Modified Local Interaction, Inverse Learning for Behavior Determination and Classification, Hyman Designed Dynamic Reward Functions, and Keep Out Zone Models. Follow-on efforts will are proposed for full implementation of the Reinforcement Learning swarm technology for real-time integrated system use and mission integration, including laboratory demonstrations of small robotic units, and the development of flight-qualified software and hardware packages for full integrated technology demonstrations.

POTENTIAL NASA COMMERCIAL APPLICATIONS (Limit 1500 characters, approximately 150 words)
NASA applications consist of facilitating precise, autonomous coordination of swarms of space vehicles in Earth orbit and beyond LEO, eventually into deep space. Enhanced capabilities in precise vehicle control in Multi-Agent Systems is included. The Reinforcement Learning based swarming vehicle control algorithms and integrated software for on-board implementation for future planned and upcoming multi agent missions will provide reduction of mission risk, expanded mission planning and analysis capabilities, and significant reduction in inter-agent communications requirements. The system will offer significant value and cost savings by either augmenting or replacing current relative navigation and control technologies, and has potential for reduction in support costs and system station-keeping down-time. The proposed system's algorithmic approach and software capabilities holds key mission enabling and enhancing benefits for swarms of planetary rovers, Earth orbiting swarms, and exploration missions to asteroids and comets.

POTENTIAL NON-NASA COMMERCIAL APPLICATIONS (Limit 1500 characters, approximately 150 words)
Non-NASA applications for the proposed technology include significant increases in the coordination and control of large fleets of unmanned aerial systems. This has direct application to Search and Rescue operations conducted by local or municipal first response or Department of Homeland Security teams. Inter-team coordination of autonomous, robotic land, sea, and aerial vehicles are a further application, enhancing Department of Defense capabilities in reducing communication and relay limitations. Formation control via Reinforcement Learning can also benefit commercial telecommunications satellite providers maintaining growing constellations of vehicles operating as nodes in inter-satellite networks for high data rate transfers.

TECHNOLOGY TAXONOMY MAPPING (NASA's technology taxonomy has been developed by the SBIR-STTR program to disseminate awareness of proposed and awarded R/R&D in the agency. It is a listing of over 100 technologies, sorted into broad categories, of interest to NASA.)
Algorithms/Control Software & Systems (see also Autonomous Systems)
Autonomous Control (see also Control & Monitoring)
Entry, Descent, & Landing (see also Planetary Navigation, Tracking, & Telemetry)
Models & Simulations (see also Testing & Evaluation)
Navigation & Guidance
Ranging/Tracking
Relative Navigation (Interception, Docking, Formation Flying; see also Control & Monitoring; Planetary Navigation, Tracking, & Telemetry)
Robotics (see also Control & Monitoring; Sensors)
Software Tools (Analysis, Design)
Telemetry (see also Control & Monitoring)

Form Generated on 04-19-17 12:45