Amazon Sumerian AI Features
User Interface Design of Crafting Voice User Interaction for a 3D Content Creation Tool
Amazon Sumerian, a VR, AR, and 3D content creation tool, lets users quickly add voice user interaction [VUI] to its digital characters using text-to-speech and conversation bots. These features are called Speech and Dialogue Components, respectively, and natively support Amazon Polly and Lex services, which are the machine learning technologies that drive the Amazon Echo. The focus of these features is to provide a simple and intuitive setup process for integrating other AWS services via Sumerian's inspector panel and enable the VUI design from either visual state machine editor or scripts to support customers with varying levels of programming expertise.
The Speech [text-to-speech] and Dialogue [conversation bot] components follow the entity component system structure that is used in Sumerian where each component controls a behavior. The Speech Component is in charge of text-to-speech and is a container for the speech files that support Speech Synthesis Markup Language [SSML], plain text, and Sumerian's custom emotes that play short character animations. By natively supporting Amazon Polly, we can dynamically sync our digital characters' speeches with their facial animations and body gestures.
The Dialogue Component allows users without programming skills to use a conversation bot in their projects. It provides the GUI to connect Amazon Lex to Sumerian for voice and text-based user interactions using a visual scripting tool or code. For advanced users who want to extend the capabilities of this component, we've made an example available that shows how to call the Amazon Lex API directly from inside Amazon Sumerian as well as how to use it in conjunction with the Web Audio API for real-time audio visualization and downloading mic recording.
Contact Please contact the Amazon Sumerian team for any questions related to Amazon Sumerian.
Amazon Sumerian | 2017