[Meeting Report logo]

Audio for Interaction at Microsoft

Seven photos of presenters and other stuff.

Report from contributions by Bob Moses, Vince Dayton, Aurika Hays of the AES PNW Section

The AES Pacific Northwest Section held a meeting at Microsoft on December 2 to learn about the company's latest developments in "Audio for Interaction." The speakers were David Thiel, Robin Goldstein, Ken Greenebaum, and David Yackley. The meeting began with an overview of Audio for Interaction by David Thiel. David explained how computers enable a new type of media. Traditional media has a linear structure where events happen sequentially over time. Computers enable non-linear interaction with an infinite number of possible experiences, based on the user's input. Decisions are made by software at run time, rather than design time, which places new challenges on the process by which this software is developed. Audio engineering for interactive applications is similar to live sound production because you have to react in real-time to situations and events that happen unexpectedly.

Interactive audio is a new field, and promises to be an area in which practitioners may focus their careers. In relative terms, Audio for Interaction is in its infancy - the movie industry took 20 years to figure out what to do with film. Audio for Interaction is even more complicated than film because it introduces a non-linear structure, and David does not believe we will see "maturity" in this field for decades. This opens the door for a lot of creative innovation. We hope the many AIS (Art Institute of Seattle) students in attendance took note of that.

Robin Goldstein talked about her efforts to develop a "DirectSound Design Studio" (DS2). DS2 is a set of authoring tools, API's (Application Programming Interfaces) and runtime components for constructing dynamic audio soundscapes used in interactive media products. DS2 provides a comprehensive sound design environment for interactive media products. DS2 wraps DirectSound for use in web applications or CD-ROM products. Architecturally, it provides reusable sound design components, and performs most housekeeping and I/O related functions. Robin claims DS2 puts more power in the hands of sound designers, and even enables non-programmers to design and test complex interactive soundscapes. In theory, this reduces iterative cycles between sound designer and developer, reduces code complexity, and therefore reduces costs of both design and development. Robin briefly demonstrated the typical workflow process that a developer would use DS2 to create a soundscape. She also showed several examples of the DS2 tools in action, and then turned the meeting over to Ken Greenebaum.

Ken talked about DirectAnimation, which is an API that is already implemented and available in Windows 98 and in Internet Explorer 4. He gave a presentation of the technology available now and also pointed to future directions in the improvements that are planned. DirectAnimation is a subset API of the DirectX technology that Microsoft has been working on for multimedia developers. With DirectAnimation, a programmer now has available the object technology (i.e. components) that is self contained and supports the COM (Component Object Model) standard interface. This allows developers to provide full featured, interactive, animated components, that make for much more engaging applications for the end users. Since the technology is built upon the COM foundation, a developer using DirectAnimation components can make them available to other applications, thereby providing reusable code and extending the feature set of new applications written by new developers.

Next came David Yackley who presented DirectMusic, which is the newest DirectX technology. DirectMusic is designed to provide a complete music solution to web site designers and title developers. David demonstrated DirectMusic Producer, which is an application that sound designers can use to create multimedia products. Some key features of this application are: Down Loadable Sounds (DLS), integrated editors for all of the DirectMusic objects such as styles, personalities, templates, and DLS instruments and collections, and real-time context sensitive music creation. The DirectMusic engine can create musical transitions on the fly in order to allow a smooth transition from one scene or page to another. Musical motifs can be played in response to a click or selection. These musical embellishments are synchronized rhythmically and harmonically with the performance. In addition, mood changes, tempo changes, intensity changes, and volume changes can happen in real-time in response to the users' actions. One of the more interesting features of DirectX technology is the ability of a graphical object to not only have audio specific properties, but to make those properties context sensitive to the real-time environment of the application using these objects. As an example: if a gamer was moving through one virtual environment then he or she went through a door (or some other portal) into another environment, any graphical objects with their associated sounds (like a gun or maybe an avatar) could change according the requirements of the new environment. Moving from inside a small vehicle to getting out of the vehicle, walking across a field, and entering a cave, would all have different sounding environments. These environmental properties can be passed to an object which will in turn respond with an appropriate sound and behavior.

David Thiel came back and closed the evening with a discussion of his "My Interactive Sound System" (MISS) technology. David's work is part of Microsoft's research arm, which has no requirement to release products. MISS is not a product and consequently David's presentation was more conceptual than practical. MISS provides services for the unification of the sound effects, music, and dialog in a soundtrack. David demonstrated the process in which he used procedural programming to create a pre-production version of soundtrack behavior. MISS allows the separation of the programming process from the sound design process, which allows practitioners to work more efficiently.

Overall the evening was rich with many issues and challenges concerning audio for interaction.

Back to PNW Recaps