
Tina Blaine
Tina Blaine is on the faculty of Carnegie Mellon University's Entertainment Technology Center, where she explores new interface designs for gaming, collective music-making and interactive media. Blaine is a frequent guest artist on the topic of new media and the convergence of science, art and technology, and has written music for NPR, video games, TV and documentary soundtracks. Her musical interactivism has resulted in electronic MIDI controller instruments and audience participation devices for live performance, in addition to collaborative audiovisual tabletop systems such as the “JamODrum” and “Animateering” for public installation. Blaine’s work has been featured at the Experience Music Project in Seattle, SIGGRAPH's Emerging Technologies, Zeum in San Francisco, Ars Electronica’s Museum of the Future in Linz, Austria and Give Kids the World Resort in Orlando, Florida. She is currently exploring the sonic possibilities of glass rondels and is performing at the New Interfaces for Musical Expression conference.
By Tina Blaine
Developing interactive musical experiences is a new frontier in design. A variety of new interfaces are making music accessible to non-musicians in ways that give players access to a new community through their participation in shared musical encounters.
Recent years have brought a proliferation of inexpensive controller devices that enhance player interaction primarily with musical video games, mobile devices and other interactive entertainment play experiences. Camera peripherals, percussion controllers, dance pads, video games, cell phones, handheld toys, game controllers, and more, have been gaining in popularity. The related success of musical game experiences such as “Karaoke Revolution” and “Guitar Hero” by Harmonix Music Systems, Toshio Iwai’s Electroplankton for the Nintendo DS, and Sony’s Eye Toy rhythm action games are just a few examples indicative of this commercial crossover into mainstream culture.
Enabling technologies, coupled with the widespread availability of computers and low-cost sensors, makes the development of a wide array of new interfaces related to sound and music possible. Measuring changes in motion, light, gravity, pressure, velocity, skin conductivity or muscle tension are just a few of the ways that a player's gestural input can be turned into musical output. Designers of musical interfaces can also choose from an extensive selection of sensors, software and signal processing options. Joysticks, ultrasound, infrared, accelerometers, potentiometers, force-sensitive resistors, piezos, magnetic tags, and other sensor technologies are available to those interested in converting voltage data into MIDI or routing signals through sound synthesis systems such as Max/MSP™i, Pure Data (Pd), SuperCollider, or Open Sound World. Despite the infinite realm of technological opportunity, the challenge that remains is how to integrate and transform this apparatus into coherently designed, meaningful musical experiences with emotional depth.
Many designers seem ready for the challenge. Among the schools with an expanding array of academic course offerings in this realm are MIT, Stanford, Princeton, NYU, Brown, Tufts, CalArts, Carnegie Mellon, McGill and University of British Columbia. Additionally, an emerging community of designers and instrument builders has gathered at the New Interfaces for Musical Expression (NIME) conference to share their work and push the state of the art. Now in its sixth year, the conference will converge with the Agora/Resonance festivals at IRCAM in Paris, France June 4–8 2006.
In addition to the commercial entertainment sphere and the realm of academia, musical interface development is also taking place in the arena of the arts and performing arts. There are far too many practitioners and examples of historical work in this field to be comprehensively covered herein, but what follows is a brief sampling of projects from three of these areas, all demonstrating unconventional and innovative approaches to musical interaction.
Open-ended musical experiences such as “Electroplankton” on the Nintendo DS console offer a new genre of “in-the-moment” audiovisual interactions (Figure 1). In each of Electroplankton’s ten modes of play, pulsing graphics are tightly coupled with musical output in a water-based world. Using a stylus on the dual screen (DS) console to activate sound, players can click, draw, spin, or tap to create music. In turn, these actions allow players to affect the corresponding scale, instrumentation, looping or playback of single/multiple notes. These elegant musical vignettes were developed by Toshio Iwai, a Japanese multimedia artist better known to some for his extensive history of visual music making in installation-based works such as “Musical Insects,” “Piano as Image Media,” “Resonance of 4,” and “Composition on the Table” (Figure 2).
When developing experiences for mobile devices, casual games and public exhibitions intended for “walk-up and play” interactions, the designer must account for the limited amount of time that someone can spend learning an interface. The accessibility of this approach can come at the expense of limiting the musical range and possible gestures associated with creating sound
One of the reason Iwai’s sound and image based experiences are so well crafted, is his ability to seamlessly integrate complementary interactive design principles. Iwai’s designs are characterized by their easy-to-learn interfaces, while also providing an experience with enough depth and complexity to be continually engaging over time. The opportunity to customize experiences with real-time vocal samples using the built-in microphone input, offers the opportunity to customize playback in repeatable but slightly differing ways that lends to the addictive nature of these musical modes. Iwai’s work consistently reinforces a strong correlation between the player’s actions and audiovisual feedback.
Figure 1: Iwai’s interface design experience in mixed reality installations such as “Composition on the Table [PUSH]” has clearly informed and influenced this adaptation of “Luminaria” (Figure 2). © 1999 Toshio Iwai
Figure 2: Electroplankton’s “Luminaria” for the Nintendo DS Console. © 2005 Toshio Iwai / Nintendo
The majority of devices and musically oriented games on the market currently are predicated upon an emerging genre of gameplay known as “rhythm action” or beat matching. These types of games tend to have progressively more difficult levels of gameplay, based on combinations of increasing tempo and number of cues. With few exceptions, this genre prompts players to perform a series of physical and/or rhythmic actions by integrating player feedback via custom controllers in combination with an onscreen interface to play in time with a predetermined musical sequence.
In 2001, Harmonix Music Systems rocked the world of musical video games with their release of “Frequency” followed by “Amplitude” in 2003. In both games, players activate sequences of notes and musical tracks set in colorful futuristic settings with beat-driven button presses on a handheld game controller for the PlayStation2 (PS2). Harmonix followed these groundbreaking rhythm action games with the “Karaoke Revolution” (KR) series for Konami. Players are expected to sing at the appropriate time using a custom microphone controller by tracking the bars of varying heights and lengths that indicate pitch and duration of each note (Figure 3). In addition to an avatar onstage, the interface also displays feedback in the form of a song meter with an arrow to indicate how close to the pitch the player is singing. A “crowd response” meter gauges player performance accompanied by cheering or jeering from the virtual audience. KR3 and party games scale from solo or duet mode with two microphones, to several multiplayer offerings with up to eight players using a turn-taking protocol. In another twist inspired by the musical dance craze phenomenon “Dance Dance Revolution,” “KR Party” comes with a dance pad for “sing and dance” mode on the PS2, Xbox or GameCube consoles. Although the nature of this singing game is competitive, the KR games elicit a party atmosphere and provide mutually supportive environments for groups of players.
Figure 3: Karaoke Revolution Country Duet.
In their latest release, “Guitar Hero,” the Harmonix design team combined or extended successful elements from each of the previous games’ interface designs, including the virtual “crowd response” and added a dueling guitar mode for two players. Players strap on a guitar controller that features a whammy bar, five fret buttons, and a “strummer” switch alternative to strings. Onscreen, notes are color-coded to the fret buttons, chords are indicated by multiple colors, and players are required to match the correct fret button(s) with their left hand while “strumming” with the right. For the aspiring guitar god or goddess, a peak performance entails hitting a series of notes, sustaining long notes with the whammy bar to build up the power bar, then tilting the guitar vertically to trigger “Star Power”(Figure 4). Success results in the virtual audience, lighting, scoring, staging and distorted guitar effects all going wild (Figure 5). The experience of playing a guitar controller in GH or singing into a microphone in KR, enhances player performance and fuels the rock-star fantasies upon which these games are based.
Figure 4: Rocking out in “Guitar Hero.”
Figure 5: Virtual staging effects and audience response in GH based on player performance. GH photos appear “Courtesy of RedOctane”.
Sony’s EyeToyii USB camera peripheral leverages the popularity of music video games, without the need for a handheld controller, by tracking players’ motions and projecting their image onscreen. “Groove” is one of the more popular gesture-based rhythm games, requiring players to dance and wave their arms to “hit” graphical icons moving around the screen in time to the music (Figure 6). The EyeToy’s easy accessibility and intuitive interface encourages high levels of physicality combined with opportunities for shared gameplay experiences, which together help foster a sense of community. Limiting the musical range to pre-composed elements allows for greater participation and communication with other players. From a social perspective, therefore, the controller becomes a vehicle for communication that helps to develop intimacy amongst groups through the act of musical gameplay. Party games such as “Karaoke Revolution,” “Guitar Hero,” “Dance Dance Revolution,” and “Groove,” thrive on feedback, as do most musicians with a responsive audience. Almost not released at all, the surprisingly widespread appeal of this device and subsequent expansion of the Eye-Toy’s gaming demographic far exceeded Sony’s expectations.
Figure 6: Sony’s EyeToy “Groove” Interface supports single or multiplayer modes.
Experiments in subjective mappings between gesture and sound that empower the player might be traced back to the early 1920’s when the first free gesture interface was invented by the Russian physicist, Leon Theremin. Although the Theremin was wildly popular, mastering its “invisible” interface by positioning two hands in the air to control amplitude and pitch was extremely difficult. In the 1950’s Robert Moog revitalized the waning interest in Theremins by starting a small business to build Theremin kits prior to becoming the inventor of the Moog synthesizeriii.
In the world of theater and dance, a new generation of hybrid dance artists seeking to influence the sonic and theatrical elements previously out of their control through movement alone, are actively collaborating with composers and technologists. Enhancing the illusion of control can also be achieved by creating a highly responsive system based on the dancers’ motions using supplemental effects such as lighting, visual imagery, sonic feedback, and more. These same principles can be applied to any interactive system based on user input.
Computer vision is often used to detect motion by measuring changes in light, converting the data in Max/MSP to MIDI and routing output to a sound source or other special effect. Cameras can also be positioned to detect the location and speed of the dancers’ movement. Mapping or translating those actions to affect the mood of a performance via lighting, rhythmic motifs, musical generation, or effects processing is a fundamental goal of this research. Under the direction of Antonio Camurri at the University of Genova’s Lab of Musical Informatics, the EyesWeb project uses computer vision to create a musical language to capture and differentiate between expressive gestures . Their free real-time software runs on a PC and, with only one videocamera, is able to distinguish the amount of movement exerted by a dancer, how fast and fluid their movements are, and recognize the overall utilization of space (Figure 7).
Figure 7: EyesWeb Visual representation of “trails” of a dancer’s movement.
Choreographer Dawn Stoppiello and Mark Coniglio, who have been toying with MIDI since they were students at CalArts in the mid 1980’s, have developed their own wireless flex sensor system called “MidiDancer.” Used in combination with Isadora, a custom software program written by Coniglio, they are able to interpret the incoming data to control a laserweb of lights, sound, digital cameras and projections. Ultimately, interpreting and translating this data into sonic and timbral effects that read to an audience is challenging and must be carefully mapped to create a direct relationship between gesture and musical intent. Their ensemble, known as TroikaRanch, does this rather well, as it has been actively crafting digital dance works for more than fifteen years and is currently featuring a piece called “16 (R)evolutions.”
Todd Winkler is another veteran composer and multimedia/installation artist with a history in motion-sensing music at Brown University. Winkler has collaborated with various dancer/choreographers to create a series of interactive media works using Dave Rokeby’s Very Nervous System (VNS) to control the manipulation of sound and visual elements of the performance. Some of his installation-based works, such as “Light Around the Edges” operate on a similar premise where the participants’ motions and gestures are tracked, but scale to the number of people in the sensing space to control audiovisual feedback. By changing modes and levels of interactivity in response to the number of people present, Winkler creates a sonic architecture that acknowledges that social interactions within space are just as, if not more, important as the methods for generating sound.
At Rensselaer Polytechnic Institute (RPI), composer/ programmer Curtis Bahn and dancer/ ethnomusicologist Tomie Hahn conjured a collaboration called “Pikapika” (Figure 8). Hahn’s many years of traditional Japanese dance training fed into the project in which her actions are sonified and accentuated by hand sensors that measure tilt and pressure. She sports a microcontroller and radio transmitter on her hips to translate and broadcast the midi data to a computer running Max/MSP. Her movements are transformed into sonic material that is sent to speakers embedded on each arm. The speakers receive the sounds and broadcast them back onto her body. In another performance piece called “Streams”, the team set out to “compose the body” with a custom Sensor/Speaker Performance interface combined with spherical speakers and data mappings determined by Hahn’s improvisational movements. Streams used voice and physical modeling of natural sound systems produced with Perry Cook’s Synthesis Tool Kit (STK) and Dan Trueman and Luke DuBois’ PeRColate for Max/MSP. The net result? Hahn said she was often unsure whether she was dancing the music, or the music was moving the dance.
Figure 8: Tomie Hahn performing “Pikapika.” Photo by Curtis Bahn.
In January 2006, the UK based company Sonalog announced the first commercially available motion capture MIDI controller called GypsyMIDI (Figure 9). Featuring a clunky but customizable wireless exoskeletal suit for the upper body, the hardware features six rotational sensors per arm and plugs directly into a MIDI interface. The accompanying eXo-software comes configured to work with a number of commercially available digital audio applications such as Logic, Cubase, Reason, ProTools and DJ Traktor as well as Autodesk’s MotionBuilder software. EXo allows the performer to configure MIDI parameters to trigger notes, melodies, or map their arm movements for discrete or continuous control of volume, pitchbend, modulation, cross-fades or other real-time effects, and to instantly become a dancing DJ.
Figure 9: Gypsy Motion Capture MIDI Controller.
The rising popularity of robotic art coupled with increasingly complex musical compositions that are difficult if not impossible for humans to play, has led to several recent intriguing collaborations. Paul Lehrman, professor at Tufts University and Eric Singer, founder of the League of Electronic Musical Urban Robots (LEMUR) joined forces to create a contemporary version of composer George Antheil’s 1926 composition of “Le Ballet Mécanique.” Originally envisioned for sixteen player pianos, three xylophones, four bass drums, seven bells, tam-tam, three airplane propellers and a siren (Figures 10 and 11), the composition was ahead of its time. Lehrman and Singer’s version uses sixteen baby grand computer controlled pianos instead of the player pianos and industrial fans in place of the airplane propellers. The remaining instruments have been surgically enhanced with a variety of solenoids, beaters and other mechanical components to create an autonomous robot orchestra. Daily performances of the resulting glorious cacophony—integrating industrial, atonal and jazz elements along with Dadaist history—are currently on display at the National Gallery of Art in Washington, DC.
Figures 10 and 11: Modern Adaptations of instruments in “Ballet Mécanique.” Photos by Charles Amirkhanian.
Singer’s collaborations with other member of the LEMUR tribe have resulted in an ensemble of robotic instruments that “play themselves.” The “GuitarBot,” “TibetBot,” “ForestBot” and the “!rBot” conjure up the quasi-organic nature of their inspiration, materiality and expected musical output. For example, the “TibetBot” is made from a series of Tibetan “singing” bowls that offer various combinations of atonal rhythms, rings and harmonic drones depending upon the strikes of the six robotic arms that hit either side of each bowl. Similarly, the “GuitarBot” is a modular midi-controlled device with four single-stringed microtunable slide guitar elements and a rotating pick/damper mechanism that can pluck melodies, play ostinatos and sustain or dampen notes (Figure 12).
Figure 12: Virtuoso violinist, Mari Kimura is a frequent performer with LEMUR’s “GuitarBot.” Photo by Eric Singer.
The underlying premise of many collaborative experiences is that, with various design constraints, playing music can be made accessible to non-musicians. However this goal should be counter-balanced by providing opportunities for extended exploration and more virtuosic expression. “Audiopad” allows multiple players to collaborate in the performance and interpretation of a composition. The system, developed by James Patten and Ben Recht at MIT’s Media Lab, tracks the movement of electronically tagged objects on a tabletop to convert motions into sound (Figure 13). Several players can engage with “Audiopad” simply by positioning objects around the table to activate playback of samples, loops, melodies, synth textures or combinations of these to make entirely new remixes. While the use of pre-determined musical events limits certain aspects of an individual’s creative control, it has the benefit of creating more cohesive sound spaces, particularly with multiple players. The straightforward interface makes it easy for novices to instantly make music without any instruction, but also supports more expert levels of play for those willing to invest time and practice. The “Audiopad” has been used in performance as part of the Motion Graphics Series at Museo Guggenheim Bilbao and has been a sound installation at SONAR and Ars Electronica.
Figure 13: “Audiopad’s” visual and tactile approach to musical interpretation. Photo by Mariliana Arvelo.
“ReacTable” is another project able to support novices in installation settings and expert players in performance. Where Audiopad uses antennas to track the movement of objects on the table, “ReacTable” has a video camera to track the movement of objects on its translucent surface (Figure 14). Dr. Sergi Jordà at the Universitat Pompeu Fabra in Barcelona Spain led a team in the development of this collaborative device that allows multiple performers to play together to compose what its creators call “audio topologies.” The team wanted to make the system easily accessible, sonically stimulating and learnable without any instruction. As with the “Audiopad,” simply moving objects around the table allows players to intuitively figure out the causal relationships between their actions and the resultant sound. Players also receive visual feedback based on their actions and amount of collective activity via a projector mounted under the table.
Figure 14: The orientation of objects on the “ReacTable” affects tempo, frequency and dynamics of music playback. Photo by Diemo Schwarz.
The audiovisual performance artist Golan Levin, in collaboration with Zachary Lieberman, excels in designing cleverly orchestrated musical experiences that range from intimate performance-oriented events to large-scale audience interactions. Their custom software created for “Messa di Voce” projects graphics that visually interpret live singers’ voices to distinguish dynamics and vocal nuance (Figure 15). Another experience called “Manual Input Sessions” explores the musical expressivity of hand gestures with a vision system that interprets the light generated by combining a low-tech overhead projector with video projection. Silhouettes of performers’ finger movements and gestures are projected and converted into sound to make a unique form of shadow play (Figure 16). Levin also led a team of artist/engineers in the creation of the “Dialtone Telesymphony” at Ars Electronica in 2001. Through a pre-registration process, a tonal sound palette was distributed to the participants’ cell phones along with a specified seating assignment. This allowed the composers to weave the audience together for a spatially distributed communal composition of sound and light with a free ringtone as a takeaway bonus.
Figure 15: Playful graphic interpretations of vocalizations in “Messa di Voce” by Golan Levin and Zachary Lieberman with Jaap Blonk and Joan La Barbara.
Figure 16: Making sounds from shapes in The “Manual Input Sessions” by Golan Levin and Zachary Lieberman.
One of the new acronyms bouncing around the mobile gaming community at the 2006 Game Developer’s Conference in San Jose, CA was mdME; music-driven Mobile Entertainment. Jennifer Hruska, president of Sonic Network showed off a few applications in development including a camera phone that scans colors and converts them into sound. She and co-developer Shayne Guiliano also won a prize from Motorola for “Pyrosonix,™” a Java-based rhythm game developed for use with Motorola’s RAZR, PEBL and SLVR handsets. As the name implies, “PyroSonix™” is based on the metaphor of “catching music” by snatching fiery sonic objects that rhythmically race across player’s handset screens. Developing mobile music-making applications is primed to be one of the next frontiers for innovation and exploration.
A new wave of musical interface design has embraced the act of participation and extended the potential community of players to include people with or without musical training. Accordingly, the role of music in the design process has moved away from traditional metrics based on classical training and toward metrics that involve the player experience. Designing easily accessible musical experiences that can sustain continued exploration is an ongoing challenge of this emerging field. As collective knowledge about how to achieve greater depth, creativity, expressivity, and emotion is accumulated, it can also be applied to other areas of design and interactive experience still lacking in these qualities.
iMax/MSP is a trademark of Cycling ’74, 379A Clementina Street, San Francisco, CA 94103 USA.
iiMaher, D. EyeToy Story: Part One. November 2003.
iiiThe MiniMoog was the first easily transportable synthesizer, which led not only to its commercial success but paved the way for an industry that still emulates many of Moog’s early ideas.
Causality: Creating a strong correlation between a player’s actions and immediate feedback for player and if relevant, the audience. The perception of cause and effect is progressively more difficult to distinguish as the number of participants that engage with a system increases.
Affordance: The ways in which a physical interface and sensors are integrated are of primary importance as they provide the affordances that make the interaction obvious, particularly to the novice. Whenever possible, design with materials and interactions that are self-revealing and convey what the player is supposed to do.
Mapping: Natural mapping behaviors evolve from the creation of a direct relationship between gesture and musical intent. Enhancing the illusion of control can also be achieved with supplemental effects such as lighting, visual imagery, tactile feedback and more, to create a highly responsive system based on player input.
Usability/Learnability: Supporting a broad range of musical possibilities for expressive performance is often tempered by restricting the range of musical possibilities available to the player through computer-mediation to enable ease of learning. The learning curve must balance simplicity of the interface to make it readily accessible and easily learnable for the intended demographic, while also creating an experience with enough depth and complexity to be continually engaging over time.
Musical Range: In multiplayer environments, limiting the musical range allows for greater participation and communication with the instruments and other participants. While the use of pre-composed musical events or sequences severely limits certain aspects of an individual's creative control, it has the benefit of creating more cohesive sound spaces.
Scalability: The number of players greatly influences the type of interface and music that is appropriate for a shared experience. An interface built for one person is generally quite different from one designed to accommodate multiple players. When considering scale, factors such as turn-taking protocols and gesture-sound correspondences should shift as the number of players increase.
Repeatability: Experiences should be responsive in a consistent way but also offer subtle changes in feedback based on player performance that encourage continued exploration and repeated use.
Due to growing interest in this field, a number of universities around the world are expanding their music/engineering/technology offerings to include courses and workshops in programming, electronics, interaction design, digital audio, physical modeling, DSP and interactive music. Joe Paradiso and Tod Machover at MIT’s Media Lab have an extensive history of innovations in musical interface design, including Hyperinstruments, controller development and the development of responsive environments. Perry Cook is the head of the Sound Lab at Princeton with a focus on sound synthesis techniques and controllers to augment human performance. Dave Wessel, Adrian Freed and Matt Wright at the Center for New Music and Audio Technologies (CNMAT) at UC Berkeley conduct interdisciplinary research to explore the intersection of music and technology. Max Matthews and Bill Verplank offer courses in Human Computer Interaction theory, design and controllers at Stanford’s Center for Computer Research in Music and Acoustics (CCRMA). Two-week summer workshops include C programming for Atmel AVR microcontrollers, PD and/or Max/MSP for music synthesis are frequently offered at Stanford and Banff Centre for the Arts. Gideon D’Arcangelo, an interactive media artist and purveyor of “walkman busting” offers a New Interfaces for Musical Expression (NIME) course at NYU’s Interactive Telecommunications Program, where Luke DuBois also teaches "Live Image Processing and Performance" (LIPP). They join forces at the end of the semester with a showcase of student work at Tonic in New York City. Paul Lehrman teaches a course at Tufts University in Electronic Musical Instrument Design. Todd Winkler and Butch Rovan codirect the Multimedia & Electronic Music Experiments (MEME) at Brown University. Golan Levin, Roger Dannenberg, and the author can be found straddling the Art, Computer Science and Entertainment Technology departments at Carnegie Mellon University. In Canada, Sidney Fels directs the Media and Graphics Interdisciplinary Centre (MAGIC) at the University of British Columbia in Vancouver with a focus in human-communication technologies, musical interfaces and multimedia, Andrew Schloss investigates new musical instruments, interactive performance and improvisation at the University of Victoria, and McGill University in Montreal offers an extensive music technology program that includes the Input Devices and Music Interaction Laboratory (IDMIL) and the collective influence of Marcelo Wanderley, Gary Scavone and Daniel Levitin on the faculty. At the University of Western Sydney in Australia, Garth Paine develops interactive installations for public spaces in the School of Communication Arts. Other schools with programs investigating electronic musical interfaces, installations, performance, research and development include: CalArts, Mills College, UCLA, Rennselaer Polytechnic Institute’s Electronic Media, Arts and Communication (EMAC), University of Limerick in Ireland, Universitat Pompeu Fabra in Barcelona Spain, University of Genova’s Lab of Musical Informatics, and Shizuoka University of Arts and Culture in Hamamatsu (SUAC), Japan. The Studio for Electro Instrumental Music (STEIM) in Amsterdam offers concerts, workshops and residencies to artists in the electronic performing arts. In Paris, the Institut de Recherche et Coordination Acoustique/Musique (IRCAM) offers internships to students from postgraduate and engineering studies to better understand music-related issues.
There are many trailblazers who helped launch this area of practice, many of whom still actively perform, direct and/or teach in a variety of interdisciplinary electronic arts/media/music/theater programs. Unfortunately, it was not possible to cover all of their contributions.