Adobe
Products
Acrobat
Creative Cloud
Creative Suite
Digital Marketing Suite
Digital Publishing Suite
Elements
Photoshop
Touch Apps
Student and Teacher Editions
More products
Solutions
Digital marketing
Digital media
Education
Financial services
Government
Web Experience Management
More solutions
Learning Help Downloads Company
Buy
Home use for personal and home office
Education for students, educators, and staff
Business for small and medium businesses
Licensing programs for businesses, schools, and government
Special offers
Search
 
Info Sign in
Welcome,
My cart
My orders My Adobe
My Adobe
My orders
My information
My preferences
My products and services
Sign out
Why sign in? Sign in to manage your account and access trial downloads, product extensions, community areas, and more.
Adobe
Products Sections Buy   Search  
Solutions Company
Help Learning
Sign in Sign out My orders My Adobe
Preorder Estimated Availability Date. Your credit card will not be charged until the product is shipped. Estimated availability date is subject to change. Preorder Estimated Availability Date. Your credit card will not be charged until the product is ready to download. Estimated availability date is subject to change.
Qty:
Purchase requires verification of academic eligibility
Subtotal
Review and Checkout
Adobe Developer Connection / HTML5, CSS3, and JavaScript /

JavaScript motion detection

by Romuald Quantin

Romuald Quantin
  • @soundstep
  • soundstep.com

Content

  • HTML5 getUserMedia API
  • Blend mode difference
  • Assets preparation
  • JavaScript motion detection
  • Where to go from here

Modified

12 June 2012

Page tools

Share on Facebook
Share on Twitter
Share on LinkedIn
Bookmark
Print
HTML5 JavaScript jQuery mobile

Requirements

Prerequisite knowledge

JavaScript beginner/intermediate, HTML5 beginner/intermediate and a basic knowledge of jQuery. Due to its use of the getUserMedia and audio APIs, the demo requires Chrome Canary and must be run via a local web server.

User level

All

Sample File

  • javascript_motion_detection_soundstep.zip

Additional required other products (third-party/labs/open source)

  • Chrome Canary
  • A local web server, such as one of the following:
    • Xampp (Mac, Windows, Linux)
    • Mamp (Mac)
  • Video encoder for the webm file format, such as:
    • Online ConVert Video converter to convert to the WebM format (VP8)
  • Video encoder for the mp4 file format, such as one of the following:
    • Adobe Media Encoder
    • Adobe After Effects
    • HandBrake
  • Video encoder for the ogg file format, such as:
    • Easy HTML5 Video

In this article, I discuss how to detect a user movement using a webcam stream in JavaScript. The demo shows a video gathered from the user webcam. The user can play notes on a xylophone built in HTML, using their own physical movements, and in real time. The demo is based on two HTML5 “works in progress”: the getUserMedia API displays the user's webcam, and the Audio API plays the xylophone notes.

I also show how to use a blend mode to capture the user's movement. Blend modes are a common feature in languages that have a graphics API and almost any graphics software. To quote Wikipedia about blend modes,  "Blend modes in digital image editing are used to determine how two Layers are blended into each other." Blend modes are not natively supported in JavaScript but, as it is nothing more than a mathematical operation between pixels, I create a blend mode "difference".

I made a demo to show where web technologies are heading. JavaScript and HTML5 will provide tools for new types of interactive web applications. Exciting!

HTML5 getUserMedia API

As of the writing of this article, the getUserMedia API is still a work in progress. With the fifth major release of HTML by the W3C, there has been a surge of new APIs offering access to native hardware devices. The navigator.getUserMediaAPI() API provides the tools to enable a website to capture audio and video.

Many articles on different blogs cover the basics of how to use the API. For that reason, I do not explain it too much in detail. At the end of this article is a list of useful links about getUserMedia if you wish to learn more about it. Here, I show you how I enabled it for this demo.

Today, the getUserMedia API is usable only in Opera 12 and Chrome Canary, both of which are not public release yet. For this demo, you must use Chrome Canary because it supports the AudioContext to play the xylophone notes. AudioContext is not supported by Opera.

Once Chrome Canary is installed and launched, enable the API. In the address bar, type: about:flags. Under the Enable MediaStream option, click the toggle to enable the API.

Figure 1. Enable MediaStream.
Figure 1. Enable MediaStream.

Lastly, due to security restrictions, you must run the sample files within your local web server as video camera access is denied for local file:/// access.

Now that everything is ready and enabled, add a video tag to play the webcam stream. The stream received is be attached to the src property of the video tag using JavaScript.

<video id="webcam" autoplay width="640" height="480"></video>

In the JavaScript, you go through two steps. The first one is to find out if the user's browser can use the getUserMedia API.

function hasGetUserMedia() { return !!(navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia || navigator.msGetUserMedia); }

The second step is to try to get the stream from the user's webcam.

Note: I’m using jQuery in this article and in the demo, but feel free to use any selector you like, such as a native document.querySelector.

var webcamError = function(e) { alert('Webcam error!', e); }; var video = $('#webcam')[0]; if (navigator.getUserMedia) { navigator.getUserMedia({audio: true, video: true}, function(stream) { video.src = stream; }, webcamError); } else if (navigator.webkitGetUserMedia) { navigator.webkitGetUserMedia({audio:true, video:true}, function(stream) { video.src = window.webkitURL.createObjectURL(stream); }, webcamError); } else { //video.src = 'video.webm'; // fallback. }

You are now ready to display a webcam stream in a HTML page. The next section provides an overview of the structure and assets used in the demo.

If you hope to use getUserMedia in production, I recommend following Addy Osmani’s work. Addy has created a shim for the getUserMedia API with a Flash fallback.

Blend mode difference

The detection of the user’s movement is performed using a blend mode difference.

Blending two images using difference is nothing more than substracting pixels values. In the Wikipedia article about blend modes is a description of difference: “Difference subtracts the top layer from the bottom layer or the other way round, to always get a positive value. Blending with black produces no change, as values for all colors are 0.”

To perform this operation, you need two images. The code loops over each pixel of the image, substracts each color channel from the first image to each color channel of the second image.

For example, from two red images, the color channels at every pixel in both images is:

- red: 255 (0xFF)

- green: 0

- blue: 0

The following operations substract the colors values from these images:

- red: 255 – 255 = 0

- green: 0 – 0 = 0

- blue: 0 – 0 = 0

In other words, applying a blend mode difference on two identical images produces a black image. Let’s discuss how that is useful and where those images coming from.

Taking the process step by step, first draw an image from the webcam stream in a canvas, at a certain interval. In the demo, I draw 60 images per second, which is more than you need. The current image displayed in the webcam stream is the first image that you blend into another.

The second image is another capture from the webcam but at the previous time interval. Now that you have two images, subtract their pixels values. This means that if the images are identical – in other words, if the user is not moving at all – the operation produces a black picture.

The magic happens when the user starts to move. The image taken at the current time interval is be slightly different than the image of the previous time interval. If you substract different values, some colors start to appear. This means that something moved between these two frames!

Figure 2. Example of a blended image from a webcam stream.
Figure 2. Example of a blended image from a webcam stream.

It starts to make sense as the motion detection process is almost finished. The last step is looping over all of the pixels in the blended image to determine if there are some pixels that are not black.

Assets preparation

To make this demo work, you need several things that are pretty common to all websites. You need, of course, an HTML page that contains some canvas and video tags, and some JavaScript to make the application run.

Then, you need an image of the xylophone ideally with a transparent background (png). In addition, you need images for each key on the xylophone, as a separate image with a small brightness change. Using a rollover effect, these images give the user visual feedback, highlighting the triggered note.

You need an audio file containing each note’s sound for which mp3 files will do. Playing a xylophone without sound wouldn't be much fun.

Finally, I think it is important to show a video fallback of the demo in case the user can't or doesn't want to use their webcam. You need to create a video and encode it to mp4, ogg, and webm to cover all browsers. The next section provides a list of the tools used to encode the videos. You can find all the assets in the demo zip file, js_motion_detection.zip.

Encode HTML5 videos

I encoded the video demo fallback into the common three formats needed to display HTML5 videos.

I had some trouble with the webm format as most of the tools I found for encoding would not give me the option to choose a bitrate. The bitrate sets both the video file size and quality, usually written as kbps (kilobits per second). I ended up using Online ConVert Video converter to convert to the WebM format (VP8), which worked well:

For the mp4 version, you can use tools from the Adobe Creative Suite, such as the Adobe Media Encoder or After Effects. Or you can also use a free alternative such as Handbrake.

For the ogg format, I used some tools that encode to all formats at once. I wasn't happy with the quality of the webm and mp4 formats. Although I couldn't seem to change the quality output, the ogg video was fine. You can use either Easy HTML5 Video or Miro.

Prepare the HTML

This is the last step before starting to code some JavaScript. Set some html tags. Here are the required HTML tags. (I won’t list everything I used, such as the video demo fallback code which you can download in the source.)

You need a simple video tag to receive the user’s webcam feed. Don’t forget to set the autoplay property. Without it, the stream pauses on the first frame received.

<video id="webcam" autoplay width="640" height="480"></video>

The video tag will actually not be displayed. Its css display style is set to none. Instead, draw the stream received in a canvas so I can use it for the motion detection. Create the canvas to draw the webcam stream:

<canvas id="canvas-source" width="640" height="480"></canvas>

You also need another canvas to show what’s happening in real time during the motion detection:

<canvas id="canvas-blended" width="640" height="480"></canvas>

Create a div that contains the xylophone images. Place the xylophone on top of the webcam: the user can virtually use their hand to play with it. On top of the xylophone are placed the notes, hidden, that display on a rollover when the note is triggered.

<div id="xylo"> <div id="back"><img id="xyloback" src="images/xylo.png"/></div> <div id="front"> <img id="note0" src="images/note1.png"/> <img id="note1" src="images/note2.png"/> <img id="note2" src="images/note3.png"/> <img id="note3" src="images/note4.png"/> <img id="note4" src="images/note5.png"/> <img id="note5" src="images/note6.png"/> <img id="note6" src="images/note7.png"/> <img id="note7" src="images/note8.png"/> </div> </div>

JavaScript motion detection

The JavaScript steps to make this demo work are as follows:

  • detect if the application can use the getUserMedia API
  • detect if the webcam stream is being received
  • load the sounds of the xylophone notes
  • start a time interval and call an update function
  • at each time interval, draw the webcam feed  onto a canvas
  • at each time interval, blend the current webcam image into the previous one
  • at each time interval, draw the blended image onto a canvas
  • at each time interval, check the pixel color values in the areas of the xylophone notes
  • at each time interval, play a specific xylophone note if a motion detection is found

First step: prepare variables

Prepare some variables to store what you for the drawing and motion detection. You need two references to the canvas, a variable to store the context of each canvas, and a variable to store the drawn webcam stream. Also store the x axis position of each xylophone note, the sounds notes to play, and some sound-related variables.

var notesPos = [0, 82, 159, 238, 313, 390, 468, 544]; var timeOut, lastImageData; var canvasSource = $("#canvas-source")[0]; var canvasBlended = $("#canvas-blended")[0]; var contextSource = canvasSource.getContext('2d'); var contextBlended = canvasBlended.getContext('2d'); var soundContext, bufferLoader; var notes = [];

Also invert the x axis of the webcam stream so the user feels like they are in front of a mirror. This makes their movement to reach the notes a bit easier. Here is how to do that:

contextSource.translate(canvasSource.width, 0); contextSource.scale(-1, 1);

Second step: update and draw video

Create a function called update that will be executed 60 times per second and will call other functions that draw the webcam stream onto a canvas, blend the images, and detect the motion.

function update() { drawVideo(); blend(); checkAreas(); timeOut = setTimeout(update, 1000/60); }

Drawing the video onto a canvas is quite easy and it takes only one line:

function drawVideo() { contextSource.drawImage(video, 0, 0, video.width, video.height); }

Third step: build the blend mode difference

Create a helper function to ensure that the result of the substraction is always positive. You can use the built-in function Math.abs, but I wrote an equivalent with binary operators. Most of the time, using binary operators results in better peformance. You don’t have to understand it exactly, just use it as it is:

function fastAbs(value) { // equivalent to Math.abs(); return (value ^ (value >> 31)) - (value >> 31); }

Now write the blend mode difference. The function receives three parameters:

  • a flat array of pixels to store the result of the substraction
  • a flat array of pixels of the current webcam stream image
  • a flat array of pixels of the previous webcam stream image

The arrays of pixels are flattened and contain the color channels values of red, green, blue and alpha:

  • pixels[0] = red value
  • pixels[1] = green value
  • pixels[2] = blue value
  • pixels[3] = alpha value
  • pixels[4] = red value
  • pixels[5] = green value
  • and so on…

In the demo, the webcam stream has a width of 640 pixels and a height of 480 pixels. The size of the array is: 640 * 480 * 4 = 1,228,000.

The best way to loop over the array of pixels is to increment the value by 4 (red, green, blue, alpha), meaning there are now 307,200 iterations - much better.

function difference(target, data1, data2) { var i = 0; while (i < (data1.length / 4)) { var red = data1[i*4]; var green = data1[i*4+1]; var blue = data1[i*4+2]; var alpha = data1[i*4+3]; ++i; } }

You can now substract the pixel values of the images. For performance – and it seems to make a big difference – do not perform the substraction if the color channel is already 0 and set the alpha to 255 (0xFF) automatically. Here is the finished blend mode difference function (feel free to optimize!):

function difference(target, data1, data2) { // blend mode difference if (data1.length != data2.length) return null; var i = 0; while (i < (data1.length * 0.25)) { target[4*i] = data1[4*i] == 0 ? 0 : fastAbs(data1[4*i] - data2[4*i]); target[4*i+1] = data1[4*i+1] == 0 ? 0 : fastAbs(data1[4*i+1] - data2[4*i+1]); target[4*i+2] = data1[4*i+2] == 0 ? 0 : fastAbs(data1[4*i+2] - data2[4*i+2]); target[4*i+3] = 0xFF; ++i; } }

I used a slightly different version in the demo to get a better accuracy. I created a threshold function to be applied on the color values. The method either changes the pixel color value to black under a certain limit or to white above the limit. You can also use it as is:

function threshold(value) { return (value > 0x15) ? 0xFF : 0; }

I also made an average between the three color channels, which results in an image with pixels either black or white.

function differenceAccuracy(target, data1, data2) { if (data1.length != data2.length) return null; var i = 0; while (i < (data1.length * 0.25)) { var average1 = (data1[4*i] + data1[4*i+1] + data1[4*i+2]) / 3; var average2 = (data2[4*i] + data2[4*i+1] + data2[4*i+2]) / 3; var diff = threshold(fastAbs(average1 - average2)); target[4*i] = diff; target[4*i+1] = diff; target[4*i+2] = diff; target[4*i+3] = 0xFF; ++i; } }

The result is something like this black and white image.

Figure 3. The blend mode difference results in a black and white image.
Figure 3. The blend mode difference results in a black and white image.

Fourth step: blend canvas

The function to blend the images is now ready, you just need to send the right values to it: the arrays of pixels.

The JavaScript drawing API provides a method to retrieve an instance of an ImageData object. This object contains useful properties, such as width and height, and also the data property that is the array of pixels you need. Also, create an empty ImageData instance to receive the result, and store the current webcam image for the next iteration of the time interval. Here is the function that blends and draws the result in a canvas:

function blend() { var width = canvasSource.width; var height = canvasSource.height; // get webcam image data var sourceData = contextSource.getImageData(0, 0, width, height); // create an image if the previous image doesn’t exist if (!lastImageData) lastImageData = contextSource.getImageData(0, 0, width, height); // create a ImageData instance to receive the blended result var blendedData = contextSource.createImageData(width, height); // blend the 2 images differenceAccuracy(blendedData.data, sourceData.data, lastImageData.data); // draw the result in a canvas contextBlended.putImageData(blendedData, 0, 0); // store the current webcam image lastImageData = sourceData; }

Fifth step: search for pixels

The final step for this demo is the motion detection using the blended images we created in the previous section.

When preparing the assets, place eight images of the notes of the xylophone to use them as rollovers. Use these note positions and sizes as rectangle areas to retrieve the pixels from the blended image. Then, loop over them to find some white pixels.

In the loop, make an average of the color channels and add the result to a variable. After the loop has performed, create a global average of all the pixels in this area.

Avoid noise or small motions by seeting a limit of 10. If you find a value that is more than 10, consider that something has moved since the last frame. This the motion detection!

Then, play the corresponding sound and show the note rollover. The function looks like this:

function checkAreas() { // loop over the note areas for (var r=0; r<8; ++r) { // get the pixels in a note area from the blended image var blendedData = contextBlended.getImageData( notes[r].area.x, notes[r].area.y, notes[r].area.width, notes[r].area.height); var i = 0; var average = 0; // loop over the pixels while (i < (blendedData.data.length / 4)) { // make an average between the color channel average += (blendedData.data[i*4] + blendedData.data[i*4+1] + blendedData.data[i*4+2]) / 3; ++i; } // calculate an average between of the color values of the note area average = Math.round(average / (blendedData.data.length / 4)); if (average > 10) { // over a small limit, consider that a movement is detected // play a note and show a visual feedback to the user playSound(notes[r]); notes[r].visual.style.display = "block"; $(notes[r].visual).fadeOut(); } } }

Where to go from here

I hope this brings you some new ideas to create video-based interactions, or to step on a territory that has been owned by Flash developers such as myself for years: video-interactive campaigns based on either pre-rendered videos or webcam streams.

You can learn more information about the getUserMedia API by following the development of the Chrome and Opera browsers, and on the draft of the API on the W3C website.

Creatively, if you are looking for ideas for motion detection or video-based applications, I recommend that you follow the experiments made by Flash developers and motion designers using After Effects. Now that JavaScript and HTML are gaining momentum and new capabilities, there is a lot to learn, try and experiment with the following resources:

  • Soundstep blog
  • Blend modes
  • GetUserMedia API
  • GetUserMedia article to get started
  • GetUserMedia shim
  • Great developers blogs to write better code and get ideas from:
    • Grant Skinner's blog
    • Keith Peter's blog
  • Motion designers blogs to get ideas from:
    • Video Copilot
    • ProVideo Coalition
    • Motionographer

Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License+Adobe Commercial Rights

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License. Permissions beyond the scope of this license, pertaining to the examples of code included within this work are available at Adobe.

More Like This

  • Using the Geolocation API
  • CSS3 basics
  • Developing HTML5 games with Impact JavaScript game engine and Dreamweaver CS5.5
  • Introducing the HTML5 storage APIs
  • Introducing theexpressiveweb.com beta
  • Adobe, standards, and HTML5
  • Real-time data exchange in HTML5 with WebSockets
  • Pseudo-classical object-oriented programming in JavaScript with Minion
  • Object types in JavaScript
  • Backbone.js Wine Cellar tutorial – Part 2: CRUD

Products

  • Acrobat
  • Creative Cloud
  • Creative Suite
  • Digital Marketing Suite
  • Digital Publishing Suite
  • Elements
  • Mobile Apps
  • Photoshop
  • Touch Apps
  • Student and Teacher Editions

Solutions

  • Digital marketing
  • Digital media
  • Web Experience Management

Industries

  • Education
  • Financial services
  • Government

Help

  • Product help centers
  • Orders and returns
  • Downloading and installing
  • My Adobe

Learning

  • Adobe Developer Connection
  • Adobe TV
  • Training and certification
  • Forums
  • Design Center

Ways to buy

  • For personal and home office
  • For students, educators, and staff
  • For small and medium businesses
  • For businesses, schools, and government
  • Special offers

Downloads

  • Adobe Reader
  • Adobe Flash Player
  • Adobe AIR
  • Adobe Shockwave Player

Company

  • News room
  • Partner programs
  • Corporate social responsibility
  • Career opportunities
  • Investor Relations
  • Events
  • Legal
  • Security
  • Contact Adobe
Choose your region United States (Change)
Choose your region Close

North America

Europe, Middle East and Africa

Asia Pacific

  • Canada - English
  • Canada - Français
  • Latinoamérica
  • México
  • United States

South America

  • Brasil
  • Africa - English
  • Österreich - Deutsch
  • Belgium - English
  • Belgique - Français
  • België - Nederlands
  • България
  • Hrvatska
  • Česká republika
  • Danmark
  • Eastern Europe - English
  • Eesti
  • Suomi
  • France
  • Deutschland
  • Magyarország
  • Ireland
  • Israel - English
  • ישראל - עברית
  • Italia
  • Latvija
  • Lietuva
  • Luxembourg - Deutsch
  • Luxembourg - English
  • Luxembourg - Français
  • الشرق الأوسط وشمال أفريقيا - اللغة العربية
  • Middle East and North Africa - English
  • Moyen-Orient et Afrique du Nord - Français
  • Nederland
  • Norge
  • Polska
  • Portugal
  • România
  • Россия
  • Srbija
  • Slovensko
  • Slovenija
  • España
  • Sverige
  • Schweiz - Deutsch
  • Suisse - Français
  • Svizzera - Italiano
  • Türkiye
  • Україна
  • United Kingdom
  • Australia
  • 中国
  • 中國香港特別行政區
  • Hong Kong S.A.R. of China
  • India - English
  • 日本
  • 한국
  • New Zealand
  • 台灣

Southeast Asia

  • Includes Indonesia, Malaysia, Philippines, Singapore, Thailand, and Vietnam - English

Copyright © 2012 Adobe Systems Incorporated. All rights reserved.

Terms of Use | Privacy Policy and Cookies (Updated)

Ad Choices

Reviewed by TRUSTe: site privacy statement