Tab Completion

I'm Tab Atkins Jr, and I wear many hats. I work for Google on the Chrome browser as a Web Standards Hacker. I'm also a member of the CSS Working Group, and am either a member or contributor to several other working groups in the W3C. You can contact me here.
Listing of All Posts

<video> + <canvas> = <magic>

Last updated:

You've already learned about the <video> and <canvas> elements, but did you know that they were designed to be used together? In fact, the two elements are absolutely wondrous when you combine them together. I'm going to show off a few super-simple demos using the two elements together, which should help suggest cool future projects for you fellow web authors. All of these demos work in every modern browser except Internet Explorer,

First, the basics

If you're just starting in HTML5 hacking you may not yet be familiar with the <video> tag and how to use it. Here's a super-simple example which we'll be using in the later demos:

<video controls loop>
	<source src=video.webm type=video/webm>
	<source src=video.ogg type=video/ogg>
	<source src=video.mp4 type=video/mp4>
</video>

The <video> tag hosts two attributes - @controls and @loop. @controls tells the browser to give the video the standard set of video controls: play/pause, scrubber, volume, etc. @loop tells the browser to start the video over again from the beginning when it ends.

Then, inside the <video> element, we have three <source> children, each pointing to a different encoding of the same video. The browser will try each source in order, and play the first one that it understands.

See this code in action, playing the intro to one of the greatest cartoon series of all time.

(A note about fallback - all of these demos assume that your browser has <video> support, which isn't true in IE8 or earlier. Normally it is good practice to specify a Flash fallback or similar for those browsers, but that wouldn't accomplish much here - all of the techniques I demonstrate rely on the basic integration between the <video> element and the <canvas> element, which you can't achieve with a Flash player. Thus, I've omitted non-<video> fallback in these examples. I've still provided multiple sources, though, so all current browsers that do support <video> will be able to play it.)

Now, a simple example

Now that we know how to play a video, let's mix in some <canvas> shenanigans. First, check out the demo, then come back here for a code walkthrough. I'll wait.

...

Done? Cool! Now, how does this work? Surely it requires a few hundred lines of javascript to do? If you've cheated and already looked at the source code of the demo page, you'll know how easy it is. We start with this HTML:

<!DOCTYPE html>
<title>Video/Canvas Demo 1</title>

<canvas id=c></canvas>
<video id=v controls loop>
	<source src=video.webm type=video/webm>
	<source src=video.ogg type=video/ogg>
	<source src=video.mp4 type=video/mp4>
</video>

Same video code as before, but now with a <canvas> element thrown into the mix. Kinda empty and useless at the moment, though. We'll script it into action later.

Now, let's pair that with some CSS to get things positioned right:

<style>
body {
	background: black;
}

#c {
	position: absolute;
	top: 0;
	bottom: 0;
	left: 0;
	right: 0;
	width: 100%;
	height: 100%;
}

#v {
	position: absolute;
	top: 50%;
	left: 50%;
	margin: -180px 0 0 -240px;
}
</style>

This just centers the video in the screen, and then stretches the canvas to the full width and height of the browser window. Since the canvas comes first in the document, it'll be behind the video, exactly where we want it.

Here comes the magic!

<script>
document.addEventListener('DOMContentLoaded', function(){
	var v = document.getElementById('v');
	var canvas = document.getElementById('c');
	var context = canvas.getContext('2d');

	var cw = Math.floor(canvas.clientWidth / 100);
	var ch = Math.floor(canvas.clientHeight / 100);
	canvas.width = cw;
	canvas.height = ch;

	v.addEventListener('play', function(){
		draw(this,context,cw,ch);
	},false);

},false);

function draw(v,c,w,h) {
	if(v.paused || v.ended)	return false;
	c.drawImage(v,0,0,w,h);
	setTimeout(draw,20,v,c,w,h);
}
</script>

Take that in for a moment. Just breathe deeply and absorb it. So short, so sweet, so pretty. Now, let's step through it.

	var v = document.getElementById('v');
	var canvas = document.getElementById('c');
	var context = canvas.getContext('2d');

	var cw = Math.floor(canvas.clientWidth / 100);
	var ch = Math.floor(canvas.clientHeight / 100);
	canvas.width = cw;
	canvas.height = ch;

This part is simple. I grab hold of the video and canvas elements on the page, and grab the canvas's 2D-context as well, so I can draw on it. Then I do some quick calculation to find out how many pixels wide and tall I want the canvas's drawing surface to be. The <canvas> element itself is already stretched to the size of the screen by the CSS, so this'll make each pixel of the drawing surface equal to about 100x100 pixels on the screen.

That last bit may need some explanation if you're new to canvas. Normally, the visual size and the drawing-surface size of a <canvas> element will be the same. In that case, drawing a line 50px long will display a line 50px long. But that doesn't have to be true - you can set the drawing surface's size through the @width and @height properties on the <canvas> element itself, and then change the visual size of the canvas with CSS to be something different. The browser will then automatically upscale or downscale the drawing appropriately to make the drawing surface fill the visual size. In this case, I'm setting the drawing surface of the canvas to be very small - on most screens it'll be about 10px wide and 7px tall - and then stretching the visual size with CSS so that each pixel I draw gets blown up 100-fold by the browser. That's what causes the cool visual effect in the demo.

	v.addEventListener('play', function(){
		draw(v,context,cw,ch);
	},false);

Another simple part. Here I attach some code to the "play" event on the video element. This event gets fired whenever the user hits the "play" button to start watching the video. All I do is call the draw() function with the appropriate parameters - the video itself, the canvas's drawing context, and the canvas's width and height.

function draw(v,c,w,h) {
	if(v.paused || v.ended)	return false;
	c.drawImage(v,0,0,w,h);
	setTimeout(draw,20,v,c,w,h);
}

The first line just makes the function stop immediately if the user pauses or stops the video, so it's not burning CPU when nothing's changing. The third line calls the draw function again, allowing the browser a little breathing space to do other things like update the video itself. I'm putting in a 20ms delay so we'll get roughly 50fps, which is more than enough.

The second line's where the magic happens - it draws the current frame of the video directly onto the canvas. Yes, it's exactly as simple as it looks. Just pass the video element, then the x, y, width, and height of the rectangle on the canvas you want it to draw into. In this case, it's filling up the entire canvas, but you can do less (or more!) if you wanted.

I'm using another trick here - remember how the canvas is really tiny? The video will be at least 20 times bigger than the canvas on most screens, so how do we draw it onto such a tiny canvas? The drawImage() function handles that for us - it automatically scales whatever you hand it in the first argument so that it fills the rectangle you specify. That means we authors don't have to worry about averaging the pixel colors (or extrapolating, if you're drawing a small video into a big rectangle), because the browser does it all for us. I'll use this trick more in the future, so watch out for it.

And... that's it! The entire demo is done in 20 lines of easy-to-read javascript code, instantly producing a pretty cool background effect for any video you wish to play. You can trivially adjust the size of the "pixels" on the canvas by adjusting the lines that set the cw and ch variables.

Directly Manipulating Video Pixels

The last demo was cool, but it just leaned on the browser to do all the heavy lifting. The browser downscaled the video, drew it onto the canvas, and then upscaled the canvas pixels, all automatically. Let's try our hands at doing some of this ourselves! Check out the demo to see this in action, where I convert the video to grayscale on the fly.

The HTML for the page is basically identical:

<video id=v controls loop>
	<source src=video.webm type=video/webm>
	<source src=video.ogg type=video/ogg>
	<source src=video.mp4 type=video/mp4>
</video>
<canvas id=c></canvas>

Nothing new here, so let's move onto the script.

document.addEventListener('DOMContentLoaded', function(){
	var v = document.getElementById('v');
	var canvas = document.getElementById('c');
	var context = canvas.getContext('2d');
	var back = document.createElement('canvas');
	var backcontext = back.getContext('2d');

	var cw,ch;

	v.addEventListener('play', function(){
		cw = v.clientWidth;
		ch = v.clientHeight;
		canvas.width = cw;
		canvas.height = ch;
		back.width = cw;
		back.height = ch;
		draw(v,context,backcontext,cw,ch);
	},false);

},false);

function draw(v,c,bc,w,h) {
	if(v.paused || v.ended)	return false;
	// First, draw it into the backing canvas
	bc.drawImage(v,0,0,w,h);
	// Grab the pixel data from the backing canvas
	var idata = bc.getImageData(0,0,w,h);
	var data = idata.data;
	// Loop through the pixels, turning them grayscale
	for(var i = 0; i < data.length; i+=4) {
		var r = data[i];
		var g = data[i+1];
		var b = data[i+2];
		var brightness = (3*r+4*g+b)>>>3;
		data[i] = brightness;
		data[i+1] = brightness;
		data[i+2] = brightness;
	}
	idata.data = data;
	// Draw the pixels onto the visible canvas
	c.putImageData(idata,0,0);
	// Start over!
	setTimeout(function(){ draw(v,c,bc,w,h); }, 0);
}

The script is a bit longer this time, because we're actually doing some work. It's still really simple, though!

document.addEventListener('DOMContentLoaded', function(){
	var v = document.getElementById('v');
	var canvas = document.getElementById('c');
	var context = canvas.getContext('2d');
	var back = document.createElement('canvas');
	var backcontext = back.getContext('2d');

	var cw,ch;

	v.addEventListener('play', function(){
		cw = v.clientWidth;
		ch = v.clientHeight;
		canvas.width = cw;
		canvas.height = ch;
		back.width = cw;
		back.height = ch;
		draw(v,context,backcontext,cw,ch);
	},false);

This is almost the same as I had before, with two real differences. The first is that I'm creating a second canvas and pulling the context out of it as well; this is a "backing canvas", which I'll use to perform intermediate operations before painting the final result into the visible canvas in the markup. The backing canvas doesn't even need to be added to the document; it can hang out here in my script just fine. This strategy will be used a lot in later examples, and is really useful in general, so take note of it.

The second change is that I'm waiting to resize the canvases until the video is played, rather than just sizing them immediately. This is because the <video> element probably hasn't loaded its video up when the DOMContentLoaded event fires, so it is still using the default size for the element. By the time it's ready to play, though, it knows the size of the video and has sized itself appropriately, so we can set up the canvases to be the same size at that point.

function draw(v,c,bc,w,h) {
	if(v.paused || v.ended)	return false;
	bc.drawImage(v,0,0,w,h);

Same as the first demo, the draw function begins by checking if it should stop, then just draws the video onto a canvas. Note that I'm drawing it onto the backing canvas, which, again, is just sitting in my script and isn't displayed in the document. The visible canvas is reserved for the displaying the grayscale version, so I use the backing canvas to load up the initial video data.

	var idata = bc.getImageData(0,0,w,h);
	var data = idata.data;

Here's the first new bit. You can draw something onto a canvas with the normal canvas drawing functions or drawImage, or you can just manipulate the pixels directly through the ImageData object. getImageData() returns the pixels from a rectangle of the canvas - in this case I'm just getting the whole thing.

Warning! If you're following along and trying to run these demos on your desktop, this is where you'll probably run into trouble. The <canvas> element keeps track of where the data inside of it comes from, and if it knows that you got something from another website (for example, if the <video> element you painted into the canvas is pointing to a cross-origin file) it'll "taint" the canvas. You're not allowed to grab the pixel data from a tainted canvas. Unfortunately, file: urls count as "cross-origin" for this purpose, so you can't run this on your desktop. Either fire up a web server on your computer and view the page from localhost, or upload it to some other server you control.

	for(var i = 0; i < data.length; i+=4) {
		var r = data[i];
		var g = data[i+1];
		var b = data[i+2];

Now, a quick note about the ImageData object. It returns the pixels in a special way, to make them easy to manipulate. If you have, say, a 100x100 pixel canvas, it contains a total of 10,000 pixels. The ImageData array for it will then have 40,000 elements, because the pixels are broken up by component and listed sequentially. Each group of four elements in the ImageData array represent the red, green, blue, and alpha channels for that pixel. To loop through the pixels, just increment your counter by 4 every time, like I do here. Each channel, then, is an integer between 0 and 255.

		var brightness = (3*r+4*g+b)>>>3;
		data[i] = brightness;
		data[i+1] = brightness;
		data[i+2] = brightness;

Here, a quick bit of math converts the rgb of the pixel into a single "brightness" value. As it turns out, our eyes respond most strongly to green light, slightly less so to red, and much less to blue. So, I weight the channels appropriately before taking the average. Then, we just feed that single value back to all three channels; as we should all know, when all three values are equal in rgb you get gray. (In this whole process I'm completely ignoring the fourth member of each group, the alpha channel, because it's always going to be 255.)

	idata.data = data;

Shove the modified pixel array back into the ImageData object...

	c.putImageData(idata,0,0);

...and then shove the whole thing into the visible canvas! We didn't need to do any complicated drawing at all; just grab the pixels, manipulate them, and shove them back in. So easy!

A final note - real-time full-video pixel manipulation is one of those rare places where micro-optimizations actually matter. You can see their effects in my code here. Originally, I didn't pull the pixel data out of the ImageData object, and just said "var r = idata.data[i];" and so on each time, which meant several extra property lookups in every iteration of the loop. I also originally just divided the brightness by 8 and floored the value, which is slightly slower than bitshifting by 3 digits. In normal code these sorts of things are completely insignificant, but when you're doing them several million times a second (the video is 480x360, and thus contains nearly 200,000 pixels, each of which is individually handled roughly 100 times a second) those tiny delays add up into a noticeable lag.

More Advanced Pixel Manipulation

You can operate on more than just a single pixel at a time, too, composing some fairly complex visual effects. As I noted at the end of the previous section, performance matters a lot here, but you'd be surprised what you can squeeze out with a little creativity. As you can see in the demo, I'll be creating an emboss effect in this example, which requires you to use several input pixels together to compute the value of each output pixel.

Here's the code. The HTML and most of the beginning code is identical to the previous example, so I've elided everything but the draw() function:

function draw(v,c,bc,cw,ch) {
	if(v.paused || v.ended)	return false;
	// First, draw it into the backing canvas
	bc.drawImage(v,0,0,cw,ch);
	// Grab the pixel data from the backing canvas
	var idata = bc.getImageData(0,0,cw,ch);
	var data = idata.data;
	var w = idata.width;
	var limit = data.length
	// Loop through the subpixels, convoluting each using an edge-detection matrix.
	for(var i = 0; i < limit; i++) {
		if( i%4 == 3 ) continue;
		data[i] = 127 + 2*data[i] - data[i + 4] - data[i + w*4];
	}
	// Draw the pixels onto the visible canvas
	c.putImageData(idata,0,0);
	// Start over!
	setTimeout(draw,20,v,c,bc,cw,ch);
}

Now let's step through that.

function draw(v,c,bc,cw,ch) {
	if(v.paused || v.ended)	return false;
	// First, draw it into the backing canvas
	bc.drawImage(v,0,0,cw,ch);
	// Grab the pixel data from the backing canvas
	var idata = bc.getImageData(0,0,cw,ch);
	var data = idata.data;

Same as the last example. Check to see if we should stop, then draw the video onto the backing canvas and grab the pixel data from it.

	var w = idata.width;

The significance of this line needs some explanation. I'm already passing the canvas's width into the function (as the cw variable), so why am I re-measuring its width here? Well, I was actually lying to you earlier when I explained how large the pixel array will be. The browser might have one pixel of canvas map to one pixel of ImageData, but browsers are allowed to use higher resolutions in the image data, representing each pixel of canvas as a 2x2 block of ImageData pixels, or even 3x3 or greater!

If they use a "high-resolution backing store", as this is called, it means better display, as aliasing artifacts (jagged edges on diagonal lines) become much smaller and less noticeable. It also means that rather than a 100x100 pixel canvas giving you an ImageData.data object with 40,000 numbers, it might have 160,000 numbers instead. By asking the ImageData for its width and height, we ensure that we loop through the pixel data properly no matter whether the browser uses a low-res or high-res backing store for it.

It's very important that you use this properly whenever you need the width or height of the data you pulled out as an ImageData object. If too many people screw it up and just use the canvas's width and height instead, then browsers will be forced to always use a low-res backing store to be compatible with those broken scripts!

	var limit = data.length;
	for(var i = 0; i < limit; i++) {
		if( i%4 == 3 ) continue;
		data[i] = 127 + 2*data[i] - data[i + 4] - data[i + w*4];
	}

I'm grabbing the data's length and stuffing it into a variable, so I don't have to pay for a property access on every single iteration of the loop (remember, micro-optimizations matter when you're doing real-time video manipulation!). Then I just loop through the pixels, like I did before. If the pixel happens to be for the alpha channel (every fourth number in the array), I can just skip it - I don't want to change the transparency. Otherwise, I'll do a little math to find the difference between the current pixel's color channel and the similar channels of the pixels below and to the right, then just combine that difference with the "average" gray value of 127. This has the effect of making areas where the pixels are the same color a flat medium gray, but edges where the color suddenly changes will turn either bright or dark.

There's another optimization here - because I'm only comparing the current pixel with pixels 'further ahead' in the data which I haven't looked at yet, I can just store the changed value right back in the original data, because nothing will ever look at the current pixel's data again after this point. This means I don't have to allocate a big array to hold the results before turning it back into an ImageData object.

	c.putImageData(idata,0,0);
	setTimeout(draw,20,v,c,bc,cw,ch);

Finally, draw the modified ImageData object into the visible canvas, and set up another call to the function in 20 milliseconds. This is the same as the previous example.

The Wrapup

So, we've explored the basics of combining HTML5's <canvas> and <video> elements today. The demos were very basic, but illustrated all the essential techniques you'll need to do something even cooler on your own:

  1. You can draw a video directly onto a canvas.
  2. When you draw onto a canvas, the browser will automatically scale the image for you if necessary.
  3. When you display a canvas, the browser will again scale it automatically if the visible size is different from the size of the backing-store.
  4. You can do direct pixel-level manipulation of a canvas by just grabbing the ImageData, changing it, and drawing it back in.

In Part 2 of this article, I'll explore some more interesting applications of video/canvas integration, including a real-time video-to-ascii converter!

(a limited set of Markdown is supported)

#1 - Tom Potts:

Hi, Tab; I can't seem to get to the demos -- I get a 403 forbidden. Have you taken them down deliberately, or could you fix it so I can see how it all hangs together? :o)

Reply?

(a limited set of Markdown is supported)