A little bit about myself, first: I'm a product engineer at a company called Stripe. We build an API and related tools to help developers and businesses process all sorts of payments, from credit card payments to ACH, Alipay to ACH.
We're located...
We're located in sunny San Francisco, California, and I'm currently in one of our conference rooms. This one's called Zebra, as you can probably tell from the pictures behind me.
Now, we don't actually use WebRTC here at Stripe, but we do use Google Hangouts for our ultra-time-sensitive communications.
I don't know if you've ever experienced this, but oftentimes, Google Hangouts will be pretty laggy.
Chats will appear out of order momentarily, and often take more than 5 seconds to be delivered.
I wish I had a dime everytime I get this message.
So, we could just turn around and talk to each other face-to-face, but sometimes I find it hard to express myself fully without being able to use emoji.
Jokes about in-person-communication aside, the way Hangouts works right now doesn't make too much sense when I'm talking to a person who's sitting 20 feet away from me.
My message has to travel from my computer here in San Francisco, to Stripe's router, all the way to some..
..random data center in Dowles, Oregon.
That's over 600 miles (or 1000 km) away!
My message not only takes the time to physically travel across the wire to Oregon,
but once it's there, I no longer have control over what I sent! It's out there. On someone random lurker's (or Google's) server!
And after all that, it still needs to make that 600 mile/1000 km journey back to Stripe, where my message will end up 20 feet from where it started.
So, let's see if we can't eliminate some of parts of this process.
Here's the lifecycle of a Google Hangouts chat message. The chicken emoji represents a chat message that I commonly send, and you can see some third party servers up top.
http://cdn.peerjs.com/demo/videochat/demo.html
I've set up a second laptop at my desk. It's a few hundred feet from this conference room, so I hope the connection is fast, or you'll all laugh at me.
http://cdn.peerjs.com/demo/videochat/demo.html
Okay, so I'll just open up this tab with the demo here...
Just gotta allow access to my mic/video, and we'll be able to sneak a peek at what people are talking about around my desk.
Hopefully it's nothing too confidential.
Whoo! Usually I do this demo with a person on the other end, to prove it's not prerecorded or something. It didn't work out this time, but trust me, it's not prerecorded! I'm actually directly streaming to and from a computer in the other room, without going through any third party server. It's pretty nice quality on my end, but because you may be seeing a stream through a stream, I'm not sure how nice it'll seem.
// Purple: I want to have a video chat with my friend!
var purpleConnection = new RTCPeerConnection(...);
// I'll create a data channel to relay chat messages with my friend.
var p2bChat = purpleConnection.createDataChannel('CHAT', ...);
// and another one for sending files.
var p2bFiles = purpleConnection.createDataChannel('FILES', ...);
// I'll create a video/audio stream so we can videochat.
navigator.getUserMedia({audio: true, video: true}, function(stream) {
// And I'll add that stream to my connection.
purpleConnection.addStream(stream);
...
});
What I'm about to explain is a very simplified version of the raw WebRTC API, so if something seems magical right now to you, I'll probably come back and fill in the blanks later.
The pseudowebrtc in the next two slides can all run on the client side, in your browser, which is a webRTC client.
I would first create an RTCPeerConnection object with some configuration object.
At this point I should decide if I want to have a video call or just a text chat over
DataChannel or even a filesharing session over DataChannel.
If I want to create a data channel, at this point I would call createDataChannel
on my peer connection object.
If I want to add a mediastream, I can use `getUserMedia` to access my webcam and
microphone.
You'll notice that the `getUserMedia` sticks out a bit. We'll get back to this in a bit, because it's a pretty cool API on
its own.
Theoretically, there's no reason I can't both create a data channel and add a
media stream, because another cool feature of PeerConnections is that they can
multiplex many mediastreams/datachannels.
// I want to talk to Blue, so I'll make Blue an offer to chat.
purpleConnection.createOffer(function(offer) {
// I'll save it locally...
purpleConnection.setLocalDescription(offer, function() {
// ...and pass it on to Blue.
magicallySend(offer, blueClient);
// we'll talk about our magical sending apparatus in a bit.
}, errorHandler);
});
So now that PURPLE has decided who he want to talk to and how I'm going to talk to you, I'll create something called an
"offer".
The format of the offer is called "SDP", or Session Description
Protocol. SDP doesn't actually deliver any media, but rather serves as a way of
letting your peer know of your configuration--like the media format or type you want to
share, or the transport protocol you're using for your data channel.
I record this offer on my peer connection locally using setLocalDescription,
then magically send it to you.
(typeof offer is RTCSessionDescription)
This means that if you ever add a new stream or change an existing stream on
your peer connection, we'll have to go through this negotiation process again.
// Blue: I've magically received an offer, and I want to chat.
var blueConnection = new RTCPeerConnection(...);
blueConnection.setRemoteDescription(purpleOffer, function() {
// I'll share my own media, but I only want to share video.
navigator.getUserMedia({video: true}, function(stream) {
blueConnection.addStream(stream);
blueConnection.createAnswer(function(answer) {
blueConnection.setLocalDescription(answer, function() {
magicallySend(answer, purpleClient);
}, errorHandler);
});
});
}, errorHandler);
Blue magically receives purple's offer, and she decides to answer.
Blue is a pretty shy, so she's not going to add her camera stream.
And at various points during this process, events for streams and data channels
would've fired. But at this point they're usable.
Whoo, that was a lot.
So let's take a bit of a mental break, because we all fell our eyes glazing over
when we see code on slides.
Remember getUserMedia, the API I said was a little different from the others?
Recall that it is a browser api that is able to take control of your webcam and
mic.
You can use `getUserMedia` without even knowing that WebRTC exists.
It was the first part
of the WebRTC spec that was available in any browser
and this wassss...as early as Chrome 21/33/Firefox 17/27, which was over 2 years ago
although its not directly tied to
Peer0to-peer connections, its considered a gateway to
WebRTC because it allowed the PeerConnection API to stream media between to peers very early in its development--data channel support did not come until much later.
iswebrtcreadyyet.com
(&yet)
In this really cool browser support table from iswebrtcreadyyet.com, you
can see that there's a lot of red
and yellow. and these are the parts you really end up pulling your hair over.
They're parts of the API that are not fully up to spec or not interoperable.
iswebrtcreadyyet.com
(&yet)
Compared to the browser scorecard from almost a year ago when I first gave this talk, there's not an amazing amount of growth in percentage of the green portions. A lot of the work on WebRTC in recent months has been in nailing down the API and in supporting a broader range of data channel and media stream options.
Similarly, if you try to find info about the webRTC apis on mdn, you might first get a page that tells you that it's outdated...
and then you'll go to the page they claim to be migrating to, and find this.
Last year, there was an issue filed on PeerJS, where mobile devices on Chrome
31/32 could not communicate with desktop browsers of the same version.
How strange. So I got my hands on an android device and checked the chrome flags
settings.
In Chrome 31, SCTP transport, the type of transport we wanted to see, because that was what was in desktop browsers, was indeed behind a flag, so this was
somewhat expected. But even with the flag enabled, I couldn't get a WebRTC connection to be successful
The story here is that I couldn't quickly figure out how to take a screenshot on an
android phone, so I took a picture with my iphone.
So I search the equivalent of stackoverflow for webrtc: the
webrtc-discuss google
group.
Sorry, less searchable equivalent of stackoverflow.
it appears that Blue censor bar here knows that it's not supported until 33. I don't know
who blue censor bar is, and
I spent a good 5 minutes getting to this page from the last because google
groups now has google plus tipsies hanging around
but blue censor bar seems legit. It's not working in
Android. Which means Android is lying to me.
Anyone less jaded than I am about WebRTC might hesitate to believe that.
But after months of strangling with standards noncompliance, trying to implement
webrtc browser interoperability with two browsers that did not have a complete
implementation, "firefoxisms",
versions of firefox onyl supporting servers specified by IP address, random breaking
changes in both browsers, I was more than willing to believe blue censor bar.
Issue #138
The bug today? It's closed, but I never fixed it or anything. I wrestled with a few hacks to detect whether
SCTP was really enabled, but nothing felt satisfying. Eventually I decided that it wasn't worth the time. Android for Chrome would just roll out their fixes soon anyways. And indeed, now we're on like version 40 of chrome, so it's no longer an issue!
// A simple config for an RTCPeerConnection...
var pcDotDotDot2 = {'iceServers': [
{ url: 'stun:stun.l.google.com:19302' },
{ url: 'turn:homeo@turn.bistri.com:80', credential: 'homeo' }
]};
// What it looks like for some older versions of some browsers...
var pcDotDotDot1 = {'iceServers': [
{ url: 'stun:23.21.150.121:19302' }
]};
// For a UDP data channel on some browsers...
var dcDotDotDot1 = {
maxRetransmits: 0,
ordered: false
};
// For a UDP data channel on older versions of some browsers...
var dcDotDotDot2 = {
reliable: false
};
even more servers, right?
You'll notice that the two servers passed in are a STUN and a TURN server,
respectively. Let's talk a bit about what those are.
But it doesn't have to be
that scary.
(WebRTC can
be easy, remember?)
So at this point you're probably like, "But the first slide of the talk says that WebRTC can be easy! And all you're doing is scaring me!"
Well, despite all
the scary things I've just shown you, the good news is that you can play around with WebRTC without understanding any of it. There are a few libraries out there that'll make
it super easy to prototype quickly with WebRTC. WebRTC is a native browser API like any other, and native browser APIs are often hard to digest without some nice wrappers.
Here are a few that have been around.
Definitely look some of these up. As I mentioned earlier, I help maintain PeerJS.
And as a maintainer, one thing I find that make open source WebRTC libraries a bit different from other js libraries is
that they require a bit of background knowledge about the webrtc apis, which
can understandably
seem scary. this makes it so that great developers, like yourselves, don't tend to contribute as
much.
but now that you've sat through this talk, i'd like to encourage you to try your hand at
contributing to some of these libraries! it continues to be a really exciting time for webrtc, and the more folks we have using and contributing to these libraries, the better they can become for everyone.