ILYA: Yeah, this time zone thing is annoying.
JAMES: I know, right? Why do we have that round earth thing? I don’t know.
[Hosting and bandwidth provided by the Blue Box Group. Check them out at BlueBox.net.]
[This podcast is sponsored by New Relic. To track and optimize your application performance, go to RubyRogues.com/NewRelic.]
[This episode is sponsored by SendGrid, the leader in transactional email and email deliverability. SendGrid helps eliminate the cost and complexity of owning and maintaining your own email infrastructure by handling ISP monitoring, DKIM, SPF, feedback loops, whitelabeling, link customization and more. If you’d rather focus on your business than on scaling your email infrastructure, then visit www.SendGrid.com.]
[This episode is sponsored by Code Climate. Code Climate automated code reviews ensure that your projects stay on track. Fix and find quality and security issues in your Ruby code sooner. Try it free at RubyRogues.com/CodeClimate.]
CHUCK: Hey everybody and welcome to episode 135 of the Ruby Rogues podcast. This week on our panel, we have Josh Susser.
JOSH: Hey, good morning everyone.
CHUCK: Avdi Grimm.
AVDI: Hello from Pennsylvania.
CHUCK: James Edward Gray.
JAMES: Good morning everyone.
CHUCK: I’m Charles Max Wood from DevChat.TV. And I just want to let you know I’m going to do Rails Ramp Up again. Sign up by the beginning of the year. I’m doing a 30% discount and I will probably never do that again. So, if you want in, get it now. We also have a special guest and that’s Ilya Grigorik.
ILYA: Hey everyone. I’m glad to be here.
JOSH: Hey, welcome to the show.
CHUCK: I asked you before the show and I still probably slaughtered your name. ILYA: No, that was good. That was right.
CHUCK: Yehey! [Chuckles] So, since you haven’t been on the show before, do you want to introduce yourself?
ILYA: Sure. So nowadays, I guess I get to play the role of internet plumber in Google. I guess that’s the unofficial title.
JAMES: Internet plumber, I love it.
JOSH: Do you have a really cool tool belt to wear?
ILYA: Yeah. I wear one all the time. Maybe the more official title is Web Performance Engineer, Developer Advocate, some mix of those two, which is to state that primarily I work on making the internet fast, which includes things like better and faster protocols, browser optimizations. So, things like how to make Google Chrome faster and then also educating developers on best practices around how do we build fast sites, fast apps, and all the other stuff in between. So, that’s kind of the gist of it.
JAMES: We have a lot of listeners that may choose to email you when the internet is slow. I’m just saying.
ILYA: [Chuckles] Uh-oh.
CHUCK: Yeah, but then he has to go fix the internet kmode. [Chuckles]
ILYA: That’s awesome.
JAMES: That’s really cool, actually. I’ve seen lots of posts of things you’re working on and stuff and making a lot of effort into making sites perform and stuff. It’s really cool stuff.
ILYA: Yeah, it’s definitely a fun topic and an area that never exhausts itself. There’s always something else to fix. There are always more things to do.
JAMES: Right, that’s for sure. So, we had you on because we keep hearing about this HTTP 2.0 change. Can you tell us what started all this?
ILYA: Sure. So, maybe before we get to HTTP 2.0, there’s actually a little bit of history that may help. Let’s rewind history, maybe actually 15 years.
CHUCK: Once upon a time… [Chuckles]
ILYA: Yeah, once upon a time.
JAMES: That’s right. Tell us a bedtime story.
ILYA: So circa 1993, we have the first HTTP 0.9. This is, right now, the minimal viable product. This is all the rage. Well, Tim Berners-Lee was doing that in spades back in the day. If you look at the actual protocol for HTTP, it was literally two words. It was GET followed by the resource name. You hit enter and then you got your resource. And that was the beginning. That was, “Hey, I think we should try this. It may work. It may be interesting.” And it turns out, it did work and it did become interesting. And then we grafted a lot of things onto that since, so kind of from those two words. And by the way, this is a cool hack, an interesting thing to try. I think Apache and Nginx both still support HTTP 0.9. So, if you actually just open up a telnet session to your server and just type in GET / whatever the name of your page, nothing else, just hit enter, you should actually get a page back. So, they still speak that protocol. JOSH: Wow! That sounds a lot like Gopher.
ILYA: [Chuckles] Pretty much, yeah.
JOSH: Do people remember Gopher, still?
JAMES: I remember it, but I don’t think I ever knew the protocol.
CHUCK: I’m not sure what it is.
JOSH: Oh, Gopher is the text-based predecessor to the web. And for a couple of years before HTTP and what we think of as the web developed, there were a bunch of sites that were running this Gopher protocol that basically let you put a hierarchical file system online and have it be accessible. And there was no visual component to it, except for just hierarchical text.
JAMES: I think it was used largely by universities for academic storage, right?
JOSH: Yes. Yeah. But there were a lot of hobbyists who put up their content using Gopher, too. Anyway, enough about Gopher. But it was just a very simple protocol and you interacted with it over a text terminal. So, you could telnet in and do text or you could use one of the apps that bundled it all up and let you browse like you’re in the finder.
AVDI: I used to go to the public library and get on a terminal and access Gopher. CHUCK: I wonder how much ASCII art went across that.
JOSH: A lot.
JOSH: It’s like old BBSs. Okay. So, back to HTTP 2.0.
ILYA: Yeah. So, that’s actually very similar to 0.9, HTTP 0.9 because there’s literally just hypertext, hence the name Hypertext Transfer Protocol. There are no images. There’s no nothing. So, I guess the big innovation there was that we added this hyperlink context and you can just navigate across different objects. So, fast-forward a little bit, let’s say 1995. We realized that, “Hey, this is actually pretty cool. Let’s add a bunch of other things,” like, we’d actually like to have images, maybe. We actually had some browsers come out around that time. And we started adding new things. Images, after that came style sheets and other types of resources. So before you know it, we kind of added a lot more into the HTTP protocol itself so that you could do things like, “Well, here’s the date when I generated this resource. Here’s the cache key for it or when it should expire. Here’s the content type that I’m serving,” because now we’re serving multiple file types over that same connection, and just on and on and on. We just kept adding a lot of these things. And one thing that a lot of people don’t realize is that HTTP 1.0 is not actually an official standard. It was basically just an initiative for IETF to say, “Hey, let’s look around at all the crazy things that people are doing, pick the most popular ones and just document them,” because that’s basically all that was.
ILYA: Different servers coming around, there are different implementations of browsers, and they’re all just experimenting on completely wild stuff. So, they just picked the stuff that kind of stuck around and just documented it. And that was around, I think 1997, that the 1.0 standard, if you will, came out. And it’s just documenting what’s out there. And then after that, there was another kind of two-year effort which took that 1.0 and started to add more language around it. So, clarifying things like how does HTTP caching work and all the rest. And that was published in, I think, late 1999 or somewhere around that. So basically, since then, the protocol hasn’t really changed. But as we all know, the web certainly has. So, imagine or think back to the sites that we saw back in 1999. This is still the Geocities era with animated gifs everywhere, although that seems to be making a comeback. [Chuckles]
JAMES: Blink tags.
ILYA: Yeah, yeah, and all that good stuff. And of course now, it’s just a very different web. We’re building not just pages, we’re building applications. And then we have email and docs and all the crazy stuff, all living in the browsers. So, the transport really hasn’t changed, but the things and how we build them has changed significantly. And basically, we’re realizing that, “Hey we need to, like this is mission critical infrastructure now.” Performance matters, both financially in terms of who can show that faster, low times, lead to better revenue conversions and all the rest, and also just for experiencing the web in a better way where it shouldn’t take 10 seconds to load a page on your mobile phone. So, what can we do to fix that? So, HTTP 2.0 is an initiative around that to address some of those core limitations within the HTTP protocol.
JAMES: So, you mentioned performance being one of the major concerns. What are the other major concerns you’re trying to get around?
ILYA: I guess performance is actually the primary one. One of the interesting things that… Before we get to HTTP 2.0, there’s an interim step in there. Around 2008 or so, at Google, we ran a couple of experiments where basically we just set up a lab environment and vary two things. We picked, I think a hundred sites, a hundred popular sites, and said, “Well, let’s try and figure out where the bottlenecks are in terms of the actual load times of the pages.” And we varied two factors. One was latency and the second one was bandwidth. So, you just fix latency at whatever, 100 milliseconds and then you start with 1 Megabit per second and then you double that to 2 and just see how that affects things. And basically what happens is, when you look at the graph when you keep the latency fixed but vary bandwidth, is that when you go from 1 to 2 megabits, you almost get a double improvement in performance. So, you half the loading time, which is great. That’s exactly what you want to see. You go from 2 to 3, you kind of get a little bit of diminishing returns. It’s 30%. And then unfortunately, it gets into that diminishing returns curve very, very quickly. So by the time that you’re at 5 megabits, basically you’re looking at single percentage points in terms of the actual load time improvements. So the takeaway there is a lot of our ISPs love to sell us bandwidth. It’s like, “Here’s 40 megs and 20 megs,” or whatever, 100 or a gigabit even. But in reality, at least for loading webpages, it wouldn’t actually help you at all, or very little I should say, for speeding up browsing the web. It will certainly help downloading large media streams like you’re streaming a movie or something else. But for downloading pages, bandwidth is no longer an issue for most people. Like an average internet connection in the US is over 5 megabits now. So, upgrading to a data plan or a provider that gives you more bandwidth is just not going to give you much. But latency on the other hand is much more interesting because basically you look at that graph and you see that there’s a direct correlation. That’s it’s just a linear relationship between the lower the latency, the faster we load the page. And unfortunately with latency, it’s a tricky problem because we have this speed of light thing, which is rather annoying.
ILYA: And we haven’t figured out how to fix it yet. A couple of years ago, there was…
JAMES: Are you suggesting that Google’s working on that?
ILYA: I have no idea.
JOSH: Quantum tunneling, quantum tunneling, spooky networking at a distance. [Chuckles]
CHUCK: You totally should have said, “I can’t talk about that.”
ILYA: I can’t talk about that, yeah. Well, we did have some news from, what is it, CERN a couple of years back where they reported that they found something that was travelling faster than speed of light for some experiment and then found that it was a faulty cable.
JAMES: Yeah, it was a faulty cable.
JOSH: Yeah, that was the neutrino experiment between Switzerland and Italy, yeah.
JAMES: That’s it, yeah.
ILYA: That’s right. Yeah, so hey, if somebody solves that, that’s great. Because then my job is done.
ILYA: Basically, the insight there is a lot of our performance problems on the web are due to latency today and if we can fix that, then that’s awesome. I can move on to the next great project.
CHUCK: So, how does this affect me using BitTo — I mean, helping people back up their files?
ILYA: [Chuckles] Well, it doesn’t. If you have sufficient bandwidth, like a couple of megabits — so let’s say you have, whatever, a 20Mb connection and you’re using some portion of that. If you still have some bandwidth left, a couple of megabits, then you’re probably fine. It may affect you in other ways like there are a lot of problems with things like buffer bloat where if you’re doing BitTorrent or any other large media streaming, playing a video game even, some local routers actually do a pretty poor job of scheduling or buffering too much data, which introduces extra latency and extra delay. So, there’s a lot of work actually. It’s a whole separate topic in the space of buffer bloat and how do we address that.
CHUCK: So yeah, so going back to webpages then, I don’t completely follow how HTTP 2.0 helps, say our clients or our employers or even ourselves make our applications appear to load more quickly and things like that.
JOSH: Can we get a definition for binary framing?
ILYA: Sure. So, as opposed to just a text protocol, I guess let’s see, what’s a good definition of binary framing?
AVDI: So first of all, we’re talking about going beyond 8-bit ASCII here.
JAMES: That’s the binary part. I would like to know the framing part.
ILYA: So, I guess the idea with binary protocol is we can instead of using plain text, the question is in a data stream, how do you find the right delimiters? What’s the end of stream? What’s the end of message? In HTTP 1.0, that’s basically newlines. You send a header, you send a newline, you send a header, you send a newline, and then you’d send two newlines and then you send the body. And that’s how we know that the message is done. In binary framing, we’re basically defining a new set of delimiters which are just a specific frame format. Like every frame will start with this specific sequence of bytes. And then after that sequence of bytes, it will say ‘I’m a data frame’ as opposed to a headers frame. And then I’ll say ‘I’m of this length’. So then, you know exactly how much data you should read from a socket to figure out if you care about that data or not. So, it’s just a more efficient way to encode the data on the wire. And because we partitioned this data into these small chunks, it allows us to interleave that data as well. So, we can now send a whole lot of requests and get data back that is completely interleaved and mixed together. Don’t know if that helped.
JAMES: So yeah, that’s kind of interesting. I think a concern I seem to see from tech people, and I share this, is that we lose a lot of transparency that way. How do you feel about that?
ILYA: To some degree, to the extent that you can’t open a telnet window and just type in GET and then do something like that. But honestly, how many people do that today? And second of all, we’re already using a lot of the same protocols. For example, if you’re using TLS. IP itself is binary framed. And there are all these protocols that run below it that you don’t have visibility into per se if you just dump it to a terminal. You just need better tooling. So, that’s why we have things like Wireshark and tcpdump and other things which just analyze that data stream. And nothing stops us from building kind of a little shim that would, a command-line client that would open up and allow you to type in GET in plain text and it would just translate it to HTTP 2.0 or to a binary frameset used in HTTP 2.0. So that’s not really, I think, a big concern. Another win of actually going to this sort of formatting is while it’s easy for us to understand the HTTP protocol in plain text for humans who grok it, it’s actually harder to parse, surprisingly, frankly. Building an HTTP 1.0 parser is surprisingly hard. [Chuckles] It feels very simple at first when you start it. And then you discover all kinds of interesting and annoying edge cases. So, going to a binary framing protocol is actually much easier because when you write it, it’s just like, “Okay, I saw these two bytes. I know I’m getting a frame. I know the length of the frame so I know exactly how much data I need to read and I know the type of it.” It just makes implementing this a heck of a lot easier. And I say that as someone who’s had experience building both an HTTP 1.0 Ruby client and also working with HTTP 2.0 and building a parser for it. It’s just so much easier once you know binary framing because you basically have this contract for how everything should look on the wire.
AVDI: You know, we’re talking about some of the drawbacks of HTTP 1.0 but I think it’s hard to deny that HTTP 1.0 has been really, really, really successful as a protocol. And that’s the dream of a protocol designer. Do you have any insights into — and before we get into all the things that 2.0 improves, do you have any insights into what were the right decisions that they made? What made HTTP 1.0 such a successful protocol from a design standpoint?
ILYA: Yeah, that’s an interesting question. So, I do think that that simplicity of it at the beginning was actually important. So, it was not over-engineered. It was just like, “Here’s the simplest thing that could work and let’s try it.” And I think that’s fine. That’s exactly, that’s part of it. I don’t think the protocol itself is what made the web. There’s also the fact that the web was actually incredibly useful and it just happens to run over this protocol so there are two things in there. So, find a good use case for it and also…
AVDI: But I mean, we saw so many other things built on it that you wouldn’t have expected to use HTTP but they did anyway, beyond just the web itself. And so, I find that interesting.
JOSH: WebDAV, WebDAV.
AVDI: Yeah, exactly.
ILYA: [Chuckles] Yeah. So, I think it kind of feeds on itself because the more clients you have, the more applications you’ll have. You can talk to [inaudible] over HTTP today, so that’s pretty awesome. And so, there’s definitely a lot, you said, for just a simple text-based protocol that allowed a lot of people to experiment with this stuff initially. It’s just very simple to get a demo up and running.
JAMES: But now, you’re saying we have to grow up and use the binary stuff?
ILYA: Yeah. Yeah, exactly. It’s grown to a point where it’s mission-critical, basically infrastructure. Everything runs, well not everything, but significant portions of everything work today on the internet, runs over HTTP. And as developers, we just see a lot of issues in terms of we can’t do a lot of the things that we want with HTTP 1.0 performance-wise and there’s a need to address those.
AVDI: It’s just interesting to me though that HTTP 1.0 arose in a time of many binary framing protocols. The conventional wisdom was, of course, a text-based format is way too verbose. It’s not efficient enough. And people just had oodles of problems with those binary protocols, whether it was UNIX RPC or whether it was CORBA or any of the number of other protocols which have more or less fallen by the wayside at this point.
AVDI: Yeah, NFS is a great example. These text-based protocols including HTTP but also including SMTP and a few others just kind of stomped all over those with, I don’t know, their approachability, et cetera. Again, against the conventional wisdom that binary framing was better.
ILYA: Yeah, maybe I wouldn’t pose it as directly as binary versus text because if you look at other layers of the stack, TLS, so all of our HTTPS traffic, that’s binary framed. IP, that’s binary framed.
AVDI: Right, but I’m talking about the application level. I’m not talking about the transport level so much. Are you saying that what used to be the application level now needs to be pushed down to the transport level?
ILYA: Let’s see. Maybe? Not sure that…
AVDI: That sounds kind of like what you’re saying. Because HTTP used to be considered application level if you look at the network layer cake. And if you wanted to do multiplexing, well that was the domain of TCP or some other thing at the transport level. And so, it sounds like you’re kind of pushing HTTP down into the transport layer, whereas it used to be this text-based application protocol on top of lower binary transport layers.
ILYA: Yeah, I guess the layer cake is kind of confusing at this point, because what’s an application protocol? What’s a transport protocol at this point? I think that’s partially true. Maybe the observations made is we started with something very simple. It proved its worth. We’re finding that we’re pushing more and more data over these protocols. If you actually look, as an interesting point, we use HTTP over the public web but whenever you walk into any large organization and you look at what they use on the inside to communicate between all of your services, most of the time it’s not HTTP primarily because of performance and a few other concerns. Because they have to invent basically their own protocol, whether that’s something like Stubby…
JAMES: I’m not sure I agree with that. We’ve seen large movements like SOA, service-oriented architecture, and things like that. And I think that the kind of default there seems to be HTTP because it’s so well-known. It’s so easy to set up. Nginx is running on everything. Am I wrong in that?
ILYA: Well, so if you look at the large — so, SOA itself as an architecture is independent of HTTP. Nobody said that it has to be HTTP. So for example…
AVDI: Yeah, but practically it is HTTP.
ILYA: I’m not sure. A lot of large organizations…
AVDI: That’s what they said about every protocol. It was like, “Well, this is actually transport-independent,” but practically it was all over one transport.
ILYA: Right. So, let’s look at some examples that at least I’m aware of. Within Google, we have our own protocol called Stubby which is basically we start with an HTTP connection and then we upgrade to this other form of binary framing protocol. Facebook, of course, invented their own, Thrift. Twitter is using their own. Although, I believe, they’re actually migrating to HTTP 2.0, so that’s great. And there’s a lot of other binary or transport protocols.
AVDI: Because you’re not talking about traditional enterprises. You’re talking about service providers that have their own internal plumbing to provide a massively-scaled service. ILYA: Right. Right, yeah.
ILYA: So internally, you’ll still have your SOA architecture where you have different services and all the rest. But the protocol over which you communicate is just practically oftentimes something other than HTTP just because it introduces a lot of overhead, unnecessary overhead. AVDI: I think I understand what you’re saying. I’m not sure I would agree with the oftentimes simply because I think you’re referring to kind of a pretty small subset of companies that are, as I said, they’re offering a massively-scaled service.
JAMES: Yeah, I think Avdi’s…
AVDI: Like your average enterprise is not going to use anything like that.
JAMES: Like for example, Google in more recent years has said, “Well, Python doesn’t really scale to our particular level of needs.” But the truth is you have to get to Google’s particular level of needs before Python stops scaling to that level, right? It seems like that’s the top of the top. That’s the problems Google, Twitter, Facebook, that’s a massive amount of content that I do not think the average website application has. That’s, I think, what Avdi’s trying to say.
ILYA: Yeah, fair enough. But I’m not trying to knock on HTTP, right? As we said earlier, it’s an extremely successful protocol. It’s awesome. I love it. I’ve done a lot of work with it in the past. And the idea is that we can just make it better. So, that’s the goal of the project.
AVDI: Yeah. If I pushed back a little, it’s only because I just wonder, are the improvements coming solely from the perspective of a Google or Facebook, or are they also coming from the perspective of the hundreds of thousands of people developing smaller applications and websites?
AVDI: [Chuckles] Yeah, that always struck me as a horrible hack.
AVDI: Sorry, a horrible kluge. It shouldn’t be elevated to the level of hack.
ILYA: Well, it is a hack. I would actually call it a hack. And it’s a terrible one too because the other downside to it is first of all, it’s a pain in the ass to manage. But second, it actually costs a lot in terms of memory on the browser. So, let’s say you have this giant sprite, whatever, 1000 x 1000 or maybe something smaller. You have to decode the entire image which actually occupies a lot of memory. And perhaps you’re just using a little tiny icon from it. So, that’s an issue. And those icons can only be displayed once the entire image is downloaded. So, you add all of these small things up and you quickly realize that it’s just a burden on the developers to manage this. And most of them actually get it wrong.
JAMES: That seems really, really weird to me though. Everything has been moving in that direction and you’re saying our data on that’s just wrong. It’s not faster?
ILYA: Yeah. Part of it is the connectivity profiles are also changing. So when we first started advocating for those sorts of changes back in, whatever it was, 2005, 2007, when this stuff started showing up, the connection speeds were different. We were primarily maybe DSL was state of the art and bandwidth was really an issue there. So, you spend more time just downloading resources. Now that bandwidth is much less of an issue, latency is the problem. And because of that, these “best practices” are changing. And with HTTP 2.0, you actually don’t have to do that at all. And in fact, some of those things will actually hurt your performance.
CHUCK: So, when do we actually get to start seeing HTTP 2.0? When does it start solving some of these problems for us?
ILYA: You can actually play with it today. So, when I was talking about that experiment that Google did back in 2008, that actually prompted another project called SPDY. And the idea there was, “Well, let’s try and experiment with this thing.” Let’s try and build a new protocol, in Chrome at the time, and see if we actually get any performance benefits from it. And it turns out that it did. And two years after the project started or three years, Firefox adopted it. Opera had it installed. Facebook and Twitter enabled it on their sites and a whole lot of other sites as well. And it was kind of becoming this new de facto protocol. And at that point, we took it to the IETF and the IETF started a new initiative around HTTP 2.0 which went through a round of proposals. And basically what happened was this version of SPDY at that time, which was SPDY Version 2, was adopted as a starting point for HTTP 2.0 protocol. And those two things have been evolving in parallel. So, the HTTP 2.0 protocol itself is not yet ready in terms of getting it out in production. But SPDY is the experimental version, if you will, that is running in production. And basically, the way that works is the HTTP working group has these interim meetings every couple of months, every quarter or so. And we just sync up and talk about, “Here are the things that we were thinking about. We tried them. We prototyped them in SPDY. Here’s what we learned. And let’s try the next iteration. Let’s tweak it in this way and let’s see if that helps. Let’s change these framing flags or let’s add this other new feature.” And that kind of coevolution of the two protocols has actually helped quite a bit because the best way to make decisions is based on data. And we can do that because we can just build it into Chrome or Firefox, or even IE now supports SPDY, and try these ideas and then feed them back into HTTP 2.0. So long story short, you can get most of the benefits of what HTTP 2.0 will deliver today if you just configure SPDY on your server. There are modules for Apache, Nginx, and most of the other popular backends that will do SPDY. And most browsers support it today. And sometime in 2014, fingers crossed, we’ll actually get the official HTTP 2.0 spec. At which point, we’ll just deprecate SPDY and just rename it to be HTTP 2.0. So, it should be a pretty seamless transition.
ILYA: Yeah. So that’s one of the interesting, perhaps, gotchas. First of all, any application that’s delivered over HTTP 1.0 will work over HTTP 2.0. There’s nothing changing there. The semantics are all the same. It could be the case that certain optimizations that you’ve done for HTTP 1.1 will actually hurt in HTTP 2.0. And when I say hurt, in practice at least from what I’ve seen today, it doesn’t mean that your site is actually going to be slower. It’s just that it won’t be any better than HTTP 1.0. So, you may not see that much of a benefit in terms of performance. But that’s fine because then you can just tweak your implementation and adjust from there. The question is how do you go about doing that and that’s where it gets a little bit more tricky depending on how you’ve currently built your site. How do you potentially deliver two different versions of these assets? And there are some simple strategies for migration there. I think we’re still going to have to work through a lot of the quirks as we move forward on this stuff. But the simplest thing you can do, actually the thing that may hurt you the most is domain sharding. S, we mentioned the six connection limit a few times now. We have this so-called best practice with domain sharding where we said, “Hey, well that’s six connections per origin. What the heck. I’ll just have multiple origins.” [Chuckles] “And then I can open 12 or 18 or 24 connections.” Whatever, right? And that actually hurts performance quite a bit with HTTP 2.0 because HTTP 2.0 kind of tries to go over one connection and get the best performance out of that one connection. So, if there’s only one thing you can do is just undo some of the domain sharding. In practice, most sites overextend themselves with domain sharding. They actually hurt themselves in the process anyway. So, if you just disable that, you’re probably on track to have a well-performing site in both. That’s a very simple strategy.
JAMES: So, it’s interesting because it sounds like our current best practices are kind of backwards from what we’re going to be moving to. So, it does seem like that makes it hard during the transition period, knowing which set of strategies we’re supposed to play to. And it almost sounds like it’s going to have to be both for a little while.
ILYA: Yeah. I may be making it sound a lot more complicated than it actually is. The way I think about it, to be quite honest, is I just can stop doing things I don’t like doing. So things like concatenating files and doing all that other stuff. Like you just take out the Rails pipeline and you’re done. [Chuckles]
JAMES: I’m sure a lot of people will…
JOSH: Well, that sounds like a win.
JAMES: Yeah. I’m sure a lot of the people would like to do that for different reasons, but… [Chuckles]
JOSH: So Ilya, I’d like to talk a little bit more. We’ve skirted around this issue a bit, but talk a little bit more about the web developer experience because you’ve been talking about how some optimizations for 1.1 aren’t great. I’m really curious about if HTTP 2.0 has any kind of direct support for XML HTTP Requests because those things get used differently from the typical load a page request.
ILYA: Right. So, what we’re changing in HTTP 2.0 is just how the data flows on the wire. It doesn’t actually change the semantics. So for XHRs, it’s actually no different. It’s just [inaudible]…
ILYA: Request as far as we’re concerned. So, that shouldn’t be affected. That said, there are some interesting other benefits. For example, with XHR let’s say you’re doing something like long polling. Like you’ve opened a request to the server and you’re just waiting to get an update. One of the issues with that with HTTP 1.0 is you now occupy that connection and you can’t use it for anything else. So in fact, you can actually run a self-inflicted DOS attack on yourself. If you open six hanging connections to your origin and then you try to fetch anything, let’s say an image asset, you can’t. It’ll just hang until one of those connections becomes available. The cool thing with HTTP 2.0 is we can actually multiplex as many requests as we want. So, you can open 50 hanging GETs or use server-sent events or even transport web sockets all over the same connection. So, you just don’t have to worry about those limitations, which is a nice win.
JOSH: Yeah, that’s cool. It seems like some sort of host aliasing thing could be really helpful for bridging the transition from 1.1 where you do all the domain sharding stuff. That if you could put a meta tag or header or something that says, “Oh, if you’re getting assets.google.com, oh we’ll just map that to google.com for HTTP 2.0.”
ILYA: Yeah, that’s certainly something that you could experiment with. So actually, this brings up a good point which is how do you even know whether you’re running over HTTP 1.0 or SPDY or HTTP 2.0 for that matter?
JOSH: You look in the header, right? [Chuckles]
JAMES: You can’t. It’s binary. [Chuckles]
ILYA: Well, by the time you get it into your browser or your clients, you’ll have your hash map of header keys and values and that’s fine. But the way the actual upgrade is done today is over, it’s negotiated in the TLS Handshake. So, this is kind of an interesting side point actually. In practice, there are two ports open on the web today, 80 and 443. There’s encrypted HTTP and then there’s port 80. For various reasons, most of the other ports are closed, like you have corporate or private firewalls, you have [inaudible] and all that kind of stuff. So, those are the two ports that you can deploy stuff over without trying to reinvent the web and upgrade all the infrastructure. The second case, the second issue now is, “Okay, great. But 80 and 443 are already being used for other things.” So, what if we try to run this new protocol over port 80. Well, in most cases it turns out to work pretty well. But about 20% of the time, the connection just randomly fails.
ILYA: And usually what happens is you have these very helpful proxies or intermediaries in between which look at the data stream. And for various reasons, like maybe a caching proxy or maybe scanning for malware, they’ll look at this protocol and be like, “Hey, this doesn’t feel like HTTP 1.1 because that’s what I’ve been taught to analyze. So, this is a bad thing. I’m just going to close the connection.” And in practice, this is a huge issue because if 20% of your connections just randomly fail, that’s not a very useful web service. And even antivirus software fails that quite a bit. So, if you run antivirus software on your machine, it’ll just look, inspect all the traffic for port 80 and be like, “That’s it. It’s closed.” And by the way, this is the same problem with web sockets. If you ever try deploying web sockets especially in mobile context, you’ll know that you need to deploy over HTTPS. It’s exactly for this reason, because there are all these intermediaries which get in the way and just mess with your data. So, to deploy HTTP 2.0, practically speaking, you’ll be deploying it over TLS which creates this end-to-end encrypted tunnel and cuts out the intermediaries. And then during the TLS Handshake, you can actually negotiate. We have a new extension, Next Protocol Negotiation for SPDY, where you declare like, “Hey, I support HTTP 2.0,” and then the server acknowledges that and says, “Great. We’ll speak HTTP 2.0.” So, we don’t have to incur that extra roundtrip to negotiate the protocol. So, by the time the connection is established, basically you already know which protocol you want to speak. And then at the server level, you can make decisions as well, like which assets I’m going to serve. JOSH: So, listening to you describe that, that sounds like that would have a huge impact on intermediate caches. ILYA: To some degree, to transparent caches specifically.
JOSH: Yes, yeah.
JOSH: Yeah. So, if somebody just wanted to run Varnish, it would be completely different how that worked.
JOSH: How do you cache a fragment of this binary framed chunk?
ILYA: Right. So, for most sites that have a caching tier, today if you’re doing TLS, you’ll have to have something that terminates TLS to begin with.
ILYA: So, that’ll be your load balancer or maybe you’re running HTTP proxy or Nginx or what have you. It’ll terminate the TLS connection and then what you do after that is your own business. You can talk to any cache that you want. Same thing applies to a CDN. So, if you’re using a CDN to serve content over HTTPS, basically you have to give them your certificate and they will terminate that connection such that they can actually look at what the client’s requesting and then serve out the right asset. So, that really doesn’t change. The question more so is what if there was an intermediate cache, if my carrier deployed a cache that’s just transparently doing stuff with its content? And that’s where — therein is a problem because some of those caches actually do things that we don’t want to the protocol. So yes, they will be affected.
JOSH: Okay. So, what other web developer changes do we need to think about?
ILYA: Let’s see. So, we talked about a couple of things. We talked about multiplexing, the fact that you can send many different requests. Another cool feature with HTTP 2.0 is server push. So, the idea here is you can send multiple responses to one request.
ILYA: And that sounds a little crazy.
JOSH: [Chuckles] It’s like asking a question of our panel and we all have our answers for it. [Chuckles]
JAMES: What about things like streaming and stuff? I think that is kind of a good example of where HTTP 1.0 is super, super bolted on.
ILYA: Yeah. So, we already talked about the XHR case where you run into issues with the long-hanging GET. With HTTP 2.0, that’s basically addressed. But then some of the other issues, I guess, streaming, I’m assuming you’re talking about XHR streaming, for example, that really doesn’t exist. And that’s more a limitation of the XML HTTP Request API than it is of the HTTP protocol today. And there’s a bunch of work to actually address that, to add new APIs that will enable that. But of course then sending that over HTTP 2.0 will also make it much more efficient as well. Let’s see, what else is there? We have header compression, which is also quite nice in HTTP 2.0. So, it turns out that we send quite a bit of header metadata for every HTTP request. That’s actually one of the issues, one of the reasons why a lot of people go to their own protocol on the backend for when they run their stuff at scale. It turns out that an average request on the web today adds about 800 bytes of headers. That’s including requesting a response. And then if you add cookies, all bets are off, because it could be in well over kilobytes or multiple kilobytes of data. And that’s kind of unfortunate because if you think about it, more and more so we’re building apps which are just sending small requests. Like here’s a little JSON packet of an update that I did, or here’s a new message from whatever, from a chat application. And that’s 30 bytes of JSON data. And then you wrap it with 800 bytes of HTTP header metadata.
ILYA: It’s like, oh okay, something went wrong there, right? So with HTTP 2.0, there’s header compression which is to say we can avoid sending the metadata that we’ve sent before. So basically, we kind of keep state on both ends and if nothing has changed, for example, your user agent. We send that on every single HTTP request. How often does your user agent change between requests? Like, honestly.
JAMES: [Chuckles] You never know when you might need to switch browsers mid-site.
CHUCK: [Chuckles] Yeah.
ILYA: Right. Midway, while you’re requesting all the assets. A simple optimization would be to say, “Hey, I sent you a user-agent header at the beginning of the connection. Just assume that that’s the user agent that I’m using and then if it changes, I’ll let you know. But otherwise, just assume that it’s the same so I don’t have to transfer that data.” So, it significantly reduces the overhead of HTTP as well. So actually, in the best case, if you’re just stuck in a loop and you’re just re-requesting the same resource, the actual overhead of making HTTP request goes down from 800 bytes to, I think 8 or 10. So, it’s a factor of a hundred, which makes it obviously a lot more efficient. We’re kind of in the WebSockets territory of just very little overhead for each and every message.
JOSH: Okay, so how does this interact with browser development? Obviously, you need support for new protocols in browsers. Is the browser development community embracing this? Are they pushing back? What’s going on?
ILYA: Yeah. So, I think we actually have very good progress there. So as of today, Chrome supports SPDY, which is the experimental version implementation of it. We actually have an HTTP 2.0 implementation as well. It’s under a flag. It’s just that practically speaking, there are no servers on the internet today that will speak to you in that protocol. But if you want to test it locally, you can. Firefox also has SPDY and HTTP 2.0 implemented. The IE team…
JAMES: IE’s going to have it next week, right?
ILYA: Well actually, IE 11 supports SPDY.
JAMES: Wow, that’s awesome.
ILYA: Yeah, yeah. So, Martin Thomson who is the editor of the spec is actually at Microsoft.
JAMES: That’s awesome. That’s really great.
ILYA: Yeah. I think we’re well on track. And because the SPDY and HTTP 2.0 protocols are not exactly the same, they’re always a little bit out of tune because one gets ahead and then one falls behind and all the rest, but they’re very similar. So, by the time we finish, in air quotes, “finish” HTTP 2.0 spec, we’ll already have a well-implemented version in most browsers and then just a small [inaudible] to update it. And if all goes well, we should actually, I’m hoping we will see this in production in 2014 which is a pretty aggressive timeline if you think about it.
JOSH: It is.
JAMES: What do you think will be the – I mean, obviously HTTP 1.0 and 1.1 are not going away any time in the near future/ever. How soon do you think we’ll see sites switching over? Probably the bigger sites I think would switch sooner, probably because they have more to gain maybe.
ILYA: Yeah, I think so. So, as we said at the beginning actually, even HTTP 0.9 is still supported by some servers. I’m not sure that they’re inclined to actually use that. But chances are it will still have HTTP 1.1 for another decade at least. And the question is, is there enough of a benefit to most sites to make the switch? And I’m hoping the answer is yes. We actually just recently ran some stats on a bunch Google services which are using SPDY. And we compared it to just regular HTTP 1.0 and we’re seeing anywhere between 20% to 40% reduction in latency of the actual page load times. So those are significant wins, right? That’s something I can take to other Google teams or projects and say, “Hey, you should enable SPDY because it’s going to make your pages faster.” And if those wins are big and good enough, then I think that makes it simple. The other problem, of course, is also just having server support. So how easy does Apache and Nginx and all the other infrastructure that you have allow you to upgrade to that? Because there are certain issues in just implementation that you need to do there. And Nginx actually has a SPDY implementation today. So, if you’re using the latest version, it’s literally just a matter of enabling a couple of config flags. Same thing for Apache. And then if you’re running custom hardware or other things, that’s where you may have to wait a little bit just for your vendor to integrate support. That said, I know that F5 and a lot of other vendors already have products that support SPDY. So, there’s definitely some adoption curve in there. But I think the, at least the current numbers that we have, show that there’s enough of a performance win such that it is a compelling argument to actually take it to the team. And it actually doesn’t require that much. You’re basically saying, “Look, we’re just going to enable this. All of our existing applications are going to work over it just fine.” And then after you enable it, you can start thinking about what can I do to take advantage of some of these new features? How do I make it go even faster?
JAMES: Gotcha. So, in the beginning, turn it on and everything should be good. And then going forward, you re-architect to favor that strategy basically, and you should get bigger wins, is what you’re saying?
ILYA: Yeah. Yeah, exactly. Actually, one of the things I’m really interested in is maybe not web-web developments but coming back to work on SOA architectures and all that kind of stuff. In Ruby land, when you look at the HTTP libraries that are available and that we use today, most of them frankly have a terrible API for exposing things like multiplexing and pipelining and all of these things. And that typical request that we make is like, “Okay, net HTTP. Here’s a URL,” and I get my response. And that basically maps to a new connection. And now we need to change all those into phases and educate developers to like, “Hey, it’s a good thing to reuse your connections. You don’t have these limitations anymore.” And how do we go about updating all of those client libraries or just designing new APIs around it? That’s, if anything, I see is kind of a more challenging problem.
JAMES: Yeah, it seems pretty significant. Everything that’s architected that way. It sounds like unfortunately, we need changes at basically every layer in between client and server. And that’s a lot of things to change. It’s like if tomorrow we had a better gasoline but it required changes in all gas stations and all cars, it’s a big problem. [Chuckles]
ILYA: Yeah. The one difference here would be that that same gasoline can still work in the old cars. [Chuckles] So, it’ll still run your old car. But then you can ask questions like what can I do to make it run better? So, it doesn’t mean that you need to get an entirely new car. This stuff, we’re trying to make the upgrade process as simple as possible. How do we negotiate which protocol you’re going to use? And all the same applications still run over it and there are going to be no changes there.
JAMES: That’s a good point.
CHUCK: Alright. Well, I think we’ve kind of hit our time limit. I’m sure there’s a lot more to talk about. If people want to know more about HTTP 2.0 and keep track of what’s going on with SPDY and with the protocol, what are the best ways to do that? Is there a mailing list or blog articles or what?
ILYA: Yeah, so there is. If you’re interested in those kinds of stuff, there is the IETF HTTP working group. It’s just if you search for http-wg, so that’s the working group, you can look at the mail archives. You can join the mailing list and there are lots of ongoing discussions about all of this stuff, if you really want to get into it. If you’re looking for maybe a deeper dive to understand what this stuff is all about, this is a shameless plug. I actually have a book out with O’Reilly called ‘High Performance Browser Networking’ and it’s actually available online and free. And I have an entire chapter on HTTP 2.0. So, if you go to hpbn.co, you can actually just pull it up there and read more about it.
CHUCK: Alright, cool.
JOSH: So, I have one last question.
JOSH: It’s not about HTTP 2.0 per se, but I think it’s related. It’s about ‘The Extensible Web Manifesto’. So, I assume you’re familiar with that?
ILYA: Yup, yup.
ILYA: Yeah, absolutely. So, I can’t speak on behalf of all the browser developers.
ILYA: But I work quite closely with Chrome and I know that there’s a lot of interest in that. And you’re seeing that in a lot of the new standards that are being developed, things like Web Components and other things which deserve their own show. But the way I think about it is there is a component that you’ve described but it’s also to me about exposing low-level primitives that allow you to build your own higher-level abstractions as opposed to just giving you the end API, right?
ILYA: Instead of giving you, “Here’s,” I don’t know, “An awesome camera filter effect. And we give you five,” instead of just saying, “Well, here’s access to the locked camera feed. You have CSS. You have WebGL. Go nuts.” And if we find later that everybody’s using the same effect, great. We’ll just take that and provide it as a native thing such that you don’t have to do that. Actually, a great example of things like that is a lot of the changes that even jQuery brought about. Working with DOM was a pain in the butt. JQuery showed us that it doesn’t have to be, at least less so. And we’ve made it easier by just taking parts of it and putting it directly into the browser. So, I think that’s definitely the direction that we want to head. And I’m personally pushing for a whole bunch of things which I think we need that are like that in a browser.
JOSH: Great, cool. Okay, so that’s it on that.
CHUCK: Alright. Well then, let’s go ahead and do the picks. Josh, since you’re just already talking, do you want to start us off?
JOSH: [Chuckles] Sure, great. So hey, it’s almost a new year. And everybody needs a printed calendar for their new year, right?
JOSH: Paper calendars, they’re awesome.
CHUCK: I am not getting into a swimsuit for you, Josh. [Chuckles]
JOSH: Okay. Well, I prefer my pictures of nebulas and other celestial entities. So, I’m pretty sure I’ve picked the astronomy picture of the day before, the NASA website showing you awesome pictures of things related to space. And that’s just asterisk.apod.com. But they now have this, for the last couple of years, they’ve been doing a fan-created calendar that is 12 nice pictures of celestial phenomena and the little calendar grid beneath it with all sorts of meteor showers and things like that. And it’s all nicely laid out in PDFs that you can just take it to Kinkos or whatever and print it out on spiral-bound stuff and have your own little flip calendar. So, that looks really great. I’m going to be getting this one done soon so I can have something to put up in my bathroom. [Chuckles]
JAMES: Wait a minute. You keep a calendar in your bathroom?
JOSH: I need something on the back door of the calendar, right? I mean, of the bathroom, right?
JOSH: You always need something there to look at. [Chuckles] And then I have a silly one and that’s The “Blog” of “Unnecessary” Quotations. [Chuckles] And this is great. It’s a lot of funny pictures of people abusing one of the best punctuation marks there is. So, that’s it for me. I haven’t been doing much programming in the last week or two. So, no programming picks. Okay, I’m done.
CHUCK: James, what are your picks?
JAMES: First, I talked recently about problems in gender diversity in our field and stuff like that, and how we all need to be, I think, increasing our awareness on that. There’s a really great write-up of the recent Node.js issue by Joyent, the company that sponsors Node.js. It’s really short. You can read through it quick. Super insightful as far as why this is a problem and how we should be thinking about this and stuff. So, I’m recommending everybody read this because it’s a great write-up. So, that’s my first pick. And then second, I’ve been playing a bunch of games lately. And I’m finding some pretty good stuff. One of those is Papers, Please. And if you have not played this game, you absolutely have to. It is fun how fast this game can turn you into a horrible person, which is always interesting. You’re a paper checker at a customs checkpoint, checking passports and tickets and various complications. And you have to do so many a day to make money and you’re keeping track of your family and you end up not being able to pay your heat or whatever if you don’t do enough. Then your kids get sick and you start thinking, “Ah, I need to just get this person through here as quick as possible.” It’s amazing how quickly it gets you to start reassessing these things and stuff. It’s really a total blast. Papers, Please and [inaudible]. Those are my picks.
CHUCK: Alright. Avdi, what are your picks?
AVDI: So, I’m going to pick something topical and that is Ilya’s blog. I have been following your blog for years and years and it has always been one of my very, very favorites. I want to thank you for the years of amazing articles. These articles are always in depth on technical topics. They’re beautifully illustrated. They have great code samples and they’re just incredibly insightful. I learned so much from this blog over the years. It’s igvita.com. And gosh, archives go all the way back to what, 2005 it looks like. So yeah, I’ll put the URL for that in the show notes.
JOSH: That blog was a big part of why Ilya’s a Ruby hero.
JAMES: Yeah, that blog is amazing especially in the event machine stuff and things like that. It’s awesome.
AVDI: Yeah, and I guess okay, I’ll pick something fun. My current Netflix brainless guilty pleasure is the show Revenge which is pretty much just about somebody getting back at people. And it’s just good brainless schadenfreude fun.
CHUCK: Awesome. Alright, I’ve got a couple of picks. My first pick is, and I know it’s been picked on the show before, but I’ve really gotten into The Walking Dead. It’s not something that helps me unwind at night. Well, it helps me unwind, but it doesn’t help me go to sleep like some of the other shows that I watch sometimes in the evening. But anyway, I’m really enjoying the show. I’ve just been buried with work, so I really don’t have any great programming picks. But I have been listening to audio books and one of the books that I’ve been listening to is ‘Duct Tape Marketing’ by John Jantsch. So, if you’re in business for yourself and you’re looking for something that can guide you through the process of setting up your marketing, then that’s a really terrific book. And those are my picks. Ilya, what are your picks?
ILYA: So ‘Duct Tape Marketing’ is actually a great book. I really, really enjoyed that. My picks, let’s see. I actually have three, I guess. One [inaudible], so I’ve already mentioned it before. I do have this new book out called ‘High Performance Browser Networking’. So, if you like my blog, chances are you may like this one as well. So, do check it out. That’s at hpbn.co. Then the second one, actually a book I just finished reading just recently which was really interesting called ‘Exploding The Phone’ by Phil Lapsley which basically takes you through the history of AT&T and phone phreaking which is something that I, to be quite honest, I didn’t know much about. Certainly familiar with the term, but this is a very well-researched history of how it came to be, what they were doing, and the early explorers of this network plus all the legal repercussions that happened and all the rest. So, really interesting read. And then the last one which is just good comic relief whenever I have a long day and need a break, TheCodingLove.com. It’s a really awesome Tumblr blog of basically just animated gifs with annotations for programmers. For what happens when an intern joins your team, to how do you solve bugs, and all the rest.
ILYA: It never fails to give me a good laugh when I need one. So, that’s definitely a good place to check out. And I think that’s it.
CHUCK: Alright, cool. Well, before we wrap up, I want to remind you of our Book Club book. We’re reading financial, or… Financial [Chuckles]
JAMES: Financial programming? Awesome.
CHUCK: Financial programming!
JAMES: That is going to be an awesome episode.
CHUCK: Yeah, James is going to tell us how he gets rich. No, it’s ‘Functional Programming For The Object-Oriented Programmer’.
JOSH: Or as David calls it, FPOOP.
CHUCK: Yeah. Well, you were talking about the bathroom earlier, so…
JOSH: [Chuckles] Well, David’s not on the show. Somebody has to stand in for him.
AVDI: In his defense, that is the acronym.
JOSH: I know! It’s great. [Laughter]
CHUCK: Anyway, so we’ll have links in the show notes and I believe the discount code.
JOSH: Hey, hey, hey, let’s plug Parley too. We haven’t done that in a while.
CHUCK: No, we haven’t.
JAMES: Go for it. Plug away.
JOSH: Yeah, so for those not in on it, Parley is the Ruby Rouges private discussion group. And we moved it from an email list to a Discourse site. So, it now has a lot of great features for managing conversations and stuff. And you can have conversations with the Rogues and other listeners about stuff on the show and other random things. We have job postings. We have all sorts of crazy technology conversations. And we have many of our guest Rogues there available for discussion too. And Ilya, I hope you come check it out, too. By the way, it’s private and you pay anywhere from $10 a year to $50 a month, whatever you want. And that’s a way for supporting the podcast. Done with plug.
JAMES: Another way to support the podcast is go buy a Ruby Rogues shirt.
JOSH: Oh my god, yes!
JAMES: We have our shirt campaign up. We got an awesome design from Beth Morris over at Littlelines. And we’re super, super in love with it. They are awesome-looking shirts. You absolutely need to wear one. So, go check it out. Buy a shirt. We’ll have a link in the show notes.
CHUCK: Alright, I think that’s it. We’ll catch you all next week.