036 RR RubyGems with Nick Quaranto
- Published on:
- January 5, 2012
- 00:43 – Nick Quaranto Introduction
01:35 – Ruby Gems
- Started around Jekyll
- Runs GitHub Pages
- github gems
- Josh Nichols: jeweler
- Tom Preston-Werner
- A Gems API
- How to Sell It
- Sinatra Apps
12:46 – GitHub's Biggest Regret: Writing RubyGems Server
14:35 – How RubyGems.org Has Changed the Landscape to Manage Gems
- Release of gemcutter
17:08 – gemcutter’s Process (Big Shift in 2009)
21:30 – Traffic
24:25 – Gemcutter using S3
- Gemspecs/Gems on Cloud Front
- Updated 200 times/day – Under a minute
27:50 – RubyGems uses 3 Diff Lists
- Latest Version of Every Gem for Every Platform
- Every Single Gem Available Ever
- Pre-released Gems
30:18 – Pending RubyGems Changes:
- New Endpoint
31:05 – Decentralization of Ruby Environment
- Great, until it fails – “everyone is toast”!
33:40 – How do we NOT become Rubyforge
- Having or Not Having Comments
35:44 – Traffic
- Tracks CPAN, Pypy, Hackage
- CPAN is the Perl Archive
38:00 – Mirroring and Problems
- Trust Issues
43:56 – Gem Building Tools
52:19 – gemcutter Future Additions
- More People Using APIs
- Started around Jekyll
NICK: I know you guys had an episode where you tore apart the code and I thought that was really cool and weird to listen to.
JOSH: We’re going to ask you about your feedback about that on the show.
JOSH: We want you to rate our reading.
JAMES: Rate our reading?
NICK: A dramatic reading.
CHUCK: Hello everybody, and welcome to Episode 36 of the Ruby Rogues podcast. This week on our panel, we have guest Rogue, Nick Quaranto.
NICK: Hey ho.
CHUCK: Do you want to introduce yourself really quickly?
NICK: Sure. I’m Nick, I live in Buffalo, New York. I work for 37Signals and I’ve hacked a lot on RubyGems in RubyGems.org, more than any human should possibly want to. Although not the most, I don’t hold the top banner there.
CHUCK: Alright. Well, thanks for joining us. We also have James Edward Gray.
JAMES: I’m back.
CHUCK: Yeah, it was kind of different without you.
JAMES: It was really weird being on the other side. I got to listen to Ruby Rogues as a consumer instead of a producer.
CHUCK: [Chuckles] Josh Susser.
JOSH: Good morning, folks.
CHUCK: Avdi Grimm.
AVDI: Hello again.
CHUCK: And I’m Charles Max Wood from TeachMeToCode.com. And this week, we’re going to be talking about RubyGems, kind of where it came from and then all the good stuff that surrounds that. So, it was suggested by Josh. So, I’m going to go ahead and let him kind of lead us into this topic.
JOSH: Okay, cool. So, RubyGems.org, that’s this website that lets us see what gems are available and it also drives the downloading of RubyGems when you type Gem Install, whatever. So, there’s a whole website there. And Nick is the driving force behind that so we wanted to talk a bit about the history of how that came to be and what it does and what the future is. Does that sound okay?
CHUCK: Yeah, but what does RubyGems have to do with anybody programming Ruby really?
JOSH: Well, what does it? Somebody tell me.
CHUCK: Well, I remember when I first got into the Ruby community and we had, what was it? Rubyforge? And so, I kind of expected collective groans but it did actually work mostly.
NICK: I groaned internally for you.
CHUCK: So, I’m wondering, what were the issues with Rubyforge that prompted the gemcutter rewrite?
NICK: That’s a good question. So, I got into gem publishing around when Jekyll started to get popular and basically, that’s the thing that runs GitHub pages. And I knew nothing of RubyGems. I kind of knew Ruby, I was doing some Rails stuff and I was actually looking for open source projects to contribute to. I actually have a post on Reddit from forever ago that’s like, “I’m in College and I heard this open source thing is cool. What do I do?” And somebody pointed me at Rubinious and I went running the other way as fast as I could. But then, I found Jekyll and I decided to rewrite my blog in it.
And from there, I basically knew I had to hack some things into it, I had to hack some features and I found out about gems that way. It was more of like a natural thing, like, “Oh, I need to hack this feature in. How do I do that?” And I started looking into gems like that.
So basically through that experience, I learned about GitHub gems which at the time, were starting to get kind of popular. You could press a little button on GitHub and it would build a gem for you, which I thought was cool so I could click that. Then I started to look into like, “Okay, hat if I wanted to put this on Rubyforge?” Because that’s basically what you had to do to make gem install work, because what it would look at by default was at Rubyforge. And I found out that you needed to sign up for an account and then request for access to a project. I think I put something like RoflCopter or something in there, just to like mess around and see what it would do. And I ended up getting rejected to make a gem.
JOSH: Why would anyone reject a RoflCopter?
NICK: I don’t know.
JAMES: That’s so awesome.
NICK: I got rejected and they were like, “This isn’t a project, you’re silly.” And then, I started talking more to two other guys at the time. One was Josh Nichols, who was writing Jeweler which was a little gem to help you make gems. He was in Boston and so was I at the time. Basically, we would drink at Boston RB and talk about the problems he was having with Jeweler.
One of the things he talked about a lot was with Rubyforge and how it just was not fun to work with. So apparently, we had to upload things through a form to get to index your gem. That’s how you would push your gem. Before is that you would upright, is this correct?
JAMES: Yeah. I think you’re kind of way under selling the process. It was one of those awesome 1970’s interfaces so it was really comfortable. And if you wanted to upload three files, you could go through the process three times instead of one and that just kind of added to the excitement.
CHUCK: And you forgot the blood letting.
NICK: That’s right. I performed a séance and danced around a pop three times, and then my gem was indexed. But anyway, I started looking more into Rubyforge and I talked to Tom Preston-Werner from GitHub about the problems they were having at GitHub which apparently, building gems is not easy. They would have all sorts of problems and they actually had a support queue full of, “Hey, my gem won’t build because of blah.” And they would sometimes fix it but not usually.
I actually started looking through Rubyforge like what is going on in this thing? I found out that it was just a really old gForge clone so when source forge used to be open source, that’s when they kind of copied it over. Then there’s just a ton of weird things and modifications that they’ve put on and obviously it’s been awesome and it’s served us well. But I actually wrote out a list of all the things that we shouldn’t ever have, and I think it was like, “Well, we don’t need a job board.” You can have a journal on Rubyforge. This is a whole mess of features.
JAMES: Dear Diary, today I compiled a gem.
NICK: Yup. So, after talking to those, to Tom and Josh, we basically said like, “It would be really cool if we had a site that was better.” And I think the number one thing was we wanted an API for interacting with gems. One of the things that Josh had to do a lot was scrape the site, there’s actually entire gems, I think Hoe and the Rubyforge gem, they both scrape the Rubyforge HTML in order to submit your gem. So, we were like, “This is silly, we need an API to do this.” And that would be nice. I had no idea what that meant. I was like, “Let’s have an API to do that.” We also wanted better project pages that were more transparent.
Doctor Nick at the time, he wrote this new gem thing that would produce a little project page for you and it would show you the kind of how to install your gem and the version number. And I thought that was the neatest thing. Because nothing, like if you went to a Rubyforge page, good luck figuring out what the current version of it even was. Or like what was the most recently pushed thing, or how many downloads it had. So, I thought it would be neat if we had an easier way that would be successful for everyone to get that information.
Then the final thing is that I wanted to be in Ruby. That’s kind of a no brainer but I didn’t want to hack PHP. Certainly one of the options could have been, “Oh, let’s make Rubyforge better.” But I wanted to write Ruby and I didn’t think that would be possible. So, it’s kind of the goals I set out with.
JAMES: Well, after that, you threw together the site first? Or what?
NICK: Yeah. So, that was I think before Rails Conf in 2009. What was the one that was in Vegas?
CHUCK: 2008, I think.
NICK: Okay, it was that one. I went to that one and we had the idea before that. Basically, I hadn’t worked on it much. And I talked to a few people at the conference and they gave me a lot of great feedback. They kind of inspired me to try it out. I think, I forgot who but someone was like, “You got to learn how to sell it.” The idea was very nebulous at the time. Now, it makes sense, make gems easy to release and to share. But before, it was kind of this vague thing. We knew that there were these goals but solving the clear problem wasn’t there yet.
So, I basically went home from that Rails Conf and spent the next few months just trying to figure out, getting those things working. I read a lot of the RubyGems source code which was frightening at times. And I ended up writing two little Sinatra apps, one for the website portion and the other for the gem server. It was basically a database back version of the gem server that’s in RubyGems itself because the gems server that’s in RubyGems itself is file system based and that’s not good to work with thousands of gems.
I basically just went forward from there. And eventually, I think things really started exploding when it got up on Ruby Inside. There was like a little article, “Here’s this gem source that’s taking on GitHub and Rubyforge.” When in reality, it wasn’t a battle. I’m not saying that I wasn’t fighting for it, but saying like I was trying to work towards improving the whole situation. Especially at GitHub, Tom was telling me about the problems they were having and it seemed like everyone was unhappy with Rubyforge and how it worked.
JOSH: I just really like the idea of you getting a drinking battle with Tom to see who could own the space of hosting Ruby gem downloads.
NICK: I think I’d lose, I think I’d definitely lose.
CHUCK: I think it’s interesting too because it kind of got to the point where GitHub basically said, “Rubygems.org or gemcutter,” I don’t remember exactly what it was being referred to at the time, “Is good enough and so, we’re not going to host gems anymore.”
NICK: Yeah. And another thing that was said at that Rails Conf, there was like a panel that all the GitHub guys were on. And I think it was PJ who was asked, “What’s the thing you regret most?” Like about their whole process so far. This was in 2008 and he said, “Writing the RubyGems server.” So, that kind of like, it made total sense. That’s not core to their business, and even now it’s more apparent that it’s not a thing. If they went more in that direction, I don’t think they would have been as successful. So, I think they were willing and happy for someone else to deal with it.
JAMES: It’s kind of a vital step in the process though, right? Because we talked about how Rubyforge was ancient and bizarre and then GitHub had plenty of problems with its gem server but the one thing that was amazing about it was how simple it was, right?
NICK: Right. I think that was an early thing as well, it was like, “Oh, I have to put forth more effort in order to use gemcutter.” But I think that was more of a lesson in education that it’s just difficult to build gems in a robust and automatic way that works across all platforms, like they would never build Windows gems.
NICK: So, I think it makes more sense to have it be done client side and the proof is that GitHub tried to do it server side and it didn’t work.
JAMES: Yeah, I agree. Actually, I wasn’t dissing gemcutter there. I was saying that I thought GitHub kind of led the way as far as, “Look how easy this can be, chuck a buck, get a gem.”
JAMES: Then when we got gemcutter, I felt like you kept that, “Look how easy this can be, gem push and you’re done,” kind of thing.
NICK: Yeah, definitely.
CHUCK: So, that kind of leads me into another question that I have. And that is, how do you think that RubyGems.org has changed the landscape on the way that we manage our gems? It definitely changed the ease with which you can submit a gem. I never actually submitted any to Rubyforge but it’s pretty darn easy to get one into RubyGems.org. Are there any other things that people have told you that they’ve noticed as far as how it’s affected their work flow or anything?
NICK: I think, just because it’s so dang easy now that people are doing it more, and they’re doing it more frequently. And they’re putting things up there that they wouldn’t have done before. Jeremy Hinegardner had a talk at Ruby Conf in the year. It was the first one that was in New Orleans. And he kind of like mined data out of RubyGems and out of the database we had. And he found out that since the amount of gems released in Rubyforge from the beginning until when the cutover was, was the same as one year of gemcutter, and that’s just the first year. I’m really pumped to see what the second year is going to be, if there’s going to be another huge — if the growth curve continues to skyrocket. So, I think that just the data is really showing it. I think we’ll have to find that for the show notes.
JAMES: I was actually in that talk and it was a really good talk. He showed the graph and it’s interesting because you’ve got this kind of steady climb of gems being published as things go up and then this like, whoa! Where it just shoots way up, and Jeremy dropped an arrow there and he was like, “This is the release of gemcutter.”
NICK: That was really humbling. I’m really glad to be a part of that. It’s changing how we’ve been behaving but what I’m more excited is to see that same principle be applied to other communities. And we’re starting to see there are some other package manager sites that are adopting a more automatic, ‘let the users do what they want’, ‘you don’t need permission for everything’ approach. And I think that other communities will see a similar growth and I’m really excited to see that happen.
JAMES: So, with gemcutter’s process, originally you wrote it as a gem plugin I believe, and you had to install the gemcutter gem and add the source and then you could push and stuff. Then it caught the eyes of Ruby Central, I guess, and kind of became official. Is that how it went?
NICK: Kind of. So basically, after the blog post on Ruby Inside, it kind of got to the point where people were starting to get confused. I was seeing it in a lot of readme’s, like, “Install your gem from gemcutter.” And I think the point of it had always been this is going to be the main thing. I’m going big or going home. It’s like, “If this doesn’t work, either we’re taking over or it’s going to be the real thing.” I really didn’t want it to be like a mean, “Screw you guys, I’ve got the cool hipster thing now.” No, it was supposed to be like the better thing to use and to improve what’s going on, and not be this mean backhanded way.
So, in order to do that, I had to talk to the RubyGems team which is mostly based out of Seattle and it’s mostly Eric Hodel and Ryan Davis. And now, Evan Phoenix is really involved too and he helped out a lot. So basically, it was talking to those guys and figuring out what’s possible. Basically I had a proposal I sent out to the RubyGems developers’ mailing list which is kind of like a deep dark hole of where all the RubyGems bugs are discussed and yelled about. I had no idea it existed and some scary thing in Rubyforge. I basically laid out like, “Here’s what we could do.” to make these two work together so we don’t make everyone’s lives more painful by having these three different competing, at the time, it was basically three at the time, it was GitHub, Rubyforge, and gemcutter.
So, instead of having these three community services, let’s make it one so we’re not all going crazy about where gems are fetched or where do I get this? If someone pushed a Rails to a — I think my biggest fear that was driving me for a while was like, “Okay. What if Rails was not here and how do I make sure that a bad version of Rails does not get pushed or something?”
JOSH: I think that requires a lot of conversation with Rails core team.
NICK: Yeah, that’s true. [Chuckles] But I was recently just working together with them. Ruby Central was definitely involved as well because they run the Rubyforge infrastructure, they finance it. And although they don’t work too much on RubyGems itself, they’re the non-profit Backbone that basically owns, they write the checks.
So the way it went on was kind of, the gemcutter brand so to speak would go away, Eric, RubyGems maintainer, he owned RubyGems.org but wasn’t really using it for anything so we decided to move it to that because that sounds official. And then, we would move off to some separate hosting as well. We were on Heroku and we moved off to Rackspace with the goal of at some point getting the Rubyforge infrastructure off onto Rackspace.
It was basically this big shift that happened in 2009, I think? And I don’t remember exactly when but that’s basically how it went down. You’d think it would be like, we all got together kind of old west style and shot at each other, but it really wasn’t. It was more like, we kind of sat down and figured out what the actual work was.
CHUCK: Yeah, leave the shooting old west style to the Rogues.
NICK: I was really humbled by the amount of enthusiasm people put forth. And it definitely sounded like everyone was like, “Oh finally, someone did this. Thank you!” Instead of, “Ugh, here’s something we have to do.” So, I was really blown away by everyone helping out.
JAMES: So, I was wondering how’s your traffic, Nick? I would assume you would have quite a bit just searching all the gems and stuff.
NICK: Yeah, I haven’t looked at us. We just hooked up Gauges. I can’t say that right.
JOSH: That’s right. It’s not Gau-ches.
NICK: [Chuckles] Hold on, sorry I’m looking up these things, I don’t keep them on the top of my head. But it looks like we get around 20,000 or so people on the site a day. I like sharing this data too like the up time, we do up time through Pingdom, I have Gauges and New Relic. This is more like a general offer because we have performance problems and other things. So, if people want access to like a real high traffic Rails site, we’ll do this at the end too but I can do that for you. You can play around with it and see what’s going on. So, I think that’s a fun part and why I love running this service is because it is real and we’re using it every day. I do lose some sleep over it now and again but I like that. It’s fun.
CHUCK: So, you can’t tell us how many bundle installs are run a day?
NICK: I wish. That would be a good metric to track.
JAMES: I would actually think Bundler gave you a lot more traffic, is that true? When Bundler got popular, the traffic got worse?
NICK: I don’t think it’s been a bump. I just think it’s been an eventual rise because I mean, if you think about it, to be honest, I don’t have the data on that. But I feel like it’s just been going up and up because there’s before Bundler, I feel like we were still requesting the same amount of gems. I think now, it’s just requesting the same amount of gems and the right amount of things, and possibly even more. So, it can do all of the dependency checking. I think it’s hard to track specific Bundler installs because especially on the older version of Bundler, it’s grabbing tons of gems and gem specs.
The newer version of Bundler, there’s a specific API endpoint that we wrote to make it go faster. So, maybe that would be more conclusive, but there’s nothing like Bundler would have to say, at the end when it’s done bundling, it would have to make another HTTP request or something. I don’t know if it would be possible to like snapshot an entire install. But as for installs, I haven’t looked at the stats for S3 or Cloud Front for a while but I do know that it was expensive. [Laughs]
JAMES: Yeah, that’s a good point. Gemcutter uses S3. We saw that when we were poking around on it. So, the gems themselves are on S3?
JAMES: But the index is on gemcutter server, right?
NICK: No. The index is a list gems that’s available for download. Gemcutter is basically a fancy wrapper. If you were to put it in the most simple terms, it’s a fancy web app around regenerating that index all the time. So, every time someone pushes a new gem, it has to regenerate that index and shove it off to S3 which then gets cached by Cloud Front. And right now, the indexes are on S3 and the gem specs and gems are on Cloud Front. Previously at Gemcutter, no previously on Cloud Front. So, Cloud Front’s a CDM. So basically, once it gets a file out in Japan, it’s going to keep that file for 44 hours and cache it. So, we tried at first to put the indexes on Cloud Front, but you wouldn’t be able to update them more than once a day and all at different times. So for right now, the indexes are on S3 because you can update that constantly.
CHUCK: How often is it updated?
NICK: My last count was 200 a day. I can whip that up in a second.
CHUCK: How long does it take to generate?
NICK: Under a minute.
CHUCK: So, that’s not bad.
NICK: Yeah. Evan Phoenix did some really good performance work. The goal was under a minute when I first had it out. And as we got more and more gems, and especially after I brought in all the Rubyforge gems, it was starting to creep up around two minutes plus and then Evan optimized the heck out of it.
CHUCK: You guys have seen X-Men because the Phoenix was like the Class 5 mutant. And he’s totally delivered on that, I think.
JOSH: I believe it’s Level 5.
CHUCK: Level 5. Okay.
JAMES: Get your reference right.
JOSH: Sorry, I’ve just been watching Wolverine and the X-Men.
JAMES: So, you talked a little bit about the traffic problem, so it was that you got everything together, you regenerated the index. And then when we moved all of the Rubyforge things in, it got slower. And then, Evan went and used his magic and sped it back up again. Then Bundler, people have definitely complained Bundler was slow. And now, we modified Bundler and you also modified the application to help Bundler go forward, right?
CHUCK: Yeah. But in my opinion, it only made it less slow.
CHUCK: So, I’m curious though if it’s a file that’s already been generated that’s sitting up on the Cloud, already cached, then what takes so long for it still to get the index and stuff together? Granted it is faster, is that Bundler or is that RubyGems?
NICK: So, I think we’re talking about two separate things here. So, RubyGems itself only provides a way to find and install a gem. Basically, there’s three different lists that are available. The first one is the latest version of every gem for every platform. So, for example Rails, the latest version is 3.1 something. Then if I was dealing with a gem like Nokogiri, then it would be the latest version for Nokogiri, then the latest Java version. So, that’s one index. Another index is every single gem available ever, that’s huge. Then there’s another index that is only pre-released gems.
Basically, those three lists are all that’s available and that’s pretty much all that Bundler has to work with, along with the gemspec. And the gemspec is what we all know and love when we write to basically say what’s in our gem, and you also write what the dependencies are. So, if I’m in Rails, I depend on ActiveSupport and ActiveRecord and whatnot. So, the old Bundler, basically the one that we’re all used to and is slow, basically, in order to figure out dependencies, that’s it’s whole point. It has to look at that gem spec which has a dependency. That’s one HTTP request. So then, if you specify five dependencies in your gemspec, it then has to make five more HTTP requests. It basically has to go all the way down the tree until it figures out everything. So, that takes a long time and that’s really the slow down. And it’s that, there’s no way to easily find out dependency information with these stuff that’s been available in RubyGems.
And definitely, I don’t think that was a thing that RubyGems was designed for. So, the way that we’ve worked around it is like James is talking about, is we wrote a new endpoint on the gemcutter side to say, if I say in my gem file, “Rails, Nokogiri, Rest Client and RedCloth, what do those four gems depend on?” And it will give you, for those four, it will give you the dependencies for those four. So basically, it can group up those requests and make less of them.
So, that’s the real reason why it’s slow. I think the file generation and stuff is a whole separate problem. I think Bundler in itself is a interesting mix of problems that RubyGems just simply wasn’t designed for. So now, we’re all dealing with it and we all stare at our screen watching it.
JOSH: So Nick, I’ve heard comments here and there about there have been some changes pending with RubyGems itself to make Bundling work better. Do you know what’s going on with that? Anything you can say about that?
NICK: The thing I just mentioned was all I know about, that new endpoint. I think that endpoint is the first of many iterations, although it’s been done since last summer, two summers ago. But I feel that now the services in Ruby, we can mess around with it more and try out different things.
JAMES: Yeah, I feel like the Bundler endpoint is kind of the testament to why we did that, right?
JOSH: I think one of the cool things about what I’ve seen happen around this piece of the Ruby environment is that it’s decentralization. There used to be Rubyforge and it was this one place that did all of this stuff. Now, we have GitHub where people keep their code repositories and RubyGems.org where gems get uploaded and downloaded from. And there’s probably a couple other key pieces. I guess, there’s the Bundler tool and the Gem Command line tool. But things are now decentralized and there’s multiple teams of people taking care of the tools that we all use which I think makes for a more robust ecosystem. But I’d be interested in hearing your perspective as someone who is in that process of centralization versus decentralization. Just your perspective on that?
NICK: Yeah, I think that’s something I definitely think is a good thing. I’d rather have a lot of smaller services that are in charge of one thing than one big one that is easy to fail. That being said, I think RubyGems.org is even too big to fail. Like if we’re down, everyone’s toast. Like it went down on New Year’s Day, like at 5:00 in the morning Eastern time and I didn’t know until my dog woke me up at 8:30.
JOSH: Master! Master! Timmy’s down a well and RubyGems.org is down.
NICK: Yeah, really.
JAMES: Holy crap! Am I understanding this correctly? You’ve trained your dog to tell you when your website’s down. That is just awesome.
CHUCK: It’s a different kind of shock collar.
NICK: You have to pay extra on Pingdom for it but it works.
NICK: Yeah. I mean, that shouldn’t happen and there’s a lot of infrastructure things that we can do, that Evan and I are going to be working on in the next few months, along with the other. There’s three other guys that have been helping out a lot, Chris Michael John, I’m probably saying his name wrong, I’m sorry. Michael Zober, and Gabriel Horner. And they’ve all been helping out a ton. And I think basically, having a decentralized service is definitely important. The constant battle that I’ve been waging, that I don’t think will end anytime soon since we’ve become the official thing is, how do we not become Rubyforge?
That’s always in my mind. Like if we add this feature, what will happen? The good example is comments. If we add comments on the gem page, what will happen? And that’s something I’m totally opposed to because I think it will just create another bug tracker, right? I’ll now have to look at my gems page for comments. When someone comes along and has a problem and says, “What do I do?” And just comments there, I’ll never see it. So, that’s something that I want to keep people from doing, let people decide where they want to have their bug tracker, let people have their source code, wherever, not on Rubyforge. RubyGems is fine, throw it somewhere else and that’s great.
That’s definitely what I want to push for is services that do one thing and do it great. I want gemcutter to be its main job is to focus solely on serving those gems. That problem is big enough in itself and it’s only going to get worse. It’s only going to get more difficult and more hairy. Especially as now, one of the bigger problems is mirroring, right? That’s definitely going to be a whole another service that I hope someone else will lead, and lead up the infrastructure and stuff with. So, I think that having more separate things that are in charge of one thing instead of more is good.
JAMES: I thought that was really insightful, what you said about how you don’t want to add comments because then people would have to track them there and plus it’s something gemcutter’s going to have to do then. And then, you’re going to have to deal with spam and all that kind of crap, which I think is really cool. You did bring up a good point about mirroring, though, that is I would guess the one major thing we lost from Rubyforge.
JAMES: Rubyforge did have mirrors so that not everything was served from one central place.
NICK: Agreed. By the way, I found the stats. So, for last month, for Cloud Front, it looks like we pushed 4,000 gigs out. So, that’s four terabytes, and that’s just from the US, two terabytes from Europe, almost a whole terabyte from the Singapore endpoint. And I’d say like a half, split between Japan and South America. That’s not even counting S3. Oh, there’s a total. Total for Cloud Front is a lot. [Laughs] No, wait! That’s internal. I’m sorry, this report’s confusing.
But anyway, so there’s a lot of traffic and I found the gem stats too. So, we did just over 200 a day. I guess last week was kind of weird because it was a holiday week. But like on the 30th of December, we did 200 pushes. Monday, January 2nd, so when everybody was back to work, 240, Tuesday, 244, and so far, today, 173. [Chuckles]
JAMES: So, those are pushes. Those are people publishing gems.
NICK: Yes. And there’s a really nice site called Module Counts that has been tracking across all of the different sites. So, it tracks CPAN and Pypy and some other ones like Hackage for Haskell NPM. And we were by far blowing the other ones out of the water. We’re over 30,000. According to them, we’re at 30,000. My count, we’re around 33, 000, almost 34,000.
CHUCK: That’s packages on the system?
NICK: Yeah, that’s unique gems. Specific versionings, there’s a lot of those too. There’s even more. That’s not even, that’s of gem, right? So, Rails has like how many versions? Has like 30 versions or whatever. Versions, there’s 182,000. So, that’s the actual gem amount that we’re storing. So, it’s 182,000+ gem files hanging around.
So, that brings in the whole mirroring thing. The mirroring problem is something that we need to solve. I’ve been talking about it for a long time. I’m a really bad Sys Admin. And mirroring is an entirely Sys Admin problem. It’s like an infrastructure set up and deploy, and make it easy to deploy problem. And I’m pretty bad at that. So, I need help with that. Evan and I have found some people that are interested and we’re going to start to look into it. But if you want to help, let me know.
Basically, the problem we’ve had with Rubyforge was that the mirrors stunk, right? You would throw a gem up there and at least from what DHH would tell me is that they would have to wait for days to get a Rails release out because they would have to wait for all of the gems to get rsynched across the several mirrors that were bound [inaudible] DNS’d to gems at RubyGems.org.
So basically, the lesson there is, yeah, it really stunk that we lost it but it’s clear that we want gems to be easy to install. Now, it’s a little more than a minute from when you type gem push to when you can install it and that’s great. That’s one of the huge selling points that gemcutter kind of won out. But if we were to go back I think to like 15, I think it was even more than 10 minutes, I think people would start complaining and start wondering what’s going on. I would.
I think as for mirroring, the system we had before was a poll system. Like you would set up a mirror in God knows where, and then you would pull down new gems. I think instead we need it to be a push system where the main server, so we say to you, “Here’s the new gem, index it.” And then once you have it, there’s got to be some way of saying like, “Oh, you’ve got that now and you’re ready. So, we’ll redirect to you for it.”
JOSH: That seems like something for PubSub.
JOSH: I think that putting all of that publication logic into RubyGems.org itself and having to maintain all of that stuff was figuring out which mirrors you want to push to, you don’t want to do that.
NICK: Definitely. So, there’s some other background too. I know for sure the maintainers don’t want to put in something that makes you choose a lot of things. I think it’s CPAN, that it makes you choose a local mirror.
JAMES: Yeah. That’s like one of the worst questions in CPAN. It’s like, “How do I know which one of these are good.”
JOSH: Yeah. It should just pick it randomly.
JAMES: Right. It just should.
AVDI: It auto-detects now.
JAMES: Oh, really? By what? Location?
AVDI: Actually, it’s done that for a while. As you’re going through the CPAN setup, it basically says, “It’s time to pick a mirror. Would you like me to pick one for you?”
JAMES: Interesting. And is it using geographic location in that, you think?
AVDI: I think so, or it might be. I think what it does, I don’t think it’s like you put in your location. I think it basically hits a bunch of them and figures out which one came back the fastest. It’s basically looking at ping times.
CHUCK: Well, so CPAN is the Perl one.
AVDI: The Perl Archive.
AVDI: It’s done that for a while.
NICK: I am uneducated in the ways of Perl. So, I apologize for my insult.
NICK: But I think there’s a whole other problem that comes with the mirroring and I don’t know how others fix yet. I think this is the thing that we need to look at other communities for how this is done. Because it’s frankly, in my opinion, embarrassing that we don’t have it. And we always get offers for help like, “Oh ,we’ll mirror your stuff.” And we were just like, “I don’t know how to do that.” We need to fix that. And one of the bigger problems along with this whole push/pull thing is let’s say I give you a gem, right? How do I know to mirror? And I am going to trust you. How do I really trust you? How do I know like you haven’t jumped into Rails and added a delete from star or something?
How do I know that the mirror is the same thing that the original author published? That problem is difficult, right? So usually, it’s solved by hashing, like you do an MD5. However, RubyGems doesn’t have any of that right now. There is a gem certification, like security cert thing but nobody really uses that. Eric Hodel has a decent post about the web of trust, like how do we trust someone who’s going to put gems up? If we’re going to let anyone put a mirror up and then redirect anybody, how do we know that they’re doing the right thing?
So basically, RubyGems.org or something that is community-run and official, they basically have to become a CA, to become a Certificate Authority to provide trust. So, that’s a whole other problem that goes along with mirroring, that I don’t think anyone on the team knows. I don’t think any of us knows how to do it easily, otherwise it would have been done already.
CHUCK: Yeah, that’s interesting.
JOSH: I know there’s more to say about mirroring but there’s a couple other topics that I want to throw out there. One is gem building tools. I know that Jeweler was all the rage for a while and now Bundler has built in support for building gems. And the gem command line tool, you can push stuff. So, what’s the lowest friction recommended way of building gems and putting them up on RubyGems.org?
NICK: Oh, everyone has their own opinions with this. [Laughs]
JOSH: Yeah, but you’re on the show so give us yours.
NICK: Alright. I typically, I usually use Bundler now. Sorry, Josh, if you’re listening. But it’s just easier and it’s installed already usually. So, I’ll just throw a Bundler, is it new or nit, I don’t even remember what it is. So, I typically just do that and it works okay.
I also sometimes I’ll build the gem specs by hand but I don’t think that’s a common case. I don’t think most people know how to build a gem spec by hand. So, I think encouraging the tools is a good thing. Yeah, I will say in a talk when I’m teaching RubyGems like, “Here’s a gem spec. It’s a simple little DSL to help you make a gem.” But I’m not going to make you remember it. We all learn an equation right and hope the teacher lets you get the cheat sheet out and not have to remember it. So, I’d rather people use the kind of automation tools and not have to worry.
There’s more interesting problems to solve, right? I’d rather you solve the problems that your gem has, like if you’re going to hook up an API or make my life easier, write another test framework or something. Worry less about the thing that makes the gem and just write it and share it.
CHUCK: So, I’m curious what the rest of the panel is using. I’m pretty sure all the rest of us have written gems as well.
JOSH: Yeah, I use Bundler. It’s really simple and I will just use that gem spec and hand tweak it myself. I often don’t like using git to figure out what files should go in the manifest. I’ll do that some other way.
JAMES: It’s actually been a recent discussion and so, I’ll confess that I’m guilty of it, using git to throw stuff in there. And I never really thought about why that was bad until I listened to Ruby Rogues while I was on vacation.
JOSH: Well, I have to confess, I do it all the time. And I think probably most of my gems that are out there, the couple of them that are out there, use the git-ls, or git files or whatever the command is to enumerate the files in the gem. But I have hand built that a couple times and I’m fairly ambivalent about it although I try and avoid that shelling out.
NICK: Why would you not do that? I don’t care either way I’m just curious as to why?
JAMES: So, Aaron actually discussed this on the show. That he’s been working to make Rails start up faster. And he went into looking, why is Rails slow starting up, basically. And what happened was just running Rake and Fireman in Rails was taking a ton of time and he was trying to figure out, “What’s it doing?” Because at that point, it hasn’t even loaded Rails, so why was it so slow? And one of the reasons is that everybody doing that git trick to get the list of files in their gem spec, so in order to just load Rake and stuff like that, the gems, we were shelling out tons of times to get these lists of files and stuff.
NICK: That shouldn’t be happening. Maybe that’s happening with his local install, but that kind of change doesn’t get thrown up on the server, right? When you build a gem…
JOSH: No it doesn’t. It’s just the load time for Rails that I guess the gem spec format, it’s just running Ruby. So, it’s shelling out as part of doing the Ruby.
NICK: But as a general PSA though, you can’t have runable things in a gem spec, like you can’t throw shell commands because that would stink, right? Install a gem and remove your own record?
JOSH: So load times aside, one of the reasons that I don’t shell out and do the git thing is that if I’m doing development and I’m not necessarily doing a git commit every time that I’m trying to regenerate a new version of the gem and just doing like very short iteration stuff and making a lot of changes as I go and refactoring, et cetera, it’s just a pain in the ass to have to do all of the git commits to have the git-ls command find the right files to package in the gem for me to test. So, it’s easier to do it with just a dirt lob or some manual thing.
CHUCK: I’ve always just done it manually but I haven’t written any gems that require more than a handful of files. I also have to admit that the tool that I usually use to build my gems is some form of a sophisticated copy paste algorithm.
JAMES: [Chuckles] That’s me.
CHUCK: No, I’m serious. When I write a new gem, I copy the gem spec out of one of my old gems and I put it in the new one and just rewrite part of it.
JAMES: Yeah. I do the exact same thing. I go like, “Oh, let’s go grab this gem spec from faster CSV.” And then, I copy it and then I clean it up.
CHUCK: And then I gem build, gem push. And yeah, that’s pretty much it.
JAMES: I will say that I’m against a hand kept manifest file. I did actually do that once and really regretted it because I would miss little files and then the test wouldn’t run or something like that.
NICK: I’ve definitely done that too.
JAMES: Like I’m fine with keeping a manifest file or something like that but if you are going to keep a manifest file, then at least make a write task that does the [inaudible] or the git-ls or whatever it is to spit it out. That’s my opinion.
CHUCK: Well, if you’re going to go that way, why don’t we start a fight over which fields are valid in a gem spec? Didn’t we have that last year?
JAMES: Yeah, which fields are valid? That was actually part of Jeremy’s talk. When he went through all the stats, he talked about what the gems had and what they didn’t have. Which ones were extensions? Some of them have neat requirements like requiring Python or things like that. He gave all those statistics and then there has been talk about, which you’re probably referring to, Chuck, about making arbitrary metadata available.
CHUCK: I have to mention, James, I’ve never heard of a Python gem.
AVDI: I have used Bones a lot in the past which is kind of a skeleton builder for gems. It has some fun little features like if you have some common fields that you usually fill in the same way, you can save a name skeleton, or just like a default skeleton and it will always use that in the future. So, it will just go ahead and insert your usual information.
But something I just heard about is something called Rake gem which a friend of mine used to gemify a project I’ve been working on. And it says here, it’s not a library, it’s just a few simple file templates that you can copy into your projects. There you go, copy and paste, and equally customize to match your specific needs. And this is from Tom Preston-Werner.
JAMES: So, we probably need to go to picks pretty soon, Nick, but I definitely have one other question. Gemcutter is a great project for people to get involved in and work with. So, what would you like to see done other than mirroring and stuff which we’ve definitely talked about? What things would you like to see added?
NICK: One of the things that has been requested for a while and somebody might be working on it, I haven’t seen it yet is change logs. So, people put change logs in their gems but they don’t really go anywhere. And you only get them really when you’re like, “Oh, God! What did this guy change?” Then you look at it and hopefully, they filled it out. But I think if the site had, not so much the site but if we had a separate service to parse those change logs and show them in a nice way, and it could be opinionated, I think, in a way that we can’t. I don’t think we should dictate what the change log format is. But if some other guy wants to, more power to him. So, I think that would be fun, like a little change log service, so we could actually stay accountable for things.
I would love more people to use the API and they are. I can look up how many are active right now. So, we have a web hook API that you can get a HTTP post sent to you every time a gem is pushed. And then, you can do whatever you want with it. You can download it, generate stats on it, run tests. There’s a project that does that right now. It’s at Test.RubyGems.org. I think they’re looking for help as well. So, more projects to interact with our API and mess around with it would be awesome.
I think that’s pretty much it. I’m really happy with where we are right now. Obviously, there’s a lot of bugs and stuff that we need to clean up and a lot of infrastructure problems. But I think future-wise, I don’t think we should really grow anymore. I feel like we should instead let the community really build out things and basically expose all the data that we have. And see what other people can do with it, instead of continually bolting on more crap to the site. Let’s keep it where it is and let’s make sure the API is awesome, so others in the community can build on top of it.
CHUCK: Alright. Well, thanks Nick. I’m going to go ahead and cut this off and we’ll go into the picks. Let’s have James start us off since he hasn’t been around for a while.
JAMES: This is my punishment for not being here, is that what you’re saying?
JOSH: Yeah. Next time, you’ll show up.
JAMES: Yeah, geez. While I was on vacation, I did a couple of things. So, all of my recommendations will be based on that. One of the things I had to do that was actually semi Ruby related, I needed a very simple IRC bot that did a few things. And I just wanted to whip one up real quick and so I went looking at the various Ruby IRC frameworks. I was surprised at how difficult it actually was to find one that just quickly allowed me to throw together a bot. So many of them wanted to do a Rails-y kind of thing, “Here, let me generate your project with these 50 files you’ll need to modify.” And it was like, that was so not what I was interested in. I wanted something I could just post and something like that.
I did find it though, it’s called Cinch. And it’s a great little bot framework in Ruby. It’s stupid simple to get started with and I was able to whip up an IRC bot very quickly. It did exactly what I wanted. It has kind of a rich plugins environment so you can add on things like identifying with NickServ and stuff like that. So, it’s really great to work with. If you want to mess with IRC and do it in a low ceremony sort of way, then Cinch is really great for that.
The other thing I did while I was on vacation was just to play a ridiculous amount of games because that’s how I unwind and have fun, which is important. We should all get away from the keyboard once in a while. There’s kind of a wild world out there and we should all remember to go play, I think, however you do it. But if you’re like me and you like games, I’ll tell you what I was playing. I played some Skyrim which I think pretty much the whole world is playing right now. It’s a great game. It’s a really deep role playing game. I actually spent eight hours one day just reading books in the game. It’s really surprising how deep the world is and stuff and you just sit there reading these old books. It’s pretty cool stuff. So, I recommend checking out Skyrim.
The other game that we were playing over the break is called Catherine. It’s on the PS3 and XBox, I believe. Skyrim is on the same and also on PC. But Catherine is kind of unusual. I would call it kind of an adult anime platform puzzler. That’s probably about the best description of it. It’s surprisingly good, very old school. And its game style, it’s just brutally punishing levels that kill you over and over again. So, if that brings back fond memories for you, I know currently it’s not very popular in the game world but I’m old and I enjoy it.
And like I said, the story is petty adult. So, it doesn’t really talk down to you. It’s kind of neat how it all comes together. So, I would recommend checking out Catherine as well. Okay that’s it. I’m done.
CHUCK: Alright, Avdi.
AVDI: So, I have a Ruby pick. It’s one of those libraries that I just always use in every project and I forget that maybe not everyone knows it exists. It’s the Addressable URI gem and it’s basically just a better replacement for Ruby’s built in URI library, and it does a lot of things. There’s a bunch of things that it handles just better, like it has fewer bugs in how it parses and represents URLs and URIs. It also has some interesting added features like support for URI templates which are kind of neat. So, Addressable URI.
And your booze pick for today is Johnny Walker Double Black, another of my Holiday gifts from my wonderful wife. And if you’ve ever had Johnny Walker Black, it’s a sort of a smokier take on Johnny Walker Black Scotch. And if you like that sort of thing, it’s pretty good.
CHUCK: Charles Max Wood Black is a smokier take on me.
JOSH: I want to see that outfit.
CHUCK: Yeah, we’re not going there today. Okay, Josh.
JOSH: Oh, man! So, I kind of got nothing. I’ve been looking around trying to find this thing that I know exists but I haven’t had a chance to use yet. And I’m sure that Nick knows what it is because it’s that notification service that lets you know when a gem that you’re dependant on is updated.
JOSH: Gemnasium. Thank you. It’s really hard to Google that. My Google skills failed this morning. So, that’s all I got. [Chuckles] It’s been that kind of week for me. I’ll do better next week.
CHUCK: Alright, I’ll jump in next. One of my picks is, I’m trying to think of the best way to describe it but when I was working at Public Engines and I worked with Dave there, one of our coworkers came in and he had this Screaming Flying Monkey. And you put your fingers in the little holes in his hands and then you pull him back. And you let go and he goes across the room [monkey sound]. It’s really funny. Anyway, so I was watching GeekBeat.TV and they were shooting them at each other and it just reminded me how fun those things are. So, that’s one of my picks.
My other pick this week, it’s been kind of a weird week. And so, I really haven’t gotten a ton of code done. So, it’s not a code pick. The other pick is The Office, the television show, which is freaking hilarious. Now, I haven’t gotten to any of the episodes where Michael Scott isn’t the boss there, because I understand Steve Carrel left the show. I’ve been watching it on Netflix. But I have to say that show, if you’ve ever worked in any kind of office environment with any kind of extreme personality, then this show will make you laugh out loud because all of the characters embody the different extreme personalities that you’d get in an office and it’s really just a scream. Anyway, those are my picks and we’ll let Nick pick a couple for us.
NICK: Sure. I think I’ve got one and I started playing with it yesterday. It’s this site called Showoff.io and it basically is really cool. It lets you share a server you’ve got running on your local machine out on the web. It’s kind of mind blowing. What I needed to use it for yesterday, I needed to test something on my iPhone and I didn’t want to hook up all the junk with my router and whatnot. And this site just lets you do it. You install the gem and you just do show, and then either the host name, either locally or a port. And it will throw that on a website for you externally so you can hit it from your phone or wherever you need it. It just kind of blew my mind how easy that was and they now have a new customer from me. So, it was really cool and I definitely suggest it.
CHUCK: Alright. Well, if that’s it, then we’ll go ahead and end the show. If you want to get more episodes, you can find them at RubyRogues.com. Or you can also find us in iTunes and all of the episodes are still up there, or at least they should be. We also, I keep having people now telling me, “We’ve left a review on iTunes.” We really, really appreciate that. It helps the show move up, it helps more people find it. And hopefully, we can help a few more people become better developers through what we’re sharing.
JOSH: I think we should get more forceful about that and say that if we don’t get 50 new ratings by next week, we’re not doing an episode.
JAMES: That is awesome.
CHUCK: That would ruin my week. I love doing this show.
JAMES: We’re blackmailing our listeners now?
JOSH: Yeah, right.
CHUCK: I was watching Vimcasts and I guess the guy that did that, he said, “Unless I get a hundred dollar sponsorship,” or something, “I’m not going to release the next episode.” And so, two guys pointed up 50 bucks each and he released it.
JOSH: Nice. Okay, unless we have a corporate sponsorship by the end of the month, we’re done. No.
CHUCK: I will be working on that anyway. Anyway, but really appreciate that. If you do work for a company that might want to sponsor Ruby Rogues, have them contact me, Chuck@TeachMeToCode.com.
JAMES: Yeah. But if we don’t get T-shirts, we’re not doing it.