[This episode is sponsored by Hired.com. Every week on hired they run an auction where over a thousand tech companies in San Francisco, New York, and L.A. bid on Ruby developers providing them with salary and equity upfront. The average Ruby developer gets an average of 5 to 15 introductory offers and an average salary offer of $130,000 a year. Users can either accept an offer and go right into interviewing with a company or deny them without any continuing obligations. It’s totally free for users. And when you’re hired, they give you a $1,000 signing bonus as a thank you for using them. But if you use the Ruby Rogues link, you’ll get a $2,000 instead. Finally, if you’re not looking for a job but know someone who is, you can refer them to Hired and get a $1,337 bonus if they accept the job. Go sign up at Hired.com/RubyRogues.]
[Snap is a hosted CI and continuous delivery that is simple and intuitive. Snap’s deployment pipelines deliver fast feedback and can push healthy builds to multiple environments automatically or on demand. Snap integrates deeply with GitHub and has great support for different languages, data stores, and testing frameworks. Snap deploys you application to cloud services like Heroku, DigitalOcean, AWS, and many more. Try Snap for free. Sign up at SnapCI.com/RubyRogues.]
[This episode is sponsored by DigitalOcean. DigitalOcean is the provider I use to host all of my creations. All the shows are hosted there along with any other projects I come up with. Their user interface is simple and easy to use. Their support is excellent. And their VPS’s are backed on solid-state drives and are fast and responsive. Check them out at DigitalOcean.com. If you use the code RubyRogues, you’ll get a $10 credit.]
CHUCK: Hey everybody and welcome to episode 252 of the Ruby Rogues Podcast. This week on our panel we have Coraline Ada Ehmke.
CORALINE: Hey, everybody.
CHUCK: Pete Hodgson. I guess you’re the guest but anyway, I’ll introduce you anyway.
PETE: Ah, I’m the guest. I guess you’re right. Hello, everybody.
CHUCK: I’m Charles Max Wood from DevChat.tv. Quick shout-out about… well actually I think it’s too late for Ruby Remote Conf. But anyway, keep an eye out for the other conferences at AllRemoteConfs.com. Pete, do you want to introduce yourself? I know you’ve been on the show before and you’ve done several shows over at the iPhreaks Show. But it’s been a while.
PETE: Hello, my name is Pete Hodgson. I am a consultant with a consulting company called ThoughtWorks, also known as that company where Martin Fowler works. We do have other members at ThoughtWorks besides him. And yeah, I guess I describe my job as helping my clients deliver software and helping my clients get better at delivering software. So, part of what I do is kind of [inaudible] software but also part of what I do is advising clients on good engineering practices or agile practices, architectural stuff to help them get better at building software.
CHUCK: Awesome. Now, I just want to let our listeners know we did have a little bit of a technical snafu where the first half of our original recording of this episode got messed up. Avdi was on that call. Coraline’s on this call. So, it should be interesting. And I guess I’m just a little bit more well-informed, since I was on both.
CHUCK: But anyway…
CORALINE: You’re cheating, Chuck.
CHUCK: I am.
CHUCK: Well, and I’ve had conversations similar to this. We had Neal Ford on the iPhreaks Show and he talked about trunk-based development and good development practices there and made a case for feature toggles, which is what we’re talking about today. So, and I think that’s a good place to start. Pete, do you want to explain why or where feature toggles come in and especially what the benefits are of trunk-based development versus long-lived branches?
PETE: Yeah, sure. So, maybe a good place to start would be some history. So, back in the day maybe 10 or15 years ago companies like Flickr and then later Etsy really made a name for themselves in these practices that we now sometimes refer to as continuous delivery or continuous deployment. And they were doing crazy things like deploying their production applications once or twice a day which at the time just sounded bonkers. Nowadays it’s a lot more standard but they were really blazing the trail in terms of continuously delivering their codebase into production at least once a day. I think Flickr were quite well-known for saying that… I think it was Flickr, or maybe this is Etsy, but I think it was Etsy maybe, said that one of their guiding principles was someone should be able to join the company and on their first day at work commit production code, like make a production change to the codebase on their first day at work. So, these companies were really pushing the boundaries and now these practices have become more widespread.
And one of the ways that both Flickr and Etsy and lots of other organizations now were achieving this ability to release to production very, very frequently was a technique called feature toggles. And feature toggles which are also sometimes referred to as feature flags, feature bits, feature flippers, the basic idea is to be able to decouple deployment of your software from release of functionality. And we achieve that by shipping latent code into production. So, we have code that is in our codebase and maybe is being tested but isn’t actually turned on for production, for users in our production environment. And we can choose to flip that feature off or on based not on a code change necessarily but on a configuration change.
And this allows us to do what you were talking about, Chuck, this idea of trunk-based development where we don’t create long-lived feature branches for work in progress. We instead essentially hide that work in progress or mask that work in progress behind a feature toggle which allows us to do all of that work on the same branch, on master or trunk, whatever you want to call it. And that allows us to avoid merge hell that you get from long-lived feature branches.
CHUCK: It’s funny that you talk about that. I actually have an experience with merge hell and I dealt with it on a weekly basis. I was working about 10 or 15 hours for a client that had a team that was running ahead with their application. I was working on the next generation of their application but it was still enough with the one or two people that were working on that to where they were consistently changing the system out from under me while I was working on that long-lived feature branch. So, I’d work on it my 10 or 15 hours, the two of them would put in 20 or 30 hours each. And what I was doing was actually pulling in reporting and building graphs and things like that with D3.js.
And I’d go merge it and inevitably my graphs would be broken because the data had moved or changed. It had changed shape to the point where I had to rejigger D3 to pull the data out again. And that would be next week’s work and then things would have changed again. And needless to say, the client wasn’t super happy and I wasn’t super happy. And anyway, it was just really interesting. If I had been working on trunk then I think, well I think what would have happened was I would have merged my changes in and then they would have done their work, run the tests, seen that my stuff was broken, and then realize that something else was dependent on what they were doing. And it wouldn’t have been an issue. But yeah, it was week after week after week. It doesn’t even take a long time.
What you usually hear when you hear merge hell is two or three people went off and they did their own thing for two or three months. They made big changes to the system and then they came back to merge it back to master and master had moved either because another big branch similar to theirs had been merged in or because master had just advanced due to other work being committed. And anyway, so then they got merge hell and then you’ve got to figure out, “What do we keep from the two of these?” and all of that stuff.
PETE: Right. And I think…
CORALINE: So, we’re talking about not having long-lived branches but how short-lived is a branch when you’re using this feature toggle approach?
PETE: So, that’s a good question. I guess I don’t have a… I can say from my kind of sense more than anything else, anything that has been on a branch for more than a few days, like a handful of days, maybe two or three days, is probably getting to the point that I would get uncomfortable that we’re diverging. Because any of that work isn’t being integrated with other people’s work and that normally makes me pretty uncomfortable. As with everything there are exceptions to that rule perhaps. But to me if it’s longer than about two or three, if a branch has been around for more than two or three days I start to get nervous.
CORALINE: Why not just constantly rebase?
PETE: Ah, so that’s a great question, because that is often what people say is like, oh well if I’m constantly rebasing or merging in from master then I’m getting everyone else’s changes. That is true if everyone is working on master. But if you have two long-lived feature branches, let’s say we’ve got Sally’s working one feature branch and Kiran’s working on another feature branch and they’re both pulling in from master, they’re pulling in changes that are happening on master but Sally’s not seeing Kiran’s changes and Kiran’s not seeing Sally’s changes.
PETE: And then you get of course, Sally hears that Kiran’s been on this branch for two weeks and Sally makes a mental note after standup to make sure that she merges before Kiran so that [inaudible]
CHUCK: That’s exactly what I was thinking. “I’m merging first, dang it.”
PETE: Right. I think Martin Fowler on his website has a pretty good, it’s either a bliki article, I think it’s a bliki post on this idea of trunk-based development. And he has a pretty good diagram that shows that visually, this idea of first merge wins. And this meme of, “Oh, if I’m rebasing from master then I’m still doing continuous integration,” is a really, I hear it a lot. And it really, it kind of frustrates me because I… well, it doesn’t frustrate me but it makes me a bit sad because I feel like a lot of people read into that, that they really can have these long-lived branches and that they’re still doing continuous integration. You’re not doing CI. If you’re not integrating with a shared branch on a very regular basis, ideally [daily], then you are just not doing CI. You’re still building software. That’s fine. But don’t kid yourselves that you’re doing CI if you’re not all integrating with a shared branch on a very regular basis.
CHUCK: So, how to feature toggles actually save your bacon here, then? So, you get in, everybody’s pushing everything to master, and we’re talking Git. I guess we’re assuming Git instead of SVN or something. But so everybody’s pushing to the master or the trunk branch. And you have stuff that you don’t want to release so you put a feature toggle around it. Is that kind of the idea? It’s just essentially a fancy name for an if statement?
PETE: [Chuckles] Yes. Actually, it’s kind of funny. So, I wrote this big article about feature toggles and Martin Fowler was nice enough to host it on his website so obviously it got a lot of visibility. And some of the comments that I saw on like Hacker News or reddit or something was basically that. Like, “Geez, these consultants. They write 200 lines or 200 words to describe an if statement,” was I think one of them.
PETE: So, I think…
CORALINE: [Inaudible] 200 words to describe [to a nerd] why it’s [needed]. So…
PETE: It is fundamentally. So, the way that I advise people to start down this path is to not even do an if statement, just to have commented out blocks of code. And I think that’s kind of how some people start down this path without even really thinking about it, is they’ll have some work in progress but they want to merge some other changes. They want to integrate their changes so they’ll just comment out the stuff that they’re working on or they’ll say ‘if true’ and then ‘if false’ and comment out one of those lines. And that’s conceptually sure, that’s a type of feature toggle. And it gets a lot more sophisticated.
CHUCK: It’s a universal feature toggle, right? Because it’s a comment.
PETE: Right, right.
CORALINE: But the point of what you said earlier is that the feature being enabled or disabled is a matter of configuration. So, having something in the code like that isn’t really a configurable option. So, how do you make that configurable?
PETE: So, we could talk about that. But I would actually… so, I’ll push back on that a little bit in that I still think… so, let’s say that we change from commenting out a line and saying ‘if true’, ‘if false’ and basically hard-coding it. Let’s say we change from that to reading from a YAML file and then saying, if this property is there then it’s true otherwise it’s false. Now if I as part of my deployment pipeline take that YAML file in my code and package it into some artifact that essentially can be running in an environment, either way whether I’ve hard-coded the feature toggle configuration or if I’m reading it from a YAML file I would argue that conceptually that’s actually pretty similar. And the difference… there’s a difference in the implementation and it’s more, it’s clearly more manageable to use something like a configuration [inaudible] for advanced use. But I would say in terms of [inaudible] changes are flowing through a deployment pipeline, they’re actually kind of the same thing, having [inaudible]…
CORALINE: Yeah, that’s not actually what I was thinking about. I was thinking about a database table with features.
CORALINE: With a value, with key/value pairs that would say this feature is enabled or not enabled. So, that would really allow you to make that switch without [another] deploy.
PETE: Yeah. So, that’s often the path people walk down, is to start off with something hard-coded and then we move to a configuration file. Quite often what I see at a lot of places is that configuration file gets more sophisticated and there’s some ability to layer, have a base configuration and then overlay some environment-specific configuration. Maybe we could get back to this, but I sometimes see this being taken a little bit too far and it’s actually really hard for anyone to understand what the current configuration is in an environment.
But yeah, and then often people will move past that to some kind of dynamic runtime configuration, often read from an application database. Usually it’s like, “Oh, we’ve already got this database that we’re reading the users and the accounts and stuff from. So, we’ll just add an extra table for features.” And then sometimes it gets taken as far as using a special purpose runtime system [with] a data store and such. And quite often that’s one of these fancy new key/value stores like etcd or ZooKeeper or Consul or something like that.
CHUCK: I’m still…
PETE: But it gets quite expensive to… well not expensive but as soon as you move to that runtime configuration of these toggles, you introduce a lot more flexibility but you also introduce some challenges in terms of making sure that all of the nodes in a cluster for example are configured the same way, and also in terms of how do you detect that the configuration has changed and what do you do in response to that change? Do you need to [bounce] the process and restart your Rails app or can you reconfigure it on the fly? All of that kind of stuff can actually, there’s some subtlety in there that can get people the first time they work on this kind of stuff.
CHUCK: I’m still thinking about that basic case of feature toggle where it’s just like if this condition. So for example, if feature enabled or if they’re in the beta user group. Let’s say that you’re only…
CHUCK: You’re toggling it on for a certain group but not other groups or if they signed up before a specific date. How do you keep track of those? Because I can see it becoming sort of unwieldy where you have a whole bunch of ‘if feature dot enabled’ in there or ‘if this feature is enabled or if that feature is enabled or people have signed up here or there’, right? So, how do you differentiate the if statements that are actually logical deviations in your code from the ones that are actual feature toggles that turn things on and off for people?
PETE: So, the way that I advise people to approach this is to try and decouple the decision point, like where that actual conditional statement or whatever you’re using to make the decision of which code path to go down, decouple that from the reason for choosing one way or another. So essentially, keep distinct the toggle point from the decision logic, so the reason that’s motivating you to do something. And so often, a good way of doing that is to introduce some kind of class or objects like oftentimes it will be a singleton that you can ask all of these toggling questions and that class encapsulates the reason behind the routing decisions.
So, let’s say for example you have, you’re working on an e-commerce system and you have a recommendations panel that you’re adding to the home page to show the user recommended products. Now there’s probably going to be a few different motivations behind showing or hiding that panel. So, you could be not showing it because the user’s not logged in. Or you could not be showing it because it’s still a feature that’s in development and it’s not ready to be released to production but you still want to have it in your codebase. Or you could not be showing that recommendations panel because you’re undergoing heavy load and operation staff want to disable recommendations because it introduces a bunch of extra load on your backend systems. Or perhaps you’re doing an A/B test and you want to show the recommendations panel to one cohort of users and not show it to another cohort and see if it affects their behavior in a positive or negative way. So, all of those are different decision or different reasons for making a decision. But you’re still always saying, “Should I show or hide the recommendations panel?”
So, what I would advise in most cases is to have a little, a toggle router class. Usually I see this being [called] features. And you can just say ‘features dot show recommendation panel question mark’. Inside of that method is all of the potentially different types of decisions like is the user logged in, are they in the right cohort, are we undergoing heavy load? All of that logic can be encapsulated inside of your toggle router and at the toggle point where that if statement is, you just have ‘if features dot show recommendations panel’. So, you can clearly understand where all of the toggle points are in your code but you’ve not got the logic behind those routing decisions smeared throughout the codebase, which happens a lot. It actually happens a lot in Rails apps. I see in a lot of Rails app in the rendering code somewhere, ‘if user dot something’. And really to me, that’s a violation of single responsibility because you’ve got both the decision, both the switching between two code paths and the reason for that all mixed in, in your rendering code. So, [let’s say]…
CORALINE: I’ve seen that [inaudible] as well. I think Rails does a pretty rotten job of enforcing [SRP]. Active Record itself violates SRP because it is both persistence and a [container] for business logic. So, as soon as you make your first Active Record class you’re already on a slippery slope. But I love the idea of having a single feature class that [handles] the routing for all the different things. That’s what I was hoping you were going to say.
CORALINE: [Inaudible] to hear that. One thing we touched on, Chuck touched on a little bit talking about beta users and you just touched on as well is A/B testing. So, how does feature toggling interact with A/B testing functionality in an application?
PETE: In the article that I wrote I took a very broad view of what a feature toggle is and I included things like A/B testing as a type of feature toggle. Now if you’d have asked me a year ago or if Pete from a year ago had heard Pete today saying that, Pete from a year ago would be quite annoyed at Pete from today because it didn’t really fit into my definition of feature toggle. And I used to get quite frustrated when people would lump all this stuff together. But I guess I’ve come to the conclusion that because people often get all of this stuff mixed up and they often use similar code to either manage these toggles or to do the routing decisions, it makes sense to conceptually lump them all together and then look at different types of toggles differently and manage them differently.
So for me, a routing decision for a multivariate test like an A/B test, you can still make that routing decision based on, you can still use the same kind of feature class or toggle router to make both a routing decision based on A/B testing but also a routing decision based on let’s say permissions. So, if the user is a premium user or is part of an administration group then give them extra capabilities, we can still use the same kind of patterns to manage which code paths we go down. They’re just the underlying reasons behind that decision [inaudible].
CORALINE: That makes a lot of sense. I can understand why Pete from a year ago would take exception to that because the problem you’re trying to solve was a problem of continuous delivery. So…
CORALINE: Feature flags for continuous delivery are one thing and feature flags for A/B tests are another thing. But when you think about the conceptual solutions, yeah it’s a [class] [inaudible] does this thing show up or not?
CHUCK: One thing that…
PETE: And the other thing… the other thing that I’ve seen happen lots of times is a product team or a delivery team will start using feature toggles for that continuous delivery reason to manage releases and decouple feature release from deployment. And then the product manager or the operations folks that that team is working with get whiff of this awesome capability and they start saying, “Huh, maybe we could use these feature toggles to do A/B testing.”
PETE: Maybe we could use… and the tech leader, in this case the tech leader is perhaps me, starts cringing and getting really frustrated because it’s like, “No, that’s not what they’re for. They’re not for that kind of stuff.” But I think you’ve got to embrace that this is a really useful capability and say, “Okay. This is clearly an attractive thing for not just the developers on our team but also the product managers on our team, the folks running operations. So, let’s figure out how we can give them some of these capabilities but use them in a different way or exposing it in a different way so that we don’t get everything all kind of mixed up together and used in an inappropriate context.”
CHUCK: So, do you split those up?
CORALINE: Novel application of [inaudible] anyway. So, [inaudible].
CHUCK: So, I’m just wondering. Do you split those up then? So, you have a features class and then you have an A/B test class?
PETE: So, I would say you really want to encapsulate that. You want to hide that detail from the toggle point. So, that if statement or however you’re making that routing, implementing that kind of routing between different code paths, it shouldn’t really care. I don’t care whether I’m hiding the recommendations panel because it’s an A/B testing thing or because it isn’t ready for production yet or what. Just tell me whether to show it or hide it. So, abstract that away from the toggle point. But then under the covers in that implementation you’re probably going to be pulling from different sources to decide, to make that routing decision.
So for example, release toggles, so these things that we’re using to decide whether to show or hide a half-finished feature, I would argue that they’re best implemented very statically. So, I wouldn’t want to be able to manage those at runtime because I think that just adds extra complexity to your deployment pipeline. On the other hand, something like a permission toggle or a multivariate testing, an experiment toggle is what I call them, so for like A/B testing, that has to be a runtime decision because it has to take into account the context of which user is making a web request, if it’s a web app. So, you need to implement these things differently and under the covers you’re going to be composing together decisions coming from several different places. But in the code that’s making the actual routing through the different code paths, I think that you want to hide that complexity from the rest of your application.
CHUCK: Okay. So, then I get to pick on one of my favorite gems to pick on and that is the CanCan gem, which does authorization.
CHUCK: And I’ve used it and I’ve had a class that was 500 or a thousand lines long because there were so many different ways of doing permissions correct. So, the issues that I see here is that if you have them all in that feature class or that toggle manager class, how do you keep it from becoming so long that you don’t know which toggles do what?
PETE: So, the one thing I’d say in general is any team that’s using feature toggles should be trying as hard as they can to use as few of them as possible and to retire them as soon as possible. That’s the number one stumbling block I see with teams using these [inaudible]…
CHUCK: Yeah, but authorization toggles are probably going to be longer-lived.
PETE: Yeah, yeah. So, there’s some stuff that you can’t get away from. It needs to be there. I mean, I don’t know. I think you use the same techniques that you use with anything where you want the business logic to be clear, right? You spend… this is one of the things that I, one of the frustrations I have in general is when people think of this kind of stuff as it’s like, it’s just authorization or it’s just configuration so I won’t actually do a good job of making it readable code. It’s still code in your system. It’s actually code that you probably have to modify quite a lot. And in my opinion, the code that you modify a lot is the code that you should be taking extra time over to make sure it’s coherent. So…
CORALINE: What about a [centered refactoring] practice applied to that feature code? So, if you see for example, one of the smells for me is if I have namespaced methods. Like if it’s recommendation engine show and recommendation engine such and such.
CORALINE: I’ll say, “Hey, I probably have a [inaudible].” So, there’s no reason why you couldn’t create feature modules under a feature namespace and keep the code divided up nicely so you don’t end up with 5,000-line kitchen sink of feature information.
PETE: Yeah. I really like it. That’s an awesome smell. I’ve never heard of that one before but that’s actually [chuckles]…
CORALINE: I have a…
PETE: That’s actually a really good one.
CORALINE: I have a gem called snuffle that looks for… it’s basically like do you have an [object] [inaudible] and one of the smells [inaudible] is namespaced methods. Those namespaced methods and data [clumps].
PETE: That’s awesome.
CHUCK: So then, let’s go back to the idea of having as few feature toggles as possible. So, if you have a team, a large team or a large number of teams working on the same application how do you keep track of all of the different toggles? Do you know when you can get rid of one?
PETE: So, a few techniques that I like, the toggles that intentionally [inaudible] we’re expecting to be short-lived. So, a release toggle is the most common one of those where we just put it in place while we’re working on a feature and once the feature is ready for production and tested, we want to get rid of that toggle. For those kinds of toggles and there’s others that fit into that short-lived category, there’s a couple of techniques.
One is when you create the toggle write a… if you’re doing some kind of agile write a story to remove the toggle and put it onto the backlog. That doesn’t mean that maybe someone will continuously deprioritize that story so that’s not a silver bullet. But that will at least help you remember that it’s to be done. I’ve seen teams have some success with actually putting time bombs in their toggle and actually timestamping when I create a new toggle, like when was this toggle introduced into the codebase and put that into the configuration file and then have some code where the app will crash at launch, will refuse to launch if it detects a feature toggle that’s older than two months or something.
CORALINE: [You get that with a failed test, too].
PETE: Yeah, yeah, yeah. Just putting a test around that is another way of doing it. I’ve seen some dysfunctional [inaudible]…
CHUCK: I like that better.
PETE: That have failing tests that they don’t care about.
PETE: They will definitely care if the app refuses to launch.
CORALINE: That’s true.
CHUCK: Yeah, but then that’s a process issue, not a…
CHUCK: Not a feature toggle issue.
CHUCK: I like the idea of putting a failing test in just because then hopefully your CI will catch it and warn you before you try and… you know, and at least prompt you to go refactor before you deploy.
PETE: Yeah. And the other…
CORALINE: And you’ll know why.
CORALINE: You’ll know that it’s like, “Oh, [inaudible].”
CHUCK: Well, you should just put a big piece of ASCII art in the error.
PETE: Do it.
CORALINE: Don’t all of your tests have ASCII art in them, Chuck?
CHUCK: I’m going to start doing that. Darth Vader.
CHUCK: I find your lack of refactoring disturbing.
PETE: So, I once wrote a Cucumber plugin that would use the say command on OS X to…
PETE: To narrate BDD tests. If you look up ‘cuke puke’, I made…
PETE: I recorded a video of doing some iOS testing with it. It was the most annoying thing. I managed to get my team to the point that they wanted to throw my out of the room within about two minutes.
CORALINE: I wrote a gem called ambient-spec which has the opposite effect. It’s an RSpec formatter that plays ambient music to the tune of your specs.
CORALINE: And if there’s a test failure there’s this gong sound. But it’s all very gentle. It’s like very soothing. So, if you have a long-running test suite that keeps you entertained [inaudible].
PETE: That’s so cool. You know, I remember reading this article from… it was the very start of my career so it was probably about 15 years ago. These operations teams that had wired up various metrics like CPU usage and memory and stuff like that to different ambient noises. So, like a babbling brook indicated, how loud the river was indicated the CPU usage and how many birds were chirping was like the number of thread context switches or something like that.
CHUCK: [Chuckles] Cool.
PETE: You just have it always running in the background and it really leverages the pattern-matching in human brains, because you notice when things sound different than usual, right?
PETE: So, if you suddenly start hearing more bird chirps than usual, people start maybe noticing and wondering what’s going on with the system. And I’m actually a really big fan of that idea. I’ve never actually seen it really rolled out. I just read this article about it a long time ago. And actually, I would love to see a team that’s doing that, that has like a sound in the background of their system in production.
CORALINE: Try the [inaudible] out and see if [inaudible].
PETE: Yeah. That’s awesome. That’s very cool.
CHUCK: Speaking of gems, are there gems that people have written that allow you to do feature toggles?
PETE: Yes, there are. I have one that myself and another ThoughtWorker open sourced that’s no really a general purpose one but it’s a very simple gem that’s focused primarily on, I don’t know, a very minimal implementation. I’m trying to remember what we called it. Rack-flags, rack-flags.
CORALINE: Does it basically give you a DSL? Or…
PETE: All it does is it lets you, it manages… like a very simple way of just reading a feature flag configuration from a YAML file, displaying those flags, and then having a little admin UI where you can go to this admin UI and see what the state of the flags is in the current environment them. And when you override… and this is generally for QAs or for devs who are working on a release toggle that’s not ready to go into production. What you do is when you override them it just shoves a special cookie into the browser. And then every time you hit the Rack application, so a Rails app or any other kind of Rack-based application, it will sniff that cookie and do a little jiggery-pokery to change what the state of that flag is for the context of that request.
So, it was a very simplistic… well, not simplistic. But it was a very straightforward way that we used for a specific client. I think that that client is still using it in production as it were. It was pretty lightweight, not super-powerful but good enough for us to be able to ship latent code into production and allow that to be overridden. And you can even override it in a production environment. So, you could do testing in production if you want, if you have access to the special admin page that will set the cookies for you.
So, that’s one that I’m aware of because I built it. There are other ones out there. To be honest, there’s something about these kinds of systems that people always end up building their own bespoke ones. I don’t know why. It’s just the [certain size] problem or something. But it seems like…
CHUCK: I thought it through my head and I think after about five minutes of thinking through all the scenarios I completely over-engineered it in my head.
PETE: Yeah. Well, that’s the thing that you see, that I see a lot as well, is it’s a really nice problem to… engineers want to over-engineer it. There’s something about it that people start thinking like, “Oh, we can have a general purpose system and it can read from a configuration file and then we can overlay per environment configs and maybe we’ll do like a runtime override as well.” My advice in general is: wait until you actually need, like really need all of that functionality. Start simple and respect the YAGNI principle basically.
CHUCK: I had 10 different toggles that were all implemented differently and stuff in my head and I was like, “No, that’s probably too much.”
CORALINE: What are the implications for testing?
PETE: Yeah, that’s a great question. So, when you first introduce this idea to a team, anyone who has a test, are focused on testing will probably freak out a little bit at the idea of, now I have to test with every toggle off and on and every combination or [inaudible] combinatoric version of these toggles is the only way I can verify the code that could be in production. That’s not… it’s not quite as bad as that because these toggles tend to not interact with each other. So, you don’t really need to test every combination. Let’s say you’ve got a forgot password, a toggle that exposes a new forgot password functionality and a toggle that manages you recommendations panel. You don’t really need to test all of the combinations of those being on and off because they’re not going to interfere with each other.
But these toggles do introduce an extra burden on testing because you do have to verify the behavior that the system works as expected both with the toggle on or with the toggle off, if you’re assuming that that toggle is going to be turned on at some point soon in production. If it’s still early days then you don’t necessarily have to totally fully test the system with that toggle on, if you’re not planning on turning it on in production any time soon. But you do want to figure out ways to test your system with toggles off or on. And you need to figure out ways to temporarily override those toggles in the context of a test so you can do that verification. So, there is an overhead there.
CHUCK: You said that they don’t usually interact. So, what about the cases where you do have a toggle inside of a toggle or you have a toggle inside of a toggle inside of a toggle inside of a toggle, et cetera, et cetera.
PETE: Well, hopefully you don’t have a toggle inside of a toggle inside [chuckles]. There are some cases I suppose where you have the kind of nesting of features. And in that case in some ways, it’s not really… again it’s not… you don’t have to test every combination because if one toggle is inside of the condition of another one, then if the outer one is turned off then the inner one is never going to get exercised. But you do sometimes have these features that interact with each other.
What I’ve observed is in general teams will be smart and they won’t try and work on two interacting interfering bits of functionality at the same time. Because it’s an ineffective way to build out software, exactly the same as if you were, let’s say you’re not doing trunk-based development with feature toggles and instead you’re using feature branches. You’re unlikely to sign the team up for two parallel streams of work that are working on the same area of the codebase. Because that’s going to cause a bunch of confusion and merge conflicts and stepping on each other’s toes. So, that’s the same with feature toggles as it is with feature branches, I think.
CHUCK: I just had a Ghostbusters moment. “Don’t cross the streams.”
CORALINE: Back off man, I’m a scientist.
CORALINE: So, how do you introduce the idea of feature toggles to your development team that is used to feature branches?
PETE: Yeah, it’s a… you’re not just introducing feature toggles. You’re really… the main thing you’re introducing in that context is the idea of trunk-based development. And feature toggles is one of the tools that you’re using. So, I guess the way I start getting feature toggles into a codebase is again I try as much as possible to respect YAGNI. And I will literally start the first feature toggle.
I actually just did this with a team a couple of weeks ago. We realized that we had some work in progress that we wanted to hide. And it was too big for us to just have a two or three day branch. So, we decided we needed feature toggles and we hadn’t set it up yet. And I literally just put in an if statement that was with a… it was like, ‘is dollar sign turn feature on’ and that was just a global variable that we set at the very top of the application boot code. And that was how we got started. And we rolled that out and then we immediately came back to it and started making it a bit more sophisticated. We introduced this toggle router. But the implementation of the toggle router was just a hard-coded return true or return false. And the configuration was just to change that return false to a return true when you wanted to test this thing out.
And that’s how I recommend teams get started. Just do the simplest thing you need, but be aware that you will… your needs will become more sophisticated very soon and you just need to be willing to keep going back to that toggle router to your feature toggling infrastructure and upgrade it.
CORALINE: What kind of push-back do you typically get?
PETE: It’s more work. It feels like more work than feature branches. Because feature branching, the cost is hidden. And there’s not a direct obvious correlation that people see between them working on feature branches and the pain of their merge hell. They believe that that’s just a cost of doing business. But when you start asking people to wrap conditionals around their code and actually support their software being able to do either being able to run with the feature off or with the feature on, that feels like additional overhead that a team doesn’t want to do. And so, that’s the most common push-back I get, is like, “Oh, look how much more complicated that codebase has become,” because you’ve got all of these if statements. And my response to that is, “Yes, but you haven’t had to deal with the merge conflicts since you started doing this and that’s a huge win.”
And also, there’s some discussion at some point about moving beyond just moving conditionals. So, I think teams will start by just putting sprinkling if statements through the code and it does start to feel painful. Again, this is software. It’s not… we can use the same principles we use for abstracting over business decisions and use them for our toggling decisions as well. So, we can use things like the strategy pattern and common interfaces with different implementations to allow us to do that toggling without having to have a ton of if statements all over the codebase.
CORALINE: How do you manage code reviews without feature branches?
PETE: Ah, so that’s interesting. So, I kind of have a quick what’s the correlation? I think feature branches… I am absolutely fine with a short-lived… I’m absolutely fine with that GitHub style approach of short feature branch with a pull request. The pull request triggers a code review before a merge. That’s fine with me. That’s totally compatible with trunk-based development if those feature branches are short-lived. This is I think one of the big disconnects that the two rival camps have. You have folks on the trunk-based development side that are talking about feature branches being bad and you have people who use GitHub-based or the GitHub style workflow of short-lived story branches plus a pull request or code review. And they feel like they’re in disagreement. I actually think that anyone who does trunk-based development would never say, “Oh, you can never have any branches ever.” What they’re saying is, “Don’t have long-lived branches.” So, you can still use that same practice.
The thing that I think people don’t get is before GitHub we were doing the equivalent [or so], before Git, back when more people were using tools like SVN, we were doing the equivalent of short-lived feature branches all the time. We just did it by having our local copy that we weren’t committing, right? Like it used to be, you would work on something on your machine for two or three days. And then when it was ready to get integrated with the rest of the codebase, you would commit it into version control. Now what we do is we commit into our local version control and maybe we work on a branch or maybe the branch is just the fact that we haven’t pushed up to a remote. It’s still conceptually the same as what we used to do. It’s just that we’ve got better tooling to support us. But I still don’t think that that kind of workflow is at all incompatible with the ideas behind trunk-based development. The key thing is the integrating on a very regular basis.
CORALINE: That makes sense. Your talk about SVN took me back to the dark days when I was doing .NET development. We had something called Visual Source Safe.
CORALINE: Where you actually had to check out a file and only one person could have a file checked out at a time.
PETE: My first…
CORALINE: That successfully avoided merge conflicts.
PETE: My first job was… part of the many things I did at my first job was managing Visual Source Safe for the team. Microsoft’s recommended best practices was to defragment it every week because it would get itself all up in a tizzy.
PETE: [Inaudible] that was possibly the worst Microsoft tool I’ve ever used. [Inaudible]
CORALINE: And kids today complain about Git and I’m like, “You have no idea how good you have it.” [Chuckles]
PETE: Yeah. Yeah, try what’s… Visual Source Safe. What’s the one…? Oh, ClearCase. If you’ve ever used ClearCase then you would think SVN is amazing.
CHUCK: Alright, anything else we should hit before we get to the picks?
PETE: So, I think the thing that I would say to anyone who’s doing this or thinking about doing this, the most important single piece of advice I would have is work really hard to keep the number of toggles low. And work really hard to not let long-lived toggles affect the quality, the internal quality of your codebase. I think if you can do those two things, then you will see a lot of the benefits that feature toggling gives you without getting bitten. The teams that I see struggling with this is usually because they’ve allowed the number of toggles to grow exponentially. Or they just grow out of control. Or they’ve not done enough care and attention to how they’re actually implementing these toggles in their codebase. If you do those two things, I think you will end up a net positive from using this approach.
CORALINE: Very cool.
CHUCK: Well, let’s go ahead and get to some picks. Coraline, do you have some picks for us?
CORALINE: I have a couple of picks. The first one is a document called ‘Cryptic Ruby Global Variables and their Meanings’. It is a comprehensive list of Ruby Global Variables. There is a couple that I learned like ‘dollar sign colon’ which is a short cut for load path. And ‘dollar sign zero’ which is the name of the program that’s currently being run. The author is a guy named Jim Neath. He knows tons of these cool variables and has put together this cheat sheet to help you out. So, if you know them and remember them, oh my gosh, [inaudible] includes like a bunch [inaudible] regexes that are [inaudible] interesting. So, knowing them can save you some [inaudible] when you’re trying to figure out details for your environment among other things. So, I’ll post a link to that cheat sheet in the show notes.
The second thing is a repo called RailsBridge Installfest. And I’m one of the organizers for a women’s hackathon. It’s taking place at the end of this month in Chicago. We’re hacking on social justice projects. And we’re going to have varied people there. We’re from different backgrounds and different technology at their disposal. We are anticipating some Windows users for example. Not everyone can afford a Mac. So, the RailsBridge Installfest documentation gives you detailed setup information toward getting a development environment for Ruby on Rails up and running and includes specific examples for what to do for people who are running Windows. So, it’s great if you are putting together a class or a workshop or a hackathon. It’s a great guide to [inaudible] open sourced it and made it available to the world in general. So, that is my second pick.
CHUCK: Alright. I’m going to not pick. I’m actually going to just really briefly put a few things out there. I’m going to be traveling a bit. I think this episode goes out next week.
A few days later, on the 2nd and 3rd, I’m actually going to be in Las Vegas for about a week. One of my mastermind groups is doing our retreat there and then we’re going to all go to MicroConf which is a small business conference. Anyway, we’re going to do a meetup I think on the 2nd or 3rd. I’m still working out the details there. I’ve been to Vegas plenty of times so I kind of know where I want to do it. But anyway, we’ll get the details out on that, too. So, if you’re going to be in Las Vegas the first weekend in April then keep an eye out for stuff going on there.
Finally or semi-finally, in July I’m going to be in Chicago for Podcast Movement. And I’ve decided to stay an extra day and do a meetup on the 9th of July. So, if you’re in Chicago feel free to show up. I’m going to try and get people who live in that area that I know. Coraline lives somewhere in Illinois I think.
CORALINE: I’m in Chicago, yeah.
CHUCK: You’re in Chicago?
CHUCK: So, we’ll try and get Coraline to come. But yeah, I’ll be there and I’m just going to have a meetup and see who wants to come and hang out and eat some food. And at all of these I try and meet everybody who comes. I’ve had as many as 50 people RSVP. So, it just depends on how many people are there. But I will definitely be happy to meet you. The reason that I do this is just that it’s one thing to talk to people online. It’s another thing to have the 15-minute podcast listener calls that I do. And then it’s yet another thing to be able to meet people in person. And I really want to shake your hand, find out who you are, find out what you’re about. And it’s just a great way for me to do that. So, I really appreciate people coming out. I think it’s a really… if you want to do me a favor, come. [Chuckles]
Finally, there is a small chance that I will be in Nashville in November. There’s also a small chance that I’ll be in London in September. But I don’t have any firm plans because I’m waiting to hear back from people involved there as to whether or not I’ll actually be going. So, just keep an ear out for those. If you join the mailing list, go to RubyRogues.com, you’ll see a little thing slide down that gives you the top 10 Ruby Rogues episodes in your inbox. If you don’t want to do that, then just on the top of the page there’s actually another place you can sign up just to get the episodes in your inbox every week and that’s where I’ve been sending those emails. So, if you want to be informed I’m going to send out another email this week or next week and let people know where we’re going to be in San Francisco and Las Vegas.
So anyway, those are my sort of picks. And we’ll let Pete do some picks.
PETE: You’re picking every city in the world. You’re very [jet-setter]. I’m very [impressed]. My picks. So, pick number one is a new CI/CD tool called Concourse from the folks at Pivotal who are, it’s by the folks that are behind Cloud Foundry. And it’s a really interesting CI tool. It’s focused on builds in containers. So, every test or build or whatever, job that you’re running, is in an isolated container which means A, it’s isolated and won’t interfere with previous builds or get interfered with by previous builds. And B, it means that you can have this homogeneous pool of workers and you don’t need to have all of your builds queuing up waiting up for the one agent that has the right version of Selenium installed or whatever. It also, Concourse does a really good job of modeling your continuous delivery pipeline as a true pipeline, something that Jenkins does not do. Stop kidding yourself. Jenkins does not do this, even with that plugin. It still doesn’t really work. So, if you’re really focused on continuous delivery, this is a new tool that I think is worth taking a look at. It’s fairly young. But it’s really pretty interesting stuff.
My next pick is a technique called architectural decision records. And…
CORALINE: We do those. I love those. Those are amazing.
PETE: Okay. Cool, that’s cool. I’m glad to hear that someone who actually does them. Because I’ve heard about them. I just read about them the other day. And they seem really cool but I’ve never… I’ve not had a chance to experiment with them yet. So, I’m glad that you’re having some success. Maybe I’ll start advocating even more strongly [inaudible]. So, it’s this idea of writing, having just very lo-fi simple documentation record of the big architectural decisions that you’re making in your codebase. Particularly helpful I suspect when you have new people joining the team and they want to revisit all of the old decisions that you made. So, I’ll post a link to a blog post that introduced this idea way back in 2011. And there’s also a tool by a very smart man called Nat Pryce. And he’s produced a tool to manage these flat file ADR records implemented using Bash for his [inaudible].
CORALINE: We did them… we have a Git repo of architecture decisions then we do, we [JDR] as a pull request [inaudible] the repo. And lots of people will ask questions and…
PETE: Oh, wow.
CORALINE: [Inaudible] code review practices to ask questions and get feedback.
PETE: Do you have anything…
CORALINE: That works pretty well for us.
CHUCK: I think we need an episode on this.
PETE: Do you have any public…
CORALINE: [Docs about it]?
PETE: Yeah, yeah. I think it needs a mini-episode or something. What I’d love is to have something that I could point people to, like a blog post that talks about how you’re doing it or like an example of the repo or something like that would be really interesting.
CORALINE: I don’t have anything like that. But that’s an excellent idea.
PETE: You’re welcome. I was just volunteering you to do something.
CORALINE: I appreciate that. Thank you.
CORALINE: I was just wondering: what should I do next?
PETE: [Laughs] Yeah, in your copious spare time.
CORALINE: I’m going to GitHub so I’ll [inaudible]. I’m going to create that, yeah. [There you go.]
CHUCK: I need a sound bite that goes “geek tangent”. Anyway.
PETE: My next pick is a non-pick, an un-pick, an anti-pick. I anti-pick hotel coffee. It is terrible. Those machines are the scourge of society and the coffee that they give you [to put in them] is even worse. I highly recommend bringing a bag of good coffee and a little portable grinder and an AeroPress with you if you are a regular hotel traveler. And in an act of shameless self-promotion, I have a blog post on my [inaudible] my coffee setup with a terrible link-bait-y title and I’m going to put that in the show notes. And I would encourage you to read it and then tell me that I’m wrong.
CORALINE: I was just in the UK and I had a machine in my hotel room called a Nespresso.
CORALINE: And I determined that ‘nes’ as in Nescafé and Nespresso actually means ‘not’. [Inaudible]
PETE: Yeah, like fake or pseudo.
PETE: [Chuckles] And then my last pick is a beer pick because I love picking beer. I’m going to pick Red Chair Northwest Pale Ale from Deschutes. I have… currently I’m having a love affair with Deschutes at the moment. They [inaudible] loads of really good beers. This is another one that’s really good. It’s a nice hoppy pale ale, pretty heavy on the malt, pretty toffee and dark and chewy caramel and those kinds of flavors. But it’s not over the top on alcohol or hops. So, I really… this is one of my go-to beers now. I’m really, really, really, really digging it. So, if you can get a hold of Deschutes where you live and you are a fan of beer then I recommend trying it out. And those are my picks.
CHUCK: Awesome. If people want to find out more about what we’ve talked about today or follow you Pete, what do they do?
PETE: If they want to find out more about feature toggles they should read all the words in my really long article about feature toggles. That’s at MartinFowler.com/articles/feature-toggles.html. Or just google feature toggles. It will be the first or second link probably by now. So yeah, that has more details on feature toggles.
If folks want to find out more about me, they can hit me up on Twitter. PH1 is my Twitter handle. And my blog is, or my website is ThePete.net, T-H-E-P-E-T-E dot net. And I really love talking about this stuff. I love talking about a lot of stuff, but feature toggles in particular is something [inaudible] something I’m pretty passionate about. So, I would love to hear from folks who are either having trouble with this stuff or having some success and want to share their successes or just want to tell me I’m an absolute idiot and that feature branch is the way forward. I welcome the opportunity to discuss this with you. So please, hit me up on Twitter or find me on GitHub or LinkedIn or whatever. And yeah, I’d love to talk about it.
CHUCK: Alright. Well, we’ll go ahead and wrap this up. Thanks for coming, Pete.
PETE: Thanks for having me.
CHUCK: Yeah. We’ll catch you all next week.
[Hosting and bandwidth provided by the Blue Box Group. Check them out at Bluebox.net.]
[Bandwidth for this segment is provided by CacheFly, the world’s fastest CDN. Deliver your content fast with CacheFly. Visit C-A-C-H-E-F-L-Y dot com to learn more.]
[Would you like to join a conversation with the Rogues and their guests? Want to support the show? We have a forum that allows you to join the conversation and support the show at the same time. You can sign up at RubyRogues.com/Parley.]