CHUCK: We love AWS! Yes!
[This episode is sponsored by Hired.com. Every week on Hired, they run an auction where over a thousand tech companies in San Francisco, New York, and L.A. bid on Ruby developers, providing them with salary and equity upfront. The average Ruby developer gets an average of 5 to 15 introductory offers and an average salary offer of $130,000 a year. Users can either accept an offer and go right into interviewing with the company or deny them without any continuing obligations. It’s totally free for users. And when you’re hired, they give you a $2,000 signing bonus as a thank you for using them. But if you use the Ruby Rogues link, you’ll get a $4,000 bonus instead. Finally, if you’re not looking for a job and know someone who is, you can refer them to Hired and get a $1,337 bonus if they accept the job. Go sign up at Hired.com/RubyRogues.]
[This episode is sponsored by Codeship. Codeship is a hosted continuous delivery service focusing on speed, security and customizability. You can set up continuous integration in a matter of seconds and automatically deploy when your tests have passed. Codeship supports your GitHub and Bitbucket projects. You can get started with Codeship’s free plan today. Should you decide to go with the premium plan, you can take 20% off any plan for the next three months by using the code RubyRogues.]
[Snap is a hosted CI and continuous delivery that is simple and intuitive. Snap’s deployment pipelines deliver fast feedback and can push healthy builds to multiple environments automatically or on demand. Snap integrates deeply with GitHub and has great support for different languages, data stores, and testing frameworks. Snap deploys your application to cloud services like Heroku, Digital Ocean, AWS, and many more. Try Snap for free. Sign up at SnapCI.com/RubyRogues.]
[This episode is sponsored by DigitalOcean. DigitalOcean is the provider I use to host all of my creations. All the shows are hosted there along with any other projects I come up with. Their user interface is simple and easy to use. Their support is excellent and their VPS’s are backed on Solid State Drives and are fast and responsive. Check them out at DigitalOcean.com. If you use the code RubyRogues, you’ll get a $10 credit.]
CHUCK: Hey everybody and welcome to episode 218 of the Ruby Rogues Podcast. This week on our panel, we have Jessica Kerr.
JESSICA: Good morning.
CHUCK: I’m Charles Max Wood from DevChat.tv. I just want to give a quick shout out. I got RailsClips launched so if you’re looking for videos on Ruby on Rails, I release one free one and one paid one every week. Just go to RailsClips.com and it’ll take you to the right place. We have two special guests this week. We have Alex Wood.
ALEX: Good morning.
CHUCK: We should tell people we’re brothers.
ALEX: Right, long lost relatives.
CHUCK: That’s right. We also have Trevor Rowe.
TREVOR: Hey there.
CHUCK: Do you want to introduce yourselves?
ALEX: Sure. So we work on the AWS SDK for Ruby here at Amazon Web Services. I’ve been with the team here for about two years and I just turned over the five-year mark here at Amazon overall. Before I came to AWS, I was working on inventory management algorithms. So if you’ve gone to Amazon.com and anything was out of stock, I might be partially to blame.
CHUCK: It’s all your fault. Trevor.
TREVOR: I’ve been at Amazon for just over four years now. I work on the AWS SDK for Ruby and a lot of other internal tools. We work on a really unique team in Amazon where we could do work on the software and get paid for it. So it’s been a really good ride.
CHUCK: So I have to say when somebody comes to me and they say, “We want to use AWS,” it sounds to me like they’re saying, “I want to do IT”. There are so many services. It does all kinds of different things. So telling me you want to use AWS just tells me you want to be in that Cloud.
Can you give us just quick overview of some of the things that are offered by AWS?
ALEX: I think one of the big concepts that AWS is trying to drive is that you shouldn’t have to spend your time putting a whole bunch of effort into something that it isn’t actually differentiating your business to your product.
For example if you’re writing a web application, do you want to spend weeks or months or years re-implementing queuing, which turns out to be a very complicated problem. There’s a lot that goes into it. Rather than trying to reinvent the wheel, you can use services that we’ve worked on. We have dedicated teams working on. They have all the experience of working at scale and working on these problems for a long time. You can use the services we provide instead of having to reinvent the wheel. I think that’s where a lot of the breadth of offerings comes from.
TREVOR: Yes, we have I think about 50 public services today. Most of them are aimed at providing infrastructure as a service rather than a platform or software and service. But we do have services in a lot of different areas. So odds are if you need something like email or file storage or help with testing or building software, there’s something there.
CHUCK: Yes, I think that’s the thing that’s interesting. It’s not Heroku which kind of says, “Here’s the package. Fit your app into it.” And it’s not DigitalOcean or Rackspace that says, “Here’s a server, go for it.” You provide all of the different things that you’re probably going to need to build an app in today’s development paradigm.
ALEX: Yes, there’s no particular golden path, I think. We do have, for example, Elastic Beanstalk was one of our first deployment services that does aim to say, “Hey, I’d like to get started on AWS and here’s a really simple way to do it that will handle for you a lot of the questions of the management scaling, the pushing of new code. It has built-in Rails support. But also, you have access individually.”
TREVOR: With Beanstalk, you can actually push any Rack application.
ALEX: But it has some of those built-ins to help make things quick. And you also have direct access to all the machines you’re running on. You can start with simplicity and if you want more control all the way down to very fine levels of control, there’s different other services.
AWS OpsWorks is a good example that uses chef and the subject of my workshop at RailsConf this year was OpsWorks for Rails Deployments. That also has a lot of the built-ins, kind of a particular golden path you can use to deploy Rails applications. But since all the source cookbooks are open source and it has lot of hooks for adding your custom stuff, you have a lot more control to customize higher application is deployed, manage the details of things like auto-scaling. A lot more options for what kind of database do you want to bring. It’ll hook up to any Amazon relational database service, database you bring, for example, so you can switch your database back in rather easily.
JESSICA: A few weeks ago, we had a clip of this Noah Gibbs on and he remarked that in between Heroku which is your app better fit in our box and the enterprise level platform and infrastructure automation often built on top of AWS, there’s like a gulf of it’s really difficult to customize past Heroku. Like you said, there isn’t a golden path.
ALEX: I don’t think that it’s quite as wide of a gulf as that. For example, the workshop that I gave on OpsWorks, we had, I think, well over a hundred people go from zero to you have a Rails app deployed and functioning and you’ve done a couple of integrations. We had 90 clock minutes. I think, edited down, it was a good deal less than that.
And the nice thing is once you got that starting side done, so I think we deployed the Rails application, set up some of the identity management quarts, and started sending emails through Amazon’s Simple Email Service, for example.
Once you’ve done that, adding on things like caching your static assets, it’s just one incremental step. Setting up auto scaling is a couple of incremental steps. I think there is actually the ability to go with something that gives you a little bit more control and tinker and alter as you go.
TREVOR: I think that gulf is experienced by a lot of people though. I think it’s primarily because we sit here and we architect and we build our applications kind of in our desktop environments or whatever our development environment is. Then all a sudden we have this big step of ‘okay now I want to deploy it somewhere’. And there’s this path that’s pretty easy as long as that’s fitted inside of that box. But if I want to step outside that box, there are a lot of decisions to be made.
And that takes us out of our wearing our developer hat and now we’re putting on our DevOps hat. That seems to be something that’s almost necessary second hand, extra knowledge that all of us need to learn and pick up if we want. It’s new, it’s different and what not but there’s a lot to learn.
Alex mentioned a couple of technologies like Identity Management. That’s a service Identity and Access Management that AWS has. It revolves just around managing policies and secure access to your Cloud resources. Each one of those things kind of tends to be a learning curve no matter what service or solution you choose.
JESSICA: Amazon makes this possible. I mean, it makes it possible for me as a developer to set up things like identity access measurement. But exactly Trevor, it’s a learning curve. It’s that path from dev to DevOps. It’s totally possible. And there’s no freaking way I can do all of it and understand all of it and get all of it right.
TREVOR: It’s like learning how to when you first bought your first domain name and you’re like, “Oh, this is easy.” And then you’re like, “Oh, but now I want to set up my own email server.” Now it’s not so easy. You start thinking about spam filtering and MX records and all these other things. It’s like, “What did I get myself into?” It can feel a little bit overwhelming but there are a lot of resources out there. AWS does have a lot of user guides, training materials, lot of it free.
ALEX: And almost to the point of the feeling of the gulf and the works-on-my-machine approach is one other big advantage of the Cloud as a paradigm is if you want to run your development stages of your code like you’re developing not nearly ready for production but you want to see how it’s going to run on a real bonafide server that’s not your desktop, you can do that. You can spin up a whole cluster that’s all productionized, ready to go, deployed across multiple data centers, auto scaling, ready to take on a huge amount of load if it comes in, and make sure everything works and then you can just take the whole thing down once you’ve done testing over testing.
JESSICA: [Laughing] Please pause, please pause.
JESSICA: Make sure everything works. Can we expand that?
ALEX: Oh, sure. In the sense of if you write your big monolithic application on your desktop, no matter what you’re deploying to – if you’re deploying to a server in your local data center, if you’re deploying to any service, there’s the possibility that things aren’t going to work exactly the way they worked on your desktop for any number of reasons. The most mundane that I ran into more than once is the Ruby version on your desktop is not exactly the one that you have set up on the server you deployed it to. Any number of reasons can have your production environment working differently than your desktop. It’s a pretty standard problem you run into.
The nice thing about something like a Cloud service is you can be running in a production like environment much earlier in the process. So you can do your development live in action in the Cloud. You can spin up your own little private cluster, test things out on the feature you’re working on. When you’re done for the day, you can spin it down and you’re not paying for your resources anymore.
JESSICA: Yes, which is awesome and AWS makes that possible and I think it’s amazing. And it introduces the possibility of verifying whether our software works with both our code and our deployment because if you’re spinning all that app and shutting it down at the end of the day, you’re not doing that by hand. I hope.
So, my team right now at Monsanto is working on automating, provisioning and deployment to VPCs in Amazon Virtual Private Clouds using CloudFormation, which is I’ve noticed Amazon recently, it’s got the very basic services. You’ve got RDS, the database. You’ve got S3 the storage. You’ve got EC2, the servers. On top of that, we’re getting layers of abstraction which is making it easier for us to set this up correctly. CloudFormation is a great example of that. It’s a JSON template that you tell it what you want and how all the different pieces fit together and then it figures out the order in which to spin this app.
TREVOR: Yes, CloudFormation is a really interesting technology. The idea that you can build template for what your infrastructure should look like and inject variables into that template that modifies. So when you set it up, you can say, “I just need a dev version or I need a full introduction version of this stack, if you would.” And they do deal with all the fun edge cases like eventual consistency which is something you see with many services, where you need to pull and wait for resources to enter a state before you can do the next thing and what not. CloudFormation is just one of the many tools that we have for infrastructure management. And it’s growing rapidly to a number of services that it supports.
ALEX: It also leads to some pretty cool functionality you can do because if you’ve defined your production environment in CloudFormation, that’s another example of something where you can copy that template and spin up your own personal devo copy, so to speak, of your entire production environment and test new features on the exact environment you’re going to be running on.
So it kind of enables some other cool paradigms that didn’t necessarily make sense in an era where you had physical data centers that you had to walk to every time you want to deploy something.
JESSICA: Totally. One of the things we’re struggling with right now as we’re creating our CloudFormation templates is how do we know it worked? How do we verify that the environment that we’ve spun up in Amazon has the properties we want?
CHUCK: Lots of clicking.
TREVOR: So that’s one area where I would look at automation. Lots of clicking is the first step. So we do ship an AWS SDK for Ruby. It’s the AWS SDK gem. The v2 SDK has a client for every AWS service and it allows you query the state. We also have a unified CLI. It’s written in Python. It also provides similar coverage. We have tools in lots of different languages.
We’re on a Ruby podcast so I’m going to push the SDK that we work on but you can get in there and set up tooling too. You can write tests about your deployment tools that can reason about the state of the environment after they’ve ran. And that’s always an option. We like writing tests. It’s a Ruby necessity.
CHUCK: [Laughs] So I want to go and back up a little bit that it looks like you have different levels for different people. So you got the CloudFormation, VPC stuff for people that want to deploy their own Cloud and manage the way that it operates. On the other end of the spectrum, it seems like there are few people out there that just use like S3 and maybe CloudFront and SES.
JESSICA: What’s SES?
CHUCK: That’s the Simple Email Service.
TREVOR: So AWS has a lot of data centric services, things like SES which is Simple Email Service, SQS Simple Queuing Service, SNS Simple Notification Service. Notifications like sending push notifications, hitting a webhook, things like that. We have S3 which is the storage, DynamoDB which is a key value store, a NoSQL data back-end.
We have a lot of different services that are data centric and a lot of people will pick and choose. They’ll build their application. They’ll host it in-house. They don’t have these grand scaling needs. They may be a hobbyist and they only run on one small little server that wherever it runs, it doesn’t matter. They can connect to these data centric services. They fill a really good void because those are the hard things that you hit anytime you’re running on some other server other than your desktop. “Now, how do I persist these assets? How do I get reliable email transport?”
ALEX: And another thing you see is you might start piecemeal. So if you already have a large application running on your own servers, for example, maybe you start with you’re doing some of your file back up in S3 or maybe you are moving your static assets out to S3 and then putting out a CDN like CloudFront in front of it. Maybe you set up a disaster recovery environment in AWS where if your local servers fail, your DNS will failover to a Cloud deployment you have set up for that occasion.
With a lot of larger customers, we’ll definitely see that they move over piecemeal and may eventually become fully in the Cloud. But a lot of those services also provide – like you can move over piece by piece. It doesn’t have to be all or nothing.
A lot of people scripted to see. You can upload functions in whatever language you’d like. But the idea is you can program your own webhooks. So somebody uploads a file to one of your buckets in S3 and then you can configure event [inaudible] so that you can then go generate thumbnails from that; or you can copy that file to another bucket for back up; or you can download that file to some other server for processing.
ALEX: There’s actually a lot of interesting Lambda cases. One that is interesting from the deployment perspective is – are you familiar with the Amazon EC2 Container Service?
CHUCK: Yes, I am.
ALEX: The short story there is if you use containers in Docker, it can be a great choice because you can scale your Docker run my Rails app to hundreds of containers that’s giving you a lot of helpful abstractions. So, like you can say, “Launch this Docker instance. Give me 100 containers across my instances.”
And when you think about something like auto scaling for the ECS cluster, one way to do it is you can have alarms around the load on your containers. If you reach a certain threshold where you want to spin at more containers or spin at more instances, that alarm – that usage alarm can kick off a SNS notification which can then trigger an AWS Lambda script which basically says, “Okay, look at how many instances I have. Look at how much load I’m dealing with right now. Set the new sites in my cluster.” And it runs and then the Lambda script is done, shuts down. And you have just scaled your cluster up or down depending on your load.
So that’s one way that people view services like Lambda like your little one off management scripts are a thing that you can use there. And you don’t have to run dedicated servers to just be a control plane for everything else and then worry about having to have fault tolerance for your control plane. It is trying to handle fault tolerance for the rest of your application.
JESSICA: Wow! That’s a really fascinating use case of Lambda. I never thought of that.
Question, although one thing that frightens me about that is the providence, the finding out what happened where. What kind logging happens around that kind of Lambda invocation?
TREVOR: I don’t know if you’re familiar with CloudTrail. CloudTrail is this integrated logging solution. That’s your first level of entry for logging because it will allow you to see what Lambda functions were invoked and when and with what parameters.
And then your actual Lambda script has the ability to dump logs directly into Amazon, our CloudWatch. And CloudWatch Logs is another service that we have that allows you to persist logs.
ALEX: So CloudWatch Logs originally came into my view because it’s a great way to, for example, aggregate your Rails logs across the cluster. You’ll see a lot of people who use CloudWatch Logs for example to keep a live trail on their log stream. And say, for example, keep real-time metrics on “Hey, I’m suddenly getting a bunch of 500 errors. I want to know this in real time.”
So it’s one interesting use of something like CloudWatch Logs where you can do various things on your log statements in real time. But they also have integration with AWS Lambda. So you might have a CloudWatch Logs group and anytime a Lambda script runs, those logs will be available to you in that group.
ALEX: And that can be one way to say like, “Hey, my Lambda script didn’t work. I got to go see what’s up with that.” You can go take a look at the logs that it generated.
JESSICA: Can you have Lambdas respond to the CloudWatch Logs like those 500s?
TREVOR: CloudWatch Logs and CloudWatch can trigger alerts that will trigger Lambda functions. So, absolutely.
JESSICA: Sweet. So then you can have heuristic on top of your heuristic on top of heuristic which sounds frightening but that’s how humans work.
ALEX: And the point is yes, you can do that. Obviously, if you’re adding many, many layers of obstruction to your application, to some extent that’s on you. We just give you the tools. How you use them is up to you.
But yes, if you’re comfortable with that and you totally can be especially if you’re developing it. Well, you’re developing your application incrementally, you’re testing at every phase. We’re not going to stop you from having as many layers of abstraction and different levels of monitoring, updating, scripts to do, anything from the mundane to the really useful. All those hooks are there and you can put whatever you want on top of it. Absolutely.
TREVOR: And personally, I’m a fan of exactly what you were suggesting is that data-driven monitoring and response. It’s what allows you to make objective decisions about do I need to change my infrastructure? Should I scale it up? Where are the problem areas? What’s causing me operational pain?
JESSICA: True and that lets you ask later, did it work?
TREVOR: Right. You then have actually history that you can go back and compare it to and see if your change made a difference for the good, better or worse.
JESSICA: Exactly. Not just did it run? But did it achieve the outcome I was aiming for?
ALEX: And from a control plane yes, that can be a great way because it does seem on the surface, you may feel like I’m adding many different layers of abstraction. This is a lot of stuff to keep track of. But if you’re running it all on a control plane server, you just have the same problem somewhere else.
The nice thing about something like an AWS Lambda is you can have small, sharp components. Almost like the Unix approach – small, sharp tools that do a particular job well triggered by events.
JESSICA: And you have traceability of those events and what they triggered.
JESSICA: That’s what’s missing from a lot of microsources, for instance, that kind of visibility.
CHUCK: I want to change directions a little bit. I’ve played with EC2 and OpsWorks and I had a client that used Elastic Beanstalk. After I struggled with it for a while, I finally just gave up and said, “You’re the Ops guy”. When should you be using one or the other or the other?
ALEX: To start off with, when you look at say Elastic Beanstalk versus AWS OpsWorks, it’s simplicity versus control is the first part of it. So Elastic Beanstalk does a lot of stuff for you and it’s very simple to get started.
TREVOR: It’s like the Heroku model. You upload an application and we’ll run it.
ALEX: And it gives the ability to look at those resources but it’s a very simple option. This leverage says, “We’re going to bring to you a few recommendations for how you should run in manager app and we will take care of it for you.” When you go to something like AWS OpsWorks, there’s a little bit more of lifting to do at the beginning but it gives you a lot more control. Like now you have the Chef scripts that you can use to manage the way that your application is set up, the way that your environments are installed. So you have a little bit more control, a little bit more responsibility. But you know you have both options. You can take it a bit further.
So there’s a service we released called AWS CodeDeploy which as the name implies is used to automate your code deployments to any server including servers running in your own data centers, or virtual machines running essentially anywhere. It handles lot of my new Chef deployment like rolling updates to your servers, so you can deploy without downtime; a dashboard to control and monitor your deployments or triggering deployments; and things like I pushed the change to GitHub, run the deployment. But you have a lot more control because you really bring your own for how that deployment works.
You’re going to do a lot more of the set up of your environment. But you get even more control about how your environment is set up. So you have the ability to decide how much control do I need for setting up my cluster versus how much time do I want to put into it. You can start from the very simple and you can get all the way down to controlling things at a fine grain level as is appropriate for your used case. You don’t have to go all or nothing.
Do you have any other recommendations or tricks that you’ve seen people employ that make their apps run better, work better, act nicely? The CloudFront idea I could see working really nicely for an Angular or Ember app that’s a single page app. Were they at least encapsulated a lot of the behavior in there because it comes down quickly? Any other tricks or recommendations?
CHUCK: Oh, interesting.
ALEX: Yes. It is a thing you can do for sure. It’s a great advantage, yes. If you can cut down on your site latency quite a bit, if you have as much as possible going through static caching. As far as the general optimizations, there are a couple nice things.
So one, it’s the value proposition of the Cloud to begin with. That is the lack of the need for upfront infrastructure investment. So if I’m a small company, maybe I’m running on two instances, generally low traffic. I have one in each of two availability zones for a little bit of redundancy and I’m humming along. Then one day I’m mentioned on the front page of Hacker News. I’m on the front page. I read it. And you’re getting a ton of traffic, small different directions. You can spin up to 200 instances without having to buy a rack and set up a data center, something like that. And if that traffic all goes away at the end of the day, you can kill off those instances and go all the way back down to two. It’s a thing you can do to scale up and down very quickly and to be able to support that.
The other nice thing there is you have the fault tolerance of being able to exist in multiple data centers around the world with the click of a button. That’s definitely a thing that we encourage is you want to make sure that you geographically spread out your servers, that you have more than one running. And there are things like with the Amazon RDS service for databases – you can have multi-availabilities on deployments. So in the unlikely case an entire availability zone goes down or one of your two databases goes down, you can failover behind a single host name which is really handy for a Rails application. Or even just the Cloud paradigm that allows things like immutable deployments you don’t have to alter anything running on your application server to do a deployment. You can spin up an entirely parallel cluster and then only move over even say after that cluster is smoke-tested, you know it’s ready to go.
I can only imagine how it would have gone over at my first job if I proposed that every time we wanted to do a code push that we bought a whole new set of servers and install them.
ALEX: By the time it was over, I would take the old servers and throw them all in the dumpster. But really that’s exactly what you can do now and it makes perfect sense.
TREVOR: I’m a fan of a couple of our services like SES. I know we mentioned this earlier but it’s dead simple to start sending email through SES so you can use emails to the APIs. You can send emails through regular SMTP addresses. So it means you can have working email sending from your dev environment that’s the same as from your production environment. That’s just one of those things that often is just one of those headaches. They even have sandbox tools for testing.
Another one is if you use Route 53 and you host your DNS records through Route 53, it’s a very low-cost service. This allows you to do things like set up routing policies like weighted latency based failover, geolocation routing policies. It’s really easy when you set up a hosted zone record with them to pick. Say, I want to send 40% of my traffic to this server and 60% to the other or respond to traffic based on latency or set up failover polices and whatnot.
That’s one of those low hanging fruit easy wins that you could that can provide a lot of reliability to your system.
ALEX: One infrastructure example on that note that I really like to point to is the Obama campaign in 2012, the Obama for America campaign. They released a lot of the technical information of how they had set things up on AWS. It’s just really cool if you think about it because a lot of companies don’t want to talk about the secret sauce, the stuff that they set up. But as election campaign, they were going to be talking a lot more partially for electoral law reasons. They had to disclose a lot more than the company often would. But also just the team was really excited to share what they did.
One of the cool things that they did along those notes is they were running all of their applications in a particular region. But in the extremely unlikely event, which I don’t believe ended up happening, that an entire region is unavailable. They actually had a ghost like a parallel version of their application running in a different region that they could have failed over to. So they could have actually failed over their whole application to a standby version of their application running in a different region.
So it’s a thing you can do with things like the Route 53 failover routing or in a more common use case, you can say if somehow my entire application goes down like critical software failure or anything like that. Like for some reason my servers aren’t reachable, you can failover to a static site you have stored on S3.
JESSICA: All of this is amazing potential and it’s almost within grasp. I hear the word [can] over and over again in this and yet there’s that cognitive overhead to every piece. I love that there are things like SES and Route 53 that takes something you have to do anyway and just make it easy. That’s just the most beautiful kind of service like stripe for payments. You have to do this. Let me make it easy for you.
On the other hand, all of these failover things and the scaling, all of these are new potential things that we didn’t do before. I’ve observed in talking to people at conferences and of course we’re doing this at work that there are a lot of companies making teams to learn AWS and implement all of these [cans], all of these new things we can do.
Do you see that or do you have the expectation that a small development team is going to take advantage of all of these things in addition to producing their software.
TREVOR: There’s definitely, as we talked earlier, that learning curve, you’ve mentioned companies putting together teams to learn all of this. I don’t see that necessarily going away. We, as developers, are accustomed to learning and learning about infrastructure and scaling technologies and whatnot. That’s just par for the course. AWS does offer a lot. We have partners that offer course work and whatnot to aid in learning this. There are a lot of materials on the web.
ALEX: The other way to look at this too is there are a whole lot of things you can do. At first, I looked at all the options of the things you can do, all these best practices. If you’re developing a Rails app and you look at the list of security things you can do. Even that almost feels overwhelming at first. But when we talk about things, you can do autoscaling. You can do failover. I think the way to look at this is you can do this one step at a time.
One thing I was trying to get across in my lab, for example, is okay let’s just get the first step out of the way. You’ve got an application. Let’s get it deployed. Once you’ve done that, it’s a small incremental step to say, “Okay now let’s tie rails, email sending in the simple email service.” It’s a small incremental step to cache all of your static assets with something like Amazon CloudFront. It’s a small incremental step to spin up some more app server instances and put them behind the load balancer.
Each of the individual steps to get to this really robust architecture, each step can be handled on its own. Then eventually you find like, “Wow, my application is getting better and better. It’s getting more scalable. It’s getting more fault-tolerant. I can see that the availability story is getting strong.” You don’t have to feel like, “I have to do everything at once.” You just kind of add a bit, add a bit. You do it one bite at a time.
The other thing I like is when you start to pick up this mentality of the cloud paradigm, you start to write your apps in an appropriate way. When you treat your application servers like they’re disposable, you start to write your applications to be stateless. You really think about any state that I’m storing should be in a database or it could be in a DynamoDB like their DynamoDB session handling plug-ins, for example; or my static assets could be in S3. Like you take the things that should be stateful and put them into stateful services and then you write your application servers to be stateless. Then it really starts to come together.
TREVOR: Yes, I think the key here is that you got to eat the elephant one bite at a time and you got to just learn as you go.
CHUCK: Well, the other thing is that I know people get really concerned about security or they get really concerned about this, that, or the other. Those are also things that you can tackle one thing at a time. You do the best you can but then as you learn more then you take that next step. Unless you’re a big target or a lucrative target in some way, you’re probably not going to get – it’s probably not going to matter a ton. I’m not saying that’s an excuse not to do security or some of these other things but you can do good enough security and then as you learn more about it, you can take those next steps.
The other thing is that if you’re using something like OpsWorks which has Chef recipes and things like that, you can pull in recipes that do the security to some degree for you.
ALEX: And at AWS obviously, security is our top priority.
ALEX: It comes before anything else. One thing we look at for an application is we’ll give you for example a white paper that says, “Here are some security best practices when you’re deploying in the cloud.”
That’s another thing where you’re just going to go step by step as you’re developing things to make sure that you’re developing your applications in a secure way. Much like you’re going through securing your Rails application and you have those best practices. You go through step by step. Then you find no individual step is necessarily that hard. Then when you’re done going through all those like, “Wow, I feel really productive and this is really cool now.”
JESSICA: Frankly, Chuck, what you said kind of terrifies me.
CHUCK: They terrified me a little bit too.
JESSICA: Certainly it terrifies business people because you said and I’m not denying that this is reality. We are just going to count on good luck and not getting hacked until we have time to learn more about security.
CHUCK: It happens a lot but the thing is that I see AWS and some of these other services solve a lot of it for you.
JESSICA: That’s a good [inaudible].
CHUCK: Yes and there are a lot of ways. I guess I didn’t say it well but that’s kind of what I was saying. They solve a lot of it for you and there are a lot of other tools to solve a lot of it for you. Then you can incrementally get better.
So you can say, “Okay, well you know our application, it looks like we’re doing this other thing that we realize now is a bad idea.” So you don’t have to know everything upfront. You just start with a good platform and work from there.
ALEX: And one of the nice things to me is Amazon scale allows us to put significantly more investment into security, policing and countermeasures than almost any large company could afford themselves.
One example that we look at is a lot of CIOs might worry about the Rogue’s server on a developer’s desk running something destructive or something they don’t want running. Today, it’s really hard if not impossible for these CIOs to know like how many orphans there are and where they might be.
With something like AWS, you can make a single API call to something like CloudTrail or describing instances in EC2 and you can see everything running in your VPC. So you can audit in a way where you can say with some certainty there is no hidden servers under the desk or someone put a server anonymously on your rack and plugged it into the corporate network. You can describe with an API call the fullness of your AWS resources.
TREVOR: Also AWS does a lot of things like when you spin up an EC2 instance, the default behavior is to put it inside of the VPC and to lock down so that you have a minimal to no access to that unless you open up holes and the firewall and whatnot. Sometimes this gets in the way of it being really easy to use but it also is a fixture by default and AWS takes that fairly seriously.
ALEX: We have a shared security model where we’ll spell out the things that will secure for you. We will take care of the physical security of the data center. You have to take care of the security of the application you’re running. But we try to give you more tools to be able to do that. It’s nice to be able to know if you’re running it primarily on the cloud, these are all the resources that are running. These are all the API calls that have been made. You can audit and say CloudTrail says that these are all the EC2 API calls that have been made. You can account for every single one if that’s a concern that you have.
TREVOR: There’s also a difference between security at the infrastructure level and at the application level. As developers, I think we’re more focused on application level security and things like cross-site scripting, SQL injection and those kind of things. Those are going to strictly always going to fall on our shoulders. If we put on that opt tact, you do take more responsibility onto your shoulder to not like run an OpenSSH, put it on your server, things like that.
ALEX: At the same time, as you learn services like VPC and security groups for EC2, it gives you a lot of tools to make sure that you have control over the traffic coming to your instances. So a lot of the tools are there and we have things like whitepapers on the AWS website to point you in the right direction of the things you should think about. As you layer more and more security and robustness on your applications, again one step at a time, you just go through these things and make your application a little bit better and a little bit better.
JESSICA: These are some great points. I’m really encouraged by it. But you mentioned one about running an OpenSSH port.
JESSICA: Yes, I know that’s bad and I know we do it because as developers, we don’t know any other way to find out what’s running than the log in and look at it. Did you have any suggestions around that? How do you create visibility into what’s happening on the machine without opening an SSH port?
TREVOR: Sure. So CloudWatch and CloudWatch Logs are tremendous tools for collecting metrics, custom metrics, standard metrics on your instances running in the cloud without having to poke onto them and those services disc. It really depends on how you deploy your applications to those servers and what not. But you can also get around some of this by running this inside of a VPC which is the default access. So now you can tune that VPC to say, “I have an internet gateway to this VPC only from my office.”
JESSICA: We do that. We do that.
TREVOR: If you’re going to run that OpenSSH port, you start locking that down. You can also turn it off. You say, “I don’t need to access these servers so we’re going to turn off that SSH port.” By default, it’s always a good security practice.
ALEX: You get into things like Bastion servers as well. There are definitely a lot of options for [Inaudible].
JESICCA: Yes, we do that too. We have a whole layer of things. It just kind of terrifies me. The whole DevOps movement which really should be devsecops, I guess, because you can’t. You should never do ops without worrying about security.
CHUCK: That has always happened.
JESSICA: Yes, it should.
ALEX: No, you bring up a good point, I think, Chuck. Let’s get into it, so why doesn’t it always happen? Sometimes it seems to me like it doesn’t happen because people think security is really hard and to be fair, there is a lot to security. I feel like I learned something new about security every day. If you’re like, “Security is hard, I’m overwhelmed. I’m just going to hope that I get by with a security through obscurity.”
I think what we try to do is to try to make security a little bit easier or truly try to guide you in the right direction. Obviously at the infrastructure provider level, we’re not going to be able to make sure that you write your Rails application in a way that it doesn’t accept arbitrary input from users. But we can try to say, “Okay, if you’re worried about ‘how do I make sure my cloud deployment is secure’, we can tell you what you should be looking for and what you should look to handle,” because we have the aggregated experience of having done this for a really long time.
When you start to get into the bigger company side of things, you can get, for example, enterprise level support. You can talk one on one with solutions architects and others at Amazon who can walk through your application example like depending on how much direct contact you need. We’ll start by giving you whitepapers that point you in the right direction and to try to make it more comfortable. You can take it all the way to having us help you directly.
TREVOR: And not to promote any kind of fear, uncertainty and doubt, but I think a certain level of fear around security is healthy because it makes us think about like, “Gosh, should I really open this port in that firewall or should I really install this application on that server?” Because it’s when we ask those questions that we say, “Is there a better way to do this?”
JESSICA: Yes, is there a way to get my logs stored into a place that I can see them quickly enough that I never need to log in at my box?
ALEX: With things like CloudWatch Logs and live monitoring of that, you can have it such that as soon as say the word ‘fatal’ shows up in your logs, that you get a text message or an email or whatever kind of notification you like, like simple notification service.
JESSICA: That’s cool.
ALEX: You have to keep an eye on the things you really care about.
JESSICA: And you have to build that into your app. You have to know what you care about into your app and print or build in even if it’s just printing to the log, some way for you to know. You have to think about that when you’re coding. Like we used to think about making it testable so we started test-driven development. You have to ask yourself, “How am I going to know this is working,” and build that in.
ALEX: How do you sleep soundly knowing that your application is working while you’re not there to look at it?
CHUCK: Yes, but can’t you use the logging system to hook into a Lambda that hits SES?
JESSICA: Yes, but you have to print those logs.
ALEX: You don’t even need to have that many layers of abstraction. You can have something like SNS directly sent to SMS. You can have it text you. You can have it email you directly.
TREVOR: I think the point is you have to identify those hot paths that you’re concerned with.
CHUCK: Yes, totally. I’m just saying from an infrastructure standpoint, once you know what you want it to notify you about, you can plug some of these things together and they’ll notify you automatically.
TREVOR: There are plenty of reliable channels for tracking, monitoring and persisting that data. It’s just a matter of identifying what data do you care about. What is that metric that’s going to — if it’s green, we’ll give you [inaudible].
ALEX: How much latency before it’s a problem and you want to hear about it? How many error statements before it’s a problem and you want to hear about it? How many 500s? It depends on your application and what kind of service level you’re trying to give to your customers.
CHUCK: Very cool.
JESSICA: One thing that I feel like is a conflict – Trevor said eat the elephant one bite at a time and I really like the points about you can gradually add scalability, add better availability, add robustness and security, layers of security to your application.
When we made the box, whatever, one of the objectives of the DevOps movement, and DevSecOps too, is to align the interests of apps, devs, security of the whole ‘create this application and run it’ directly with the business if our purpose is to have business impact and help with that. But once you put all those on the same team, suddenly the most visible output of that team is not robustness. It is not availability. We only see those when you don’t have them. The most visible output is features and that’s what people are demanding. Then that’s what the whole team is pressured and motivated to do. And these side tasks of maintaining the operations and maintaining the security and continually improving these things which is fantastic but it’s nobody’s obvious job.
ALEX: Yes, I like the sports metaphor for this where it’s like the offensive linemen of the development team. You only notice them when they do something wrong.
ALEX: I sympathize with that strongly, believe me. I think it’s a culture thing. As a culture, you want to value like hey making sure the things are up and running on a regular basis. It’s a hard job and we try to build the tools as providers and as a community to make the job easier and easier over time. I’m really excited with the progress I’ve seen and the speed of innovation and making it easier and easier to make scalable high availability applications.
But it’s definitely important when you have people dedicated to do this to really value that work. I feel like outages happen to everyone either for reasons you can or cannot control. Every single time that I’ve seen an outage, I see a chance to learn from it and get better. Part of that is we try to bring AWS as a provider our over 10 years of experience of doing this kind of thing – 20 years as a company experience of building scalable applications. You see many, many ways that things can go wrong. You make the system better and better over time and try to pass on that tribal knowledge and pass on that experience in the form of something more reliable.
I think having those teams that see this is why our site went down and this is how we can make sure it never happens again and to have that work be respected is a great thing. I think that’s a sign of a very strong, positive company or team culture when you have teams that value that. When you have teams who only see you when things are going wrong, then that to me would be a toxic culture.
JESSICA: True, it’s one of the things I like is we’re tasked with making all of these easy for the other development teams at Monsanto. So while we do have specifically security and operations as a focus, a way to answer to the development teams and they’re only going to use our stuff if it makes their lives easier, it feels like a good balance at least for an enterprise.
ALEX: Well, that’s fantastic. That’s the exciting things about teams that are doing that and these tools that are coming out is really their force multipliers. It’s not just additive.
I go back even to when I was in college and it felt like deploying an application to the “real world” was the scariest, most inaccessible thing ever. Nowadays, you can condense it into a very short tutorial and that’s exciting. It’s almost we’re into this new paradigm of the ability to tinker and experiment. I really think it’s a force multiplier for everyone.
ALEX: Yes and within teams if you can say, “Hey, you as a team don’t have to worry about how your system is going to scale as long as you build your application a certain way.” That’s exciting. That’s a force multiplier like when you as a team is doing something, like that’s fantastic. That’s the kind of thing that I’d love to see and be around.
JESSICA: We’re hiring.
TREVOR: So are we.
CHUCK: Alright. Well, I think we need to get to the picks. Jessica, do you want to start us off with picks?
JESSICA: Okay, I have one pick today. This is one I’ve been hoarding for a little while. It’s an article. It’s an interview with Laurent Bossavit who wrote the 10X Programmer and Other Myths in Software Engineering. I believe this is one of Avdi’s favorite books. The article is short and it makes really clear that most of what we do is learning as we’ve been talking about today. Oh my gosh! There are so many new things we can do and all of them sap our cognitive energies. I recommend this article as inspiration to be patient with that.
CHUCK: Alright. I’m going to pick Paracord. So I’ve been involved in Cub Scouts for about seven or eight years. I’m on the Roundtable staff which is the monthly training for Cub Scout leaders. One of the guys there is really into Paracord. If you don’t know what it is just go look it up. It’s basically this nylon rope. It’s really thin and pretty strong. Basically, you can tie it into all kinds of different things.
So I’ve been tying it into different things like I’ve got this little fob for my keychain. I tied myself a Neckerchief Slide, it’s a Turk’s head knot. You can look that up too. But anyway, it’s been a lot of fun and I’ve really been enjoying it. So I’m going to pick that. There are a whole bunch of websites out there to show you how to do it. There are a ton of videos on YouTube that will show you how to tie different things with Paracord. So, go check it out.
Alex, do you have some picks for us?
ALEX: Yes. So the first one I want to bring up is the workshop that I gave at RailsConf this year. The video for that is up. It covers from scratch, how to get a Rails application up on the AWS. I think the one take away from me is I think it’s a great starting point to experiment. I definitely want to hear from anyone if they try it out and they want to experiment with different things. I think it’s an exciting starting point to go a different direction for a second.
I’ve also been flying a lot more this year. So I’ve been going through the backlog of books I wanted to read. And one of the better ones I read, I think it got me through two cross-country flights and made them just breeze right on by is Stranger in a Strange Land. So this is kind of the Robert A. Heinlein sci-fi classic. I thought it was just fantastic. So that’s a book I couldn’t put down.
Last one is the Kalzumeus podcast. So I believe Patrick McKenzie was a guest on your show previously.
CHUCK: Yes, quite a while ago.
ALEX: I think this podcast has about 11 episodes. He doesn’t post them very often but some of them are just solid gold for looking at the intersection and meeting points between business and software. I find a lot of the content are a great reminder that we’re not writing code and solving deployment issues or optimizing [inaudible] and performance in a vacuum. We’re doing this to solve real world problems. We want to stay focused on making life better for people in some way.
CHUCK: Very cool. Yes, I really enjoyed that stuff. I think he does some stuff with Keith Perhac and he’s also a lot of fun to talk to.
So Trevor, do you have some picks for us?
TREVOR: Sure. Actually first I have a Paracord key fob.
ALEX: He’s holding it right now.
CHUCK: Awesome. What color is it?
TREVOR: I work with a local scout troops. So we did the same thing. So my two picks: First one is gitter.im. I don’t know if you guys are familiar with this. It is a – think of IRC for the web for your GitHub projects. It’s a really great way to collaborate with users of tools that you work on, for, or with. It’s free. They have history. It’s searchable. You can mention people. You can get GitHub project activities right there in the feed. Really cool tool. Definitely check it out. I use it on a daily basis.
My second pick, this one’s a little bit of a selfish one, the ruby.awsblog.com. That’s where we blog about stuff we’re doing. So that’s it.
CHUCK: Awesome! Are there other places that people should go if they want to find out more about the two of you or about what’s going on with AWS and Ruby?
TREVOR: Sure, the blog is a great place. The Gitter channel, our GitHub organization/project, you can get information all there. We love talking to people who use our tools. We listen to feedback. We definitely cycle feedback in to drive our development efforts. We’re very accessible. You can catch us on Twitter. You can catch us on all the usual suspects.
CHUCK: All right, terrific. Well, thank you both for coming.
ALEX: Thank you.
CHUCK: I guess we’ll wrap up the show. We’ll catch you all next week.
[This episode is sponsored by MadGlory. You’ve been building software for a long time and sometimes it gets a little overwhelming. Work piles up, hiring sucks, and it’s hard to get projects out the door. Check out MadGlory. They’re a small shop with experience shipping big products. They’re smart, dedicated, will augment your team and work as hard as you do. Find them online at MadGlory.com or at Twitter @MadGlory.]
[Hosting and bandwidth provided by the Blue Box Group. Check them out at BlueBox.net.]
[Bandwidth for this segment is provided by CacheFly, the world’s fastest CDN. Deliver your content fast with CacheFly. Visit CacheFly.com to learn more.]
[Would you like to join a conversation with the Rogues and their guests? Want to support the show? We have a forum that allows you to join the conversation and support the show at the same time. You can sign up at RubyRogues.com/Parley.]