056 iPhreaks Show - Mobile Performance Monitoring with Brit Young

00:00
Download MP3

The panelist talk to Brit Young of New Relic about performance monitoring.

Transcript

BRIT: I'm in Portland, Oregon. CHUCK: Okay. PETE: Yeah, West Coast represent! BRIT: Yeah, I'm the West Coast. JAIM: Cascadia! BRIT: Yeah. JAIM: Living the dream of the 90s. CHUCK: [Laughing] BRIT: I am, indeed. [Laughter] [Would you like to join the conversation with the iPhreaks and their guests? Want to support the show? We have a form that allows you to join the conversation and support the show at the same time. You can sign up at iphreaksshow.com/form] CHUCK: Hey everybody and welcome to episode 56 of the iPhreaks Show. This week on our panel we have Ben Scheirman. BEN: I can’t think of anything clever to say. Hello, from Houston. CHUCK: Pete Hodgson. PETE: Good morning from Eastbani.. CHUCK: Jaim Zuber. JAIM: Hello, from Baton Rouge. Wait, I don’t live there. [Laughter] CHUCK: I'm Charles Max Wood from DevChat.tv and we have a special guest this week, and that is Brit Young. BRIT: Hello, from Portland, Oregon. CHUCK: Do you wanna give us a quick introduction, since you haven't been on the show before? BRIT: Sure! I just moved to Portland actually, recently. I'm the current mobile product manager at New Relic, and so I'm overseeing the products for managing and monitoring your performance and your need of iOS and Android applications. Prior to that, I was the co-founder of a startup in Chicago for a few years and we built Mac and iOS applications. We had a couple of apps in the store, and we also did some client work. CHUCK: Cool! I didn’t know that New Relic did monitoring for mobile-type stuff. BEN: I think it’s relatively new, right? BRIT: Yeah, it’s only been around about a year, but essentially what we do is we provide performance monitoring of your real-time user base, so we will monitor the objective-C code that’s running and real sessions of your application out there in the world and we will report back just health information, some information about the network calls that are happening in the backgrounds, how long different things are taking to load or process. And then we will also actually trace your objective-C code so we have visibility into anything that's taking longer than expected, you can sort of drill in and see what was happening in these traces and this is essentially the only way that you can get this kind of information after you ship an app because we have some sort of real-time performance monitoring tool in place, and so that’s what we focus on. PETE: Do you guys do the magic thing where you can associate a mobile request and a backend request so I can see – like, as I go slow on a client – what was happening in the server as well? BRIT: Yes. New Relic, we do web monitoring – that’s obviously where we started. If your app is talking to an API or web app or web service that you are also monitoring with New Relic then we can make that link and you would see from the mobile device level perspective what the user experience from a performance standpoint, and then you can also see how the server experienced that same network request. PETE: And I would drill all the way from stage-based queries, right? BRIT: Yes. PETE: It’s pretty trippy. The first time I saw – I never used iOS stuff but I use New Relic to backend stuff – and the first time I saw that where it was a server that called a server that called a server that called a database, and they had the whole, like they drill through the whole thing. It was pretty magical. BRIT: Yeah, that’s really what we’re trying to do with mobile as well – bringing that sort of power to what's happening in your objective-C code and your device-level code and give you insight into the whole experience that a user’s having, not just the worst-case scenario of you got something actually crashed, but there's a lot of stuff that can happen in-between that. Without that kind of visibility, you can’t really know how your code’s performing. BEN: A lot of this stuff is dependent on the data specific to that user, right? So when you're testing, an application might perform just fine, but as the data grows, or as you have maybe power users who generate a lot of data, sometimes you [inaudible] prohibitably bad experiences, especially if you're doing stuff like on startup – you can actually have the app in a state where it can’t even launch because it can’t respond to user events before the application has finished launching. We’ve definitely seen that where we did this thing where – it was one of those things where you kinda find your friends by uploading a hashed version of your address book to the server and we compare with stuff that we have on the server. Basically we’re sending an address book – not an actual, raw address book, but a hashed version – and that worked fine on my phone. I have a couple hundred contacts, but our client has 5,000 contacts in his phone, so he would sit there for 10 minutes, and we didn’t do any memory optimization on it or anything and that was just something I didn’t even realize would be an issue. BRIT: Yeah, yeah exactly. JAIM: He’s got 5,000 contacts, he’s running iPhone 3. [Chuckling] BRIT: Yeah, people run some pretty old stuff. But even just – we find that what carrier someone’s network communication is going over can impact the experience that they're having in a negative way, or just a type of connection. Maybe the problem, performance problem is only happening when someone’s on Wi-Fi but not over their data plan, or vice versa. And also just the user data’s big. There's so many things that you can’t anticipate even with the best test suite in the world. CHUCK: So what kinds of things should you be looking for? I mean, you mentioned the data size and the data usage – are there specific other things that you should be particularly looking for? BRIT: Sure. A lot of our customers that do heavy networking, they will monitor response times to the various APIs that they're communicating with the monitor to make sure that the HTTP errors that are coming back are expected, or not getting anything weird in the monitor for spikes in their error rate. You can also monitor the network failure rate just to have a better understanding of how many requests are never making it to your server because for whatever reason, the mobile device wasn’t able to connect and just understand across the user base how often that’s happening and for what reasons. That’s sort of on the network side, and you can also trace through the various transactions that are happening so it’s great for troubleshooting network requests. If something is going wrong and there's a high response time, then you can actually drill in and see why and what's happening. And then on the other side of things, it’s just understanding the other stuff that happens in your app, so independent of network monitoring, you may be parsing huge JSON files, for example, or just making a lot of database calls or you're communicating to core data a lot. And you wanna make sure not only that the code is efficient but that you're not blocking on the main thread or missing opportunities to optimize some things that should be happening in the background or should be happening in parallel in the background. So kinda just being able to visualize and see what app-specific things you might be able to optimize is really powerful. JAIM: So if I wanna test the different APIs that my app might be using, maybe we've got a home service or going to the twitter API – do I have to anything in the app to get that information, or is it magic? BRIT: It’s magic. It’ll track every network request that’s going out from your app in real time and it will trace that for you. JAIM: Do I get a dashboard view on a website somewhere? BRIT: Yeah, so you would just log in to New Relic, and you just click on Mobile. We have several different sections of our performance tool – we have network information in a separate section and you would just kind of see what's happening and you can change the time frame over which you're viewing your performance data. You could see, for example, what's been happening in the last 30 minutes, or you could say, what's been happening in the last month and anywhere in between. If you see something that’s spiking or that looks a little bit odd, or your response time to your own API is hitting over 10 seconds or something crazy, then you could actually drill in and see what was happening. JAIM: Oh, very cool. BRIT: Yeah. JAIM: What other things do we get out of the box? BRIT: Out of the box, we will monitor – everything that I talked about networking is out of the box. We will monitor common system or API calls that happen, so for example, we will track a what we call an interaction trace, which is every time a new view loads up we will track your viewDidLoad [inaudible] and we’ll start tracing. At that point, it’s going to be app-specific what actually happens after that, but we’ll trace anything that seems like a major activity that’s happening in your code without any extra work. And then if you wanna put specific things that you'd like to trace, like, say, there's a really specific method that you're interested in or you just wanna get a lot more detail about what's happening in a certain view – that’s when you could go and use our custom API and you could put in some of your own very specific metrics to report up to us, so you can do pretty much as detailed or standard as you want to, depending on how much effort you wanna put into it, but we try our best to monitor a lot of things out of the box. Some other thing – we also will monitor your JSON serialization calls, calls to fetch results from core data – I think those are the key ones that we monitor. PETE: What are some of common custom ones that you see customers creating? BRIT: I've seen some interesting stuff. Sometimes people are specifically interested in what is happening in a certain screen, so maybe you have a lot of different UI elements in there and you just wanna know how those are performing or loading up. I've also seen where people have [inaudible] to report purchase information or values in addition to whatever interaction we’re tracing. They will capture some piece of data that’s specific to what's going on in their app and kinda send it along with the normal trace. If it’s a viewController for your purchasing, or your store – maybe you'd also send a value as an extra metric. JAIM: If you're a newer developer – it sounds like this is kind of a nice, little test suite to find things that are problematic in your application, whether you're spending too much time on the UI thread, or something, a nice, little – this test, if you're doing your app right. Does that sound about right? BRIT: Yeah, that’s a good way to put it; I like that. Yeah, we’re basically trying to provide as much out of the box performance information as possible that otherwise you wouldn’t really see because we don’t have visibility into how our code is running on every single instance once our customers get a hold of it, so that’s the idea – is you’ve done your testing, you shipped. Now, we wish we could say that things are going to be perfect, but that’s never the reality, so even you have the opportunity to drill in and see what's going on, now that you have the information that’s specific to the types of network connections, the types of data that people are inputting to your app, etc. CHUCK: Can you run monitoring on the emulators that come with Xcode? BRIT: Yeah, you should be able to. It’s just an SDK that you would drop into your code, so it will report whether you're just running one session on your development or even in the simulator, it will report in to New Relic. CHUCK: Does it report at any different link so that you know, “Oh, this is my development setup” as opposed to running it live? BRIT: Well you can create different versions, which is what most of our customers will create a different token in our UI for their dev builds versus their production builds, so it would just report to a different place to keep the development work separate from your real-time metrics. JAIM: I love it when the marketing team is excited about test data that we've generated. You know, “Oh my gosh, they use this feature like crazy!” Well, because we’ve been developing it and testing it. [Laughter] So it’s good that you can switch the token out; it’s a very valuable tip. BRIT: Yeah. CHUCK: One other question I have regarding monitoring is if they are disconnected from the internet for a long period of time, do you wind up losing data? BRIT: That’s a good question. One of the things that’s really important, since this is an SDK and it does have overhead, is we have to make sure that we build something that’s going to be extremely lightweight from the user’s perspective that’s not going to impact – we’re monitoring performance; we don’t wanna impact performance, so we have to keep our network calls and everything very lean. What we will do in this situation where a user loses network connection is we will start buffering some of the performance information for about five minutes and we’ll trace that information. At a certain point, we have to stop tracing because we would be accumulating a massive amount of data that we would not wanna send – have to send over the network when the user comes back online, so that’s kinda how we draw the line there. Once the network does come back online, we will queue up and send the information in a way that will impact performance. PETE: Are you doing aggregation on the client to reduce the amount of network traffic or is it all just kind of sent over to the server in raw form? BRIT: Yeah, we do a lot of work to try and make it very lean and we do aggregate information. We also sample across your user base, so we’re not going to be collecting every single thing that happens for every single user – that would be crazy. So we have a lot of logic and algorithms that will kinda figure out what we should be collecting, and then we will sample across and make sure that we’re aggregating a good picture of performance without bogging down, if you will, the app sessions that are running. CHUCK: Performance monitoring is an interesting problem; do you do much in the way of the business process monitoring? For example, if I have an app that I built and I want to put it out for people to use and then I add this feature to it and I want to see specifically, in detail, how these people are using it? I want the general performance metrics, but then I want to know specific things about this feature. Is it possible to do that with New Relic? BRIT: No, that’s not something that we have in our performance monitoring tools. We have something that’s for that type of thing on the website, which is our product we call Insights. We don’t have [inaudible] that for mobile right now, but our performance monitoring does report usage information so you will get uptake information for the various versions that you ship. And then we also have geographic information so you can see where your sessions are across the world, geographically. And you can also sort of drill in and see in this region how are my response times and that sort of thing – what's the network failing rate in China, for example. We have that level of statistics, but nothing that’s more of what you're describing. CHUCK: Do you have any specific use-cases where you’ve seen people actually use monitoring to fix problems or address certain performance issues? BRIT: Yeah, so you definitely have – obviously, most of our customers’ situations are proprietary and I can’t really provide specific examples, but some of the things that –. Because we have a lot of customers that are running some apps that have millions of users at any given moment, and one of the things that they will see is they can quickly figure out if a performance issue is on the server side or on the mobile device side, specifically when it relates to communicating at APIs or just making network calls where the problem was. We’ve also seen situations where someone’s shipping something that is on various different hardware – for example, Kindle – and they will see that they're not leveraging the hardware to the best of their abilities and that they could be spawning more background work than they are, and as a result, they're kinda missing out on opportunity to speed things up, make things a little nicer for the user. So then it’s pretty clear when you look at the trace to see that, “Oh, we’re only spinning out a couple of background threads here and we could put more things in parallel, and this would definitely improve performance in this particular spot in my app.” We’ve seen a lot of things like that where it’s just not realizing that you could do more to optimize, and what you're doing in the background versus foreground, and how the background work is actually organized. CHUCK: Does it actually make those recommendations to you? Spin up more background threads? BRIT: That would be awesome. We don’t specifically say, “Here’s where your problem is” any more than just we show you the data so that you can parse it and make those kinds of calls on your own. But that would be pretty sweet if you could just ask the product and it would say, “Here’s what you need to do next.” I think that’s something that we would love to get to, eventually. CHUCK: I'm still smarter than a computer; I feel good. BEN: [Chuckles] BRIT: You are, yup! BEN: So, given the – I don’t know – the data that you guys collect, do you have some recommendations on things that people should be aware of? Like, time and time again, we’re seeing x. For instance, if this is web, and probably with a tool like New Relic you could find select n+1 problems where somebody is selecting a list of records and then iterating over the list, and in turn, in the iteration, they are executing another SQL call, which is called select n+1 problems. So instead of doing one SQL query, you're doing 26 – something like that. Those are some pretty common problems and mistakes that people make that tools can help find – I was just wondering if there are similar things for core data or networking or something like that, that you guys see a lot that would be beneficial for people to be aware of. BRIT: Everything that I've seen as the product manager – everything that I've seen is tended to be specific to the customer or their app, or what their code is actually trying to accomplish. But I'd say, one of the biggest things is just not realizing where you might be blocking or where you might be – not even just blocking, but just being a little bit inefficient in what's happening on the UI thread. I think there's a lot of situations where people think that they're doing just fine, and then they look at the data and they realize, “Okay, I should be using my background work a little bit more efficiently here.” It doesn’t really matter what the background work is, it’s more about just not realizing instances where you could optimize for a better experience from the user’s perspective so that your app is more responsive, the UI is more responsive. The actual specific problems, so far I haven't really seen anything that’s a common, overwhelming trend. It’s been a lot of specific instances of those types of issues, I would say. BEN: So how would you embed it in your application? Is this like a cocoa pod, or is it a framework? BRIT: It’s a framework SDK; you just download it and then you'll drop the couple of lines of code in, you'll put the app token in. It’s pretty straightforward; it’s just like three or four steps, and then you're good to go, and then it will start reporting as soon as a session begins, as soon as your app is up and running. JAIM: How do you do the monitoring on, say, viewController, viewDidLoad – how is that done? Do you know? BRIT: I do. We use something called objective-C methods [inaudible] and some other techniques, some logic to –. Basically, you can think of it as we’re watching the code, so we’re in there, we’re running inside your app and we’re watching for common things that we are specifically looking for. As soon as we see that we will say, “Okay, start tracing.” We will start watching that code, and then we have logic that kinda determines when we wanna stop watching the code, and then we turn that into, we parse that into some data that we can send up to New Relic to eventually display to you in a meaningful way. It’s just kinda in there watching what's happening and to do that we have to know specifically what we’re looking for in order to find a good starting point and ending point to trace things. We also will be in there watching what's going on in the networking. Can’t talk to you specifically about how that’s implemented. CHUCK: I'm a little bit curious. I've used New Relic for my Ruby on Rails applications – not so subtle plug, if you need backend work, call me. What I am curious about is, so you collect all this data. I'm assuming you have all the nice graphs and stuff like you do for the backend systems – how do you go about looking at that data and evaluating it to identify your problem areas? BRIT: We basically will show you trends in key metrics like air rate or response time, or interaction time. Along with that you can set alerts so if you have a specific API that you're concerned about, you might send an alert that’s like, ‘If the air rate goes about 5%, I wanna know.’ You can take a passive approach, and then the monitoring tool basically tells you, ‘Hey, we think you need to come and take a look.’ At that point, you would get an alert and you would say, “Okay, I'm going to go and take a look.” You would see that exact API called out and you could drill in and see what's going on. Same kind of idea for interaction traces: we aggregate the trend and how typical interactions are happening tend to take and where your code is spending time, and we’ll break that out. So it’s pretty straightforward when you go in; you will know pretty much instantaneously, “Wow, that is abnormal. That’s spiking. That’s high” or “Everything looks normal.” The idea being that we just wanna show you at a high level what's going on so that you can clearly note when something is spiking or something is just going crazy, and you can then, at that point, troubleshoot by going into more detail. I think that answers your question, hopefully. CHUCK: Yeah, I think it does. I mean, I'm pretty familiar with the tool and I'm assuming that it works the same way for both apps, more or less. I mean, being able to see the trend data, having it identify your slowest or least efficient processes and things like that – they're just really handy ways of going, “Okay, if this is the slowest or if this is the most common slow-down, then how do I solve this?” Or do I need to solve it? Because sometimes it’s the slowest, but it’s still fast enough or efficient enough – just making those judgment calls. One other thing I really have liked about New Relic is that you can set your own thresholds, so you can say this kind of performance is bad and this kind of performance is good. BRIT: Yeah, exactly. Because it is specific, I think, for every app and for – depending on what you're trying to accomplish – even to the point of, you're looking at a trace and it’s telling you what your CPU usage is but I think you need that knowledge of your app and what it’s doing in that particular instance to know whether that’s appropriate or not, whether you're hogging the CPU or whether that’s what your performance should be given the amount of work that you're doing. Yeah, I think there's a level of developer judgment that has to happen when you parse this kind of information, but we try and do the best we can to show overall, here’s the trend and kinda call out spikes or call out abnormal instances of things, so that you can get to the problem faster. CHUCK: Now one other thing I seem to remember about New Relic is that it’s a good price point for some people but not for others. How do you go about evaluating a monitoring solution like New Relic to decide what is the best fit for what you have given your budget or other constraints? BRIT: Well, we provide a light plan that’s free that you can use for as long as you want to try out some of the functionality. We also provide an enterprise trial, so for 30 days you can try out all the features for free and sort of evaluate whether or not you think it’s something that you wanna invest in and have it included in your product or not. We do provide a standard plan, so we provide a plan that’s for non-enterprise. We’re kinda targeting individual app developers or small app shops, because we try to provide a price point that we think will work for no matter how large or small your team is. CHUCK: What kind of features and what kind of integrations should a good monitoring solution have? BRIT: I think one of the most important things is that it should be easy to incorporate into your code. A lot of monitoring tools that are out there require quite a bit of work to get up and running, so I think it’s important, first of all, that you have something that is not a huge overhead for the developers to actually get it up in reporting data. In terms of what it reports from a mobile app, I think the big gap right now that we see and that we try to fill is what's happening that could cause your user to have a poor experience that isn’t a crash. Anything other than a crash, there are great crash-reporting tools out there already in the market, so if you're looking to get serious about performance, there's a lot more. In that category, I would say, anything that has a pretty comprehensive network monitoring and anything that has code device-level visibility – so anything that’s tracing the common activities that are happening in your code and providing some insight and some standard metrics around that is what you would wanna look for if you're looking to get into real-time performance. BEN: On that note, if you're looking to get a sense of your network performance, my approach is, just most of the time – we use New Relic on the server – is just to watch API response times from the server end. Why would you perhaps also want it on the client end? BRIT: There are a couple of reasons. First of all, network failure is something that you're server will never see because basically the connection never made it through. The other reason is because when you look at it from the perspective of geography or carrier, you can see where problems are isolated in your mobile network. We’ve had customers who had a sports app that was around a certain event, like maybe the Olympics or something like that, so there’s a huge spike in geographic region, in usage. What would frequently happen with network monitoring is depending on where the call is coming from, the type of network, what carrier they're on, your users can be having a really good experience or a really bad experience. So you need to look at the networking from that side as well as how your server actually handled the response. I think both sides of it are very important, but you're just going to learn different things from watching the perspective of the mobile device or the mobile user. BEN: Okay. Yeah, I think one other thing that just comes to mind is sometimes you're just returning boatloads of JSON then the client has to then go and parse it – and who knows what else they're doing during the parsing? Maybe that’s just another cause for concern in terms of, I'd fire this network request and even though it took 300 ms to respond, they're still unpacking all of that data they have. BRIT: Yeah, absolutely. Yup, that is a really good one as well. CHUCK: Are there things that New Relic doesn’t do that there are products out there that will cover those types of monitoring? BRIT: Yeah, we don’t have crash reporting, so that would be one thing that people who use our tool in conjunction with a crash reporting tool. I think –. CHUCK: Are there one or two out there that you like? BRIT: We usually recommend Crashlytics – it’s a pretty good solution. PETE: Is there a reason why you guys don’t do crash reports? It seems like that would be convenient if someone – they’ve already dropped the agent in and they could get crash reports in as well. BRIT: Yeah, it’s definitely something that we’re looking into. I think we just wanted to start by – the [inaudible] about a year. We wanted to try and start with the area of the performance monitoring world that we didn’t feel there was adequate coverage, and so that’s where it sort of just where we started with – the real time performance that is not necessarily the worst case scenario but definitely impacts user reviews. PETE: Yeah, makes sense. Do you intercept or do you have a feature for tracing logs? I know you can do custom metrics, but if I wanted to, say, as a developer I want to log an error when something non-fatal occurs or something –. BRIT: Not at this time. We don’t have a logging solution right now. PETE: I actually looked the other day and I couldn’t find a SAS log aggregation thing for iOS, which is kind of surprising because you would think that there would be a need for that. BEN: I've considered building one on numerous occasions. [Chuckles] PETE: Yeah, me too. BEN: But we’ve actually done some remote logs before, just like one-off solutions – nothing reusable. But with CocoaLumberjack, it’s very much more like a log4net, log4j style logging tool where you can have lots of event sources and lots of event syncs, and you can control log levels and you could say, “Okay, this type of event, I want to go to this sync” and that sync happens to be a remote logging thing, and you [inaudible] buffered, kinda like the performance data you were talking about where you say, ‘I'm going to wait until I get 20 log lines and then I'm going to zip them up and send them’ or whatever. The problem with that is logs are often super – like you can generate a whole lot of data in logs. It’s funny, as we’re talking I'm actually looking at server logs right now to troubleshoot an issue and we have well over a million log lines in our Papertrail server. CHUCK: Oh, wow. BEN: Actually, sorry. All systems – 20 million lines. [Chuckling] It’s just a lot of data and this is truncated. I think they keep it for maybe a month or two months, so it’s just a lot of data that you wanna be able to turn it on and off. If a savvy individual out there wants to build the dream remote logging tool, my wish list would be have the device phone home and say, ‘what is my log level’ every so often and then could log conditionally. That way, if something happens like I sort of flip a switch on the server and then just watch the events start trickling in and I could turn it off later so I can not use all that processing power and bandwidth on the phone just all the time, which I think would be kinda cool. The downside is when you decide that there's a problem and you wanna turn on logging later, oftentimes you will have lost what you were trying to capture, so it’s a double edged sword. But certainly, you can build something like that with CocoaLumberjack and I don’t see why you couldn’t use something like Papertrail to collect the logs. PETE: Yeah, there's a bunch of backend tools out there that you can [inaudible] in – Loggly, or Papertrail, or Splunk if you feel like dropping 5 bazillion dollars a day. [Laughter] BRIT: Yeah, I think –. CHUCK: That was bazillion with a “buh-za”. PETE: Yup. BRIT: The trick is exactly what you were describing of being lightweight with how much data you were actually collecting and having to send over the network because that’s the real danger with any type of monitoring or logging solution. PETE: Yeah. It’s tricky as well because, yeah, if you're using CocoaLumberjack or something like that then you’ve got log levels, but [inaudible] iOS developer is just using NSlog, and NSlog is just like all or nothing, right? It’s like I'm logging it or I'm not, so [crosstalk]. BEN: And I guess this is a good chance to get on the [inaudible] and say, if you're NSlogging stuff, somebody could just plug in their iPhone and look at the Xcode console. They're going to see that stuff and you may be revealing a little too much information about how your app works and API, or if you ever log anything secretive like maybe an auth token or something, like you're exposing yourself to more than just developers. It’s pretty easy to crack that open and look at it, and not only that, NSlog is really slow. If you're logging inside of a loop – one of the Apple engineers at dub dub mentioned the reason why it’s slow is they have a NSGregorianCalendar [inaudible] in it for every single line. And so they can put the timestamp in there. I'm not really sure why don’t they just have a static date format or something like that, but anyway, there's some sort of date object or calendar object that gets initialized for every log call. If you're logging in a tight loop, it will affect performance negatively. I like to hit the bare minimum – just have my own funky log or whatever macro that is just defined out in Adhocware production builds. That way you get them in development and you don’t have to worry about them in the other builds. BRIT: Yeah, I feel like that’s really – I think what this kinda gets back to is the fact that there are two sides to performance monitoring. There's basically this side where you're anticipating and the side where you're putting in place some monitoring for the things that you couldn’t anticipate, and so we focused on the latter thus far, but both are important, I think. PETE: Yeah, you’ve gotta kinda save enough data so that when something’s happened you can retroactively drill in and see what's causing it, I guess. BRIT: Yeah. PETE: Is there a – sorry, this is not related to that, but just because I'm poking around on New Relic. I guess I'm answering my own questions [inaudible]. Is there an easy way [inaudible] to kind of mark when a new app went out? If I'm looking at trends and I wanna say – so I know it’s quite easy for New Relic to track server-side deployments or events that happened, like we just rolled out a new version of the app and, oh look, performance is getting worse. Is there some magical thing where the iOS agent will notice that the version number’s changed and kind of posted it up to the server or something like that? BRIT: Yes. Everything is version-specific data, so we will parse out your information based on which version of your app it is; it is really straightforward to see. As soon as you deploy the latest version, you're going to start getting data for that version but you'll also still be able to go back and see who’s running my older versions and that data will still report as well. PETE: I can imagine that being really useful if you want an end-of-life, an old API or like some old functionality but you don’t know what percent of your users are still using version 1.0 of your app. BRIT: Yeah, definitely. To just know what the usage is is something that you can just see in our tool in addition to all the performance-level information you can just see – who’s using it, where are they, how many sessions are reporting, how many users are using my app at any particular point in time. PETE: One of the challenges I always have with all of these tools is they kind of like, the sweet spot for one tool bleeds into the sweet spot for another tool. For example, with New Relic the sweet spot is obviously performance monitoring but then there's also this option of monitoring user behavior and user analytics. Have you got any insight on where to draw the line there? When should I stop using New Relic to get insight into this stuff and start using Google Analytics or one or the other mobile-specific analytics tool? BRIT: I think they serve different purposes, so I think it really – it depends on what you're trying to accomplish, but I think –. Obviously I'm going to recommend that you always have real-time performance monitoring otherwise, you won’t know what's going wrong, but I think there's good situations for saying, “I'm going to use this particular tool while working on trying to AB test. Is this screen performing the way I want? Are people tapping on the button that I want them to tap on and at what rate? How long are they spending on a certain view?” I think those tools are really coming at things not so much from a performance perspective, but from learning how your user behaves perspective. I think that the reason that it’s kind of fragmented is because we see things as solving different problems for people and having sort of a different area of expertise and also a different mindset coming at the problem. Is it about learning behavior? Is it about understanding retention rates and how many people click on the purchase button, versus ‘I just want to make sure that my app is responsive and performing from the developer’s perspective and that I'm doing everything I can to build something great’? I don’t think that one tool can necessarily solve both those things perfectly. I think that’s why there are so many different things out there, but I definitely see what you're saying that we tend to bleed into each other a little bit by trying to broaden our offering and provide a little bit of a taste of those other types of special monitoring. It can be a little confusing. PETE: I mean, it makes a lot of sense to have it in the same tool because you can aggregate on the same thing, right? I can say, with your AB testing example, for example, if I can segment my performance metrics into everyone using feature A versus feature B or whatever then, that’s actually really useful because I can see some correlation there. But if that data is in two different tools, it’s really hard for me to – for example, say, is there a correlation between where’s someone located geographically or the performance of the network when they're using the device and their propensity to click the purchase button or doing app purchases or something. Ideally, I want all of that data smushed together so that I can slice and dice it, and I think that’s part of the reason why people want it all in one place. It’s tough because then you become the jack of all trades and the master of none, right? You're trying to solve everyone’s problem and maybe not solving any specific problem really well. BRIT: No, but that’s a really good point. This whole podcast has been great for feature requests. [Laughter] I'm jotting all this down, no. PETE: Don’t steal Ben’s logging idea. He’s going to be a millionaire. BEN: I don’t know if it’s a sign of getting old but after a while, you're just like, “I'm just going to let these ideas go because I just want somebody to build it.” CHUCK: You don’t want to be the one to build it, huh? BEN: It’s not that – it would be fun, but when am I ever going to do that? I'd rather just let somebody else do it. I actually think there's money in that idea, so. PETE: I do, too. I was actually thinking the other day – you know, I asked, “Is there something out there?” because I was thinking the other day, “There isn’t anything out there? Why isn’t there – I'm sure someone would pay me money to build that for them.” But that’s the problem, right? They wouldn’t pay me to build it; I'd have to build it first and then find out if they'd pay me money. BEN: Yeah, now that I think about it, I actually did this idea at a hack-a-thon once, me and a buddy of mine. We got it working, sort of, but you know – the first 80% was done, and the remaining 80% [laughter] was a lot more time-consuming. PETE: Very [inaudible]. BEN: But I did have my own log statement when the app launches and I ran through a loop and collected a bunch of logs and I buffered them and sent them over to a server. He was using Goliath on the server, and basically just a really thin evented frontend that would publish things to – I can’t remember what queue system we were using, but that way we could have some elasticity if it takes a while to log and then storing the actual log save in some other DB. But that was two or three hours of furious coding. It sort of worked. [Chuckling] PETE: Have you got any amazing numbers on how many events you're monitoring every second or something? I kinda imagine New Relic has pretty beefy, has a pretty beefy amount of data going through it. BRIT: Just on mobile product alone we capture hundreds and millions of events, so it’s a lot of data and definitely in the hundred millions. CHUCK: Do you ever throw it away? I mean, I can see that data from one version to the next would be handy, or data from this month to last month, but if you have five-year-old data I probably don’t care about it. Do you [crosstalk] anyway? BRIT: No, we don’t our product has – depending on the level of plan that you're on, you have a different level of data retention. Our maximum right now is three months of data retention for that same reason of we’re trying to help you in the moment, what's going on, help you troubleshoot. It’s a tradeoff, obviously, and if we could keep everything forever that’d be great, but that’s just do much information and has to be stored somewhere. CHUCK: Well, and the value declined so quickly. BRIT: Yeah, exactly. PETE: There was a really cool presentation at Strangeloop last year that was the guy – I can’t remember his name; he works at Stripe now – Avi Bryant or something like that? He was talking about the mathematics of capturing this data basically and the tricks you can use to capture the aggregate, but also be able to modify the aggregate as new data is coming in so you don’t have to store everything in the past, but you can still update the aggregate information as new data comes in. I think he used the word ‘monad’ several times, which makes me smile, because it always makes you feel like I don’t know what the hell is going on. [Chuckling] I really wanna find this presentation now to put a link in the show notes but I can't find it, maybe they deleted it. BRIT: Yeah, it sounds really interesting. PETE: There's also a really good presentation at RailsConf this year from Tom Dale and Yehuda Katz who are building – I guess they're pretty much building a competitor to New Relic called Skylight, and they're focused on Rails apps. Very focused on Rails performance monitoring but they talked a lot about their infrastructure and their architecture for dealing with this crazy huge amount of data that’s coming in really fast and you don’t wanna impact the user, but you wanna capture it, but you wanna be able to process it and do interesting things on it – it’s a really tricky problem to solve. It’s an interesting challenge. BRIT: Yeah, and ideally one would do as much work as possible on the server is where you have processing power, so you wanna keep things pretty simple and minimal when it’s on the device level. CHUCK: I have one more question. I'm actually signed into New Relic at this point right now and I see all of my clients’ projects in here. I've got – I don’t even know – 50 or so applications that are being monitored in here. These are web applications. If I wanna monitor my own applications, do I have to setup a different account, or can I segregate them? BRIT: For our enterprise customers, we can do interesting things with sub-accounts and you can have lots of different sets of accounts with apps in it, but some people also just have a personal account and work account; they’ll just log in separately. You didn’t do either way, but if you have an enterprise level account, then I can help you get set up with separate sub accounts. CHUCK: What, you think I'm made of money? [Chuckling] You said ‘enterprise.’ Anyway, that’s interesting. This account, they added my email to it and stuff, it’s pretty well just tied into their system then? BRIT: Yeah. CHUCK: Okay. Good to know! BRIT: Yeah, it’s awesome to hear that you’ve been using it and hopefully finding value. CHUCK: Do you guys have any other questions about monitoring or New Relic for Brit? BEN: Like you said, I didn’t think there was any alternative in this particular space. I know there's lots for analytics and crash reporting, some for AB testing and user behavior, but not a whole lot that I've – in fact, I'm not aware of any for performance monitoring so it’s definitely a good space to be in. PETE: That’s a good question actually for Brit, that whatever alternatives are there out there apart from – actually what I’d been most interested in is if there's something open source that people can use? New Relic doesn’t open source any of the agent stuff, right? BRIT: Our mobile agents are not open source. There are a couple of tools out there that kind of do some of the same stuff that we’re doing. The one that I hear about sometimes is Crittercism, but they actually – they're crash reporting and I think they may do some monitoring with the network as well. But yeah, it’s a relatively new space, and it’s kind of got an interesting history because web APM – our application performance monitoring – is something that people in the web world and server monitoring are pretty familiar with, and a lot of teams will have their dev ops people who are focused specifically on these types of issues. But once mobile development became big, we started to mature as an industry. It became more and more necessary to bring some of these principles and concepts over, and how do you translate the types of performance problems that you should care about that we’re familiar with on the web side to the mobile side? That’s actually a pretty complicated question, because mobile apps – just the anatomy of them – are extremely different and most of the time they're interacting with third parties, their interacting with web apps, so there's just a lot of moving pieces but different types of problems. It’s just now starting to take off as a space that I think we’ll see more work in mobile application performance management as we go forward. But yeah, the offerings right now, I'm not familiar with too many alternatives. PETE: That’s interesting. It’s an interesting point because I didn’t know that much about the mobile stuff that New Relic did before this podcast and I just assumed it was like New Relic on the server, like tracking network requests or whatever, and I didn’t really think through the –. Actually, the performance, the things you care about on a mobile app are very, very different because it’s only one user using it so you don’t care about will it scale, but you do care about what those experiences are like for that individual user, I guess. BRIT: Yeah, exactly. And trying to find any patterns or just understand where [inaudible] performances you might be isolated to specific types of hardware or only a certain version of the operating system or what have you. PETE: I imagine that’s even more useful on Android because there's so much stuff floating around them. BRIT: Yeah, absolutely. It’s a big deal on Android with the fragmentation of that market. To just be able to say, “Okay, well which devices are causing me the most problems?” PETE: I had an interesting statistic from someone the other day that works for a company that knows a lot about their user’s platforms and they said that the most popular Android phone and OS combination – the most popular – is 3% market share, and it just goes down from there. It’s crazy, right? How do you decide what to build for when the most popular thing has 3%? BRIT: Yeah, it’s a crazy – it’s been interesting watching the change in the global market. I remember I was working on Mac apps at the time that the iPhone App Store opened, and I remember when there was just a few hundred apps in there, and it’s just been insane watching things blow up. The change in how competitive you have to be and how business-focused you have to be in order to make an app successful has been really dramatic. PETE: Are you – I guess, legally or ethically – allowed to look at the really super aggregated data to report on trends like the users are using which phones in which market and stuff like that because that would be a really interesting thing to find out. Because you have so much data flow through the system you could share that with the committee, I guess. BRIT: Yeah, we are to a certain extent, and we sometimes publish our infographics. If you go on the New Relic blog, we’ve done this several series of infographics of some of the things that we’ve learned from watching that type of information across our whole user base, anything that’s not sensitive. PETE: Cool. Everyone loves infographics. CHUCK: Oh yeah. BRIT: Yeah, this is a way to parse data into something that’s a little bit more friendly to look at than just a bunch of numbers. CHUCK: Alright, well let’s go ahead and do the picks. Jaim, do you wanna start us off with picks? JAIM: Sure, I've got a few picks today. One is a conference that you probably have never heard of. It’s kinda underground – it’s called WWDC. CHUCK: Never heard of it. [Chuckling] JAIM: Anyone heard of that? I don’t know. I’ll be there. I think Ben’s going to be there. Actually [inaudible] WWDC, but I’ll be around, all [inaudible] it, Pete? There's another conference coming up this summer if you didn’t get tickets for dub dub and you don’t wanna spend $300 for a hotel, and Pete’s garage is full, so you're out of luck there. [Chuckling] I’ll be speaking at That Conference, which is not very near San Francisco at all. It’s in Wisconsin – Wisconsin Dells. If you remember the TV show “That ‘70s Show” – based in Wisconsin. Actually a pretty decent mobile track; we’ve got some other people around the iPhreaks family will be speaking there, so it’s pretty good. And it’s based in a water park. PETE: Yeah! JAIM: So, water slides. With your iPhone. Will be fantastic. PETE: That’s like the sister conference to – or the sister water park to the one that they do code mashes in the other water park in Ohio. Anyway, sorry. I'm interrupting your picks. JAIM: Alright, so conferences with water parks, +1. PETE: It’s a thing; it’s a thing. JAIM: That’s a pick. So [inaudible] pretty good – August 11 to 13th. Tickets are on sale and you get to hang out with me if you want. Those are my picks. CHUCK: Yeah, and it says that – I just went to the website. The call for proposals – it says it’s still open? But then it says it’s only open through the 14th of April, so I'm confused. Anyway, Ben, what are your picks? BEN: Just a couple of picks today. I use HipChat and love it at work, however, my client also used HipChat and we’ve been wanting to sort of integrate our teams a little bit more, and so I convinced them to switch to Slack because HipChat doesn’t really have a way to switch accounts without – if I'm using the Mac app, I would have to use the web app to join theirs so it would be really awkward. And then when you get, you have your iOS app where you get push notifications on, that would only be tied to one account. I convinced them to switch to Slack and I've been slowly trying out Slack. I really like it; it’s pretty awesome. I can't necessarily say it’s vastly better than HipChat, but they do have a free tier, so if you're on a small team you could probably get by just with the free tier. Second pick is Raspberry Pi, which I know has been picked in the past, I think. We just picked one up; I found a kit on Amazon for $60-$62, and that is the cheapest way to get a dashboard up for performance stuff on a second monitor, so I was kinda tired. We used librado and we pumped a bunch of stats from our running service through statsd and into librado. So we’d get technical metrics, like amount of memory and CPU percentage on our servers to queue sizes and how many events are flowing through our system to business metrics, like how many tracks are played or how many people have signed up today and things like that. Now that that’s on a little Raspberry Pi, it just loads up one webpage and puts it on a monitor, so it’s pretty awesome. Those are my picks. CHUCK: Very nice. Pete, what are your picks? PETE: I have a ridiculously epicly long list of picks this week, [chuckles] so I'm going to go really quick. This WWDC conference is coming up. It’s the What Would Developers Create Conference – that’s what my wife calls it. Since I live in San Francisco, I did know some places to go that aren’t within two feet of the Moscone Center, so I'm going to run through a bunch of places and I’ll have links. City Beer Store is a short walk and has an amazing selection of beer both in bottles and on tap. It’s a really fun place to go down in SoMa. If you don’t wanna walk to far from Moscone, Amber Indian is right next to Tropisueño, which I picked last year and I was told I wasn’t allowed to pick again. Right next to Amber Indian is a place called Beard Papa that does puff pastries filled with cream, which are pretty much amazing, so you should go get those in the afternoon. Ben told me I wasn’t allowed to pick Special Xtra, which is a coffee place nearby, but I'm going to pick it anyway. CHUCK: [Chuckles] PETE: But sinceI wasn’t allowed to pick that one, I will also pick Epicenter Café, which is a little bit of a – more of a walk, but is also a nice café. One of the things that I want to tell people who are at WWDC is not spend all their time around Moscone, because it’s actually kind of a pretty crappy part of San Francisco, to be honest. If you walk up to Montgomery Street, there's an awesome bar called House of Shields. Right next to that, there's a really nice lunch place that’s literally a hole-in-the-wall lunch place called Sentinel. If you don’t want to eat those mediocre WWDC lunches, then take a stretch, take a walk, enjoy that wonderful San Francisco sunshine – or fog – and go to Sentinel and get some food there. A food truck that you should go to, which is a little bit further than Sentinel is Curry Up Now. They do Indian burritos wrapped up in tortillas – it’s super yummy. Keeping going, sorry, this is the longest set of picks ever. I'm going to put this in the blog post as well, I think. It’s really easy to get on BARTs and ride a little bit down into the Mission and actually see a fun part of San Francisco. Once you're down there, you should go to Zeitgeist, which is an awesome beer bar/biker bar. Little Star Pizza has amazing pizza; Fourbarrel Coffee has amazing coffee; and The Monk’s Kettle, which is just a very short walk from 16th street [inaudible] has a really eclectic beer selection of weird Belgian beers. Those are my picks. CHUCK: Cool. JAIM: You giving tours? PETE: Yeah, and I'm available for hire, $400/hour. CHUCK: Segue tours, right? PETE: Yeah, that’s right. CHUCK: Alright. I've got a couple of picks; they're books and one game. The first one is I just read EntreLeadership by Dave Ramsey. I'm a big fan of Dave Ramsey, and it was just a terrific book and really drove home for me some of the things that I need to be doing better in my business. I've also been reading Winning by Jack Welch. He was the CEO of GE for a long time, and his is kind of the same kind of thing that EntreLeadership is. It’s a playbook for running your business, and I've really, really been enjoying that one as well. I've been playing Hearthstone, which is kind of like Magic – if you’ve played Magic, The Gathering or one of those card games like that, Pokemon – it’s kinda like that, except it’s on your computer. That’s been a lot of fun. It’s a Blizzard game, and it’s kind of based around characters or character types from Warcraft and stuff. Anyway, those are my picks. Brit, what are your picks? BRIT: Something that I've been having a lot of fun with is this thing called SmartThings. Basically a kit of sensors that you can put throughout your house and then there's a nice iPhone app that you can use to track things. I have played around with these sensors and I've put a sensor on the front door, and when I leave town or when I'm out of town, you can set it so that it’ll notify you if your door were to open or shut, which is important because we had a situation where we had some people who were going to be coming and doing some work, and we had some people checking in our house. I wanted to know and be notified if someone went in or out, and that worked really well. You can also have a temperature monitor, motion sensors – I played around with both of those, and that’s pretty nice. I'm definitely, definitely having fun with the SmartThings. My other pick is, if you're going to be in San Francisco for dub dub, then please join the WWDCGirls. We are hosting a benefit party on Wednesday, June 4th. It’s going to be held in the evening at the New Relic offices, and we’re going to be raising money for our App Camp for Girls, which, I guess, would be my third pick. It’s a really great organization, not for profit, based here in Portland, and expanding to Seattle this summer that hosts a one-week camp for middle school-aged girls to learn and get exposure about building software, and they actually build their own apps, so it’s pretty awesome. But if you can join us one June 4th, we’re going to be hosting a great, really fun party, so hopefully we’ll see some folks there. JAIM: Do you have to be a WWDC girl? BRIT: No, everyone is welcome. JAIM: Everyone? Okay. CHUCK: That sounds really cool. BRIT: It’ll be awesome. CHUCK: It’s interesting because most of the programs like that where it’s some girls, or it’s focused around getting more women into the community – there are terrific, terrific things that are going on there, and most of the ones I've seen are open to everybody. They give a little bit of preferential – so if it fills up, they're going to let the women in first, which is fine with me. If you're interested, show up; and if you can help, go volunteer. BRIT: Yeah, absolutely. CHUCK: Alright! Well, let’s go ahead and wrap up the show. Thanks for coming, Brit! BRIT: Thanks so much for having me. CHUCK: And if people want to know a little bit more about New Relic or find out a little bit more about what you're working on, what's the best place to do that? BRIT: I have a personal website, it’s cocoabythefire.com; welcome to check me out on there. Otherwise, just through New Relic’s website you can learn everything we’re doing with mobile monitoring, and definitely check out our blog as well. CHUCK: Alright. Thanks for coming, and thanks for listening, guys. We’ll catch you all next week!  [Hosting and bandwidth provided by the Blue Box Group. Check them out at BlueBox.net.] [Bandwidth for this segment is provided by CacheFly, the world’s fastest CDN. Deliver your content fast with CacheFly. Visit cachefly.com to learn more]

Sign up for the Newsletter

Join our newsletter and get updates in your inbox. We won’t spam you and we respect your privacy.