iPhreaks Core Data with Saul Mora
01:22 - Cora Data
07:50 - Stores and Contexts
- Persistent Store Coordinator
- Core Data Editor
- Creating a CoreData Model in Code | Cocoanetics
21:17 - Faulting and Fetching
- The Law of Leaky Abstractions
- -com.apple.CoreData.SQLDebug 1
- Base 2
27:48 - Is Core Data the right tool for the job?
29:46 - Managed Object Context
- Core Data and Threads, Without the Headache | Cocoa Is My Girlfriend
- Core Data: Data Storage and Management for iOS, OS X, and iCloud by Marcus S. Zarra
38:22 - Importing Data
40:08 - Predicates
SAUL: I like your style.
CHUCK: Hey everybody and welcome to Episode 6 of iPhreaks! This week on our panel we have, Rod Schmidt
ROD: Hello from Salt Lake City!
CHUCK: We also have Pete Hodgson.
PETE: Good morning from San Francisco!
CHUCK: We also have Ben Scheirman.
BEN: Hello from Houston, Texas!
CHUCK: I'm Charles Max Wood from DevChat.tv. And we have a special guest this week, that is Saul Mora!
SAUL: Hello from Denver!
CHUCK: Denver? I thought you said Fort Collins? Is that not the same thing?
SAUL: [Laughs] No, that's where the beer is. Okay. [Laughs]
CHUCK: Oh, I see.
SAUL: Right. But yeah, that wouldn't be so bad to go and get some beer now.
CHUCK: If you go and get too much beer, is it a one-way trip [inaudible]?
SAUL: [Chuckles] Yeah, well, I have to take some guest with me.
CHUCK: Oh, here you go.
SAUL: But no...yeah, that's where the New Belgium Brewery is, so I take guests over there quite often. So for anybody comes and visit me in Denver, definitely head on up there.
CHUCK: Well I don't drink alcohol, but I'm going to be in Denver this weekend.
CHUCK: Maybe I'll come and shake your hand, buy you lunch, or something.
SAUL: Yeah! Just let me know!
ROD: You get to be the driver. [Laughter]
SAUL: There you go!
CHUCK: I don't know what my wife would say about that.
SAUL: Oh, there are plenty of breweries out here to visit. So, we can visit them anywhere.
CHUCK: Awesome! Well this week, we're going to be talking about CoreData. Or, do you call it Core Data?
SAUL: [Laughs] I thought that was an English thing; Pete might know.
PETE: I'll refer to it as Core Data!
PETE: It's the French pronunciation. I still say Data; it's one of the few English things that I still say in the English way [inaudible].
SAUL: So do you say Beta or Beta?
PETE: Oh, that's a good (question). I think I say Beta now just because it's like a -- I was going to say just because it's a software thing, but Beta was a software thing. So, I don't know.
SAUL: We have gotten to him! Great!
PETE: Yeah. My covers are blown.
PETE: Actually, I'm native Texan.
CHUCK: Yeah, you've seen that? Now you only sound cool when you're talking about things other than computers.
CHUCK: Alright. So CoreData, it's a way of storing data on your iOS device. Is that correct?
SAUL: It is. CoreData is an Object Graph Persistence framework. So yeah, it's not really like the classic ORM thing that maybe Java or C# people might be used to with tools like Hibernate or NHibernate, things like that. So, it's an interesting framework of classes that let you store data on a disk or anywhere, really.
CHUCK: So, do you use it to interface with your APIs? Or, do you use other libraries for that one?
SAUL: It really don't have CoreData talked to anything else; it's like you tell CoreData what to do and it just kind of stores stuff for you. By other APIs, I'm guessing you might be meaning --
CHUCK: Extra on APIs, yeah...
SAUL: Like some other datastore or a database or database server, CoreData does not connect to any other databases without some significant implementation by somebody. So like, you might used to doing -- I don't know if you're from Java or something -- but they have like a JDBC Connectors interface thing, which is a lot more longer lines of a classic SQL backend kind of thing, where you have a server sitting off somewhere or maybe like a local server on your local machine and it connects through sockets or network pipes or something like that. CoreData is basically running in the same process as your app, and it's really just an interface for your app to store data somewhere. And I keep using the word 'somewhere' because the data can be stored in a number of different formats. One of them, the most common one, is SQL; I'm sure everybody's familiar with SQLite. CoreData uses that as its backend just because it has got really nice benefits. I'm sure a lot of people familiar with SQL, if you don't use CoreData and you use iOS, you probably stores stuff using FMDB, or some other data persistence mechanism that way. So, SQLite is really nice because it serialize everything to disk and you have this kind of relational database lookup format. What's really nice about this is that when you're debugging your Core Data store and whether or not data made it in, or into the right format or not, you can just open it up in a tool like base and just read the data straight off, and it's really nice that way. But it's also nice because data stays on the disk longer, and you don't have to have all the stuff in memory so it's very efficient -- it's kind of a balanced efficiency between having all the data in memory versus having it on disk and having the right objects in memory at any particular time.
BEN: Does that how really been used the XML store?
SAUL: I think the XML store is very legacy. Originally, it was presented as a store that you should use during debug mode because it was human readable. Nowadays --
CHUCK: XML is human readable?
SAUL: Right [Chuckles]
PETE: My eyes! My eyes! [Chuck laughs]
SAUL: [Chuckles] Right. Those angle brackets get in your eye, and it won't come out, right?
PETE: Stubbing me! Stubbing me!
SAUL: Right. Yeah, I don't really know how people got along before SQLite, I guess maybe XML was enough. But yeah, the XML store is available; I don't really use it that often, like you're saying. There's two other options, though. There's the binary store, so CoreData can has its own custom binary format, which is non-human readable at all. So you can't just open this up in a tech setter and really see anything useful; you won't be able to see structures or anything, it's just literally a binary format. You just say "Hey, use binary store", and it writes it all and it just keeps track of this in internal Apple kind of file format. The other one and I think it also uses the same theme or similar one for the in-memory store, so you can have like an in-memory cache of your data objects there, and just let them live in memory. That's not too recommended on iOS devices, given the memory constraints and things. That's why SQLite is a really good option for data storage on iOS, because everything is still present but is not actively living in memory when you're not using it.
PETE: It sounds like none of these other options are any used really, apart from the SQLite. What uses would that be for using something like any --
SAUL: For all practical purposes, you just want to use the SQLite store, but there may be occasions, I mean sometimes if you're using, if you have temporary objects, you might want to have those in an in-memory store if you have. But even then, that might not be the best thing; you just have a different context and you have some temporary objects in there. I wrote an article way back in the day about basically transient entities; it's on Cocoa is My Girlfriend. There 's really just 3 options, like I said, one is to have an in-memory store, the other is just a scratch context. I forgot what the other one was, I have to look at my article, I guess; it's a good thing I wrote these things down. But another option is to not set a store at all for the object, I believe. So when you initialize it, you just don't give it a context and it just kind of lives, hangs around, but doesn't really connect to any persistent store, or whatsoever.
BEN: It may be a good time to take a broader look at like all the pieces involved, like if you already start from scratch of what is a store or what is a context.
SAUL: Yeah. So, CoreData has a lot of moving pieces and when you open that defaults template from when you open up a file new project and you select CoreData as an option, you'll see all the pieces in your app delegate. For one, putting them all in your app delegate is not something I recommend, I'll talk more about that later. But yeah, you can see all the pieces there. And the first piece that you're going to interact with a lot when you're using CoreData is the NSManagedObject. You can think of this thing as basically a row in your database then in your table. This thing is just a key-value store of all your attributes and values that they're trying to store. This managed object is managed by a managedObjectContext. And the context, you can think of it as a, I guess, as just a workspace, as a storage area for the in-memory objects. So [inaudible], is what handles fetching the objects, saving them, and some of the other overhead in managing what managed objects are in memory any one time.
BEN: So yeah, there's a unit of work pattern?
SAUL: Unit of work implies an algorithm, right?
BEN: No, I think it's just a way to batch things so that you can undo them or isolate saves to that one specific operation. And the alternative would be like say, each one of your objects has a save method on it, in which case you can't really do atomic operations and involve multiple entities. So, it would almost model to a transaction. Is that [inaudible]?
PETE: That's what I was going to say.
SAUL: Right. In your code, you could think of it as a more or less a transaction. I think in the database, I think it doesn't -- I'd have to look at the actual SQL logs that are generated -- but, it doesn't explicitly use transactions, so it will just use one statement and kind of let the database manage the transactions. But I could be wrong on that; I haven't looked at the logs in quite a while. But yeah, when you do saves, you do want to them on your managedObjectContext. And this draws a lot of people off because I think, a lot of people want to just have "I have one object, and then I want to save that object right away". They do that on the context; but the thing is, if you have a lot of objects kind of with all of your data updates in a particular context and you save that context, you're going to save everything that any painting changes that context has. So, you need to kind of just be aware of what's changed there, and what needs to be saved. Save can kind of take a long time, but we'll get into that in a little bit here. Let's continue down the rest of the stack; so those are the two objects that you're going to be working with most often: the ManagedObject, and the managedObjectContext. The next one kind of down the stack is the NSPersistentStoreCoordinator. This is the thing that will coordinate the data transfer between your contexts and your persistent stores. This coordinator can actually, it's weird, it's kind of like this locking funnel; it manages threading to a certain extent, so you can have multiple persistent stores connected to a single persistentStoreCoordinator. It handles a lot of the threading and the locking and things that are needed for that to happen. And yeah, it's a nice way to just have a really simple access to a multiple stores. So, we have the coordinator; and the coordinator knows how to transform the data that you give it the value of that context. It knows what the storage should look like via the NSManagedObject model. The managedObjectModel correlates directly to what database people might know as a 'schema'. What I really like about the managedObjectModel is the entity designer. Some people might hate that thing, but I find it really handy because what it lets you do is design your object hierarchy visually. And you can add entities, and attributes and relationships, and subclasses and things like this, all visually; and you can kind of get a visual sense for what's your object graph looks like. I find that super handy when I'm debugging things, when I'm designing stuff. A lot of times I'll just design my AppData in the visual designer first without doing anything just to kind of get a sense of what the data will look like. And just having it visually there, it lets me kind of mentally walk-through a lot of things. So I find that pretty handy. And it's pretty straightforward when you add attribute some things, you can set properties like you would on Query Entities, so you can have a date attribute or a string attribute or a float, things like that. So in code, they look like the objects that you would expect, but they get translated via the coordinator framework to actual, actually whatever the database the underlying storage mechanism wants.
PETE: Is there options to do that with code? Or, do you have to do it graphically?
SAUL: You could do it with code if you are so inclined to keep stubbing yourself in the eye like that.
SAUL: If I could do it in the Visual Editor and it works fine, I will do it that way. That's why I prefer nibs over coded interfaces. That's why I'll take this Visual Object Designer versus writing in code. But you can create everything you see in the visual designer in code; you have to learn some of the meta Core Data objects that describe that. So every time you create an entity, for example, in that Entity Designer, you're actually describing an NSEntityDescription object. That is the thing that is your Meta Entity, and that is basically like Class Objects for managed objects. So, you're telling it what the attributes are, what the relationships are, what types of things are, all that stuff. And each one of those things, there's like an NSRelationshipDescription and NSAttributeDescription; all the pieces in that design are actual classes under the covers. And that designer, it just makes a lot easier to work with, in my opinion.
PETE: Okay. Because I know that there's some kind of a slightly, I don't know, weird school of thought that people don't like nibs and want to do stuff pragmatically, but maybe there's less of a movement for --
SAUL: Yes! The Core Data Editor is really nice now. So, I think people don't like nibs for the reason that they're XML based, but you really don't know what's going on in all that extra mile of jungle; they're hard to merge, they're hard to --
BEN: But, it's human readable!
SAUL: [inaudible] It is - XML is human readable, I'm sorry I forgot that. So surprisingly, the CoreData filed that it's generated by this editor is actually really readable by actually humans, not these mechanical humans that at least other things we're talking about.
CHUCK: Evil robots.
SAUL: Right. Or, bots or things, I don't know. But the end of the description or that model file is you could open that up in your text editor; you can actually see what's going on, you could see the entities, their names, their attributes. You can see how they're related. It is very readable; it is a very readable form of XML versus the object dump that is nib files.
CHUCK: So, you said that -- I'm still trying to get my head around some of this -- so the models or the CoreData structures or whatever you want to call them, you said that it's not an ORM in the traditional sense. Is it just kind of a class with an interface to the data layer? Or...I still don't completely understand how that all works.
SAUL: How what works? Let me finish off with the persistent store, maybe this will help clear up. So we've got kind of a top-down stack, so when you're interfacing with CoreData from your app’s perspective, generally you're going to start with the managed objects. And your, maybe, ViewControllers or some other controllers might have references to the managedObjectContext. And when you initialize your managedObjectContext, you want to give it a persistent store, and you're going to tell it which persistentStoreCoordinator you're working with, and you're going to tell it a model that you want to work with. From there, you're going to add a persistent store, so the store is a separate object. So you kind of have this kind of linear flow of data; it starts at the managed object, then it goes down to the persistent store through those other pieces. The persistent store can be any storage format that correlates a support. You can also have custom storage formats, which we can talk about in a little bit here. But by default, like we were talking about, you have the XML, the binary store, the in-memory store, and the SQLite store out of the box from CoreData. And basically, what you want to do is, when you initialize your business' store coordinator, you're going to tell it "Hey, add this persistent store at this URL; it's going to be this filename. And it is going to be of this type, and it have certain options on it, and tell me if there are any errors", and that's all in a single method on the persistentStoreCoordinator. (There's too many of these class names, they kind of start running together.) But yeah, when you interface with it, you're going to really just talk to it really through the context and the objects themselves, that's really your interface. That's why I really try to make people understand, I really want people to understand that it’s an object graph; you’re dealing with managed objects, you’re not dealing with managed rows.
BEN: Let's say we had a hypothetical model with like books and authors, so inside our Core Data Editor, we will drag in an entity that is a book and it's got a title on an ISBN and then drag in an author. And since books can have multiple authors and authors can have multiple books, then you would set a too many relationship on both sides, let's say, and CoreData will make you refer to the parent, is that correct on all cases? So if you have a relationship to a child object, the child has to have a relation back to the parent?
SAUL: Right. This is one of the things I was thinking about as far as trying to explain the difference between like normal objects and managed objects. In CoreData, like you're saying, when you define relationships between entities, CoreData really wants bidirectional relationships that's an entity 1 points entity 2, and entity 2 points back to entity 1. This reinforces the concept of a graph. If you think back to some of your college days and maybe some networking theory or member of just network graph and things, CoreData is really just an object graph in exact same sense. So you could do in at least shortest distance between object algorithms with CoreData, and it would manage all the memory and stuff for you. So, CoreData is a lot different than in ORM, at least in that regard. But the thing is, again getting back the difference between regular objects and managed objects, in regular objects, you don't have to have bidirectional relationships. When you have object A having a reference to object B, object B doesn't necessarily point back to object A; if you did, you'd have a memory cycle. So, that would be bad. But in CoreData, you want that because you want to be able to traverse a graph. The idea with CoreData being a graph is, you can get to any data that's related to any other piece of data from any entry point. So you can fetch any of the objects from CoreData, and then you can traverse the relationship graph to get to another related set of data, which might be possible in ORMs, but I don't think that's really the key.
CHUCK: Yeah, I don't think that's what they focus on.
BEN: I think that it's definitely -- I have a pretty extensive background within Hibernate -- and you can definitely do that. However, it has implications of your database performance. And I think the same is true for CoreData, however, they want you to forget that you're running on a database because you might not be, and so it wants to manage that for you. So I guess this is a good time to get into faulting and what that means.
SAUL: Right. So faulting, like I said, CoreData handles a lot of the memory for you. So with regular objects, now we have the wonderful adventive ARC and we don't have to do retains and releases. But with CoreData, it really does a lot of Memory Management for you as well. And it has events and things so faulting means that your object has an ID reference; you kind of have a shell object available to use so you have a book entity. The faulted one is words in memory; when it's not faulted means that it's still in the store, but you have the shell objects kind of laying around. So we now have the shell object, all you have is a really lightweight managed object and all it has is a ManagedObjectID. This ID is basically unique record identifier in your store. And when you access any property on that particular object, the CoreData will then go and fault all of the data from CoreData into memories, then you have instant access. And what's nice about it, and what I really like about it, is that you didn't have to do anything to trigger that memory loading. So even though you had to fetch the object, if you fetched it as just C shell objects, you don't have to trigger the framework to go and populate all that data for you when you actually need it. It's a really nice convenient way to lazyload all of your data.
PETE: Is there a way to kind of do an eager, I guess it would be eager faulting, to kind of avoid and pass one type stuff?
SAUL: There is another object in the CoreData framework called "NSFetchRequest". This is what handles all of the request that you want to get out of CoreData. So you have a book entity, and you want to have all books that start with the word, I don't Odyssey or something, I don't know. So you have this request and you have this kind of clause that you're looking up, so you create a request and you set your predicate and you set a whole bunch of other things that you want to happen on this request. And what you can also set on this request is this property that I think set returns as faults. I believe you can set that to 'No' -- you should definitely read the docs, I don't have it in front of me. But there's a message in there that just says "Return these as faulted or non-faulted", and just set the proper blue in attribute. That will avoid the lazyloading feature if you really want to do that.
PETE: So it's kind of one of those "Don't think about it as at H-based until you have to remember that it is H-based kind of things".
SAUL: Right. There will be times where the type of persistent store that you're using, if you understand what's going on there, it will help you optimize some of the performance. And that's only going so far as making sure your predicates are in the right way, basically in the right order. Some of the dubbed dub videos actually explain a lot of the things that you need to think about with the predicates. You really shouldn't have to know the underlying store as far as optimizing your CoreData usage, but sometimes it helps.
PETE: I guess this is probably the N+1 problem; it's massive of an issue when the SQLite database is in the same process; it's not like you're making N+1 network requests or just maybe hitting the disk a few more times or just --
BEN: Yeah, but you could be doing that while scrolling, which was this really bad. So I think the same problems apply. While you're not supposed to know that you're running on a database and never treat it like that, the reality is, it's a leaky obstruction. And sometimes, you do need to know so that you can see what exactly is CoreData doing under the hood. And because your interface with it is not SQL queries and parameters and statements and that sort of thing, you don't know when it's making a query. So, one of the tricks I like to do is, if you go to your Arguments, if you edit your active scheme and you edit your Arguments, you can add a command line argument to your program. And if you add the one that -com.apple.CoreData.SQLDebug 1, then it will log out all of your SQL statements, which of course makes your app really slow, but it will tell you exactly when queries are being run, and that sort of thing. And I like to run with that on occasionally so I can see if it's doing what I expect it to do.
CHUCK: Is there a good way of seeing everything that is in the database?
SAUL: Well if you're using SQLite, you can open up the actual SQLite files. It's really easy to get to a few running the simulator; you just pick in to your application folder on your hard drive and just open that up in a tool like bases -- I don't know, it was like $15 I think, maybe some more -- but yeah, it's a really useful tool. You just kind of load up the database file and just peek inside. But the leaky obstruction thing is not as leaky as you might think because CoreData will let you talk to any persistent storage mechanism that you want to use. So there's other things like CouchDB or Mongo or Tokyo Cabinet, somebody could go and write an NSAtomicStore for those particular types; I don't think that you SQL to do their thing. And even if they did, you could also have an atomic store that stores your file as JSON; or JSON doesn't support SQL at all, you have to write your own query language for that. So, it's not necessarily a leaky obstruction that you think it is. It's only leaky because you happen to know that SQLite is the underlying storage mechanism.
BEN: Yeah, so you probably have different performance considerations if you're running under a different store, and hopefully, would have some visibility into avoiding some of the edge cases that make things a little bit problematic. Like if you're fetching a gigantic entity and all you really wanted was one attribute from that entity, then it would be faster in SQL to only use that column in the select class. And you can influence CoreData to make that query by specifying the attributes to fetch. But again, I don't think you need to like optimize the stuff from day 1; I just think that there may come a time when you need know what the underlying stores doing, so you can influence it to do the thing.
SAUL: Right. And there's been a lot of debate as to whether like CoreData is actually the right tool for the job. Back when CoreData was first released for iPhone OS 3 back in the day, I know a number of like news reader apps that were coming out on to the App Store, and they were using CoreData. And then they eventually just said "Hey, CoreData is too slow!" It can't be slow! So the thing is, the reason like a SQLite having direct access to the database might be faster is doing really large single value attribute update. So, the most famous article on the interwebs out there is by Brent Simmons, and it was like "Why I Moved Away from CoreData", or something like that. It was for NetNewsWire and the performance was not happening when somebody had a thousand plus articles, and they wanted to set them all to read. With CoreData, we have to do is fetch all of your objects into memory, loop over those objects, set the attribute to as red as true, or whatever happens to be, and then save them at the end of that loop. And now in the database world, that's called "a cursor”, I believe, and that is extremely slow because you're loading all this data in the memory, flipping one bit, and then setting it back. Whereas, if you do that on any SQL database whatsoever including SQLite, if you just do update these values or set (I forgot what it is; I don't even do SQLing as how often I use it)...
SAUL: So it's like update column with this value where the row, it matches this predicate whatever. You'd do that in one-line in SQL, and it doesn't have to load anything in the memory; it just loops over those rows and sets that bit one by one. That's far more efficient than what CoreData can do in that particular example.
BEN: I wonder if we could talk about like managedObjectContext strategies? Because I think the docs state pretty clearly that you aren't supposed to use managedObjectContext across threads, and so the question then becomes "Well, do I want one main context for my whole app? Or, do I want one main context per ViewController? Or perhaps, create one for whatever particular operation I'm doing and then destroy it." What is the best strategy there?
SAUL: There's a few strategies, I believe, that you can use. Kind of stepping back a bit, I guess the one reason I think you guys wanted me on here talking about CoreData is that I wrote a little framework called "MagicalRecord", and that helps to make a lot of this process easier. What MagicalRecord does is give you kind of the idea of a default context. That's kind of what those templates, those app templates, give you as well; they put their default context in the AppDelegate. I kind of let it live separate from the delegate, and basically, as a [inaudible] of variable, you can access it through a category on NSManagedObjectContext. But as far as some strategies, MagicalRecord helps do that by having that default context. And everything that you do, like in the ViewController, you should have (well, not should) it's recommended that you have a context per ViewController. But even then, that might break down. So, the reason why you had all these managedObjectContext is just to manage a certain subset of data. It's kind of hard to explain without an example. So one example is, if you have a list of data on an app, so you have a TableView and it scrolls through all your data and you're kind of in editing mode, so you tap on one row and it goes to this detail screen. So that first ViewController will have its own NSManagedObjectContext to load all of that data. Now, that detail of ViewController might have its own as well because when you're in edit mode or you want to edit that particular object, you will need a scratch context because you could cancel that operation as well. So by having a secondary managedObjectContext and having your editing object work in that particular context, you can basically dealloc the ViewController, dealloc the managedObjectContext, which basically just list everything be erased from memory. And the thing is this, like all those changes that you made in that particular View when you go back without saving, that original context doesn't know anything about those changes; it just has all the original values.
PETE: Oh, okay. So this is kind of like what? I guess, by must getting out with the unit of work, then you must like you're kind of darting off like a separate managedObjectContext to do some work, and then you can either choose to commit all those changes in one chunk or further them away and as if nothing happened. Is that right?
SAUL: Right, exactly. So yeah, that's definitely a really good way to use managedObjectContext. But when we're talking about threads though, that’s where it really starts to get hairy. Because managed objects are not thread-safe and Apple says this in their documentation, don't do that; [inaudible] but people and neither we do. Getting back to some previous work as well, I wrote a blog post on Cocoa is My Girlfriend as well that also talks about the strategy that I use in MagicalRecord to help make the threading easier. I think it was called "Core Data Threads Without the Headache" or something. And what you need to do is you need to follow the rules, basically. If you look at the threading docs (and this is for the oldschool thread isolation management of CoreData objects and context and threads and things), the thread isolation model is that you should have one managedObjectContext per thread. And the thing is, you need to do some set up in order to get those objects over to that new thread. So you need to transfer an object from one context to the other and also one thread to the other. And the way we do that is by managedObjectIDs; those are thread-safe, those are unique, those are not writables, so you should be able to read those across threads and they are indeed safe within the Cocoa Framework. So that is the magic sauce in getting managed objects from one thread to the next. Now the thing is that, before MagicalRecord, you had to kind of roll your own code to kind of do all of the things that Apple says in their documentation. Their documentation says that “You're going to need one context per thread, you need to transfer it, file ID” and all that stuff, and then you need to save it or whatever; you may going to do it with the context. So, magic writer provides a little codified set of instructions that follow those guidelines using blocks. So what you would do is just say, say you have a managed objects on one thread and you want to basically make changes to the subject and save them in the background, with MagicalRecord and all with the same strategy as in that article, you just create a block so you do MagicalRecord, save data in background with block, and it gives you a managedObjectContext to work from. Now, it doesn't know about the data, so what it's going to do is, say, you're going to have to use the block to load the data that you want to update inside the block, and you’ll have to say a local context (you see this in the MagicalRecord docs), local context, I think it's MR_inContext now, and then you pass in the ObjectID or the object, which then that method will go in and do the right thing as far as looking up the record, basically, through CoreData, and returning you a thread-safe object that you can then access in that block. And then when that block exit, it will save it for you in the background and then it also have things like a completion block that will notify you on the main thread. Things like that that make all of that boiler plate stuff go away, which was the main idea behind MagicalRecord. There's a lot of boiler plate in CoreData and it was really annoying to have to do it everywhere. So, I just made it not boiler plate. Hopefully, that kind of explains how threading should work in the thread isolation model of CoreData. Is that confusing enough? Or, [chuckles] do I need more examples?
BEN: Yeah, I think that's one of CoreData's problems; that even after reading the docs over and over again, reading the CoreData book, Marcus Zarra's book, I still felt as if I was doing it wrong and I found numerous cases where I was actually doing something wrong inadvertently. One of those is when -- and I think this is kind of a common problem that people fall into -- when you're running an NSOperation, so I wanted to take an entity and do something with it, and I created a context to work with inside of the NSOperation in the alloc in the initializer, and then later on, the main method gets called on your operation and that operation gets run on a different thread, perhaps. I thought I was doing the right thing, but when the operation gets created, it's created on the main thread and I had the context sitting there, and then it got run on a different thread and so bad things happened; you really have to be careful about where your objects are created and where they're used.
PETE: Is there a way to like [inaudible] and know when you're doing things on the wrong thread, like can you, as a managedObjectContext, like bound to a threads, maybe join use it from the wrong thread, it will like throw up rather than just suddenly going wrong?
SAUL: It is not. CoreData gives you enough rope to hang yourself with, so it's up to you to really manage that. Although, I found that there's a lot of people that don't understand all that. So again, MagicalRecord tried to codify the way that you're supposed to do that and make the API just far easier to work with. My suggestion is to look there first, or at least look at that article as to the motivation just to what you're supposed to do and how to build that yourself if you really don't want to use MagicalRecord. But yeah, you definitely want to follow the rules in all cases.
BEN: I have a specific question. So say, you're like importing a bunch of data, and the data might already exist in your CoreData store, so you need to, say, do a search fetch request save for an object existing with this particular unique identifier, and then if it does exist, it will fetch it and would just update the properties and if it doesn't exist, won't start a new one, that becomes problematic with large batches. I'm just wondering, like what types of approaches would you take to make that process seamless?
SAUL: Yeah, so the thing that really makes that slow is that fetch request. So what happens is, anytime you do an NSFetchRequest, basically, you're reaching all the way down to disk; and disk I/O is significantly slower than memory I/O, so you want to do as much in memory as possible. But in a batch import like that, you're more likely going to have many records that you're going to be doing that for. So rather than doing it one by one, you want to batch that fetch request. The recommended thing from Apple is to say "Batch all of those records into memory before you start your import process; before you have to start your lookup and kind of batches into like an array or a dictionary" so that you can do in-memory lookup versus network lookups. All the things that you can do is maybe optimize that batch array so that you can maybe traverse along with your data import structure. So, you kind of have an array of existing objects that matches the array of data things to import, and you just increment both arrays in that way you don't have to do any lookups, really; you just kind of follow the trail there.
PETE: So, can we talk about predicates a little bit? Because we kind of like touch some of that a little bit, but as I understand it, that's the way you kind of lookup things with like say I want to find new books by a specific author, or something like that. Can you kind of go with how that works?
SAUL: Predicates are whole other topic because they're so in-depth; they're so complex. I highly recommend reading the "NSPredicate Programming Guide". That has a lot of good information as far as what they are, how to build them, the syntax, things like that. But the basic idea of a predicate is that, the predicate is a way to specify an object to test. I think of it as a filter, even though that might be an overly simplistic simplification. But for people, they might think of it as their way of clause. But a predicate lets you specify objects and the properties of things that you need to match. And what you want to do is, we're talking about predicate optimization and stuff, so predicates are whole complex thing by themselves. They do a lot of stuff, and I highly recommend reading the "Predicate Programming Guide" that's referenced in the CoreData docs. That specifies all the syntax, all the keywords, all the things that you can do with the predicate; how to construct one of things like that. But the basic idea of a predicate is that, it's a filter. So as a filter, predicates just kind of -- there are way to kind of do test on your objects. Does this object passes test? And that's your filter. If not, it gets rejected; if it is, it gets included in the list. And that's where you can use predicates outside of CoreData; you can use it on an NSArray or NSSet. There's methods in there like array by filtering elements with predicate, or something like that. And they're the same thing with NSSet. So you can apply these to normal just Cocoa objects in general, but what you want to do is -- again, these are your filters for CoreData; predicates, again, they're the filter -- so with CoreData, they're a natural fit for that. And deep down in the framework, CoreData will translate a predicate into a WHERE Clause because you can take a predicate; you have your kind of nice predicate format. And that gets broken down into basically like a syntax tree or an evaluation tree...actually that's an expression tree, sorry. And that gets translated into a SQL WHERE Clause eventually, and gets executed on the CoreData database. That's why if you know the store type, you can optimize the predicate by reordering the WHERE Clause; these are basically the same SQL tricks that you will use otherwise just with predicates versus the WHERE Clause, and you put the right things at first. Apple's general rule of thumb is eliminate as much data as possible on that leftmost predicate, and that would get you a lot of gain. That's a very simple advice, but it's one of those things that will do 90% of your optimization in CoreData.
CHUCK: Alright. Well, we're just about at the end of our time, so we need to get into the picks. I really appreciate that you come in, Saul. It's been really interesting to talk about this aspect of iOS programming.
SAUL: Yeah, I hope it hasn't been too confusing. CoreData is one of those things that people expect it to work, to do a lot more. I know people who come from Rails and things see CoreData and they think ActiveRecord. And that is definitely not the case. Like I said, CoreData is quite verbose; it has a lot of objects, a lot of moving parts. And MagicalRecord is an attempt to make a lot of that verbosity sorter. Having one-line fetch request, really easy way to set it up, things like that. So the one thing I want to say for MagicalRecord, though, is that it's not a direct ActiveRecord implementation. There's a reason that it does not have object save method, because the save method on an object would have to then go to the context, and then that would have side-effects. And I don't want to promote unattended side-effects and end users code. So if CoreData by itself is a hard thing to use, definitely use MagicalRecords and let it do its magic for you, and that'll get you 90% of the way there. But remember, there's a reason that it's not an actual ActiveRecord clone kind of imitation.
CHUCK: Awesome! Alright well, Ben, why don't you start this off with picks?
BEN: Alright. So I get to go first [chuckles]. I have two picks related to CoreData, one of them is "mogenerator". Mogenerator is a really old open source project that's still just as super useful. What it will do is, your NSManagedObjects that you create in the designer don't actually create classes for you unless you ask Xcode to do it. And if you ask Xcode to do it, it will generate a file that has properties and that sort of thing like a first class object would. But you obviously can't add behavior to that without either sub classing or dealing with your own private category so that you can add behavior to that object. And what mogenerator does is it reads your CoreData model and generates the code and does the subclass for you. So if we're doing the books and authors example, then mogenerator will implement your CoreData classes as _book and _author and give you all of the properties that reference you all of the attributes that you're interested in, and then it subclasses that to create a book in an author class that you then have control of, and it won't overwrite those. So as you make changes to your model, those would be picked up. So I use that in a lot of apps that use CoreData. And my second pick is "PonyDebugger", which is a cool little utility that you can launch from your iOS App during development. And you can plug in the Chrome development tools into this little web server it runs, and Chrome has an inspector that lets you do like View hierarchy debugging and logging network request just like you would in any kind of website. And then there's a data tab that is typically used for HTML 5 offline storage, but it lets you view the data in your SQLite database or your CoreData model. So that's a pretty cool tool, PonyDebugger. And those are my picks!
CHUCK: Nice! Pete, what are your picks?
PETE: I'm amazed I got a way of getting my pick in because I thought that [inaudible]. My first pick is a new tool that was recently out and sourced from Facebook called "xctool". So this is basically kind of a wrapper over xcodebuild; xcodebuild is the way that you build your Objective C apps from the command line. Like old things Xcode, it has some wrinkles and little bit of personality; doesn't always do what you want it to do. Xctool is actually not just a wrapper over xcodebuild, but they actually kind of hooks into the Objective C code around xcodebuild, and does magical stuff so that you can do things like running your unit test in CI, and stuff like that. So, that's really interesting because it's from Facebook; everyone got really excited when they released it. So, that's pretty cool. My next pick is a graph database called "Neo4j", and it was kind of talking about how CoreData is like an object graph, which was reminding me a lot of this thing called "Neo4j". So this is not something that you're going to use directly as an Objective C programmer because it's actually a Java database. But if you're also writing backend code, it's something to look into as an alternative to those boring, old, relational databases. It's NoSQL but it's less trendy than the other NoSQL databases, but I really really like it. So, that's my second pick. And then my last pick is a thing called the "AeroPress". This is a super duper cheap way of making really nice coffee; it's like $20 or something, and it's just 2-pieces of plastic and a piece of silicon. Basically, you put the coffee in and you add some water and then you kind of push it all through kind of a hydraulic press and it makes really good coffee. It's great if you're travelling and you're in a hotel and you want to actually drink coffee that's drinkable. It's also great if you're backpacking or something like that. So the AeroPress is my third pick, and that's it!
CHUCK: Awesome! Rod, what are your picks?
ROD: Alright I'm going to pick -- a lot of people have problems with CoreData sinking obviously -- and there is an alternative CoreData sinking framework that is actually open source, and it's written by used by the Moneywell people, it's called "TICoreDataSync". I haven't used it myself, but I wanted to make people aware of that; they're having issues and they want to investigate that. And my second pick is a app analytics service that just went out of beta called "Countly"; they have iOS and OS Mac APIs. So I would be investigating that. Those are my picks!
CHUCK: Nice! Alright, I've got a couple of picks here. Inevitably, somebody's going to ask me what I use for my podcasting setup, and so I'm just going to run through a few of the pieces of equipment here, probably the most important ones. In that way, if you want to know what I'm using, then you know! So the first one is the "Heil PR-40 Microphone"; I think at full price, it's usually like $360 or something. But I've seen it on Amazon off and on for closer to $250. So depending on when you go and look, you'd probably see it there. Another piece of equipment that I just can't live without is my "Edirol R-09HR". Now, they don't make that anymore; they've updated it to the "Roland R-05", so I'll put a link to that in the show notes. I actually have them both and they work really well. I've just had the Edirol forever and that's what I'm using, but I really really like that piece of equipment for recording audio. And so, I'm looking forward to this working for a long time because I'd be in trouble if it broke. The last thing that I'm working on is, I've picked up Aaron Hillegass' -- and this is already been picked on the show, but I'm going to pick it again -- it's the "Big Nerd Ranch Guide to iOS Programming". And already, I'm only a couple of chapters in, and already there are few things that are different in Xcode from what is has in the book, but it's still a terrific guide. So I'm looking forward to becoming much more proficient at iOS. So, those are my picks. Saul, what are your picks?
SAUL: Yeah! So I do this little podcast called "NSBrief", and yeah, so head on number 2, nsbrief.com, we talk about a lot of fun developer-y stuff. Maybe not as introductory, but yeah more of topical things like this, technical for sure. We recently talked about TICoreDataSync with Michael Fey, the maintainer, on our latest episode, so it would be something to look up if you're looking of that. So related to sync, another pick, I want to give a shot out to some friends of mine at "Wasabi Sync". This is also a Core Data cloud syncing service. It's pretty cheap to get started with it, I think it's free actually for quite a bit of syncing hits or whatever it is, syncing data traffic; wasabisync.com, it's a nice little framework that uses CoreData. The guys who wrote it, I know them pretty well and they do a pretty good job. So if you're looking for a yet another alternative to sync that's for pay and then maintain the server and you can access client data in the cloud, that might be something to look into. I've got two more picks, sorry [chuckles]. I've been doing some stuff with UIColors and NSColors and just colors and code in general. And there is a tool on the App Store called "Sip", it's on the Mac App Store. It lives in your toolbar, or the minibar, sorry, on your Mac. And what you can do with it is tell it to pick a color on your screen and gives you a little color picker thing. But when you pick a color, it will copy it into either like a CSS code or HEX code, it'll copy them to UIColor or NSColor based on like some of the display preferences and things like that. So, Sip is pretty cool for that, just to kind of directly take any color on the screen that you want to include in your app, and basically paste that color indirectly into your code. And my last pick is kind of a reference from my tweet from yesterday. So yesterday, I tweeted about CoreData. It does gets frustrating at times even for advanced users. And yeah, my tweet was very (I don't know) antagonistic. So it referenced the movie "Star Trek II: The Wrath of Khan", so that's my other pick. And you can get that for free on Amazon Prime; I believe it's on Netflix as well. But definitely, the scene with Kirk is like Khaaan!!!! That's how you feel with CoreData sometimes...
SAUL: Like "Geez! Core Dataaaa!!! Come on!"
BEN: Shake your fist in the air.
SAUL: Right. So yeah, that's my last pick; it brings back some memories. But, yeah.
CHUCK: Awesome! Well, thanks for sharing. It's always good to get some other resources of people can go and use to get better. So, go check out NSBrief! I'll definitely be subscribing myself. But yeah, thanks for coming again. It's so awesome just to get some other experts in here. I mean we love the experts we have, but --
SAUL: Well, thanks! Yeah, thanks for having me! Hopefully, everybody understands my blabbering on CoreData. It can be complicated, but I try to keep it simple.
CHUCK: Yeah, but if you have any questions, go ask Saul.
SAUL: Oh, geez!
CHUCK: No, leave a comment on the show notes on iphreaksshow.com, and we'll try and get them answered.
SAUL: For sure!
CHUCK: Alright, well, thanks again! We'll catch you all next week!