Thursday, 28 December 2017

Exploring Core Data database with sqlite3 on command line

Core Data abstracts a lot of the low level details of storage away for programmers, and that’s a good thing however sometimes it’s useful to lift the lid and see what’s happening beneath. For this post I’ll be specifically looking at the SQLite persistent store, I’ll assume you already have an app using Core Data and want to glean more insight into the inner workings. Fortunately the mac comes with a command utility to open and explore sqlite databases (sqlite3), but before we can delve into the database file, we must first find the file.

Finding the database file for an iOS App
The easiest way to look inside the database file is by running your app within the iOS Simulator for Xcode. The simulator stores all of its file structure within a directory on your mac. The trick is finding it. By default the simulator will store its files in Home Directory ▷ Library ▷ Developer ▷ CoreSimulator ▷ Devices. Normally the Library folder is hidden to prevent users mucking things up, however you can access Library from the Go menu in Finder by holding the option key down while displaying the menu.



However since we’re starting to perform some advanced techniques it may be easier to perform these actions on the command line, therefore open Terminal and type the following to perform a search of all the simulator devices for a file with a given name (you can substitute the file name for your own):
find ~/Library/Developer/CoreSimulator/Devices -name CoreDataDemo.sqlite

This will print a list of all the files found with the given name. However if you have been running your app on multiple simulators then multiple files will be displayed. It may not be obvious which file is for which simulator as the folders are named after the device identifier rather than a human readable name, additionally the files are not sorted therefore the first file may not be the most recent one.


To find out the Identifier for the simulated device, we can get it in Xcode by going to the Windows menu and selecting Devices and Simulators. This will open a new window, switch tabs to Simulators and select the simulator that your interested in. The Identifier will be displayed along with the summary information.


Since this can be rather painstaking, I've written as script that will search for a given database file and display the top 5 most recently modified files and the ability to directly open them. As well as listing a human understandable version of the simulator. The findcoredata.py script can be found on GitHub.


You’ll notice that the sqlite database file may not be alone. You may also have a file with ‘wal’ suffix, this is the Write-ahead log and a file with ‘shm’ for the Shared-Memory file. If you intend to copy the database then it is important that you copy these files also.

Look, but don’t touch
It is important to remember that you should avoid making modifications to the sqlite data file as Core Data will manage that. The sqlite3 command line tool can be used to execute commands against the database.


Perhaps I'll blog more on the internal structure of the database, but for now since you know how to find and open the database happy exploration.

Tuesday, 19 December 2017

Consistent property attribute formatting in Objective-C

For those of us that care about consistent code formatting there are a few tools out there, including Objective-C support in uncrustify. However, one aspect of formatting  I have not found is to provide a consistent order for the property attributes. Often I'll see code such as the following:-

@interface ViewController()

@property (weak, nonatomic) IBOutlet UIButton *button1;
@property (weak, nonatomic) IBOutlet UIButton *button2;
@property (null_resettable, nonatomic) UIColor *highlightColor;
@property (nonatomic, copy) NSArray *names;
@property (getter=isActive, assign) BOOL active;

@end

This can be a bit of a jumble as there is no consistency in the ordering which causes a lot of cognitive work when reading as I need to read the full list of attributes on each line to see any patterns. To that end I've created a python script that will re-order these source files for me. In this example it would produce: -

@interface ViewController()

@property (nonatomic, weak) IBOutlet UIButton *button1;
@property (nonatomic, weak) IBOutlet UIButton *button2;
@property (nonatomic, null_resettable) UIColor *highlightColor;
@property (nonatomic, copy) NSArray *names;
@property (atomic, assign, getter=isActive) BOOL active;

@end

Now all the nonatomic attributes are move to the front as this is the most common attribute. You'll also notice that for the active property we added the atomic attribute that was not in the original source code. This is because atomic is the default unless nonatomic is specified, however atomic is rarely needed unless dealing with multiple threads, more often than not people simply forget to add nonatomic. By adding it, it makes the behaviour more explicit.

To run the script you simply supply a list of source files. It will work with both header and source files. The script will create a copy of the original source file prior to modifying it, this will have the same name as the original but with .orig suffixed on the name.

sort_prop_attrs.py <sourcefile1> <sourcefile2> <sourcefile3> ...

Monday, 18 December 2017

Named capture groups with NSRegularExpression in iOS11 / High Sierra

After years of no change, Apple slipped a small improvement to NSRegularExpression into iOS 11 / High Sierra. The macOS 10.13 and iOS 11 Release Notes Cocoa Foundation Framework mentioned the updates to NSRegularExpression, but little was given in terms of detail. So lets explore and see if we can find out more. We’ve always had the ability to use index capture groups. However as the complexity of a regular expression grows using numbered indexes can grow unmanageable, as well as making the indices fagile to changes in the pattern can throw the numbering system out of kilter.

Apple’s class reference for NSRegularExpression links to the ICU user guide, which lists the syntax for named groups as (?<name>pattern). So let experiment to see it in action.

Since NSRegularExpression is still very much an API that works with objective-c NSString and its UTF-16 code point model. I’ll use a simple extension on String to make life easier.


Consider a regex pattern for matching formatted U.S. domestic telephone numbers, such as (123) 456-7890.
\(\d{3}\)\s\d{3}-\d{4}
This is chosen for simplicity for demonsating the point rather than the most flexible pattern for general purpose matching. The pattern matches the following: -
  1. an opening parenthesis
  2. three digits
  3. a closing parenthesis
  4. a single whitespace character
  5. three digits
  6. a hyphen
  7. four digits
Let’s say that we’re in particular interested in the area code, which is number 2 on the above list. While the pattern will match the whole telephone number, we can create a capture group around sections of interest, in this case the area code by adding parenthesis around it.
\((\d{3})\)\s\d{3}-\d{4}
This is what we’ve always been able to do, and we’d access this capture group at index 1, using the method range(at:) on NSTextCheckingResult. Index zero is reserved for matching the whole pattern.
let areaCodeRange = match.range(at: 1)
With named capture groups however rather than thinking about it as the capture group at index 1. We can name the capture group like so:
\((?<areacode>\d{3})\)\s\d{3}-\d{4}
This allows us to extract the area code using the new method range(withName:) on NSTextCheckingResult
let areaCodeRange = match.range(withName: "areacode")
  

  
Named back references
Named capture groups are not just for extraction, they can be used in back references. Using the syntax \k<name>. For example, if we wanted to match a balenced set of HTML tags, we can use a named capture group for the tag name and then use that name as a back reference to match the closing tag.

<(?<tag>\w+)[^>]*>(?<content>.*?)</\k<tag>>

Saturday, 16 December 2017

Core Data SQLDebug Log Levels

The first step in any performance tuning exercise is to identify the amount of execution work that is being incurred and to evaluate how performant each element is. Within the context of Core Data backed by the SQLite persistent store those elements are usually SQL statements.

While Core Data attempts to hide the SQLite backing store through abstraction. It can be beneficial for performance monitoring to understand what's going on behind the curtain.

One of the useful tools Core Data gives is the SQLDebug argument that will cause Core Data to log various events to the console. This is enabled by specifying -com.apple.CoreData.SQLDebug followed by an integer on the launch arguments within the scheme for the project.


Edit Scheme dialog pane with SQL Debug specified on launch arguments.

Logging Levels for SQLDebug

I did not find a detailed explanation of what the different log levels will give, so I decided to experiment and publish my findings here. To do that I created a project with a simple Core Data model, with a Person entity, added one million rows of test data then captured the log output for the a simple query for person row where the age is greater than 18.

let request = NSFetchRequest<NSManagedObject>(entityName: "Person")
request.predicate = NSPredicate(format: "age > %@", 18 as NSNumber)
_ = try! container.viewContext.fetch(request)

The following shows log output from query at different log levels: -

1 and above
CoreData: sql: SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZAGE, t0.ZFIRSTNAME, t0.ZLASTNAME, t0.ZPERSONID FROM ZPERSON t0 WHERE t0.ZAGE > ?
2 and above CoreData: details: SQLite bind[0] = 18
1 and above CoreData: annotation: sql connection fetch time: 0.4789s.
2 Only CoreData: annotation: fetch using NSSQLiteStatement <0x60800008bf40> on entity 'Person' with sql text 'SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZAGE, t0.ZFIRSTNAME, t0.ZLASTNAME, t0.ZPERSONID FROM ZPERSON t0 WHERE t0.ZAGE > ? ' returned 688187 rows
3 and above CoreData: annotation: fetch using NSSQLiteStatement <0x60c00009b620> on entity 'Person' with sql text 'SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZAGE, t0.ZFIRSTNAME, t0.ZLASTNAME, t0.ZPERSONID FROM ZPERSON t0 WHERE t0.ZAGE > ? ' returned 688187 rows with values: (
"<Person: 0x604000098470> (entity: Person; id: 0xd0000000000c0000 <x-coredata://804E31B9-8EB9-4A42-B73D-F3B53FB47D27/Person/p3> ; data: <fault>)",
"<Person: 0x600009283700> (entity: Person; id: 0xd000002632dc0000 <x-coredata://804E31B9-8EB9-4A42-B73D-F3B53FB47D27/Person/p625847> ; data: <fault>)",
...
"<Person: 0x60c008c939c0> (entity: Person; id: 0xd000003d08fc0000 <x-coredata://804E31B9-8EB9-4A42-B73D-F3B53FB47D27/Person/p999999> ; data: <fault>)",
"<Person: 0x60c008c93a10> (entity: Person; id: 0xd000003d09000000 <x-coredata://804E31B9-8EB9-4A42-B73D-F3B53FB47D27/Person/p1000000> ; data: <fault>)"
)
1 and above CoreData: annotation: total fetch execution time: 0.6390s for 688187 rows.
4 and above CoreData: details: SQLite: EXPLAIN QUERY PLAN SELECT 0, t0.Z_PK, t0.Z_OPT, t0.ZAGE, t0.ZFIRSTNAME, t0.ZLASTNAME, t0.ZPERSONID FROM ZPERSON t0 WHERE t0.ZAGE > ?
     0 0 0 SCAN TABLE ZPERSON AS t0


After looking at the output, we can say each value will stack (in general with one exception). So specifying a log level of 3 will give all the logging events that levels 1 and 2 gave as well. To summarise the output the following table has the elements that get added at each log level.

Value 
Description
1
SQL Statements, Row count and Execution time
2Bind values and the truncated version of NSSQLiteStatement that does not list the full list of Managed objects.
3List of Managed Objects returned for the query. These objects have not been faulted into memory and therefore only the Managed Object ID is outputted.
4SQLite Explain Query Plan


Friday, 15 December 2017

Welcome

This will be my tech blog for posting an assortment of things for programming and technical things I find interesting. I have a predominate skew toward iOS development as that is my field of expertise. I hope that some of this will prove useful to others, or at least it will provide a venue for me to dump ideas out of my head and store them somewhere.