Saturday 13 January 2018

Using sqlite3_analyzer to identify space usage with Core Data

When using SQLite as the backing store for your Core Data persistence store it's easy to take a hands off approach. However lifting the lid on that abstraction can provide useful insights on its effectivness. If you are wondering why the database file seems to have grown beyond your expectations or just have a general interest in digging deeper then this post is for you.

Analysing the database

One of the tools the SQLite project develops is sqlite3_analyzer to measure the efficiency of space used by individual tables and indexes within an SQLite database file. There are precompiled binary versions of the sqlite-tools available for the mac on their download page.

Update - The precompiled binary available from the SQLite web site works on macOS Sierra, but fails on High Sierra with the following error:-
dyld: Library not loaded: /System/Library/Frameworks/Tcl.framework/Versions/8.4/Tcl
  Referenced from: /Users/raymond/Downloads/sqlite-tools-osx-x86-3210000/./sqlite3_analyzer
  Reason: image not found
Abort trap: 6
This is because High Sierra ships with version 8.5 of the Tcl framework. Of course you could compile your own version of the tool from source to resolve this, but for simplicity we can update the executables rpath with the following command:-
install_name_tool -change "/System/Library/Frameworks/Tcl.framework/Versions/8.4/Tcl" "/System/Library/Frameworks/Tcl.framework/Versions/8.5/Tcl" sqlite3_analyzer


I'd previously written about finding the underlying sqlite database file here.

So what does sqlite3_analyzer give us, let's use an example by creating a database via Core Data, with a entity Person and add 1 million entries into the database. sqlite3_analyzer gives a lot of information, so i've included only an extract of the overall output here.


The report starts with statistics for the overall database file and then breaks down in to sections for each table and index. For this discussion let us constrain ourselves to the top level information of the overall database. The report contains a handy definitions section at the bottom in case you require a more verbose description of each term in the report.

The section "Page counts for all tables with their indices" is very useful to see which tables are using most of the space within the file. In this example we can see the ZPERSON table is consuming 99.84% of the overall file. The other tables prefixed with Z_ are used by Core Data to store metadata. Therefore things are looking pretty healthy in this database.

Vacuuming the clutter

One of the aspects of SQLite when deleting data is that the file size of the database does not reduce. Instead the released space is marked as been available for reuse. This makes sense as the general case of application use will cause the growth to plateau, as the inserts and deletes reach a state of equilibrium. However under special use cases where large fluctuations in the quantity of data occur, the database can be left consuming significantly more space than required. To resolve this problem SQLite has the VACUUM command. This will cause the database to be repacked to used the smallest possible space required. It is worth noting that during the process of vacuuming SQLite will temporarily copy the database, therefore as much as twice the size of the original database file is required in free space in storage.

Let's demonstrate this by deleting 30% of the data from Person in our demo database. Notice "Size of the file in bytes" remains unchanged, however "Pages on the freelist" has jumped to 29.7%.


So far we've talk purely at the SQLite level, inevitably the next question becomes how can we use the vacuum command via Core Data within our applications. Fortunately Apple has us cover in the form of the persistent store option NSSQLiteManualVacuumOption. When specified as part of the options on the NSPersistentStoreDescription will cause a full database vacuum operation to be perform when opening the persistent store.




Finally we run the sqlite3_analyzer report again after the vacuum operation. We can see the file size is now 29155328, a reduction of 36% from the original size. Additionally the values for "Pages on the freelist" are now back to zero indicting all the space within the file is in use.