Polyfill

IndexedDB Polyfill

Although IndexedDB API is well supported in Firefox, Chrome and IE, Safari currently does not have plan to support IndexedDB API. Instead Safari browser support Web SQL Database API. Additionally older browsers will never have database API. For web developers, leaving out Safari, hence iOS, is too much stake. IndexedDB polyfill to WebSQL has been used, with some miss and hit. Two well known open source are IndexedDBShim and IndexedDB-polyfill from Facebook. In this article, I will discuss implementation of YDN-DB indexeddb polyfill for websql and localStorage.

Should polyfill be IndexedDB or WebSQL

Before diving into detail, let us think about overview of database api. Instead of IndexedDB polyfill, the reverse WebSQL polyfill is possible and even make more sense since SQL is more powerful and existing code are in it. Of course, 100% polyfill will be difficult, but implementing only commonly use SQL features is more easier than IndexedDB polyfill, since there is only one method called, executeSql. This is long term goal of YDN-DB.

Proponent of IndexedDB API point out IndexedDB API is better suitable for SQL in javascript language. For large scale app development, application logic does not directly SQL, but instead use midware ORM framework, like Hibernate, JPA, JPO, LINQ, etc. Large scale javascript web app will finally be find useful to use ORM as well. IndexedDB API is, in fact, a good ORM layer with low level API without relational consistency management. YDN-DB is indeed wholehearted embrace the intent and concept of IndexedDB API.

In YDN-DB implementation, polyfill is from both side.

Transaction

We are fortunate that both IndexedDB and WebSql use auto commit transaction through javascript run-to-completion concurrency model. If were not so, it will not be possible to implement direct polyfill between the two.

Transaction implementation in WebSql is very simple. There is only two locks, read or write. Transaction scope is irrelevant to WebSql. Each lock close whole database. IndexedDB API spec intentionally avoid specifying transaction lock and left browser vendor to choose. However the spec said that "readwrite" transactions with overlapping scope are created according to order placed. Other than that transactions are not necessary to lock whole database or object store. Chrome implementation base on levelDB are poised to exploit very fast read requests. As side node, Chrome implementation is slowest among browsers, IE (fastest) and Firefox.

In addition to database read and write transaction, there is version change transaction. It is special and only transaction to mutate database schema. In older IndexedDB spec and websql, database schema can be change throughout the connection. Finally it turn out not a good idea and version change transaction is only available only at opening of database. WebSql also have similar concept for creating version change transaction at opening, but due incomplete specification, its usage is limited to opening known database version. But we do not know what database version will be in user browser, it cannot be used. A usual practice for WebSql is version change transaction is call just after opening database. However browser implementations are so relaxed that, you can change database schema in usual write transaction too. This practice is so popular that version change transaction in websql get lost.

Web sql database api forgot to specify, rare but crucial to abort a transaction. Fortunately there is not-too-bad workaround. If invalid SQL statement is place against the transaction that want to be aborted, causing SQLStatementErrorCallback to be invoked. There, transaction can be aborted by returning true. The following is YDN-DB abort implementation for WebSql.

/**
* @param {SQLTransaction} tr transaction.
* @param {SQLError} error error.
* @return {boolean} true to roll back.
*/
var error_callback = function(tr, error) {
  // console.log(error);
  return true;
};

this.tx_.executeSql('ABORT', [], callback, error_callback);

All IO operation are handled via request, which is placed against an active transaction. Request are executed asynchronously. Transaction is still active when receiving the result. If transaction is not reuse, the transaction is automatically committed. This concept is same for both IndexedDB and websql, but one important detail. IndexedDB API specified that request are executed in order placed against the transaction, but WebSql does not. Implementation are different from browser to browser or even version to version. For example the following code is unreliable:

db.transaction(function(tx) {
  // Insert 2 rows
  tx.executeSql('INSERT INTO city (id, name) VALUES (?, ?)', 1, 'Roma');
  tx.executeSql('INSERT INTO city (id, name) VALUES (?, ?)', 2, 'Bari');
  tx.executeSql('SELECT * FROM city', [], function (tx, results) {
    // may not get 2 rows
  }
});

Correct usage requires taken care of execution sequence by reusing the transaction from the success handler, as follow:

db.transaction(function(tx) {
  // Insert 2 rows
  tx.executeSql('INSERT INTO city (id, name) VALUES (?, ?)', [1, 'Roma'], function(t) {
    t.executeSql('INSERT INTO city (id, name) VALUES (?, ?)', [2, 'Bari'], function(t) {
     t.executeSql('SELECT * FROM city', [], function (tx, results){
       // get 2 rows
      }
    });
  });
});

Key or column type

Web Sql database does not specify column data type, and hence we can assume all SqLite column data type are supported, which include NULL, INTEGER, REAL, TEXT and BLOB. Column In IndexedDB API, key can be numeric, string or array of valid keys. It is not necessary to specify key type in advance. Since key data type is included in key ordering specification, a record can be specified by these valid data type. To polyfil indexeddb key (IDBKey) to websql, TEXT is the only choice. Storing in BLOB is possible but indexing is only allowed in TEXT data type. IDBKey is encoded into text with padding data type and array length according to key comparison algorithm, also available as indexedDB.cmp. An excellent key encoding implementation from facebook polyfill is conveniently grand fathered in YDN-DB. YDN-DB also allow optionally type column, in that case key are not encoded and save a bit of cpu time.

Composite and compound key

If IDBKey is an array, it is effectively a composite key. Composite key and compound key are convenient way to make database constraint in IndexedDB as well as faster query. For example, a channel object can have only unique sender and receiver by using composite primary key as array keyPath of ['sender', 'receiver'].

Compound key are used for multiple equal join query or sorting. In IndexedDB, compound key is specified also by array keyPath. Essentially these key are same as array key and previous key encoding method can be used to polyfill compound key. In YDB-DB, compound key does not have corresponding column. Compound key query are simply execute as JOIN query of respective index columns.

Unfortunately IE11 does not support compound key nor array key. YDN-DB is yet to polyfill this feature if Microsoft could not deliver it in time.

Nested object key

Once simple trick, but important in designing database schema is IndexedDB ability to specify key path in javascript dotted notation. Consider the following record object:

var contact = {
  name: 'John',
  label: 'relative',
  phone: ['2325232', '23523242']
  address: {
   zip: 123456,
   country: 'US'
  }
}

If we want to query record by zip, contact and address must be store in separate table. Since IndexedDB allow dotted annotation, zip can be indexed by 'address.zip' keyPath. Polyfil to websql is straight forward as well.

multiEntry key

Another beautiful trick in indexedDB AI isindexing array element. Often the data we want to indexed are in array such as phone number of previous contact object. In this case we want to index each element, rather than array as a whole. multiEntry indexing is very simple and effective. multiEntry index is used for one-to-many and many-to-many relationship. Profiling multiEntry in WebSql use of junction table, in which primary key is value of each element of multiEntry index key with referencing back to primary key of the record. YDN-DB manages updating, creating and deleting of multiEntry keys.

Key generator

If an application just need to store object without query, IndexedDB can generate primary key automatically by setting autoIncrement attribute to true. This is similar to Sqlite INTEGER PRIMARY KEY AUTOINCREMENT. Both database generate increasing integer key as row counter. IndexedDB is liberal in that any key (including string and array) can be used along with auto generated integer key. Sqlite is limited to INTEGER data type for autoIncrement key. Since YDN-DB use native Sqlite key generator, only integer data type can be used for autoIncrement.

Structured cloning algorithm and schema-less database

Making a database schema-less is nothing new or revolutionary. In its essence, schema-less database is ability to store serialize record object. Sqlite provides BLOB data type for that purpose. The only problem is serialization. IndexedDB API uses newly define, still in draft, structured cloning algorithm to serialize record object. Structured cloning is more capable than JSON serialization. It can serialize javascript object like Blob, File and CanvasPixelArray, etc. Since it is not exposed, YDN-DB only use JSON serialization. For File object, it is possible to circumvent with base64 encoding. When YDN-DB create a database table, a special column, named as _default_ is created to store stringified record value. The object is parsed on retrieval.

A database is interface to a b-tree

Essentially a database is nothing more than an interface to b-tree, which enable fast retrieval ordered list of keys. B-tree data structure are so fast to retrieve of keys that retrieval time is independent of size of database or at most log time. Most of database query belong to b-tree key query. Complex query require in-memory sorting, which again happen in b-tree.

IndexedDB API, which is inspired by Oracle Berkeley DB, exposes database api to abstraction level of b-tree interface. Most query in indexedDB are happen in cursor, which transverse ascending or descending with ability to skip duplicated key.

So far we haven't discuss webstorage, more commonly know as localStorage. It is key-value store without ability to enumerate sorted key order. Query will get linearly slower with number of records. It neither have serialization nor transaction. In polyfill webstorage into IndexedDB API, YDN-DB index key into in-memory avl tree data structure, which is a read-fast/write-slow variance of b-tree. Again array key is supported via facebook key encoding algorithm and JSON serialization are used. Nested key and indexing are implemented. Basically most of IndexedDB API feature are available in this polyfill.

Database index and secondary key

IndexedDB is one step ahead of most other key-value database api on that it manages secondary key or index. Key-value alone is not enough to retrieve record if the one we want to query is not primary key. In this case, query field is indexed becoming a secondary key which map to primary key. It is important to note that IndexedDB API require primary key to read or write record record value. In WebSql, column of INTEGER, REAL and TEXT data type can be indexed using CREATE INDEX command.

For basic key (none-composite key), there are two approach to polyfil IndexedDB index. Create a column for each index and use CREATE INDEX to index is. This method give the best performance since indexing is natively managed by Sqlite engine. This also permit executing SQL statement to the table. The limitation is this approach is schema cannot be change. Although Sqlite has ALTER COLUMN command to add new column, the command is not exposed to WebSql. Sqlite also does not removing column, but index can be dropped.

Second approach is to use separate relationship table for each index. This is simple and flexible, but managing index for each write request. Query performance can be suffered due to additional JOIN query.

Parashuram's IndexedDBshim and Facebook's polyfill take second approach. YDN-DB use the first approach. multiEntry index however uses separate table. Current implementation in YDN-DB causes data lost on schema change.

Key constraint

Being a key-value database, indexedDB does not have foreign key constraint. The only database constraint is unique key constraint. Primary key must, indeed, be unique. Secondary key may or may not be unique. Polyfill is straight foward since Sqlite has UNIQUE key constraint. However, when unique constraint is violated during record insert or replace, SqlError is issued but error code is incorrect. Chrome websql implementation correctly use CONSTRAINT_ERR (code 6), but Safari implementation give generic UNKNOWN_ERR. The bug was reported but not fix. In fact, none of the bugs are fixed in WebSql.

Schema reflection

IndexedDB API provide synchronous reflection for object store properties and index properties. Sqlite can query table information through master table. Since database query is asynchronious, YDN-DB query schema information at the opening of database.

Currently IE implementation luck reflection for autoIncrement property.

Cursor Iteration

IndexedDB cursor iteration is very low level. IndexedDB provides two way to iterate cursor, either on object store or index. When iterating on object store, cursor always has record value. On iterating on index, taking record value is optional. In YDN-DB, cursor iteration polyfil is pretty straight forward, but logistic on keeping requests are rather complex. Unique secondary key iteration is achieved by GROUP BY clause. Sqlite DISTANT means for unique rows or primary key.

Database delete

WebSql does not expose dropping database nor listing database. The workaround used in YDN-DB is dropping all tables as seen in master table. Browser clean up database without tables on next restart. Basically it work fine in application use. But it has complication is unit testing. WebSql have race condition for creating and deleting by multiple connection. IndexedDB also have race condition in similar situation, but it can easily be avoided by properly closing connection.

SQL execution

On reverse polyfil, IndexedDB can be polyfilled to execute WebSQL statements. Which is more complex and equally more useful and powerful. Due to declearative nature of SQL statement, query can be express easily and understandably regardless of how it achieve the result. At the same time, "regardless of how" is psycho path to poor performance app. However history has indicate SQL is still arguably the best query language. YDB-DB has attempted to parse SQL statement, analyze for optimal query procedure and infracture for executing them. After key joining algorithm, expression evaluator and buffer utilities are completed, most of the SQL statement will be executed on YDN-DB library.

This article is not completed yet. More content will be added or correct.