Blog

03
March 2015

Gavin Pickin

Loading or Dumping Large Amounts of Data into WebSql - Options with Performance Numbers

Javascript, Mobile Development, WebSQL

Working with databases, there is going to be a time when you need to dump large amounts of data out of, or into your database. Dumping data out, has always been pretty easy, but what about inserts? We’ll look at a couple of solutions, and a couple of issues, and see what seems to be the best solution.

I mentioned in an earlier post “Using WebSQL in my Mobile App - Lessons Learned - The hard way”  on WebSQL that although that standard covers a pretty good amount of SQL, but not everything is supported on each implementation. In Safari the following syntax works, but it will not in chrome

'INSERT INTO myTable ( ID ) values ( ? ), ( ? )’, [ null, null ]

 

In chrome, as far as I know, each insert, can only insert one record. I am going to look at the different ways to do a bulk insert, using the single insert syntax

'INSERT INTO myTable ( ID ) values ( ? )’, [ null ]

 

Scenario 1 - 500+ Transactions - 500+ Queries

Normally, I wrap my inserts with a transaction, so my code will look something like this.

create: function( companyData ){
    app.db.transaction(function (tx) {
   
     var thesql = 'INSERT INTO company ' +
                    '( ' +
                    'ID,' +
                    'company,' +
                    'updatedAt,' +
                    ') VALUES (' +
                    'null, ' +
                    '?,' +
                    '?)';
     tx.executeSql(thesql,
                    [
                        companyData.company,
                        companyData.updatedAt,
                    ],
                    function( tx, data ){ console.log('done'); },
                    function( tx, error ){
                         console.log( 'error inserting trainings in local db');
                         console.log( error );
                    }
                   );
          });
   
     }


This works well for simple inserts, but when you are bulk inserting, the setting up and tearing down the transaction, adds up. Especially when you are doing 500+ transactions.

Running scenario 1: 15000 - 15500 milliseconds

 

Scenario 2 - 1 Transaction 500+ Queries

Lets setup a special batch function, and we’ll pass in the array to loop over inside the transaction, executing 1 transaction, but 500+ queries.

createBatch: function( companyData ){
   
  app.db.transaction(function (tx) {
          var thesql = 'INSERT INTO company ' +
               ( ' +
               'ID,' +
               'company,' +
               'updatedAt,' +
               ') VALUES (' +
               'null, ' +
               '?,' +
'?)';
          for (var i = 0; i < companyData.length; i++){
               tx.executeSql(thesql,
                    [
                    companyData[i].company,
                    companyData[i].updatedAt,
                    ],
                    app.companyDAO.genericSqlResult,
                    app.companyDAO.genericSqlError
               );
          }
     });
}

 

The only difference, other than wrapping the tx.executeSql inside the for loop is the fact that I removed the anon functions, and used named functions. Why did I do that? First, JS Hint told me to… which is pretty smart really. Having 500+ anon functions is probably not smart, having 1 function, and calling it, from each of the loops is a big win for efficiency.

Running scenario 2: 2200 - 2500 milliseconds

 

Scenario 3 - 1 Transaction - 1 Queries - 500+ Values

Now, assuming we are in a WebSQL environment that supports the Safari syntax, where 1 query can have multiple inserts, we should see even faster times. I wonder how much faster. I adjust my code to match this syntax

'INSERT INTO myTable ( ID ) values ( ? ), ( ? )’, [ null, null ]

 

To do this, we need to loop over the number of values, and then add all of the parameters into one big array. This is how I did this.

createBatch2: function( companyData ){
   
     console.log('query starting');
     app.db.transaction(function (tx) {
          var thesql = 'INSERT INTO company ' +
               '( ' +
               'ID,' +
               'company,' +
               'updatedAt,' +
               ') VALUES ';
          for (var i = 0; i < 500; i++){
               thesql = thesql +
                    '(' +
                    'null, ' +
                    '?,' +
                    '?)';
                    
                    // add comma after each set of values except last
                    if (i < 499 ){
                         thesql = thesql + ',';
                     }
                }
                //Add all values into parameter array
                var paramArray = [];
                for ( i = 0; i < 500; i++){
                    paramArray.push( companyData[i].companyID );
                    paramArray.push( companyData[i].company );
                    paramArray.push( companyData[i].parentID );
                    paramArray.push( companyData[i].updatedAt );
                    paramArray.push( companyData[i].datetimestampticks );
                    paramArray.push( localTimeStamp );
                }
                //execute the query
                tx.executeSql(thesql,
                    paramArray,
                    app.companyDAO.genericSqlResult,
                    app.companyDAO.genericSqlError
               );
     });
}

 

Run the query, and now I get an error. Great, what did I do wrong. Here is the error.

{"code":5,"message":"could not prepare statement (1 too many terms in compound SELECT)"}

 

After some googling, I find out that SQL Lite has a max of 500 values statements by default… so to get it to run, I just switch out companyData.length with the number 500. Now when I run it, I get a result.

Summary of Results

Running scenario 1: 15000 - 15500 milliseconds - 561 records
Running scenario 2: 2200 - 2500 milliseconds - 561 records
Running scenario 3: 90 - 100 milliseconds - 500 records

Of course scenario 3 only inserts 500 records, but the numbers are obviously much more performant with Scenario 3, with 500, or 561 records.

In summary, if you are doing bulk imports, you need to know your environment, because the fast method, is only available on some implementations. If you are using scenario 3, you will need to batch your queries into 500 value chunks, or override the limits as specified at SQLite.org 

If you are not sure if you can use multiple values inside a single query, batching your queries into the same transaction is still much better than using a normal create DAO function, so take the extra time to implement a batchCreate method too.

I hope you find this interesting, and the code helps you out when working with your WebSQL Database, tackling Large Bulk Data Imports and Inserts.
Now I know what method is best, and what I'll be using it in the future.

Blog Search