« Back to home

Stop making your SPA data second class.

Posted on
It's time to rethink client-side data.

This is the first in a series of posts about best practices and approaches I've learned while working on cross-platform HTML5 applications with Ember.

It takes dedication to craft a user experience in HTML5 that attempts to rival native, and many of the conventional approaches to web applications simply don't translate well to single-page-applications (SPAs) or perform well enough on mobile.

When I began building a cross-platform mobile app with Ember, I already had one large and several small Ember applications under my belt. I felt I had a pretty good grasp of what Ember was capable of, and I felt that even where it wasn't (ahem, rendering performance), I could see the core team dedicated to iterative improvement with a clear high performance future ahead.

Ultimately, it wasn't Ember's initial render or re-render speed that held back the App's UX the most (but they did hold it up a lot, and that will be the subject of many posts to come). In the grand scheme of things most of the time a user is on a single screen, and while on that screen re-renders are hardly noticed. It's the transition to that screen that matters the most.

In native app development, a little bit of lag time while data is retrieved and processed is easily hidden by a transition animation. This works, because data is retrieved and processed on non-ui threads.

Our application uses liquid-fire for transitions, which utilizes velocity.js to combine JS optimizations with GPU acceleration to get silky smooth animations. This works well as long as your UI-thread is unblocked, but if the view you are entering has a lot of calculations to perform and models to fetch, you instead get a choppy, uncomfortable experience.

Getting route transitions to be smooth was a multi-layered responsibility.

1: Use Backburner.js (aka Ember.run)

Ember has many micro-lib tool belts it relies upon. One of these tool belts is backburner, a small lib that lets you schedule asynchronous callbacks into named queues so that they can be executed in a meaningful order.

Say you want to fetch some data from the server, but you know you should render what you already have first?

You would use Ember.run.schedule('afterRender', ....

Or perhaps you need to perform a calculation before render occurs? Ember.run.schedule('actions' ....

Ember has a number of built-in backburner queues you should know about, and you can add your own. In our app.js, I added two additional queues.

Ember.run._addQueue('backgroundPush', 'afterRender');  
Ember.run._addQueue('backgroundLoad', 'backgroundPush');

var App = Ember.Application.extend({ ... })  

Adding these queues here (vs. initializer) avoided any errors that might get thrown when an object is instantiated that uses one. I added these before services were created, and I suspect now that many of my custom service-like additions are actual services, and with the new instance-initializer setup, that I could get away with simply ensuring the queues were initialized first thing.

Here's how I handle these queues.

  • afterRender is for processing data for the current screen.
  • backgroundPush is for processing data that's already available but not pertinent to what's on screen.
  • backgroundLoad is for fetching new data (again, not on screen).

Utilizing the appropriate queues to schedule work and pre-load data, we could (mostly) keep the thread clear of any non-ui related work while the transition animation was occurring.

The problem is, while this helps ensure a smooth animation, that animation partially exists to cover up that fact that we're loading data, something that we just agreed we don't want to do during the animation. Curse you, javascript, and your single threaded nature! (jk). If only there were threads in Javascript?

There are!

Intro web-workers.

2: Use Web Workers

If you need IE8/9, your worker interface will need to fall back to an interface on the main thread, otherwise support for web workers is strong.

WebWorkers aren't all there yet. For one, it's hard to transfer objects from the worker to the main thread (if you could, we would truly be entering the golden age of web applications). For two, there's no DOM (it is possible to get a simple-dom micro-lib to work in one though). Web workers can establish web socket connections and make ajax requests.

Ideally, a record created by the web-worker after it normalizes a JSON payload could be transferred directly to the app's store as a ready-to-go record. Unfortunately, even with transferableObjects, that's not possible (yet), but we can do almost as well (discussed below).

Ideally, we could also pre-generate DOM Fragments within the worker and transfer them to the main thread for use. That's also not possible (at all). The best that can be done is some HTML string generation that the main thread will still have to parse into actual DOM.

To keep an uncluttered UI thread, make your requests and do your data normalization in the web worker. From there you have two options for transferring the payload to the store.

Option one is to use JSON.stringify and postMessage to send the data to the ui-thread. Option two is to push the normalized data into localStorage using an agreed upon key, then alerting the ui-thread (via postMessage) that the data is available in localStorage upon request.

I've found loading and parsing JSON from local storage to be (~5-15%) faster than via postMessage. The real gain isn't in how you get the data, it's in the peripheral advantages. By loading via localStorage you automatically cache your data, and you can pre-load data easily, alerting the ui-thread to what's available.

This mindset (of using available data) is key to performance, and it leads to step number 3 and the premise of our title.

3: Learn to trust your local data.

SPAs are a different beast than a traditional web app. SPAs on mobile are a different beast than SPAs on desktop. But all of these benefit from an improved data flow story.

What do I mean?

In a server-rendered web app, data requests are made per page request, and aren't cached locally. There's no need to, the next request for that page is coming from the server, you might be able to cache the page html, you may even cache the result of the data request, but those are details the server based application doesn't think about. For all intents and purposes, the server based web app re-requests all the data each time.

This means the data is (with obvious exceptions) fresh, and trustworthy.

In an SPA, I find most developers fall into a trap by approaching data from a server-based app mentality.

In your Ember/Angular/React app, do you do this?

export default Ember.Route.extend({  
    model: function() {
       this.store.fetch('foo', 1);
    }
});

Or this?

export default Ember.Route.extend({  
    afterModel: function() {
       model.reload();
    }
});

If so, you are thinking about your data from a server based perspective. You feel that each page load the data must be requested in order to ensure validity. I think this exact way of thinking is what led to fetch being added to ember-data (yes, fetch is very necessary, but the need for it is actually much lower than you probably feel it is).

Does this mean that the following is a better approach?

export default Ember.Route.extend({  
    model: function() {
       this.store.find('foo', 1);
    }
});

In some ways, yes. This approach has other problems though. In it's current state, it clutters the ui-thread. It also doesn't know if the particular record has gone stale.

Instead, I return Live records/arrays with locally available data in them, and trigger background re/loading as necessary. The advantage of doing so is that I can render immediately with whatever is available, and update immediately as new data becomes available.

Doesn't this lead to layout thrashing? Actually, no, not if you know a little bit about how Ember works. It did pre-glimmer, until I devised a proxied approach and later abstracted the approach into the magicArray in smoke-and-mirrors. The proxied array ensures that even if the underlying array is completely changed out, existing views and their html are reused instead of destroyed.

Those extra backburner queues I added above? That's right, those exist entirely for this approach to data loading. In it's first pass, the model code looked something like this.

model: function (params) {

    var cache = this.get('_REQUESTS');
    var route = this;

    if (cache.isStale('conversation-' + params.id)) {
      cache.logRequest('conversation-' + params.id);
      return this.store.fetch('conversation', params.id);
    }

    var model =  this.store.getById('conversation', params.id);

    if (!model) {
      // alert user with notification
      ...
      route.transitionTo('index');
    } else {
      return model;
    }

  }

The trouble here was that if a conversation hadn't been pre-loaded or was stale, we were left with a cluttered ui-thread. What's not seen here is that the adapter and web socket connection are all in a Web Worker. We're most the way to an efficient solution, but not all there. What we really need is a way to bail late in the process if (while background loading) the request fails.

Take 2:

model: function (params) {  
    var route = this;
    return this.store.request('conversation', params.id, function() {
    // alert user with notification
      ...
      route.transitionTo('index');
  });
}

Obviously, store here is a highly modified version of ember-data. Here's the gist of how the mechanics of this work.

  • check if data exists
  • if no data or data is stale, schedule a background task to retrieve data
  • return either the existing record or a ProxyObject that will resolve to the record later.
  • execute the fail callback if the background load fails.

If the record isn't available yet, we don't create a temporary record, this keeps us from encountering empty ghost records in our store.

Okay, so this looks pretty neat, but how do you know if data is stale? That leads me to my last point in this post.

4: Trust your data until you don't.

How long you trust data or how much you trust a specific data source is up to you the developer. Maybe your API uses etags? Maybe you know you only trust your data for 15min from the last fetch? Maybe you know you only trust your data until X event occurs (such as a sign off or your web socket breaks)?

Build a request cache that smartly invalidates it's data based on rules you define. Mine is just a simple hash map of urls requested, that namespaces based on record type returned, is evented (you can invalidate an entire namespace on event), and has per-record invalidation rules (based on time).

The point is, until such time as you know you can't trust your data any longer, go ahead and trust it. Or as the title to this post says, stop making your SPA data second class. This isn't a server rendered application, we live in a mobile-first world, connectivity is not guaranteed or constant. The fewer requests you make and the faster you return data to the screen, the snappier your applications will be, and the happier your users will be when their connection switches from LTE to 3G to none to Wireless as they walk from their car to their office.