7.12.07

Curiosity is bliss: "Take It With You" Wiki

January 17, 2006

"Take It With You" Wiki

Although this blog has been silent for a while, I haven't been idle. I was working on AJAX-based web application with transparent support for disconnected operations.

TiwyWiki is a prototype wiki that runs both online and offline without any install (besides Flash Player).
Here's a demonstration scenario:

  1. load the demo (requires Flash 8) and browse a couple of pages,
  2. pull the network plug off your computer and put your browser in Offline mode,
  3. re-open the wiki using the same url,
  4. while offline, continue reading and editing the cached pages of the wiki, create new pages,
  5. go back online and sync your updates back to the server.

I've only tested TiwyWiki on IE and Firefox on Windows, and heard that it runs properly on Mac (Safari I think), let me know if it runs for you on any other platform.

This is just the skeleton of a wiki, but it gives a feeling of the possibilities of web applications that can deal gracefully with being intermittantly disconnected. I'm especially interested in hearing back about whether this approach is valuable to you, in comparison to the traditional web and rich client models.
What other applications you'd find most appealing and why?
Here are the ones I brainstormed so far: a personal wiki, various other personal or group GTD tools (such as todo list or calendar), a community wiki, an email reader and/or composer, a blog editor, an RSS reader, an app for driving directions.


Some background:

Two problems ran in circles in my head while I was on vacation a couple of weeks ago: how to make cross-domain XMLHttp requests before cross-domain is actually supported by browsers and how to allow web applications to run offline?
I started by focusing on the first one, probably because I've been toying recently with cross-domain XMLHttp and client-side storage through Greasemonkey. Also, I was thinking that it would help for running local/offline copies of web apps.

The problem with using Greasemonkey to extend the browser is that it's not widely available and it doesn't offer good control over cross-domain requests. A Flash and Javascript combination, such as the Flash-based Canvas or AMASS storage, seemed like a better solution.
As I learnt more about Flash 8 and its security model, my original plan of running a local copy of a web page for offline use didn't seem convenient enough: you would have to explicitly save the app locally and synch it before going offline.

When I found out that Flash Player did cache Flash apps properly, the idea of running both the online and offline scenarios took the lead. This avoided the new security restrictions for local apps in Flash 8, keeping two local caches of the data (one for the online domain and one for the local copy) and no installation problem.
Instead you would be able to use the app locally as soon as you used it online. First, whatever content you had already accessed would be cached and persisted locally (in the Flash app/storage). You could use pre-fetching to ensure your local cache would have the data that you want.
Second, the Flash app would act as a buffer for disconnected operations, such as local updates while running offline.


Design philosophy:

One interesting thing to realize is how and why the pieces fit together.
As a starting point, you should understand that the AJAX trend is not simply about rich UI and eye candy, but more generally about providing a more responsive experience by optimizing the bottleneck resource (the network): you cache the data that doesn't change (some HTML, Javascript or CSS), and transfer only the information that is dynamic.
Once you have a web application that is entirely cacheable, you can support offline operations. You just need to have all the dynamic data go through a smart proxy that can do disconnected reads and updates.
That's where Flash comes into play, as it offers large persistent local storage and easy interfacing with Javascript.

I don't see Flash as the long term solution, but rather a temporary workaround that allows for some early experimentation. Instead of waiting for new browser infrastructures, I wanted to demonstrate that web apps with offline support and no install were already feasible, relying only on a new combination of existing techniques.

That's why I tried to keep the extensions to the browser as clean and simple as possible, minimizing the amount of Flash and relying more on the common skillset (HTML+Javascript). I think this will motivate other developpers to try this approach.

In this case, Flash actually turned out to be rather un-obstrusive.
First, if you don't have it installed, the web app will still work fine, except with no offline support or persistent data caching.
Second, Flash offers some benefits that I hadn't anticipated. For example, the storage is shared between IE and Firefox. This makes for a nicer experience that I would expect from any native browser API, such as IE's client storage API or the drafted storage API from the WHAT working group.

For those who want to avoid Flash, other alternative storage techniques could possibly be used to achieve similar results, such as IE's storage API, a Java applet, an ActiveX object or some other kind of browser extension.

In the long run, I hope this proof of concept and the following uses of this technique will help identify the right set of APIs to implement natively in browsers.

Caching:

Caching is at the heart of this solution and needs to be configured properly. When the expiration header (Expires, using mod_expires in Apache or directly in IIS) is correctly set for all the static content, both Firefox and IE let you run the application offline without complaining.
Overall, IE appears to be more sensitive to mis-configured caching headers and in that case, it would often display some prompts to work offline or return online to continue the current operation.

Loading Flash when offline:
During a troubleshooting session, I noticed something unexpected. The common markup for including Flash objects in IE actually causes a request to Macromedia, which usually replies with a 302 (but no caching headers).
Besides my surprise of discovering that Macromedia's server is hit every time a Flash app is opened in IE, this meant that the Flash object wouldn't load offline. So TiwyWiki uses its own Flash loading technique (yet another) to support running offline.


Busting the cache:
One downside of forcing the application to be cached is that if a new version of the application becomes available, the browser won't notice it until the current version expires from the cache.

I'm still looking for some ideas on how to let the application deal with this update scenario, so that it could have some logic to check for updates and trick the browser into reloading its cache (force refresh). There may be some solutions by using the XMLHttp API with the right request headers, if the different browser could cache the responses properly.
As a last resort, one could imagine a new browser API that would allow invalidating the cache for a given domain and path.

Locking files in the cache:
The other problem with running the application out of the browser's cache is that the user could "uninstall" the application by accidentally clearing the cache or the application could erased from the cache to make room when the cache gets filled up.
I'm still looking for ideas on how to achieve proper locking of the files in the cache.

In IE, that should be possible using the "Offline Favorites" feature. Whenever you bookmark a page, IE gives you the option to "Make [the favorite] available offline". If you check that option, IE will use a crawler (MSIECrawler) to pre-fetch and cache the content for offline reading. You can hint the crawler using a CDF file, linked from a tag.
But I implemented and ran various experiments with "Offline Favorites", and couldn't get the files to be properly frozen in the cache (they would still get scavenged to make room).


Making a framework:

A wiki turned out to be a rather complex application in terms of synchronization and error handling. I originally wanted to write a generic framework for occasionally connected web applications, to deal with these problems.
But besides the reusable Flash component, most of the code so far is specific to the schema and synchronization model for the application. My work on a second application (an RSS reader) hasn't helped me bubble the right abstractions yet.

Do you know any generic synchronization framework which could be ported or mimicked in Javascript? Something like TrimQuery would be great if it supported INSERT and UPDATE.

Also, is there some existing libraries that would offer a rich logical view of a persistent storage that only supports sets of name-value pairs?

Developing with Flash:

This was my first time working with Flash and overall I found it easier than expected. ActionScript is a sibling of Javascript (both follow the ECMAScript specification), which made it easy to pick up. I was happy to interact with Flash authoring tools as little as possible and end up building the Flash component completely using the MTASC compiler.
I haven't met too many problems with the Flash APIs. ExternalInterface is quite convenient, although I've had to work around a performance issue when passing large data accross.
I wouldn't expect too much performance of the storage API, SharedObject, which serializes objects into files. But this hasn't been a problem so far.

Open problems:

Besides the problems already mentioned (building a richer storage abstraction, building a generic synchronization framework, getting more control on the caching), I've hit my head on trying to fix the back button behavior in IE.
The usual hacks rely on iframes pointing to a blank html page on the server, with some unique querystring parameters. Unfortunately, such queries don't work offline, because the unique querystring values essentially keep busting the cache.

I've also encountered some weird issues with Flash in Firefox 1.5, showing "Bad NPObject as private data" in the Javascript console and sometimes popping up warnings that an extension mis-behaved. My guess at this point is that it was some interaction between Flash and some other extension, possibly AdBlock.

And finally, I'm still battling some memory leaks issue. Although the code does use closures quite a bit, I can't see how it would create circular references chains between the DOM and the Javascript engine.


Related:


Posted by Julien. Permalink

No comments: