14

The project requirements are odd for this one, but I'm looking to get some insight...

I have a CSV file with about 12,000 rows of data, approximately 12-15 columns. I'm converting that to a JSON array and loading it via JSONP (has to run client-side). It takes many seconds to do any kind of querying on the data set to returned a smaller, filtered data set. I'm currently using JLINQ to do the filtering, but I'm essentially just looping through the array and returning a smaller set based on conditions.

Would webdb or indexeddb allow me to do this filtering significantly faster? Any tutorials/articles out there that you know of that tackles this particular type of issue?

2
  • To offer you any useful specifics, we'd have to see what JSON format you have the data in and what queries/filters you're trying to run. The match between the data format (or indexes) and the desired filter/query operation is what gives you speed.
    – jfriend00
    Commented May 7, 2012 at 16:09
  • I'm not familer with webdb, but client-side sql might help. In the end, though, you're at the mercy of the browser's engine.
    – DA.
    Commented May 7, 2012 at 16:10

3 Answers 3

12

http://square.github.com/crossfilter/ (no longer maintained, see https://github.com/crossfilter/crossfilter for a newer fork.)

Crossfilter is a JavaScript library for exploring large multivariate datasets in the browser. Crossfilter supports extremely fast (<30ms) interaction with coordinated views, even with datasets containing a million or more records...

2
  • 1
    This is an amazing library that I just found, and so clearly the right answer for OP's problem. Commented May 13, 2012 at 19:02
  • 6 years later I'm seeing this and it's still amazing. Thank you.
    – dave4jr
    Commented Mar 23, 2018 at 5:33
3

This reminds me of an article John Resig wrote about dictionary lookups (a real dictionary, not a programming construct).

http://ejohn.org/blog/dictionary-lookups-in-javascript/

He starts with server side implementations, and then works on a client side solution. It should give you some ideas for ways to improve what you are doing right now:

  • Caching
  • Local Storage
  • Memory Considerations
1
  • Not precisely apples-to-apples, but I'm hoping it'll give you some ideas. Commented May 7, 2012 at 16:12
3

If you require loading an entire data object into memory before you apply some transform on it, I would leave IndexedDB and WebSQL out of the mix as they typically both add to complexity and reduce the performance of apps.

For this type of filtering, a library like Crossfilter will go a long way.

Where IndexedDB and WebSQL can come into play in terms of filtering is when you don't need to load, or don't want to load, an entire dataset into memory. These databases are best utilized for their ability to index rows (WebSQL) and attributes (IndexedDB).

With in browser databases, you can stream data into a database one record at a time and then cursor through it, one record at a time. The benefit here for filtering is that this you means can leave your data on "disk" (a .leveldb in Chrome and .sqlite database for FF) and filter out unnecessary records either as a pre-filter step or filter in itself.

Not the answer you're looking for? Browse other questions tagged or ask your own question.