The document discusses the development of node-mdb, an open source emulation of Amazon's SimpleDB NoSQL cloud database using Node.js and the GT.M database. It describes why the author chose to emulate SimpleDB, use GT.M as the database backend, and develop it using Node.js. It also provides details on how node-mdb implements the SimpleDB APIs by mapping the database schema and operations to GT.M globals and leveraging Node.js asynchronous programming patterns.
Report
Share
Report
Share
1 of 41
More Related Content
Developing node-mdb: a Node.js - based clone of SimpleDB
1. Developing node-mdb SimpleDB emulation using Node.js and GT.M Rob Tweed M/Gateway Developments Ltd http://www.mgateway.com Twitter: @rtweed
2. Could you translate that title? SimpleDB: Amazon’s NoSQL cloud database Node.js: evented server-side Javascript (using V8) GT.M: Open source global-storage based NoSQL database node-mdb Open source emulation of SimpleDB
3. SimpleDB Amazon’s cloud database Pay as you go Secure HTTP interface Schema-free NoSQL database Spreadsheet-like database model Domains (= tables) Items (= rows) Attributes (=cells) Values (1+ per attribute allowed) SQL-like query API
5. Why emulate SimpleDB? To provide a free, locally-available database that behaved identically to SimpleDB Lots of off-the-shelf available clients Standalone Bolso Mindscape’s SimpleDB Management Tools Language-specific clients boto (Python) Official AWS clients for Java, .Net Node.js etc…
6. Why emulate SimpleDB? To perform local tests prior to committing to production on SimpleDB To provide a live, local backup database A SimpleDB database for private clouds To provide an immediately-consistent SimpleDB database SimpleDB is “eventually consistent”
7. Why the GT.M database? I’m familiar with it Free Open Source NoSQL database Schema-free “ Globals”: Sparse persistent multi-dimensional arrays Hierarchical database Completely dynamic storage No pre-declaration or specification needed Result: trivial to model SimpleDB in globals node-mdb : Good way to demonstrate the capabilities of the otherwise little-known GT.M More info – Google: “ GT.M database” “ universalnosql”
8. Why write it using Node.js? M/DB originally written in late 2008 Implemented using GT.M’s native scripting language (M) Apache + m_apache gateway to GT.M for HTTP interface I’ve been working with Node.js for about a year now Rewriting M/DB in Javascript would make it more widely interesting and comprehensible Some performance issues reported with M/DB when being pushed hard
9. Why Node.js? Conclusion: Re-implementing M/DB using Node.js should provide better performance and scalability Fewer moving parts: Apache + m_apache + GT.M / multi-threaded Node.js + GT.M as child processes / single-thread Cool Node.js project to attempt Great example of non-trivial use of Node.js + database
10. How does SimpleDB work? HTTP Server Authenticate Request (HMacSHA) Security Key Id Secret Key Execute API Action Generate HTTP Response SimpleDB Database Copy 1 SimpleDB Database Copy 2 SimpleDB Database Copy n SimpleDB Database Copy 2 SimpleDB Database Copy 2 Incoming SDB HTTP Request Outgoing SDB HTTP Response Error Success and/or data/results
11. Node.js can emulate all this HTTP Server Authenticate Request (HMacSHA) Security Key Id Secret Key Execute API Action Generate HTTP Response SimpleDB Database Copy 1 SimpleDB Database Copy 2 SimpleDB Database Copy n SimpleDB Database Copy 2 SimpleDB Database Copy 2 Incoming SDB HTTP Request Outgoing SDB HTTP Response Error Success and/or data/results
12. GT.M can emulate this HTTP Server Authenticate Request Security Key Id Secret Key Execute API Action Generate HTTP Response SimpleDB Database Copy 1 Incoming SDB HTTP Request Outgoing SDB HTTP Response Error Success and/or data/results
13. Node.js characteristics Single threaded process Event loop Non-blocking I/O Asynchronous calls to functions that handle I/O Event-driven call-back functions when function completes Data fetched Data saved
14. Result: deeply nested call-backs HTTP Server Authenticate Request Security Key Id Secret Key Execute API Action Generate HTTP Response Error Success and/or data/results
15. Flattening the call-back nesting processSDBRequest() http server executeAPI() sendResponse() http.createServer(function(req,res) {..} var processSDBRequest = function() {…}; var executeAPI = function() {…};
16. Node.js HTTP Server http.createServer(function(request, response) { request.content = ''; request.on("data", function(chunk) { request.content += chunk; }); request.on("end", function(){ var SDB = {startTime: new Date().getTime(), request: request, response: response }; var urlObj = url.parse(request.url, true); if (request.method === 'POST') { SDB.nvps = parseContent(request.content); } else { SDB.nvps = urlObj.query; } var uri = urlObj.pathname; if ((uri.indexOf(sdbURLPattern) !== -1)||(uri.indexOf(mdbURLPattern) !== -1)) { processSDBRequest(SDB); } else { var uriString = 'http://' + request.headers.host + request.url; var error = {code:'InvalidURI', message: 'The URI ' + uriString + ' is not valid',status:400}; returnError(SDB ,error); } }); }).listen(httpPort);
17. processSDBRequest() var processSDBRequest = function(SDB) { var accessKeyId = SDB.nvps.AWSAccessKeyId; if (!accessKeyId) { var error = {code:'AuthMissingFailure', message: 'AWS was not able to authenticate the request: access credentials are missing',status:403}; returnError(SDB, error); } else { MDB.getGlobal('MDBUAF', ['keys', accessKeyId], function (error, results) { if (!error) { if (results.value !== '') { accessKey[accessKeyId] = results.value; validateSDBRequest(SDB, results.value); } else { var error = {code:'AuthMissingFailure', message: 'AWS was not able to authenticate the request: access credentials are missing',status:403}; returnError(SDB, error); } } }); } };
18. validateSDBRequest() var validateSDBRequest = function(SDB, secretKey) { var type = ‘HmacSHA256’; var stringToSign = createStringToSign(SDB, true); var hash = digest(stringToSign, secretKey, type); if (hash === SDB.nvps.Signature) { processSDBAction(SDB); } else { errorResponse('SignatureDoesNotMatch', SDB) } };
19. stringToSign() POST {lf} 192.168.1.134:8081 {lf} / {lf} AWSAccessKeyId= rob &Action=ListDomains& MaxNumberOfDomains=100&SignatureMethod=HmacSHA1& SignatureVersion=2& Timestamp=2011-06-06T22%3A39%3A30%2 B00%3A00& Version=2009-04-15 ie: reconstruct the same string that the SDB client used to sign the request then use rob ’s secret key to sign it:
20. digest() var crypto = require("crypto"); var digest = function(string, secretKey, type) { var hmac = crypto.createHmac(type, secretKey); hmac.update(string); return hmac.digest('base64'); };
21. Ready to execute an API! HTTP Server Authenticate Request Security Key Id Secret Key Execute API Action Generate HTTP Response SimpleDB Database Copy 1 SimpleDB Database Copy 2 SimpleDB Database Copy n SimpleDB Database Copy 2 SimpleDB Database Copy 2 Incoming SDB HTTP Request Outgoing SDB HTTP Response Error Success and/or data/results
23. Accessing the GT.M Database Accessed via node-mwire TCP-based wire protocol Extension of Redis protocol Adapted redis-node module APIs allow you to set/get/delete/edit Globals
24. GT.M Globals Globals = unit of persistent storage Schema-free Hierarchically structured Sparse Dynamic “ persistent associative array”
25. GT.M Globals A Global has: A name 0, 1 or more subscripts String value globalName[subscript1,subscript2,..subscript n ]=value
30. Key Node.js async patterns for db I/O Dependent pattern: Can’t set the global nodes until the value of the increment() is returned Parallel pattern: Global nodes can be created in parallel No interdependence BUT: Need to know when they’re all completed
31. Dependent pattern MDB ‘rob’ ‘domains’ ‘name’ ‘created’ 1304956337618 ‘books’ ‘modified’ 1304956337618 1 2 MDB.increment([accessKeyId, 'domains'], 1, function (error, results) { var id = results.value; //….now create the other global nodes inside callback }); IncrBy
32. Dependent pattern MDB ‘rob’ ‘domains’ ‘name’ ‘created’ 1304956337618 ‘books’ ‘modified’ 1304956337618 1 2 MDB.increment([accessKeyId, 'domains'], 1, function (error, results) { var id = results.value; //….now create the other global nodes inside callback });
33. Parallel Pattern (semaphore) var count = 0; MDB.setGlobal([accessKeyId, 'domains', id, 'name'], domainName, function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); }); MDB.setGlobal([accessKeyId, 'domains', id, 'created'], now, function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); }); MDB.setGlobal([accessKeyId, 'domains', id, 'modified'], now, function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); }); MDB.setGlobal([accessKeyId, 'domainIndex', nameIndex, id], '', function (error, results) { count++; if (count === 4) sendCreateDomainResponse(count, SDB); });
38. Demo using Bolso List Domains Create Domain Add an item (row) and some attributes (columns + cells)
39. Node.js Gotchas Async programming is not immediately intuitive! Loops Calling functions that use call-backs inside a for..in loop will go horribly wrong! Understanding closures How externally-defined variables can be used inside call-back functions
40. Example BatchPutAttributes Intuitively a for .. in loop around PutAttributes Had to be serialised Completion of one PutAttributes calls the next Copy state of SDB object and use for..in? var SDBx = SDB; SDBx is a pointer to SDB, not a clone of it!
41. Conclusions node-mdb is now nearly complete Only BatchDeleteAttributes not implemented Other APIs emulate SimpleDB 100% Free Open Source https://github.com/robtweed/node-mdb Give it a try! Use mdb.js for examples to build your own Node.js database applications Check out GT.M! Follow me on Twitter at @rtweed Slides: http://www.mgateway.com/node-mdb-pres.html