NDB/Bindings 0.7.1 Released

Title pretty much says it all. Today I uploaded the latest tarball in of the NDB/Bindings project to Launchpad.

NDB/Bindings is a project which provides Java, Python, Perl, Ruby, C# and Lua bindings to the NDB API and the MGM API for talking to MySQL Cluster. 

As the version number indicates, it's not quite ready for full production use yet, although for Java and Python it is mostly functionally complete (unless something has been missed that I'm unawa - also, for those of you hardcore NDB API hackers out there, no we are not providing direct NdbRecord support at this time - but we probably won't for the 1.0 release anyway so there) There are a few projects who have already been working with the code, one of which found a performance bug that I've been considering in the back of my head.

0 comments
Tags: mysql cluster

All in the assumptions

So I'm not going to claim to be Kevin Closson - because I'm not. I'm also not going to wade into a shared-nothing vs. shared-storage architecture debate. And here's why: there is no right answer.
As with anything else, it comes down to what you want to do. Look at what Kevin says in his very long-windedly (yet nicely) titled: Nearly Free or Not, GridSQL for EnterpriseDB is Simply Better Than Real Application Clusters. It is Shared-Nothing Architecture After All! « Kevin Closson’s Oracle Blog: Platform, Storage & Clustering Topics Related to Oracle Databases
Folks, today’s applications are built on large numbers of tables and complex joins. The reason shared-nothing is nothing like RAC is because instead of only shipping functions (or tasks) and lock messages to the clustered nodes, as is the case with RAC, shared-nothing requires the shipping of data
I'm going to do my best not to make this sound like a shot - because it's not supposed to be. But today's applications ON ORACLE, are built on large numbers of tables and complex joins. Because that's what you use if you have a setup that needs that sort of thing. Yeah - a shared nothing, single-contiguous image database in the 1000GB scale is going to have a whole host of issues. Likewise, I'd love to see a RAC setup try to beat MySQL Cluster in the HLR world. Good luck. There are still plenty of setups where the system is not built on large numbers of tables and complex joins. And in many of the case where they are overly massive and overly complex, the schema can be fixed.
Consider massive web property scale out where you can partition the data at the application layer. Say, an enormous social network, for instance. In a typical social network, you may have millions of users, but most of the interaction is clustered by user, so you can split a user's data across multiple machines. It really doesn't matter all that often to Alice what's in Bob's user data, except at specified points of interaction. (read - friends lists) In the places where the data does cross user boundaries, it's in such a small subdomain of data that it can be easily split out into a smaller and simpler vertical app that does one or two smaller tasks.
In this case, you're doing shared nothing, but you don't have to do a join across all of the data, because of the overall structure and relation of the data. The cost-per-unit-of-storage in trying to scale to numbers that big would be ridiculous on RAC or any shared-storage solution. And part of the reason there is that you just don't need the contiguous system. You can divide and subdivide. So the data shipping problem isn't nearly as much of a problem.
The approach I’ve taken to the shared-nothing versus shared-disk architecture topic is one of theory versus reality. I don’t care how many people say shared-nothing is the best for one workload or the other. The point is that by measured results it is not clearly the best for one (DSS) and is certainly not fit at all for the other (OLTP).
Interesting, because I was about to make the same point in the opposite direction. First of all - no argument about shared nothing in the DSS space. It just doesn't make a whole lot of sense, for exactly the reasons listed here. But in the OLTP space, I'd beg to differ, and I'd like to use the same theory-vs-reality moniker. The "reality" listed above is based on benchmark testing. Now I'm as big a fan as anyone of benchmarks. They're great in some contexts. But an OLTP benchmark is theory when compared to systems out in the field doing real work. Many OLTP systems can quite easily be sharded to handle an enormous amount of data and transaction rates. Having 1000's of parallel systems each working on their own small slice of the puzzle seems to scale in a much more linear fashion in practice than does purchasing massive RAC-like systems.
Then there is the ability to run shared-nothing on tons of smaller pieces of commodity hardware with an built in assumption that pieces of hardware are going to fail. If you want the "reality" backing that one up - see Google.
Again, I'm not trying to claim an this-vs-that victory. I'm just trying to deflect an attempt at one.
It's not that simple. It's not cut-and-dried. There are situations where shared nothing makes a lot of sense, and there are situations where shared strorage makes sense. Nothing is built with just a hammer - you usually also need a saw and a screwdriver ... and maybe even a concrete truck.
0 comments
Tags: cluster

Updates to NDB/Connectors

The NDB/Connectors have added support for Ruby, as well as Asynchronous Transaction support for Java, Python and Perl.
The Ruby support, of course, means that new you can interact with your MySQL Cluster installation using the NDBAPI from all your Ruby code.
The async stuff is especially cool, because it means you can send transactions to the Cluster and get responses by way of callbacks defined in the connector language. So you can do something like this:
[PYTHON] class testaclass(object): def __init__(self, recAttr): self.recAttr=recAttr def __call__(self, ret, myTrans): print "value = ", self.recAttr.get_value()
#snip
myTrans = myNdb.startTransaction()
myOper = myTrans.getNdbOperation("mytablename") myOper.readTuple(ndbapi.NdbOperation.LM_Read)
myOper.equal("ATTR1", 245755 )
myRecAttr= myOper.getValue("ATTR2")
a = testaclass(myRecAttr)
myTrans.executeAsynchPrepare( ndb.Commit, a )
myNdb.sendPollNdb(3000,1) [/PYTHON]
May not seem as exciting with just a single operation - but you can toss tons of them down there and then poll for the results.We've also added support for exceptions in the Connector language. So instead of checking if return values are null or -1, the wrapper code will throw a Java or Perl or Python exception.
There is a also a new mailing list at lists.mysql.com for discussion of the development of the NDB/Connectors. Come join us and have some fun!
0 comments
Tags: cluster

NDB/Connectors for MySQL Cluster on Launchpad

I've been given the go ahead to release my NDB/Connectors code. These connectors wrap the NdbApi for a variety of languages, including Python, Perl, Java and C# at the moment. I'm managing development using Launchpad, so go to
https://launchpad.net/ndb-connectors
To get the latest version or status of the code. If you would like to contribute, feel free to branch a copy of the source using bzr and send me a revision bundle. There is also an ndb-connectors team on launchpad you can join if you'd like to participate more directly in the development. For either of these options to work, you need to first sign the MySQL Code Contributor License Agreement to assign copyright of your contributions to MySQL, Inc.
I hope to have a mailing list set up soon for discussion.
0 comments
Tags: cluster

NDB/Connectors 0.1

So I've expanded the scope of the NDB/Python wrappers I was working on. Now I've got Python, Perl and C# wrappers working, at least for basic functionality. I've setup a trac instance and put a roadmap and all of that type of stuff up. If you are interested in hacking, let me know and we can talk about subversion access and all that.
For the moment, I've turned off code downloading. I'll post again when I've enabled it again.
0 comments
Tags: cluster