Sunday, May 1, 2016

Two gig just isn't what it used to be

The Apache Derby database system is now in its twenty-first year of existence, having been born in 1995 by a stealth mode startup named Cloudscape out of the ashes of a skunk-works project at Sybase to write an object-oriented database.

It turned out that the world didn't really want that object-oriented database, but the Cloudscape team wrote a fine modern database system, entirely in 100% pure Java, and although Cloudscape the company failed (sold to Informix who were in turn sold to IBM, who then donated Derby to the Apache Software Foundation more than a decade ago), Cloudscape the database (now Apache Derby) lives on.

One of the features that Derby has always had is the BLOB and CLOB datatypes.

Well, I happened to be working on a Derby bug this last month, involving the BLOB/CLOB support.

The bug shows up when you first export all of the BLOB or CLOB values in a single table into an external file, by calling SYSCS_UTIL.SYSCS_EXPORT_TABLE_LOBS_TO_EXTFILE, and then, subsequently, import those values back into the database by calling SYSCS_UTIL.SYSCS_IMPORT_TABLE_LOBS_FROM_EXTFILE.

For the bug to occur, at least one of the BLOB/CLOB values in the external file must begin at a location which is greater than 2147483647 bytes from the start of the external file.

The bug, of course, was simply that the Derby code was using a Java int variable for the file offset, rather than a Java long variable, and so I think I've nearly finished fixing it.

As part of the fix, I was looking at a companion variable in the code, which holds the length of the BLOB or CLOB, which caused me to go look up the maximum length of a BLOB or CLOB value in Derby. And, indeed, the maximum length of such a value is 2147483647, so I didn't need to change the length variable, just the offset variable.

Cloudscape was originally intended, in those heady days of 1995, to be run on mobile devices, which back then were things like the Apple Newton, or the Research in Motion Blackberry.

Those mobile devices typically had an ENTIRE STORAGE CAPACITY of, at the high end, 64 megabytes or 128 megabytes, so I'm sure that having a two gigabyte limit on the length of a BLOB or CLOB value seemed vastly over-engineered at the time.

But the original designers probably didn't anticipate that Derby would live for 21 years.

Today's generation of mobile devices routinely have 256 GB, 512 GB, and sometimes even more, for their storage.

I'm sure that, at some point, enough people will be impacted by the two gigabyte limit on BLOB and CLOB values in Derby that somebody, maybe even me, will change it.

But that time hasn't come yet. For now, I'll just fix the file offset, and move on.

No comments:

Post a Comment