Imaginative Blog Name

Thursday, July 19, 2007

Quicktime / iTunes Update breaks Eclipse?

I upgraded my Quicktime and iTunes today and found that all of a sudden I couldn't compile anything in Eclipse! Apparently, the update got rid of the /System/Library/Java/Extensions/QTJSupport.jar file, and that shows up as part of the default Java 1.5 JRE definition in Eclipse. When it went missing, Eclipse freaked out.


The fix, found at note19 turned out to be simple.



Open the eclipse Preferences... menu and select Java > Installed JREs...; make sure that eclipse can locate the OS X Java 1.5. If it cannot (as was in my case), you manually add it. It is in the following folder:

/System/Library/Frameworks/JavaVM.framework/Versions/1.5.0/Home

Works like a charm now.


Wednesday, July 11, 2007

ProScope HR: Coolest Geek Toy Ever?

In preparation for my impending fatherhood, I've been keeping an eye out for fun things to do with my daughter once she's got a few years on her. Like any good geek, I'm looking for fun, geeky things to do that will pique her interest in the world around her and stimulate her little brain. I think I found something that fits the bill: the Pro Scope HR.


We were watching Alton Brown's Pretzel Logic episode and he was using this microscope hooked up to his PowerBook to look at the differences between various kinds of salt. Apparently it's also featured heavily on the various CSI shows.


I remember the microscope I had growing up. It was your standard light microscope; I think it went up to 200x magnification. It came with a number of prepared slides of various microorganisms, which were cool, but making new slides was kind of tedious. Plus, dealing with a microscope while wearing glasses has always been a pain. With a ProScope, you can take a look at anything you damn well please, not to mention taking high-resolution still photos and live-action and time-lapse videos. I can just see us following our daughter around the house, laptop in tow, looking at everything from pennies to raspberries to bugs in the backyard to our cats' paws.


I think that'd be pretty cool, anyway.


Properly setting up a crontab entry using 'date' to generate timestamps

I have a custom backup script that I want to run every night. I'd also like to have all the output (standard and error) redirected to a timestamped log file for subsequent review. My first stab at defining a cron job to handle this was something like the following:

00 0 * * * ~/bin/backup.sh > backup_$(date + %Y-%m-%d).log 2&>1

The only problem is it doesn't work!

Cron sends you messages pertaining to failed jobs in the system mail queue, which on Mac OS X you can access using the 'mailx' program, which comes with the system. Doing so, I saw this message:

/bin/sh: -c: line 1: unexpected EOF while looking for matching `)'

A little broswing the cron man pages turned up that "%" (as well as "#") is interpreted as a comment character. Looks like my timestamp generation was causing the job to fail. Grrr.

The solution? Escape each "%" with a backslash. The functional cron job definition is

00 0 * * * ~/bin/backup.sh > backup_$(date + \%Y-\%m-\%d).log 2&>1

Sanity restored.

Tuesday, July 10, 2007

Liskov's Substitution Principle, equals, and Hibernate Proxies

Hibernate's CGLIB dynamic proxy classes reared their ugly head today. I've been using the Eclipse-generated hashcode() and equals() methods in my domain objects for quite a while, with no problems. Then today I write a test that does a simple equality check and everything blows up!


The problem seems to be due to the way my equals() methods were written, and how Hibernate's default dynamic proxy strategy interacts with that. I was doing this:



if (getClass() != obj.getClass())

return false;
However, in my exploding test, one of my objects being compared was a regular old domain object, while the other was a proxied object. Since the proxied object that Hibernate creates is actually technically a subclass of my class (with additional Hibernate-specific methods and such), my getClass()-based equality test was choking badly; after all, com.foo.MyClass is most definitely not com.foo.MyClass$$EnhancerByCGLIB$$beb95050.

It turns out that this is due to something called the Liskov Substitution Principle, which basically formalizes the intuition behind the inheritance portion of the object-oriented programming model; if S is a subtype of T, then you can use an instance of S wherever an instance of T is called for and nothing breaks. The corollary would be that if something does break, then some of your assumptions might need re-examining.

In this case, specifying the equals() method in terms of getClass() is too restrictive; Hibernate proxies should be able to be used anywhere one of my domain objects is used (that's the whole point!). A way to solve this problem is offered by Josh Bloch in his Effective Java book: use an instanceof-based test instead:



if (!(obj instanceof MyClass))

return false;
Here, the proxied object, as a subclass of MyClass, is also an instanceof MyClass. Since the CGLIB proxy doesn't override equals(), the proxy inherits the same implementation of equals() as the base class, thus maintaining the symmetric property any valid equals() implementation must have. If the subclass did override equals(), then things would be different, but then you'd have a violation of the Liskov Substitution Principle. Also, as Bloch states in Effective Java, page 30:


It turns out that this is a fundamental problem of equivalence relations in object-oriented languages. There is simply no way to extend an instantiable class and add an aspect while preserving the equals contract. (emphasis in original)
If you're paranoid, you can declare your implementation of equals() to be final, so you can be sure that it is never overridden. Since the CGLIB proxy doesn't try to override it, you're safe.

The "downside" of this approach, if it can be called that, is that each "terminal" domain object needs its own implementation of equals() (note that it's (obj instanceof MyClass), not (obj instanceof getClass())); in other words, you can't define a general equals() in a superclass and let it do all the heavy lifting for inheriting classes. However, in the grand scheme of things, I don't really see that as much of a downside. Yeah, it's a bit of a pain to write equals() methods, but it has to be done anyway (it's your job as a designer), and it is arguably a more accurate approach to take. As a designer, you need to be aware of the implication of what you code. If the getClass() method works for you, fine; just be aware of what that implies. Ditto for the instanceof method. I'm convinced that in my particular case, the instanceof approach is the semantically correct one.


Update: I just checked my copy of Java Persistence with Hibernate and, sure enough, they use instanceof in their equals() implementations. Clearly, the interaction with the proxies is a driving reason to use this formulation. Apart from that, though, I still think that using instanceof is more semantically correct.

Helpful Links


Thursday, June 28, 2007

Cost-based vacuum delay caveat in Postgres

I've been trying to vacuum a 25M row table in Postgres and it has been taking forever; we're talking over 22 hours (thought I'd let it run as I flew to Philadelphia for this conference). A bit of Googling turned up this thread:

VACUUM ANALYZE taking a long time, %I/O and %CPU very low

This guy was seeing the same behaviour as I was: VACUUM ANALYZE was taking forever, and CPU and I/O percentages were hovering around 0. He had the "vacuum_cost_delay" parameter set to 70, which means that Postgres will go to sleep for 70ms when it determines that the I/O costs have exceeded a certain limit ("vacuum_cost_limit"). Since a 25M row table isn't going to fit into memory, there's going to be a good deal of reading in blocks from the disk, and thus you're going to regularly exceed your delay threshold.

Somehow I had set my delay to 500ms. No wonder it was taking so long. I dropped it down to 0, effectively disabling the cost-based delay feature. Now, 10 minutes later, my table has been vacuumed and analyzed.

Now, you can use the autovacuum daemon to vacuum your tables, and the pg_autovacuum table (where you specify table-specific vacuum parameters) will let you set a value for vacuum_cost_delay. Thus, you can set the attribute "vac_cost_delay" to 0 to get quick autovacuums of your big tables, while still allowing you to set a system-wide vacuum_cost_delay for other smaller, less critical tables. It looks like if you manually kick off a vacuum, though, it still uses the system-wide defaults, instead of the values from pg_autovacuum (why?). Since you can set vacuum_cost_delay without reloading the server, if you need to do a manual vacuum, do a
SET vacuum_cost_delay = 0;
first (or something higher than 0 if you can't afford to peg your disk I/O), and then VACUUM (remembering to set vacuum_cost_delay back to what it was afterwards!). If you do this from the commandline, you might want to write a small wrapper script that will do this instead of running vacuumdb.

The lesson here? Always read the directions, kids.

Monday, June 25, 2007

Transformers: Members of the Coalition of the Willing?

I'm going to a conference tomorrow, and decided to check on the TSA's website to make sure I wasn't going to be breaking any of their wonderfully inane rules, like bringing 4 oz. of shampoo (horrors!) in my carry-on luggage.

I was quite surprised to find that they specifically allow "Toy Transformer Robots" (scroll down near the bottom). Even without that, Megatron would still be OK, because toy guns (so long as they don't look like real guns) are cool.

Furthermore, meat cleavers are prohibited by name in carry-on luggage (come on, you ban sabers and swords, ninja stars, and ice picks, and with all that, you still have to call out meat cleavers?!?!)

I'm glad our government is hard at work protecting us from Shampoo Bombers and insane butchers, but alas, they are falling behind in preventing the impending robot invasion!

Thursday, June 21, 2007

Unreal

If anyone ever doubts that the Internet can truly be a powerful democratizing force in the world, where the average person can say something and have it matter, check this out.

I started this blog last week. I've never blogged anywhere before, and a search for my name on Google isn't going to bring up any significant hits to me (except now for this post I'm about to talk about!). In other words, I'm not a "big voice" on the Internet.

A few days ago, I posted my third blog post ever to this free Blogger account. I wrote about how I liked David Weinberger's book Everything is Miscellaneous, and made an observation about how the themes he develops tie into what I work with, namely the human genome. Nothing big, maybe a little insightful (I thought it was neat, anyway). I wasn't really writing "for" anyone... this blog is just a place I can write some of my own thoughts down, and if that might be useful or interesting to someone somewhere, then all the better.

Today I'm sifting through my newsfeeds, and I see that David Weinberger has linked to my post on the main page of his book's website.

Think about that for just a second.

Thanks to the infrastructure that has been built up surrounding the Internet (Google indexing, Technorati blog indexing, folksonomic tagging, etc.), the words that I wrote were found and read by the author of the book I was talking about. This isn't a top-down organization, either: there aren't professional indexers, catalogers, and abstractors out there reading and organizing everything that gets published online. This is truly bottom-up organization, growing organically out of the miscellaneous pile of information we're growing online: the content, the usage patterns, the metadata—everything. Nobody needs to see that "Ah, Christopher Maier has published a post on "Everything Is Miscellaneous." We need to properly file his post in the "Everything is Miscellaneous" bin (or was it the "genomics" bin, or...)". Furthermore, very few people, in the grand scheme of things, are going to particularly care that I've done such a thing. However, for the people that would care about it and are looking for something about Everything Is Miscellaneous, or genomes or whatever else I talk about, this infrastructure presents it to them, as if by magic.

It is difficult, if not downright impossible, to see kind of thing happening prior to the advent of the Internet. And it's really exciting to see where this will ultimately lead.