Sunday, July 12, 2009

Why I don't like Autobox

So I was looking over Michael Schwern's perl5i module recently (after hearing about it at this years YAPC::NA) and I noticed that it enables the autobox module. This reminded me of all the debates I have had with Matt Trout over the years about the various pros and cons of autobox. So I figured this would probably make a decent blog post, so here goes.

My core objection to autobox is that it is an illusion. It works by hijacking the normal perl method resolution process and right before perl says "Cannot call method 'foo' on unblessed reference" it checks specific packages to see if there are available methods. This gives the illusion that these core perl types are in fact objects, when in reality they are very much not. If they were proper objects, they would always be objects instead of just objects within the lexical scope of the autobox pragma. Here is some code that illustrates what (for me) is the big abstraction leak of autobox.

my $test;
{
    use autobox;
    my $foo = [ 1, 2, 3, 4, 5 ];
    warn $foo->length;
    $test = sub {
        warn $foo->length; # succeeds ...
        $foo;
    };
}

my $x = $test->();
warn $x->length; # fails

This example shows how the lexical scoping of the autobox pragma allows the $test closure to still work correctly, but once outside of the lexical scope the value is no longer autoboxed. This just seems really backwards to me because it requires the users of your code to also enable autobox in their code to use elements from your code. The result of them (for whatever reason) not doing this is that your internal usage of a data element can greatly differ from external usage of the same element. This is an API disconnect that does not sit well with me.

In short, autoboxing is a feature of the lexical environment and not something intrinsic to the element itself. 

My second issue with autobox is that it is very shallow. In languages where the core types are proper objects (Smalltalk, Ruby, Javascript, etc.) it is possible to subclass/extend these core types using normal OO practices. Autobox provides the illusion of normal OO, but as soon as you look any deeper the the surface the illusion starts to crumble at an alarming rate. 

While it is possible to do something close to subclassing/extending with autobox code by using the following technique, it has some severe drawbacks and serious inconsistencies. 

{
    package ARRAY;
    sub length { scalar @{ $_[0] } }
    
    package MyArray;
    use base 'ARRAY';
    # do something silly here for illustration
    sub length { (shift)->SUPER::length + 1 }
}

{
    use autobox;
    my $foo = [ 1, 2, 3, 4, 5 ];
    warn $foo->length; # 5

    my $bar = bless [ 1, 2, 3, 4, 5 ] => 'MyArray';
    warn $bar->length; # 6
}

The most obvious issue is that this only works for reference types (ARRAY, HASH and CODE) since Perl only allows blessing of references. So you cannot use this with SCALAR, INTEGER, FLOAT, NUMBER, STRING and UNDEF which leaves out more then half of the functionality of autobox.

Also, the manual blessing of the subclassed array ref seems a little odd since it differs from how the regular autoboxed array ref works. Of course you could create a MyArray::new method to hide this if you want. If you did this then perhaps for consistencies sake you would want to ARRAY::new as well. But unless you blessed the array ref into the ARRAY package then a user of your code would need to have autoboxing enabled for ARRAY->new to return anything useful, because (as I said above) the autoboxing is not intrinsic functionality, but instead functionality of a given lexical environment. 

Now, my last issue with autobox is that if used with the wrong kind of laziness is can expose the internals of an object and defeat encapsulation and make bad APIs. This was the original motivation behind my writing MooseX::AttributeHelpers after having written Moose::Autobox.  Take this example for instance.

{
    package MyThings;
    use Moose;
    use Moose::Autobox;
    
    has 'things' => (
        is      => 'ro',
        isa     => 'ArrayRef',   
        default => sub { [] },
    );
    
    my $me = MyThings->new;
    
    $me->things->push( 1 );
}

It is very tempting to just let the autoboxing provide the API to add things to your object, but this exposes a lot of internal details to your objects consumer. If at some point you want to change how things are stored you will have a lot of work  to do. Of course this is better then if users had been doing push(@{ $me->things }, 1) because you still have the encapsulation of the autoboxed APIs. But having to write an interface to match ARRAY for whatever you change things to use is just going to get nasty after a while.

Perceptive readers will also note that $me->things->push( 1 ) will not work  unless autoboxing is enabled in that particular lexical environment. Again placing a lot of responsibility on the users of your code just to use the API your providing.

In contrast the MooseX::AttributeHelpers (soon to be core Moose) version is much more encapsulation friendly and is much more amenable to future changes to the storage type of things.

{
    package MyThings;
    use Moose;
    use MooseX::AttributeHelpers;
    
    has 'things' => (
        traits   => [ 'Collection::Array' ],
        is       => 'ro',
        isa      => 'ArrayRef',   
        default  => sub { [] },
        provides => {
            push => 'add_thing'
        }
    );
    
    my $me = MyThings->new;
    
    $me->add_thing( 1 );
}

If you change how things are stored, you simply need to re-write the add_things method. Everything is properly encapsulated within your object as it should be. 

So anyway, thats enough of my autobox ranting. I think that autobox is an extremely interesting piece of software and by no means do I think people should not use it if they are so inclined to. But I think it should be used carefully and with full knowledge of it's limitations and issues. 


3 comments:

  1. A propos your first objection - maybe I am interpreting it wrongly - but if autobox is enabled globally then it will work outside of the closure.

    ReplyDelete
  2. @zby, yes if it could be enabled globally that would help my first objection. However, it does not fix the second and third issues.

    To be honest, I used to look at the first issue as a "feature" and not a bug since it "allowed" the consumer of your code to choose if they wanted to use autobox or not. However the more I played with autobox and the more I thought about it, the API inconsistency felt wrong.

    ReplyDelete
  3. Hi, Stevan.

    Thanks for your comments.

    I disagree with your first point, but I rather suspect that you disagree with it as well :-) If nothing else, describing autobox's scoped behaviour as a "leaky" abstraction or API is a creative interpretation of Joel's original definition. It may be sucky, un-perlish, or incomplete, but, it doesn't leak implementation details, as far as I know, and certainly doesn't leak syntactic sugar outside of its scope - which is the very thing your first point laments. Call it a constipated abstraction and I think we might agree :-)

    Your second point can be worked around in a couple of ways, none of which require the autoblessing you opted for, but I take the point that it could be a) better documented and b) DRYer. At the very least I need to add a way to clobber/prepend a binding rather than always augmenting:

    http://pastie.org/559428

    Your third point looks like a best-practices issue, which I almost certainly agree with, although I haven't yet had a chance to play with Moose (unfortunately), so I can't say much more than that.

    P.S.

    "In languages where the core types are proper objects (Smalltalk, Ruby, Javascript, etc.) it is possible to subclass/extend these core types using normal OO practices."

    You can't subclass Fixnum, Bignum, Integer or Float in Ruby, and can't subclass Array in JavaScript without iframe hacks and/or contaminating Array.prototype. In addition, of course, both of those languages have suffered from "Chainsaw Infanticide Stress Disorder" precisely because their extension methods aren't lexically-scoped.

    http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/153380
    http://webreflection.blogspot.com/2008/03/sorry-dean-but-i-subclassed-array-again.html
    http://avdi.org/devblog/2008/02/23/why-monkeypatching-is-destroying-ruby/
    http://www.codinghorror.com/blog/archives/001151.html

    ReplyDelete