Stevan Little's Blog: api-design

So I was looking over Michael Schwern's perl5i module recently (after hearing about it at this years YAPC::NA) and I noticed that it enables the autobox module. This reminded me of all the debates I have had with Matt Trout over the years about the various pros and cons of autobox. So I figured this would probably make a decent blog post, so here goes.

My core objection to autobox is that it is an illusion. It works by hijacking the normal perl method resolution process and right before perl says "Cannot call method 'foo' on unblessed reference" it checks specific packages to see if there are available methods. This gives the illusion that these core perl types are in fact objects, when in reality they are very much not. If they were proper objects, they would always be objects instead of just objects within the lexical scope of the autobox pragma. Here is some code that illustrates what (for me) is the big abstraction leak of autobox.

my $test;

{

use autobox;

my $foo = [ 1, 2, 3, 4, 5 ];

warn $foo->length;

$test = sub {

warn $foo->length; # succeeds ...

$foo;

};

}

my $x = $test->();

warn $x->length; # fails

This example shows how the lexical scoping of the autobox pragma allows the $test closure to still work correctly, but once outside of the lexical scope the value is no longer autoboxed. This just seems really backwards to me because it requires the users of your code to also enable autobox in their code to use elements from your code. The result of them (for whatever reason) not doing this is that your internal usage of a data element can greatly differ from external usage of the same element. This is an API disconnect that does not sit well with me.

In short, autoboxing is a feature of the lexical environment and not something intrinsic to the element itself.

My second issue with autobox is that it is very shallow. In languages where the core types are proper objects (Smalltalk, Ruby, Javascript, etc.) it is possible to subclass/extend these core types using normal OO practices. Autobox provides the illusion of normal OO, but as soon as you look any deeper the the surface the illusion starts to crumble at an alarming rate.

While it is possible to do something close to subclassing/extending with autobox code by using the following technique, it has some severe drawbacks and serious inconsistencies.

{

package ARRAY;

sub length { scalar @{ $_[0] } }

package MyArray;

use base 'ARRAY';

# do something silly here for illustration

sub length { (shift)->SUPER::length + 1 }

}

{

use autobox;

my $foo = [ 1, 2, 3, 4, 5 ];

warn $foo->length; # 5

my $bar = bless [ 1, 2, 3, 4, 5 ] => 'MyArray';

warn $bar->length; # 6

}

The most obvious issue is that this only works for reference types (ARRAY, HASH and CODE) since Perl only allows blessing of references. So you cannot use this with SCALAR, INTEGER, FLOAT, NUMBER, STRING and UNDEF which leaves out more then half of the functionality of autobox.

Also, the manual blessing of the subclassed array ref seems a little odd since it differs from how the regular autoboxed array ref works. Of course you could create a MyArray::new method to hide this if you want. If you did this then perhaps for consistencies sake you would want to ARRAY::new as well. But unless you blessed the array ref into the ARRAY package then a user of your code would need to have autoboxing enabled for ARRAY->new to return anything useful, because (as I said above) the autoboxing is not intrinsic functionality, but instead functionality of a given lexical environment.

Now, my last issue with autobox is that if used with the wrong kind of laziness is can expose the internals of an object and defeat encapsulation and make bad APIs. This was the original motivation behind my writing MooseX::AttributeHelpers after having written Moose::Autobox. Take this example for instance.

{

package MyThings;

use Moose;

use Moose::Autobox;

has 'things' => (

is => 'ro',

isa => 'ArrayRef',

default => sub { [] },

);

my $me = MyThings->new;

$me->things->push( 1 );

}

It is very tempting to just let the autoboxing provide the API to add things to your object, but this exposes a lot of internal details to your objects consumer. If at some point you want to change how things are stored you will have a lot of work to do. Of course this is better then if users had been doing push(@{ $me->things }, 1) because you still have the encapsulation of the autoboxed APIs. But having to write an interface to match ARRAY for whatever you change things to use is just going to get nasty after a while.

Perceptive readers will also note that $me->things->push( 1 ) will not work unless autoboxing is enabled in that particular lexical environment. Again placing a lot of responsibility on the users of your code just to use the API your providing.

In contrast the MooseX::AttributeHelpers (soon to be core Moose) version is much more encapsulation friendly and is much more amenable to future changes to the storage type of things.

{

package MyThings;

use Moose;

use MooseX::AttributeHelpers;

has 'things' => (

traits => [ 'Collection::Array' ],

is => 'ro',

isa => 'ArrayRef',

default => sub { [] },

provides => {

push => 'add_thing'

}

);

my $me = MyThings->new;

$me->add_thing( 1 );

}

If you change how things are stored, you simply need to re-write the add_things method. Everything is properly encapsulated within your object as it should be.

So anyway, thats enough of my autobox ranting. I think that autobox is an extremely interesting piece of software and by no means do I think people should not use it if they are so inclined to. But I think it should be used carefully and with full knowledge of it's limitations and issues.

So Ovid recently discovered that Moose does not create any accessors by default. Which was surprising to him and truthfully has surprised many people over the years. We on #moose have discussed this many times and the general consensus has always been to leave it as it is. I explain in the comments to Ovid's post why this is so, but I figured that for my inaugural blog post I should expand on this topic.

DWIMery and the Slippery Slope

DWIMery ("Do What I Mean"-ery) can be a very valuable thing when designing APIs but it does come at a cost. The more specific your API, the easier it is to DWIM since the option set is likely pretty small and defaults are usually obvious. But the more general your API, the harder it is to strike a balance. The problem gets even more so when you are designing something like an object system or a language. A system like Moose needs to not only support doing what I mean, but also doing what everyone else means as well.

Opinionated Software vs. TimToady

Recently there has been a trend towards more "opinionated" software (Ruby On Rails) and even "opinionated" languages (Python). The popularity of both these pieces of software shows that many people like this trend. However, Moose is Perl, and in Perl we subscribe to TIMTOWDI (There Is More Than One Way To Do It). On some level, you could say that opinionated software is actually the antithesis of TIMTOWTDI.

Now this is not to say that Moose is not opinionated or is somehow the pinnacle of TIMTOWTDI. In fact Moose is actually pretty opinionated and I strongly believe that too much TIMTOWTDI is one of the reasons that Perl has the negative reputation it has for maintainability and code clarity. But what Moose does differently is to be humble about its opinions and make it easy (for some value of "easy") to override those opinions and inject your own.

Chris Prather actually suggested just such a solution in one of his comments to Ovid's post. The syntax looks something like this:

package Foo;
use Moose -traits => ['ReadOnly'];
has 'bar';
has 'baz';

This could be accomplished by making a "trait" (the Moose term for a role that is applied to a meta-level object) which would affect the metaclass such that any time an attribute was created it would force a default read-only accessor to be created. While this sounds complicated it would actually be fairly simple, the trickiest part being dealing with merging your default read-only-ness with any user specified options.

Why Moose doesn't create accessors by default

Moose has always aimed to be as Perl-ish as possible, which means trying to embody the spirit of TIMTOWTDI. As I mentioned in one of my responses to Ovids post, the choice of which type of accessors Moose should create is not so simple. My personal inclination is towards generating simple read-only accessors, others might expect read/write accessors to be the default (which is what other common Perl OO modules like Class::Accessor provide). But this ignores the suggestions that Damian made in Perl Best Practices or the people who like semi-affordance accessors (->foo for reading and ->set_foo for writing) or the people who prefer public readers/private writers. The list can go on and on, and each and every one of these is an equally valid choice.

In my mind the only solution when faced with all these differing and equally valid viewpoints is to actually favor none of them, but allow all of them. And of course, this is exactly what Moose does. I believe that this is most in keeping with the spirit of TIMTOWTDI and therefore the most Perl-ish.

Stevan Little's Blog

Sunday, July 12, 2009

Why I don't like Autobox

Wednesday, June 3, 2009

Moose and DWIMery

About Me

Blog Archive