taw's blog: June 2008

Sunday, June 15, 2008

How to create value by blogging

I'm hardly blogging these days, so I though I'd at least open a Twitter account - and here it is, and some stuff I've done recently like London Barcamp 4 and getting OLPC XO-1 laptop weren't blogged at all, just twitted. This is all caused by a certain problem with blogging - people really like to read long blog posts. Just look at the "Popular posts" sidebar - it's long post after another. People even like insanely long blog posts like those by Steve Yegge. But long posts take too much time and energy, so most of the posts written by me and other bloggers are pretty short. So I somehow thought that writing stort blog posts isn't really worthwhile, and tweeted instead.

Today I thought maybe it's time to verify it, so for some hard data I collected statistics from Google Analytics (unique page views) and del.icio.us (number of bookmarks except of mine), and divided my 202 blog posts into ten buckets depending on their plain text size.

And indeed - it seems that almost nobody cares about the short blog posts and the longer the post is the more people read it. The difference is even more pronounced when counting del.icio.us bookmarks. I feel that the number of del.icio.us bookmarks is a much better indicator of post's "value" than page views, as page view is generated before the reader even seen the post, while bookmark is generated only after it was read and decided to be valuable. Search good vs experience good.

Bucket	Posts	Average size	Average page views	Average bookmarks	Average page views per kB	Average bookmarks per kB
1	20	16742	1776	20.0	126	1.4
2	20	6890	1513	11.0	224	1.6
3	21	4590	1036	4.9	245	1.1
4	20	3456	496	3.8	149	1.1
5	20	2803	314	2.3	114	0.9
6	20	2129	301	1.1	146	0.5
7	20	1656	402	1.9	238	1.1
8	21	1188	223	0.2	177	0.2
9	20	789	276	1.2	399	1.7
10	20	421	143	0.2	384	0.5

This confirms my beliefs and disproves the commonly held ADHD theory of blog readers which states that most blog readers have very short attention spans and would much rather look at the kittens. It seems that to the contrary, reader really love long posts. At least my readers. You'll still be getting kittens of course, my blog would look quite empty without them.

On the other hand a completely different picture arises when page views per kB and bookmarks per kB are measured. Bookmarks per kB is pretty flat, while page views per kB is going down fast. So if kBs of text are a good measure of blogger's effort then the best way of generating value is writing tons of stort posts.

I'm so undecided. Is it better to write fewer long posts, many of which would be big hits (relative to the blog popularity of course, this isn't I CAN HAS CHEEZBURGER), or rather many posts which would generate less value per post but more value overall. I'm kinda writing for myself, but I still think if the post would be valuable to the average reader or not before posting it. I should probably simply keep posting instead of thinking too much.

Thursday, June 12, 2008

Bolting Aspect Oriented Programming on top of Python

The bigest difference between native support and bolting things on top of a programming language is that you can only bolt so much before things start to collapse. In C++ even strings, arrays, and hashtables are bolted on - and while they work just fine any interoperability between different libraries using strings, arrays, and hashtables is almost impossible without massive amount of boilerplate code.

In Perl and Python these basic data structures are native and well supported, but the next step of supporting objects is bolted on. So the objects work reasonably well, but metaprogramming with them is very difficult and limited (in Python) or outright impossible in any sane way (in Perl).

Ruby takes a step further and real object-oriented programming is native, so people can bolt other things on top of it like aspect-oriented programming. AOP in Ruby (before_foo, after_bar, alias_method_chain, mixins, magical mixins, many method_missing hacks etc.) works reasonably well, but I wouldn't want to bolt anything on top of that, or the whole thing would fall apart.

This is the problem with bolting stuff on - bolting stuff on in a valid technique (just like design patterns, code generation and other band-aids), and bolted-on stuff like objects in Perl/Python or arrays/strings/hashtables in C++ do work, they're just infinitely less flexible than native types when it comes to further extending.

But I really miss AOP in Python. Multiple inheritance can kinda simulate very weak kind of mixins, but is rather cumbersome to use. I wanted to write a test suite using aspect-oriented mixins, but there were simply so many super(omg, who).made(up, this, syntax) calls that it looked as painful as Java inner classes. So I thought - would it already collapse if I added a very simple AOP support?

It turned out not to be so bad. Here's a distilled example. There's a bunch of classes inheriting from BaseTest. Their setup methods should be called from superclass down to subclass, while their teardown methods should be called from subclass up to superclass. If there are multiple AOP methods on the same level all should be called, in some consistent order (I do alphanumeric, order of definition would be better but Python metaprogramming isn't powerful enough to do that). It's also possible to override parent's AOP methods (you could even compose AOP method override using super if you really needed). Or you could override the whole setup/teardown method if you really needed - this is very flexible.

class BaseTest(object):
  def setup(self):
    aop_call_down(self, 'setup')
  def teardown(self):
    aop_call_up(self, 'teardown')

class WidgetMixin(object):
  def setup_widget(self):
    print "* Setup widget"
  def teardown_widget(self):
    print "* Teardown widget"

class Foo(BaseTest):
  def setup_foo(self):
    print "* Setup foo"
  def teardown_foo(self):
    print "* Teardown foo"

class Bar(WidgetMixin, Foo):
  def setup_bar(self):
    print "* Setup bar"
  def teardown_bar(self):
    print "* Teardown bar"

class Blah(Bar):
  def setup_blah(self):
    print "* Setup blah1"

  def setup_widget(self):
    print "* Setup widget differently"

  def setup_blah2(self):
    print "* Setup blah2"

  def teardown_blah(self):
    print "* Teardown blah1"

  def teardown_blah2(self):
    print "* Teardown blah2"

The output of a = Bar(); a.setup(); a.teardown() is exactly what we would expect:

* Setup foo
* Setup widget
* Setup bar
* Teardown bar
* Teardown widget
* Teardown foo

The more difficult case of b = Blah(); b.setup(); b.teardown() is also handled correctly - notice that setup of widget mixin was overriden:

* Setup foo
* Setup widget differently
* Setup bar
* Setup blah1
* Setup blah2
* Teardown blah2
* Teardown blah1
* Teardown bar
* Teardown widget
* Teardown foo

The code to call make it possible isn't strikingly beautiful but it's not any worse than some of my Django templatetags.

def aop_call_order(obj, prefix):
  already_called = {}
  for cls in reversed(obj.__class__.mro()):
    for name in sorted(dir(cls)):
      if name[0:len(prefix)+1] != prefix + '_':
        continue
      if not already_called.has_key(name):
        yield(name)
      already_called[name] = True

def aop_call_up(obj, prefix):
  for name in reversed(list(aop_call_order(obj, prefix))):
    getattr(obj, name)()

def aop_call_down(obj, prefix):
  for name in aop_call_order(obj, prefix):
    getattr(obj, name)()

aop_call_order returns a list of methods with names like prefix_* defined in obj's ancestor classes in order of Python's multiple inheritance resolution, falling back to alphabetic if they're on the same layer. Overriding a method in subclass doesn't affect the order, making the "Setup widget differently" trick possible. aop_call_down and aop_call_up methods then call these methods in straight or reverse order.

Of course like all other multilayer bolted-on features, it's going to horribly collapse if you use it together with other metaprogramming feature. If you don't like that - switch to Ruby.

Coming up next - bolting closures on top of Fortran.