James Tauber

journeyman of some

blog > 2008 > 11 >

James Tauber's Blog 2008/11/29

Endo vs Exo

Yesterday I was starting to get nervous about the path I'd started to go down with the generic groups app for Pinax. I was basically building a single centralized app through which different types of groups would be managed via configuration. I had a good chat with Eric Florenzano about it and we agreed that it just felt wrong but I couldn't think of an alternative.

Then this morning I had a brainwave—a shift in approach that I described on the mailing list as an "exo" approach rather than the previous "endo" approach. Instead of having a single, centralized "groups" app that's highly configurable and lets you plug different pieces in, I realised a better approach would be to simply provide the building blocks for site developers to create their own group apps bottom-up.

The advantage of the "exo" approach is that it makes group customization more like normal Django development. It's more flexible if you want to do some things differently.

After thinking about it some more, it occurred to me that the endo vs exo distinction is quite important when crafting extensible software. It's not that exo approach is always better. It's just different.

The endo approach is that of a framework whereas the exo approach is that of a library. Even a single system like Django may have some aspects that are endo and others that are exo.

An endo approach says to a developer: "we'll provide you the core with slots you can plug your pieces into". An exo approach says to a developer: "we'll provide pieces you can plug together yourself".

In Pinax, django-notifications takes a more endo approach (you register your notification types with the notification subsystem) whereas django-mailer takes an exo approach (it's just there when you want to send mail).

If you need to "register" an entity, an endo approach is probably being used.

If you put your B in the configuration of A then A is endo. Whereas if your B just calls A to do something then A is exo.

Seems a useful distinction. What do people think?

by James Tauber : Created on Nov. 29, 2008 : Last modified Nov. 29, 2008 : Categories software_craftsmanship pinax : 4 comments (permalink)

Voronoi Canvas Tutorial, Part III

In Part I, we wrote some code that enabled the user to draw points on a canvas (as long as they weren't too close to an existing point. In Part II we added the drawing of a horizontal line wherever the mouse is.

Next we're going to draw one more type of object which will be at the heart of Fortune's algorithm for Voronoi diagrams. Remember earlier this month in From Focus and Directrix to Bezier Curve Parameters I wanted to be able to calculate quadratic Bézier curve parameters from a focus and horizontal directrix. Now I can explain why :-)

A point (called the focus) and a line (called the directrix) are enough to define a parabola. For Fortune's algorithm, I need to, for each point and for the sweep line, draw the corresponding parabola. The canvas element doesn't have a parabola-drawing primitive. However, it does support Bézier curves.

A quadratic Bézier curve is actually a section of a parabola, so I what I wanted was a way of converting the focus and directrix into the parameters for a quadratic Bézier curve that canvas would understand. That was the motivation for the mathematics in that post.

Implementing those equations in Javascript gives us:

function drawParabola(fx, fy, dy) { var alpha = Math.sqrt((dy*dy)-(fy*fy)); var p0x = fx - alpha; var p0y = 0; var p1x = fx; var p1y = fy + dy; var p2x = fx + alpha; var p2y = 0;

context.strokeStyle = "rgb(100, 100, 100)"; context.fillStyle = "rgba(0, 0, 0, 0.05)"; context.beginPath(); context.moveTo(p0x, p0y); context.quadraticCurveTo(p1x, p1y, p2x, p2y); context.stroke(); context.fill(); }

This not only draws the parabola but fills the region above it (which is relevant to our purpose).

Now all we need to do is run that for each point:

function drawParabolae(dy) { $.each(points, function() { if (dy > this[1]) { drawParabola(this[0], this[1], dy); } }); }

and then add a call to that function to our mousemove handler:

... context.clearRect(0, 0, 600, 400); drawHorizontalLine(oy); drawParabolae(oy); redrawDots(); ....

You can see the result here. I was actually surprised how snappy it is, even with a lot of points. Using the Bézier curve rather than drawing the points manually was definitely the way to go.

If you think about the fact that a parabola is the locus of points equidistance from the focus and the directrix you can start to see how Fortune's algorithm is going to work. I'll make that more explicit in the next and final post.

by James Tauber : Created on Nov. 29, 2008 : Last modified Nov. 29, 2008 : Categories mathematics web javascript jquery : 2 comments (permalink)

Bayesian Classification of Pages on This Site

A year ago, in Automatic Categorization of Blog Entries, I talked about automatically categorizing (or at least suggesting categories for) blog posts using a Bayesian classifier.

I finally decided to give it a go, using Reverend.

To train it, all I basically did was:

from reverend.thomas import Bayes
from leonardo.models import Page

guesser = Bayes()

for page in Page.objects.all():
    for category in page.categories.all():
        guesser.train(category.term, page.content)

Let's pick 10 random blog entries and see how it goes guessing them:

By "nothing conclusive" I mean that the highest guess was less than 2%. It is interesting that guesses were either < 2% or were around 40% and, in the latter case, they were always correct. So at least no false positives. I wonder what the reason for the false negatives were, though.

Next I tried it against all pages (that had a category). There were 284 cases where no prediction over 5% was made. But in the 288 cases where a prediction over 5% was made, in 287 cases the prediction was correct. In only 1 case was a wrong prediction over 5% made. And it was simply that the classifier thought poincare project should have been tagged "poincare project" :-)

So the precision was basically 100% but the recall 50%.

by James Tauber : Created on Nov. 29, 2008 : Last modified Nov. 29, 2008 : Categories python this_site mathematics : 6 comments (permalink)