James Tauber's Blog
Welcome to my blog. It's a haphazard collection of thoughts on various interests of mine as well as updates on projects. If you're interested in any of blogging, personal information management, Python, Django, XML, RDF, software development, Web 2.0, open source, free-market economics, Mac OS X, web architecture, REST, music theory, record producing, filmmaking, linguistics, the Greek New Testament, pure mathematics or general relativity, there's a chance you may find something of interest.
Conference Time
I have four conferences coming up in the next eight weeks.
From 12th-14th February, I'll be attending Kiwi Foo Camp in New Zealand—one of those trips where the travel time is longer than the length of the conference :-)
The day after I get back, I'm off to Atlanta for PyCon. I'm involved in the Pinax tutorial at the start and will be staying all the way through the sprints where we hope to get lots of Pinax done!
Then March 10th-12th I'm in Montréal for ConFoo, the first conference I've been to in a while that's all expenses paid for speakers. I'll be giving a talk on, you guessed it, Pinax. Will be fun to introduce Pinax at a general Web conference.
I'll finish off the month in San Jose for BibleTech March 26th and 27th. I'll be giving two talks there, one on my graded reader project and one on using Pinax for collaborative corpus linguistics (partly talking about Pinax in general and partly talking about some early stage work I'm do specifically on corpus annotation tools in Pinax).
Hope to see many of you at at least one of them!
by James Tauber : Created on Feb. 5, 2010 : Last modified Feb. 5, 2010 : Categories python linguistics morphgnt travel conferences speaking language_learning pycon greek new_testament_greek read_john pinax graded_reader : (permalink)
Zeno Processing
Say you have a stream of incoming data. Perhaps it's a database table that's monotonically increasing.
You want to do some processing on it that will take a long time because of the size of the data. Say it's one million records.
You take a snap shot and processes the million records. Say that takes 4 hours. In the meantime, ten thousand new records have come in. So you take a snapshot and process those. Say that takes two minutes. In the meantime, a hundred new records have come in. So you take a snapshot of those...and so on.
The analogy with the paradoxes of Zeno of Elea is obvious and so Nicholas Tollervey and I have decided "Zeno processing" might be a useful term for this approach.
At some point the processing is quick enough that either no new data comes in or you can take the stream down for enough time to finish off the processing.
I'm sure there's an existing name for this technique, but I like "Zeno processing".
by James Tauber : Created on Feb. 1, 2010 : Last modified Feb. 1, 2010 : (permalink)
Good Week for Launches
Last week was a pretty amazing week for the Eldarion team. Amidst a ton of client work, we managed to:
- launch the iPhone version of typewar
- launch Apple Predictions
- launch Lost Predictions
Readers of this blog who remember Potter Predictions will immediately recognize aspects of the second and third sites.
On Friday I managed to squeeze in time to attend a workshop at Harvard on Morphological Complexity where I saw a lot of familiar faces from when I was more active in my PhD.
Over the weekend, I worked on a new website for the Pinax project but didn't get that done so unfortunately missed out on launching a fourth site. I guess I also need to get back to redoing this site at some point too :-)
by James Tauber : Created on Jan. 25, 2010 : Last modified Jan. 25, 2010 : (permalink)
Fake JKM
I thought I'd kick off my blogging in 2010 with this video of my short opening talk at DjangoCon 2009 in Portland.
The talk was inspired, in part, by Rives on 4 AM.
by James Tauber : Created on Jan. 2, 2010 : Last modified Jan. 2, 2010 : Categories django conferences speaking : 0 comments (permalink)
Information Architecture On This SIte
It seems such a shame to have done so much musing about this site and its contents over the last 13 years and then not to blog about it here. So here is today's effort for my site sprint:
http://journeymanofsome.com/information_architecture/
by James Tauber : Created on Nov. 17, 2009 : Last modified Nov. 17, 2009 : Categories this_site blogging site_sprint : 27 comments (permalink)
Site Sprint
I've been thinking about re-doing this site for a while, so I've decided to do it as part of SiteSprint II.
My goals are to:
- freshen up the design
- rethink the information architecture of my site
- adapt to the shifting role blogging now has for me
- explore some new technologies like HTML5 and typekit
- rewrite the underlying code and make it more reusable
I'll be prototyping the new site over at http://journeymanofsome.com during the sprint and then will move it over to jtauber.com before the end of the year.
I'll be making daily changes there so you might want to check it out often!
In other news: today marks the 3-year anniversary of my decision to switch to using Django for my sites rather than continue to build my own framework (see Quisition Going Django)
by James Tauber : Created on Nov. 15, 2009 : Last modified Nov. 15, 2009 : Categories blogging web this_site leonardo django web_design site_sprint : 23 comments (permalink)
DjangoCon Talks on Pinax
DjangoCon is on next month in Portland Oregon and the initial talk schedule has just gone up. There are three talks on Pinax I'm directly involved with (and maybe others that will touch on Pinax):
- an introductory tutorial for Pinax beginners
- a short talk on how to contribute to Pinax
- a general State of Pinax talk
If you're new to Pinax, you should consider attending the first talk. If you're already using Pinax you may still learn some stuff but we'll definitely tread some ground that would already be familiar to you.
The second talk is just a brief introduction to our development process works, how you can get involved and contribute and what sorts of things you might want to work on. If you've already started using Pinax and are looking to learn how to get more involved in the project itself, this short talk is for you.
The third talk is an updated State of Pinax talk that will cover what's in 0.7 and what the plans are for 0.8 and 0.9 and beyond. It should have something of interest for everyone, Pinax beginners and experts alike.
We'll also be sprinting on Pinax too following the conference itself. If you're interested in joining the sprint, the How to Contribute to Pinax talk is probably a must.
I look forward to seeing a bunch of you at DjangoCon.
Eldarion is a silver sponsor of DjangoCon.
by James Tauber : Created on Aug. 11, 2009 : Last modified Aug. 11, 2009 : Categories django conferences speaking pinax : (permalink)
Eldarion Launched
With my visa finally getting processed, me being able to be employed by the company I helped found and my return to the US, I am delighted to be able to announce that we've launched the Eldarion website and with it, the company.
http://eldarion.com/
Just a single page at the moment, but I think it's a good start. Thanks to Ryan Berg for turning my initial design into something good looking. And of course thanks to Greg Newman for the logo, which I've previously talked about.
by James Tauber : Created on June 29, 2009 : Last modified June 29, 2009 : Categories django announcements pinax : (permalink)
Affianced
Well, it's been a two-and-a-half month blog drought but it's worth breaking for this news.
Two weeks ago, the love of my life, Lisa, and I got engaged in Paris.
Of course if you follow me on Twitter or Facebook, you knew that at the time (or shortly thereafter) but it felt like it should be declared here too!
I first met (and fell for) Lisa in 1992 although didn't see her again until 1996. People often ask me what took me so long to ask. Truth is, I would have proposed in 1996 (okay, maybe 1997) if I thought she would have said yes. I guess entrepreneurs and lawyers have different risk profiles :-)
My time with her, especially the last four years, have been the best of my life.
I'd always imagined I'd marry my best friend. And now I will be.
by James Tauber : Created on June 25, 2009 : Last modified June 25, 2009 : Categories announcements personal : 5 comments (permalink)
Pinax Video and Move to GitHub
The audio isn't great (they didn't seem to be directly recording the mic I was holding) and it got cut short, but for what it's worth, here's the video of my talk on Pinax at PyCon 2009:
Following the conference proper, we had a great sprint that got us within a stone's throw of a 0.7 beta release. One of the first things we did was move from svn at Google Code Project Hosting to git at GitHub. The new home of the Pinax source code is now:
http://github.com/pinax/pinax/
We also moved our wiki and task tracking off Google Code Project Hosting to our own, Pinax-based system:
Enjoy!
by James Tauber : Created on April 7, 2009 : Last modified April 7, 2009 : Categories pycon pinax : 4 comments (permalink)
The Long Pay Off: Battlestar Galactica and Lost
NOTE: This may contain SPOILERS for Babylon 5, Battlestar Galactica and Lost.
I'm a big fan of episodic television with long arcs. I love pay-offs that are separated by months or even years from their initial set up, where you can go back and watch earlier episodes and see things you never noticed before.
Back when I was watching the X Files, it was always the so-called "myth arc" episodes I liked and never the "monster of the week". But it eventually became clear that even the myth arc episodes in the X Files were being made up as they went along. I felt duped.
So it was with great anticipation in 2000 that I started watching reruns of Babylon 5, a show famous for having been planned out in advance. Despite numerous flaws, this planning paid off multiple times. There were multiple OMFG moments where you realize how pieces of the puzzle from previous seasons fit perfectly together. In one classic multi-season arc there is a mysterious episode in season 1 involved time travel two years in the future. In season 3 you see the other half. Amazing stuff that would be very hard to pull off without the show being planned in advance.
Of course there are dangers with planning in advance. B5 suffered from story changes due to actors leaving.
This brings me to two of my favourite shows on TV at the moment: Battlestar Galactica and Lost.
Lost has supposedly been planned in advance and BSG largely made up as they went along.
If you'd asked me at the end of last year, I would have said the two shows were counter examples to the need for planning in advance. I didn't feel like Lost was giving me the payoffs and I felt like BSG was doing a great job without the long arcs. This year I've changed my mind. Lost is starting to get some pretty cool payoffs (although it could still screw things up) and BSG is breaking up under the stress of not being worked out in advance (although perhaps the finale will redeem it).
At least BSG had a definite end point planned for a while. There's nothing worse than a show that pretends to be teleological but is being written ad hoc with no real end in sight.
There are also opportunities available to the story teller when things aren't planned too much in advance. With BSG I didn't feel there were any truly amazing payoffs (again, the finale could prove me wrong) but there were certainly moments that were so out of left field and unpredictable it made for awesome story telling. It leads to a completely different kind of OMFG moment. Had BSG been planned in advance, those events might have been easier to see coming.
I guess that's the trade off. Plan in advance and you can get amazing payoffs but you run the risk of leaving too many clues so there are no real surprises. Write ad hoc and you can totally surprise the audience but also expose yourself to too many plot holes. BSG has one more shot to see if it can successfully close some of those holes.
Despite their flaws, I think Battlestar Galactica and Lost are both excellent examples of what works about each approach.
UPDATE: Here's a (spoiler filled) post on 12 Plotholes That Must Be Filled in the Battlestar Finale
by James Tauber : Created on March 20, 2009 : Last modified March 20, 2009 : 2 comments (permalink)
Eldarion Logo
For years, I've had "Eldarion" in the back of my mind as a name I'd like to use for some creative endeavour. It is a reference to Tolkien: Eldarion was the son of Aragorn and Arwen and the Second High King of the Reunited Kingdom.
Much to my surprise, the domain was available when I checked back in 2002, so I registered it immediately.
At first I thought I'd use the name for my film and music endeavours. When I attended SXSW Film and Music in 2005, I actually had business cards printed that said "Eldarion". They were black cards with white writing, a reference to the Gondorian flag.
I planned a logo that would feature the White Tree of Gondor and the seven stars of the House of Elendil but never had the skill to pull it off.
With a need for a logo coming up again recently, I approached my friend Greg Newman. My brief to him was pretty much as follows:
- use the text "eldarion" in Anivers (the font family I'd chosen a couple of months earlier)
- feature some reference to the White Tree and/or the Seven Stars
- make it silver on dark grey rather than white on black to give it more life
He blew me away with the result and the response has been phenomenal.
It may strike some as unusual to choose a fantasy reference for a high-tech startup. But I'm reminded of a wonderful scene in the British comedy Yes Prime Minister where the PM is advised that if his first television broadcast is to say nothing new and exciting then he should wear a modern suit, the background should feature abstract paintings and the opening music should be Stravinsky. On the other hand, if the broadcast is to contain radical new announcements, then he should wear a dark suit, the background should feature oak paneling, leather volumes and 18th century portraits and the opening music should be Bach.
The same idea applies, I think, to choosing ancient symbols from the legendarium of Tolkien for what is intended to be a very modern and forward thinking new company.
Stay tuned for lots more about Eldarion.
by James Tauber : Created on March 2, 2009 : Last modified March 2, 2009 : 6 comments (permalink)
Kindle 2: First Impressions
I was asleep when I heard a knock at the door. I knew it was UPS. I jumped up and threw some clothes on but by the time I got to the door, they had gone. I ran down the hall to the elevator. I had just missed them. Caught the next elevator and caught them in the lobby of my apartment building just as they were leaving. Phew! I had my Kindle 2.
I didn't have the original Kindle and I've never used any kind of electronic reader so this was a new experience for me. I love books and own A LOT of them. But spending time on three continents, at conferences and on airplanes means I'm looking forward to more books at my fingertips and avoiding the agony of "which 0.1% of my books can I take with me on this flight".
Here are my first impressions:
- they put thought into the unpackaging experience
- even though I've seen the photos, it still seemed smaller once I held it in my hand than I thought it would be
- the Amazon leather cover I bought adds a bit of weight and bulk to the device
- the fact the screen can show stuff while the device is turned off freaked me out at first
- the device is comfortable to hold and I imagine being able to read for long periods with this
- as an electronic-ink newbie, I was very impressed by the readability of the screen
- I didn't like the font when I first looked at it but once I started reading it didn't bother me
- it worked out of the box, was linked to my Amazon account and had a book I'd bought before the device arrived
- knowing how far I am through a book as a percentage is a little freaky at times
- it took me a little while to get the handle of the navigation (beyond basic turn pages, which is easy)
- text-to-speech is impressive but can't imagine using it
- I wish trying to go up from the first selection would wrap around to the bottom selection. It's too cumbersome to select a choice towards the end of the screen
- downloading books is FAST — couple of seconds for each of the two books I bought
- browsing web pages is like turning CSS off
- User Agent came through as "Mozilla/4.0 (compatible; Linux 2.6.10) NetFront/3.4 Kindle/1.0 (screen 600x800)"
- I'm already thinking about web applications for it (especially my graded reader ideas and flashcards)
- the immediacy and ease at which you can buy books could be dangerous :-)
I'm very happy so far and can imagine buying the majority of my books for the Kindle from now on. The real test will be whether I'll go back and re-buy any of my existing books (especially the ones back in storage in Australia)
UPDATE: just discovered the Web browser has an Advanced Mode that does CSS and Javascript. This site doesn't look too great with it, though :-)
by James Tauber : Created on Feb. 25, 2009 : Last modified Feb. 25, 2009 : Categories books kindle : 1 comment (permalink)
Leaving mValent
People who follow me on twitter or are friends on Facebook already know this, but last week I officially resigned from mValent.
It was over seven years ago that Duane Tharp and Clyde Logue approached me about joining them in a new venture post-Bowstreet. I had already made the decision to move back to Australia from the US and they were still willing to hire me despite being remote. Within a year we were venture funded, had hired a CEO and VP Engineering (who became my boss) and were starting to build the team.
mValent went through distinct phases and so I don't necessarily feel like I've worked in the same place for seven years but rather three or four different companies. I don't think I ever really did the job I thought I would be doing, but I learnt a tremendous amount on the business side of things and made a number of really good friends.
But it's time for me to move on. The technology I helped create and the people I helped hire now have an excellent home at Oracle. I'm ready for something new.
As to what's that is: I hope to have more to say very soon...
by James Tauber : Created on Feb. 23, 2009 : Last modified Feb. 23, 2009 : Categories mvalent announcements personal entrepreneurship : 4 comments (permalink)
Reading Apple ][ DOS 3.3 Disk Images with Python
I was feeling nostalgic for the days of Apple ][ DOS 3.3 and started re-familiarizing myself with the disk layout (VTOC, catalog entries, track/sector lists, etc)
Of course, I couldn't help but then implement them in Python.
Here is a python module that reads Apple ][ DOS 3.3 disk images and can both list the disk catalog and also dump the contents of a file.
http://jtauber.com/2009/02/15/a2disk.py
Enjoy!
by James Tauber : Created on Feb. 15, 2009 : Last modified Feb. 15, 2009 : 3 comments (permalink)
Oracle to buy mValent
People who follow me on twitter already know this, but yesterday Oracle announced its intention to acquire mValent, the company I helped start in 2002. Congratulations to all involved! We built a great team, then a great product, then made our customers successful with that product. As with any startup, there was always a gap between what we wanted to do and what we could do given our resources. As part of Oracle, mValent is going to be able to close that gap in a huge way.
by James Tauber : Created on Feb. 5, 2009 : Last modified Feb. 5, 2009 : Categories mvalent : 7 comments (permalink)
Population Ratios of Top Three Urban Agglomerations
Some countries have a single large urban agglomeration that is much bigger than other urban areas in the country. Others have three or more major agglomerations that are pretty close in size.
For a while I've wanted to visualize different countries depending on whether they were more like the former or more like the latter.
So I took some data from http://www.citypopulation.de/ and used HTML Canvas to create a bubble plot of all countries with three or more agglomerations over one million people. I used the ratio of the population of the largest agglomeration and the second largest as the y coordinate and the ratio of the population of the second largest and the third largest as the x coordinate.
Here is the result:
Population Ratios of Top Three Urban Agglomerations
Besides this criterion, I can't immediately think of other similarities Argentina, the Philippines, France or South Korea have; or India, China, Italy and the Netherlands; or Russia and Bangladesh.
by James Tauber : Created on Feb. 2, 2009 : Last modified Feb. 2, 2009 : Categories human_development : 0 comments (permalink)
Moving to Distutils
reposted from http://code.google.com/p/django-hotclub/wiki/MovingToDistutils
the how and why of Pinax's move to distutils
Pinax is changing the way that external dependencies are brought in during development on trunk. Note that this document is only talking about changes in how things work and will work on trunk, NOT necessarily how they will work with a released version of Pinax.
Until recently, Pinax had two choices for a given external dependency:
- use svn:externals and point to the external dependency's svn repository
- include the external dependency code in the Pinax codebase
However, there are problems with this approach:
- it largely relies on external dependencies being in svn and this is increasingly not the case (although it was when Pinax started)
- it makes it difficult for Pinax itself to move away from svn
- there is no management of dependencies between external dependencies, nor between particular projects in Pinax and their individual dependencies
To solve these problems and more, Pinax is switching to a distutils-based approach. This means:
- externals dependencies are encouraged to be released as distutil-compliant packages with a valid setup.py and put on PyPI
- development versions of dependencies can be pulled in in a variety of different ways including from git, hg or bzr repositories
In order to develop from the Pinax trunk, you will need to use pip. Because some external dependencies are retrieved via git and bzr you will also need those if using Pinax trunk.
Although we will eventually have per-project requirements files, there are currently two requirements files that describe to pip what dependencies to bring in and how:
- pinax/requirements/libs.txt
- pinax/requirements/external_apps.txt
The former is actually a requirement of the latter so you can bring in all external dependencies with:
pip install -r pinax/requirements/external_apps.txt
We strongly recommend the use of virtualenv in conjunction with pip to allow isolated environments to be set up without Pinax having to hack PYTHONPATH.
by James Tauber : Created on Jan. 31, 2009 : Last modified Jan. 31, 2009 : Categories python django pinax : 2 comments (permalink)
Retweet
@gruber: Theory: the people who send out “RT” tweets are the same fuckers who came in and ruined Usenet a decade ago.
@jtauber: couldn't resist: RT @gruber Theory: the people who send out “RT” tweets are the same fuckers who came in and ruined Usenet a decade ago.
@akuchling: @jtauber ouch! Now I'll feel guilty every time.
@pydanny: @jtauber So very true!
@jtauber: @akuchling I disagree with him but couldn't resist the irony
@jtauber: @pydanny I was just trying to be ironic :)
@curtclifton: @jtauber LOL
@edwelker: Damn straight.
@TokyoDan: @jtauber How’s that? I RT when I think people who follow me might find the subject interesting. And it gets the originator more followers.
@jtauber: okay, I think I'm going to have to spell this out: I don't mind RTs. I was just RTing @gruber's dislike of RTs to be funny
@evilrob: @jtauber: you don't like RTs?
@brosner: RT @evilrob: @jtauber: you don't like RTs?
@20seven: RT @brosner: RT @evilrob: @jtauber: you don't like RTs?
@jtauber: okay, this whole RT thread needs to go on xkcd right now
@mak1e: RT @brosner: @evilrob: @jtauber: you don't like RTs?
@ericflo: RT @20seven RT @brosner: RT @evilrob: @jtauber: you don't like RTs?
@bryanveloso: RT @ericflo: RT @20seven RT @brosner: RT @evilrob: @jtauber: you don't like RTs?
UPDATE: Okay, I now agree with @gruber
by James Tauber : Created on Jan. 28, 2009 : Last modified Jan. 28, 2009 : 6 comments (permalink)
Serving Up User Contributed Media From A Separate Server
One commonly recommended practice in Django (although applicable elsewhere) is to serve up your static media from a different server than the one running Django for dynamic pages.
This becomes a slight challenge when you have user-contributed media (like allowing users to upload photos).
Here are some possibilities I can think of:
- mount the media server from the Django server via NFS so write from Django directly to the media server
- have the Django server write locally but then subsequently move the files to the media server via rsync (or similar)
- like the previous case but to avoid 404s before the file has been moved, the relevant model has a flag for whether the file is ready or not and return an "in progress" message until the file is done (not sure how Django would know the file has successfully been moved, though, unless a Django-based worker process was doing it instead of just rsync)
- similar to the previous two cases but initially store the image on (and serve it from) the Django server until the flag says the URL to the media server can be used
- have Django running on the media server just for the purpose of receiving media POSTs (but no dynamic page generation)
- having some other WSGI (or similar) process running on the media server to receive media (and then communicate with the Django server on the backend)
I guess S3-based solutions add some extra issues but ideas 2, 3 and 4 would be applicable.
Anyone have experiences (good or bad) with any of these? Any possibilities I'm missing?
by James Tauber : Created on Jan. 26, 2009 : Last modified Jan. 26, 2009 : Categories django web : 16 comments (permalink)