James Tauber's Blog 2005/07
Home
Finally home in Perth after almost six months away.
Flights were uneventful other than losing my luggage somewhere between Boston and Brisbane. Still no word whether they've found it.
I have a tremendous amount to catch up on, not sure where to start.
Stay tuned. I'll be back to blogging lots now.
UPDATE (2005-08-02): Luggage has been found and returned.
by jtauber : Created on Aug. 1, 2005 : Last modified Aug. 2, 2005 : Categories personal travel : 1 comment (permalink)
I'm Proud
I'm proud of Elliot Cohen, who I'm mentoring in Google's Summer of Code. He's writing unit tests before implementing code.
Be careful Elliot; it's addictive :-)
by jtauber : Created on July 27, 2005 : Last modified July 27, 2005 : Categories software_craftsmanship summer_of_code : 0 comments (permalink)
Last Days
These are the last few days of my marathon trip to the US. On Friday, I'll be heading home after being away for almost six months.
Things are very busy at work (in a good way) and that, combined with preparations for leaving (like working out how to ship all the stuff I've accumulated here, in Austin, Palm Beach and Europe) means that I probably won't have time for much else until I get home on Sunday.
Apologies for the paucity of posts of late. I'll be back in full force next week :-)
by jtauber : Created on July 27, 2005 : Last modified July 27, 2005 : 0 comments (permalink)
Leonardo and Atom 1.0
Dave Warnock has been working on Atom 1.0 support in Leonardo. We decided it would be a good opportunity to start a better separation of the Leonardo core from individual plugins, so he is working on the plugin itself and I'll work on updates to the core, which I'll release as 0.7.
by jtauber : Created on July 23, 2005 : Last modified July 23, 2005 : Categories python leonardo : (permalink)
Kaju Katli
My favourite Indian sweet is Kaju Katli.
Kaju means cashew. Are the two words cognate? Or is one a loan word (and in which direction?)
UPDATE (2005-07-22): Merriam-Webster claims cashew is from the Portuguese acajú. Platt's Dictionary of Urdu, Classical Hindi and English says that काजू is probably from...you guessed it—the Portuguese word acajú. But even the Portuguese acajú is just a loan word from the Tupi acajú.
by jtauber : Created on July 22, 2005 : Last modified July 22, 2005 : Categories linguistic_observations : 3 comments (permalink)
Indexing Time
Dave Warnock and I have been talking about indexing entries in Leonardo by last updated time.
We want to be able to retrieve the entries between A and B, or the n entries after A or the n entries before B where A and B can be either ordinals or times.
I'm guessing the right way of doing it would be some sort of balanced tree.
The nature of the data is that insertions will almost entirely be at the end, retrieval will largely be at the end and deletions will probably be fairly distributed.
Just as a preliminary, I've written an unbalanced tree, although I haven't finished implementing the kinds of queries we want to be able to do on the tree.
Any suggestions on algorithms and/or implementations in Python?
Most implementations don't seem to come out of the box with the kind of "slice" queries we want to do (or even both key- and ordinal-based queries).
by jtauber : Created on July 19, 2005 : Last modified July 19, 2005 : Categories python leonardo algorithms : 1 comment (permalink)
Problems Setting Up Firewall on OS X to Accept Mail
I'm really struggling with setting up my remote mac mini to receive mail. The problems seem to be in the configuration of ipfw.
Even with:
allow tcp from any to any dst-port 25 in
ipfw is logging a Stealth Mode connection attempt when I attempt to send mail to it.
Any ideas?
UPDATE (2005-07-18): Wasn't the firewall, it was Postfix. Apple's default main.cf file sets inet_interfaces twice and I had made changes to the first instance (which was then overridden by the second)
by jtauber : Created on July 18, 2005 : Last modified July 18, 2005 : Categories os_x : (permalink)
Equivalence Relations
To formalize path homotopy as a way of distinguishing certain topological spaces, we need to introduce the notion of an equivalence relation and an equivalence class. We'll introduce the former here.
Consider a set A of objects. We pick certain pairs of elements in A and say they have a particular relation to one another. In other words, a relation R on A (or more accurately, a binary relation on A) is simply a choice of pairs—a subset of A x A. If <a, b> is in R then we say that a has the relation R with b. We can also write this aRb.
If A is a set of people, R might be something like "is the father of". And so if d is Darth Vader and l is Luke Skywalker, then dRl.
A relation is said to be an equivalence relation iff it is a relation with the following properties:
- Reflexivity: xRx for all x in A
- Symmetry: if xRy then yRx
- Transitivity: if xRy and yRz then xRz
Our "is the father of" relation violates all three and so it certainly not an equivalence relation.
Something like "is less than" on the set of reals is transitive but not reflexive or symmetric and so is not an equivalence relation.
Something like "is less than or equal to" on the set of reals is transitive and reflexive but still not symmetric and so is not an equivalence relation.
Equality is an equivalence relation as it has all three necessary properties. Two topological spaces being homeomorphic is also an equivalence relation.
Importantly for us, path homotopy is an equivalence relation.
UPDATE: next post
by jtauber : Created on July 16, 2005 : Last modified Sept. 28, 2005 : Categories poincare_project : (permalink)
Parts of Speech and Number of Accents
I thought I'd write a quick Python script to check how many accents were on each of the lemmata in MorphGNT 5.06.
Here are the counts by part of speech and number of accents on lemma:
| 0 | 1 | 2 | |
| A | - | 9159 | - |
| C | 924 | 17361 | - |
| D | 1592 | 4606 | - |
| I | - | 17 | - |
| N | 30 | 28271 | 1 |
| P | 5433 | 5488 | - |
| RA | 19862 | 4 | - |
| RD | - | 1744 | - |
| RI | - | 1165 | - |
| RP | - | 11584 | - |
| RR | - | 1677 | - |
| V | 8 | 28101 | 1 |
| X | 147 | 844 | - |
Some of the low numbers are definitely errors in the database. Now to investigate...
UPDATE (2005-07-16): both 2-accent cases were mistakes. The 30 0-accent nouns and 5 of the 0-accent verbs were foreign loan words that intentionally weren't accented but 3 of the 0-accent verbs were mistakes. The 4 accented articles were the result of crasis with the following noun and the word should probably be analyzed as a noun rather than an article. I guess there'll be a 5.07 release soon. NOTE: I haven't looked at the particles, adverbs, conjunctions or prepositions yet.
by jtauber : Created on July 16, 2005 : Last modified July 16, 2005 : Categories morphgnt : (permalink)
MorphGNT 5.06 Released
Well, it's been about a hundred hours work over the last six months, but I'm pleased to announce the release of a new version of MorphGNT, the morphologically parsed Greek New Testament database made available under a Creative Commons license.
Besides some corrections to the text (mostly rho-breathing) and a couple of parsing code changes, this release has a huge number of corrections to the lemmata—160 lemma changes in 465 places. See this blog entry for how potential errors for this round of corrections were discovered.
You can download the new file at:
by jtauber : Created on July 16, 2005 : Last modified July 16, 2005 : Categories announcements morphgnt : 1 comment (permalink)
Thank You Fred Smith
At 5pm yesterday, I took my Mac Mini to the local FedEx store in Burlington, Mass. By 10.30am this morning it will be at macminicolo.net in Nevada ready to be installed in their data center. There is a good chance the machine will be online by 5pm today.
FedEx really is an amazing thing.
UPDATE (2005-07-15): Yep. It's online (and I've corrected the link to macminicolo.net — thanks to Joe Weaks)
by jtauber : Created on July 15, 2005 : Last modified July 16, 2005 : 2 comments (permalink)
Headless Tiger
I've created a page to put my ongoing notes on running non-Server Mac OS X 10.4 (Tiger) remotely via ssh.
See Headless Tiger.
I've opened up comments on that page so you can add tips.
by jtauber : Created on July 14, 2005 : Last modified July 14, 2005 : Categories os_x : (permalink)
Updating OS X From Command Line
As I get ready to send my Mac Mini off to the data center, I've been seeing just how much I can do via ssh.
So far so good.
As 10.4.2 just came out, the big question was whether one can do a software update from the command line.
Sure enough, it's possible:
sudo softwareupdate -l
will list the available updates.
sudo softwareupdate -i -r
will install recommended updates.
See
man softwareupdate
for more information.
UPDATE (2005-07-14): Now see Headless Tiger.
by jtauber : Created on July 13, 2005 : Last modified July 14, 2005 : Categories os_x : 0 comments (permalink)
Summer of Code Blogs
Elliot Cohen, whose Summer of Code project I am mentoring, has started a project blog at http://elliotpbnt.blogspot.com/
I also noticed that http://planet.python.org/ has started aggregating a bunch of SoC project blogs (including Elliot's)
by jtauber : Created on July 12, 2005 : Last modified July 12, 2005 : Categories python software_craftsmanship summer_of_code : 0 comments (permalink)
Isometric Games in Python
A couple of months ago, I started investigating free libraries for developing isometric games in Python.
I found the pygame-based project Pyplace but there hadn't been a release since 2001.
So I decided to start my own, which I've called pyso.
As a starting point, in particular because I have no experience with either pygame or writing isometric games, I've just cleaned up Pyplace (which was, how shall I say this politely, quite idiosyncratic in parts).
You can get my initial effort at:
It currently is really just the last Pyplace release taken apart, cleaned up a little and put back together again.
The next release will likely be quite different and more my own work.
by jtauber : Created on July 10, 2005 : Last modified July 10, 2005 : Categories python announcements pyso : 13 comments (permalink)
Dr Seuss's Oscars
I was looking at Amazon's List of Bestselling Authors and noticed the claim that Dr Seuss (which rhymes with "voice", by the way) won three Academy Awards.
However, a look at his award page on IMDb doesn't list any.
Turns out that two films he co-wrote won Oscars (one for animated short and one for feature-length documentary) but, of course, those Oscars go to the producer(s).
Haven't found the third yet.
by jtauber : Created on July 10, 2005 : Last modified July 10, 2005 : 0 comments (permalink)
Leonardo 0.6.2 Released
I am pleased to announce the release of Leonardo 0.6.2.
Leonardo is the Python-based content management system that runs this site and provides blogging and wiki-style content.
This is a major bug fix release which:
- corrects Not Modified handling that prevented the Atom feed from being readable in some cases
- adds support for HTTP HEAD requests (required for things like right-click save as in some browsers)
- removes stray print statements that prevented use of the wikiBNL formatter
You can download it at:
http://jtauber.com/2005/leonardo/leonardo-0.6.2.tgz
by jtauber : Created on July 9, 2005 : Last modified July 9, 2005 : Categories python announcements leonardo : (permalink)
Homotopy as a Way of Distinguishing Topological Spaces
Path homotopy can be used to distinguish topological spaces that otherwise share the same topological properties.
For example, consider two topological spaces the locally resemble R^2 but globally look like the following:
In other words, both are compact manifolds and the one on the right differs from the one on the left in that it has a "hole" in the middle.
Are the two homeomorphic? Our intuition tells us not because of the hole in the one on the right. But if they are not homeomorphic, there must be a topological property that one has that the other does not.
We'll get to what that property is formally later, but for now, I want to show informally that homotopy is the key.
Look at the two paths, f and g on each of the topological spaces. In the space on the left, they are path homotopic whereas on the right, they are not. In other words, the existence of the hole means that not all paths with the same start and end are path homotopic to one another.
There's no way you can continously transform f to g when there is a hole between them.
We'll explore this idea a little more formally over the next couple of weeks and then we'll finally be able to state the Poincaré Conjecture.
UPDATE (2005-07-10): I just changed "homotopic" to "path homotopic" twice in the third last paragraph. It's important that we're talking about path homotopy not just homotopy here as we require the start and end of the path to remain fixed during the transformation from f to g.
UPDATE: next post
by jtauber : Created on July 9, 2005 : Last modified July 10, 2005 : Categories poincare_project : 4 comments (permalink)
Switching to Dedicated Hosting
Thanks to Mac Mini colocation services like macminicolo, it's now cost effective for me to switch some (and perhaps eventually all) of my web sites to dedicated hosting.
So I got an account with macminicolo and ordered my Mac Mini which arrived yesterday.
One thing that I need to experiment with before I send it off to the datacenter is whether I'll be able to remotely manage it effectively as-is or whether I'll need to buy something like Apple Remote Desktop.
by jtauber : Created on July 9, 2005 : Last modified July 9, 2005 : (permalink)
Simulating Mechanical Clock Movement
When we were in Switzerland we spent a bit of time in stores and I spent most of that time studying pendulum clocks whose movement was exposed.
I was delighted to discover an almost identical approach in every single case: a gear train with weights causing torque on the slowest moving gear and a pendulum connected to a piece (a type of what I later found is called an escapement) that regulates the motion of the fastest moving gear. (Wikipedia has a nice diagram showing an escapement in action).
Of course, the devil is in the detail, but the pattern was enough to get me excited about getting deeper into horology.
I've been thinking since about simulating the movement in software. I wonder how easy it would be to build something in ODE, the Open Dynamics Engine, which I know has a Python binding.
by jtauber : Created on July 8, 2005 : Last modified July 8, 2005 : Categories python horology : 1 comment (permalink)
Interesting Observations Come With Ambiguity
In an email to the Leonardo mailing list, I almost said:
If I use Kid, I'll ship Leonardo with it.
but then was worried that would be interpreted the wrong way around. So I considered saying:
If I use Kid, I'll ship it with Leonardo.
but was still worried that it would be interpreted the wrong way around.
A similar incident happened a few weeks ago when I was talking to my colleague James Marcus about whether he had the right A to use with B. I said:
I'm sure A comes with B.
and he looked confused. I realised he thought I was suggesting that A includes B (rather than the other way around)
Sentences of the form:
- A ships with B; or
- A comes with B
are strange in that the relationship between A and B is clearly not symmetrical and yet, for me at least, A and B are often syntactically interchangeable.
Even if I clearly intend to express that A includes B, either of the following in most cases conveys that to me:
- A comes with B
- B comes with A
I wonder if there are other phrasal verbs in English that have clearly distinct grammatical roles but ambiguous syntactic position.
by jtauber : Created on July 8, 2005 : Last modified July 8, 2005 : Categories linguistic_observations : 0 comments (permalink)
My Sister and Holly Lisle Meet in the Blogosphere
My sister, Jenni, is a 19-year-old aspiring fantasy writer. Her favourite author for many years has been Holly Lisle.
Well, Jenni just discovered that her blog is listed on Holly Lisle's blogroll.
Isn't the blogosphere great!
by jtauber : Created on July 6, 2005 : Last modified July 6, 2005 : (permalink)
MorphGNT Roadmap
This month I should be doing another release of my morphologically-parsed Greek New Testament. This will be release 5.06.
I thought I'd outline my future plans (as they currently stand).
At some point, I'll start doing 6.xx releases. This will involve a format change that includes some more information. I'll probably continue the 5-series releases for people used to the format. The 5-series data is just a subset of the 6-series data so it's always possible (and easy) for me to generate a 5 from a 6.
From Series-7, MorphGNT's format will likely change dramatically to adopt a graph structure rather than a simple tabular structure. This will enable much greater extensibility and annotation.
Series-7 will be the last that is based on the CCAT database. From Series-8 onwards, the data will hopefully be completely the results of my own parsing work.
First things first, though—getting 5.06 out. I'm down to 299 mismatches to resolve.
by jtauber : Created on July 4, 2005 : Last modified July 4, 2005 : Categories morphgnt : 1 comment (permalink)
Developments on Atlanta Reality Show
This next week is an exciting week for the reality show concept I've been helping out Tom Bennett on. I can't say too much more at this point other than that a bunch of industry veterans who've seen the demo love the concept and it will be shown to some important people this week.
by jtauber : Created on July 3, 2005 : Last modified July 3, 2005 : 0 comments (permalink)
Unreadable Canon RAW Files on Compact Flash
Before going on my trip to Europe, I switched my Canon 10D to RAW mode and bought two 1.0GB compact flash cards.
Half way into the trip, my camera started getting "Err 99" problems. I lost a lot of shooting opportunities re-booting the camera after each error, but when a photo did successfully get taken, I had no problems downloading it to iPhoto on my PowerBook.
Then, on the second last day, I was transferring one of the 1GB cards to my PowerBook and it complained that the files were not a recognizable format. Judging from the fact the .CRW files were sitting in a temp directory, the transfer seemed to go okay. And I can view the photos without issue on the camera itself.
Anyone experienced this problem before? Any ideas how I can recover at least the embedded JPEGs from the CRWs?
I'm going to have to send the camera in to Canon to get the Err 99 problems fixed. That I can live with. Losing 160 photos is more upsetting.
by jtauber : Created on July 2, 2005 : Last modified July 2, 2005 : Categories photography : 7 comments (permalink)
More Old XML Posts
Trying to dig up some old posts on behaviour sheets, I came across two interesting posts I made to xsl-list back in August 1998:
http://www.biglist.com/lists/xsl-list/archives/199808/msg00053.html
My feeling on the issue is that a spec be developed for tree addressing patterns that serves the needs of both XPointers and XSL patterns. Such a spec could stand apart (but be normative to) both XLink and XSL.
http://www.biglist.com/lists/xsl-list/archives/199808/msg00134.html
It occurs to me that maybe the formatting objects could be separate too.I would actually like XSL to consist of three separate things:
1. Pattern Language for Tree Addressing; 2. DTD and Specification of Formatting Objects; 3. Specification for Stylesheets themselves, Tree Transforms, etc.
Given that XPath, XSL-FO and XSLT now have very separate existences, it's funny to think they started off as essentially one spec.
by jtauber : Created on July 2, 2005 : Last modified July 2, 2005 : 0 comments (permalink)
Behaviour Sheets Becoming A Reality
In the first couple of years of XML, I remember having discussions with people like Steve Ball and Paul Prescod about a hypothetical beast we called "behaviour sheets". The idea was that, just like stylesheets associate a style with particular elements or patterns of elements, a "behaviour sheet" associates behaviour (e.g. what to do when clicked on or moused over or dragged) with particular elements or patterns of elements.
Netscape submitted a spec to the W3C, although they called them Action Sheets.
Well, the idea (and an implementation) has emerged again in the form of a Javascript library called Behaviour. Ben's a Kiwi so he spells it correctly too! :-)
by jtauber : Created on July 2, 2005 : Last modified July 2, 2005 : 7 comments (permalink)
Mount Pilatus Myst
Is it just me or is there something Myst-like about this shot I took from the top of Mount Pilatus in Lucerne?

by jtauber : Created on July 2, 2005 : Last modified July 2, 2005 : (permalink)
Path Homotopy
Previously we defined the notion of homotopy.
Two functions that are continuous deformations of one another are homotopic even if the two functions aren't paths.
But if the two functions are paths, then we can further define a stricter notion called path homotopy.
Two paths are path homotopic iff they are homotopic and they have the same start point and end point throughout the deformation.
In other words, if our paths are functions f and g from the interval [0, 1] to a topological space X, then path homotopy means not only the existence of a continuous map F : [0, 1] x [0, 1] -> X where
- F(x, 0) = f(x)
- F(x, 1) = g(x)
but also that:
- F(0, t) = f(0) = g(0)
- F(1, t) = f(1) = g(1)
for all t in [0, 1].
UPDATE: next post
by jtauber : Created on July 1, 2005 : Last modified July 1, 2005 : Categories poincare_project : 1 comment (permalink)
Summer of Code Kick-off
Congratulations to Elliot Cohen and all the other successful applicants to Google's Summer of Code.
I will be mentoring Elliot's project to create a Python library for Bayesian networks. Thank you to the Python Software Foundation for giving me the opportunity to do this.
The Summer of Code requires the project be hosted by a site like SourceForge. Much to my delight, Elliot is keen to use Subversion rather than CVS so we're likely going to give BerliOS a go. BerliOS uses the SourceForge code but already has support for Subversion.
I've also suggested Elliot start a blog and wiki.
by jtauber : Created on July 1, 2005 : Last modified July 1, 2005 : Categories python software_craftsmanship summer_of_code : 0 comments (permalink)