Wednesday, February 11, 2004

Polyblog II:

Hmm. I recently got a hit from Mexico, looking for "pleibol," that led me to rediscover the Google translator, so I decided to see what my page looked like in languages Google handles that I can actually understand.

Here's my ranking of the machine translated versions, from the best translation to the worst:

1. French;
2. Spanish;
3. Portuguese;
4. Italian; and
5. German.

This ranking has several problems. First, it's done by me, and therefore also reflects my decreasing understanding of these languages (except French and Spanish - I'm quite confident that Google is having more trouble translating my page into Spanish than French). Second, it's a translation of pieces written by me. They tend to be difficult: odd sentence structure, odd diction, etc. etc. I'm not an easy read, so I'm sure I'm an even harder translation.

Google is getting ready to provide translators into more languages - I tried changing the hl flag to sv for Swedish and ru for Russian, and although the page is not translated, there is an upper frame from the Google return that is in Swedish or Russian.

I also tried translating my page from these languages into English by simply switching the hl and sl flags. The results are odd, to say the least. I'm asking the machine to translate English into English, but to listen with a French, Spanish, Portuguese, Italian or German ear.

This has actually occurred to me once - I heard English as non-speakers must hear it. For several minutes, as an in-flight announcement was being made on an Avianca flight into Bogotá, I could not tell what language was being spoken. For some reason, I could not parse the sounds and place the spaces between words correctly, and I heard a stream of gibberish. Right at the end, something clicked, and suddenly I could understand that it was English. Try as I might, I could not hear gibberish again.

It's much like when one sees an interesting pattern, only to realize after a while that it is actually a highly stylized font. Once you can read the words, it is extremely difficult to recapture the pure pattern - the 'wordness' interferes too much.