Fink, Mac Ports...


If you want to install Fink on Leopard, you'll need to go to the source release page because There is not currently a binary release for you. (Quote from the download page).

Download the package and follow the instructions found in INSTALL.html.

If you get a message like:

Use of uninitialized value in string ne at /sw/lib/perl5/Fink/ line 1579.

like that happened to me after I wanted to install a package, run:

fink selfupdate-rsync

and that will fix it. It looks like somebody run into the same problem on the Fink user list.

The list of packages can be found here.

Mac Ports

MacPorts has been updated to version 1.6 recently and the web site also looks like it has been updated. The install procedure for Leopart users is straightforward. Just click on the "Installing Mac Ports" and follow the instructions.

The list of available ports is here.

Learn Cocoa !

For the readers who have too much free time during the holiday, what about starting to learn programming applications the easy way ?

And I do mean easy, like not a single line of code written ! Yes, creating applications for your favorite machine can be that easy.

Scott Stevenson, the writer behind has updated his "Learn Cocoa" tutorial for Leopard. The old Tiger version is still available for those of you who have not updated your OS yet.

You'll need to install the developer tools that come with your OSX DVD and for the rest, just follow the tutorial. It is not only amazingly well written, it is also beautiful.

The Tiger version has a sequel (Learn Cocoa II) that is not yet updated for Leopard, but you should be able to make sense of most of it since it is mostly an introduction to Objective-C in the context of Cocoa applications.

Comments regarding the updated contents and Scott's replies are very useful too, in case you run into problems.

"Open With" too many applications...

All this started a few weeks ago, and it is most probably related to Leopard's Time Machine. It may be related to other things too but then I have no idea what they are. The result is that you end up with duplicates of all your applications popping up here and there, first in the SpotLight result window, then in Mail, when you right-click on an attachment, in Finder too...

Sometimes you launch the backup instead of the original, and you realize when the backup disks starts to scramble like crazy...

Eventually, I found a solution. A little drastic, and maybe it was not all necessary, but it also contributed to rationalize my backup procedures.


1) Time Machine copied _all_ my disk to an external backup disk
2) I had Sync!Sync!Sync! do an extra daily backup of my whole home hierarchy (including all the "user space" applications)
3) I had Spotlight index everything

What I wanted was Spotlight indexing everything _but_ the applications.

The problem was that the Privacy setting of SpotLight did not allow for the selection of a subfolder in the Time Machine backup (like Applications, for ex).


1) Time Machine copies everything but the system Application folder, the user Application folder, the whole Developer folder and the Download folder.
2) Sync!Sync!Sync! does an extra daily backup of my whole home hierarchy, including the user space Application folder
3) Sync!Sync!Sync! does also a daily backup of the system's Application folder
4) I have Spotlight index everything (including the TimeMachine backup, of course) but the Sync!Sync!Sync! backups where all the Application duplicates are now.

I have also reindexed Spotlight using the following command in Terminal:

 $ sudo mdutil -Es / /Volumes/backup_disk_name/

And I have rebuilt the LaunchServices database, the one that contains all the "knowledge" about which application is supposed to launch which file, with the following command, also in Terminal:

 $ /System/Library/Frameworks/CoreServices.framework/\
-kill -r -domain local -domain system -domain user

SInce the whole "recipe" is based on saving applications outside Time Machine, I also had to erase all the data Time Machine had accumulated for a month... Setting Time Machine to not save /Applications from any day does not remove /Applications from the previous days backup sets... There may be smarter ways to do that, but I thought that with my already existing extra daily backup I was on the safe side.

Now, I have much saner "Open With" drop down lists when I right-click a file with my mouse...

NeoOffice 2.2.2 Patch 5 and 6, OSX Security Update (and its own update), more QuickLook sites

NeoOffice 2.2.2 patch 5, then 6

The patch fixes a regression introduced in the previous patch and expands Asian text input and layout enhancements to include Chinese, Japanese, and Korean punctuation.


It seems patch 5 introduced regressions in the layout enhancements for CJK punctuation and path 6 specifically removed all the new code introduced by patch 5.

OSX Security Update 2007-009

For OSX 10.5.1 and 10.4.11.
Available from Software Update (under the Apple menu) or from the Apple Donwload page.

Details are here.


Apple released an update to that security update a few days later. It seems the first update introduced a number of new crashes in Safari...


Two sites dedicated to QuickLook plugins.

It looks like the number of domain names that include "quick", "look", and "plugin" is decreasing... and

I have a preference for the second in terms of visibility and design.

Paperless office, PDF, XML, Zip...

From the Mac Dev Center, "My paperless office" by Gordon Meyer.

Gordon writes about how he used a Fujitsu ScanSnap scanner to archive all his paper documents to digital form. He also mentions DevonThink, a digital document managing system, and 2 PDF utilities: Skim, a PDF reader and note taker for OSX released under the Modified BSD License, and hence Free software, and PDFPen, a tool to organize and overwrite PDFs.

Considering the Leopard version of Preview, it looks like Skim as well as PDFPen are becoming obsolete, as some comments seem to confirm...

The other day there was a link on Slashdot pointing to Jim King's "Inside PDF" blog, about the fact that PDF 1.7 was becoming an ISO standard.

I've been reading a number of Jim's posts and here are a few that may interest some of you. They are all a little theoretical but give very interesting insights on a number of existing major documentation formats.

Quotes from the respective pages.

PDF by Design

I named this blog "Inside PDF" because I anticipated telling you a lot about PDF technology – what is inside of a PDF file and why. I have spent most of the time so far talking about PDF and standards. So, I thought it was about time to do an entry about PDF itself. I believe that PDF has been so successful because of the caldron out of which it was brewed. By 1990 Adobe was quite successful with PostScript. By then we had helped over 60 other companies make printers, image setters, and other imaging devices that used PostScript. We had also shipped Display PostScript and the Steve Jobs NeXT machine was a computer whose operating system's imaging model was Display PostScript. So Adobe had had considerable experience in displaying documents on a screen...

XML Documents

Today I hope to tie together two previous blogs about OOXML and about XML For ....

I am sure you have often heard the term “XML Document.”  I hope you realized that that term is nearly meaningless just like the term “XML.”  We should never use either in polite conversation. Let me tell you some of the totally different uses for the term “XML Document” which render it a useless term, and maybe you will agree with me to banish it from our vocabularies...

Archiving Documents

Archiving is a rather loaded word since doing it can be a widely varying activity. In many situations, archiving PDF files is a very good solution. In fact it was so attractive to some US Government agencies that they encouraged their personnel to work on an ISO committee/working group to define a special subset of PDF called PDF/A that meets their needs better than plain old PDF might...

ZIP Archives and Portable Directories

This is a topic that is dear to my heart and I would love to spur some interest in creating an open source project or something like that. Since about 1999 I have been talking to my colleagues about a concept that I call "portable directories." It is a simple idea once you "get it."

File systems, organized around the notion of directories or folders in which to collect files and other directories, have been the staple for how we save computer material on our hard drives, data CDs and DVDs, etc. I suppose it had its invention from an analogy with a file cabinet, but on the computer we can nest folders inside folders to any depth, something hard to do with real physical folders...

Automation on the Mac ! - Python

An Automator action "Run Python script" is provided by

That comes in addition to the already existing "Run AplleScript" and "Run Shell Script" provided by Apple.

This Automator action is Free Software, released under the Modified BSD License.

Apple Java 1.4 and 1.5 Update for OSX 10.4.10 and later...

Released yesterday: Java for Mac OS X 10.4, Release 6.

Make sure you don't get a version number wrong...

From the page:
Java for Mac OS X 10.4, Release 6 delivers improved reliability and compatibility for Java SE 5.0 and Java 1.4 on Mac OS X 10.4.10 and later. This release updates J2SE 5.0 to version 1.5.0_13 and Java 1.4 to version 1.4.2_16.

The release notes are here.

Happy Holiday season !

Santa Claus is coming to town !


Thanks to Jost Zetzsche and his Tool Kit Newsletter where a link to this study (441 kb PDF) was made available.

The 1.6 mb presentation file that comes with it is also very interesting.

A very thorough study on current QA tools that, of course, focuses on Windows tools and for good reason.

Excerpts from the document:

As expected, most of the respondents (141 or 86.5%) represented translation/localisation service provider companies while a few (more specifically, 11 people) were from service buyer side and 2 were software developer representatives. 3.07% of other organisations were consulting and academic institutions, and one respondent reported his organisation to be multilingual quality assurance service provider.


The most popular operating system is Microsoft Windows, and 62.58% of respondents confirm their companies work only in MS Windows with no other OS11’s. Users of both Windows and MacOS who follow Windows users comprise only 19.35%. Users of three OS’s (Windows, MacOS and Unix/Linux) account for approx. 9%, and those who work under Windows and Linux comprise 7.1% of all respondents. 0.65% (1 respondent per each category) work only in MacOS, Unix/Linux and other (medical hardware) OS. So, the amount of translation professionals who never uses Windows was below 2%.


After SDL/Trados merger, SDL translation memory tools are indeed prevailing. Almost 60% of respondents use Trados and/or SDLX as their translation memory solution. Star Tranist (11.11%) is the third popular TM according to the feedback, and Wordfast and Déjà Vu account for 9.8% and 7.84% respectively. Other tools mentioned were across, Idiom, Logoport, MemoQ, Lingotek, Heartsome, MulitTrans, OmegaT, WordFischer and proprietary tools. Many respondents also named Passolo, Catalyst, RC-WinTrans, Helium, LocStudio and other localisation tools which, however, are beyond the scope of the paper. 4.9% of the respondents stated they don’t use any translation memory tool at all.

NeoOffice 2.2.2 Patch 4

NeoOffice 2.2.2 Patch 4 has just been released.

It comes with (from the announcement):

  • Improved Microsoft 2007 file handling
  • Improved Asian text input and layout
  • Closed a security hole in NeoOffice Base's underlying HSQLDB database engine

Patch 4 can be downloaded from the following URL: localization: an easy way to deal with .sdf files

What are .sdf files ?

A few days ago I wrote about 2.4 localization update.

For some reason related to the way SUN manages the UI/Help strings, the translation source file comes in a weird format: all the XML "<" and ">" etc are escaped with "\" and the file structure comes as a set of 2 lines pairs, the first line being the en-US original and the second line a placeholder for the target string.

This placeholder contains sometimes the en-US string and sometimes a close approximate of what would be the translation of the source string in the target language. All this is nicely embedded into a lot of meta information that makes the file impossible to parse with normal human senses...

Here is an example (without the meta information):

String in the .sdf:
\<ahelp hid=\".\" visibility=\"hidden\"\>something in the .sdf\</ahelp\>

(.sdf is the extension SUN has created to name the format)

SUN also provides translators with TMX files of the whole UI/Help for a number of languages (de, es, fr, hu, it, ja, ko, nl, pl, pt-BR, pt, ru, sv, zh-CN, zh-TW, at the time of this writting).

The TMX seem to have been created not from the original XML (with nicely encapsulating TMX 1.4 level2 tags) but from the funky .sdf file. Which means that all the original XML tags are found escaped as per the .sdf, alongside the translatable contents...

So the above string would be exactly the same in the TMX:
\<ahelp hid=\".\" visibility=\"hidden\"\>something in the .tmx\</ahelp\>

How to translate that ?

So, how to practically translate such files while making use of the TMX data ?

The no brainer way...

Edit the .sdf file directly, possibly after renaming it to .csv and importing it into, where all the {tab} separated meta information fields will nicely fill their own column and leave the translatable contents on its own...

It is not exactly translator friendly... But with a little playing with the column width you'll manage to have only the translatable parts displayed...

This procedure allows translators to separately (and manually) do searches in the TMX or the glossary (Sun Gloss) and to use the matched contents directly without having to play with the "\" too much.

It is not very practical because the TMX data is embedded in plenty of XML tags and the result is thus not exactly pretty...

The PO way

The PO way is not the best way to leverage the TMX contents. It also requires translators quite some editing when wanting to use TMX matches... Still, it seems to be the most common way to localize

PO files are provided by the team coordinators, they are created with the Translator's Toolkit's oo2po tool.

The above .sdf contents would be converted like this:

\\<ahelp hid=\\\".\\\" visibility=\\"hidden\\\"\\>something in the .po\\</ahelp\\>

The reason is that oo2po wants to be smart and adds an extra layer of escape characters (the ugly and ubiquitous "\"). And as you see above, the number of added "\" depends on what has been escaped: a simple [\] will become [\\], but [\"] will become [\\\"] because PO wants to escape both [\] and ["] with another [\]...

Now, it does not take much to see that matching that against the TMX data will be a problem. Even if the translator uses a smart PO editor to refer to the TMX there will still be a need to add all the ugly extra "\" that oo2po has added to the .sdf contents.

Basically, oo2po adds a useless extra layer of complexity to an already complex process that also happens to render TMX matching pretty much useless.

The smart way that also happens to really ease the translator's work

Here we are. Now, to keep the post to a reasonable length, let me refer you to the mail I just wrote to the OOo-l10n-dev list where everything is explained.

The idea is basically that, since the TMX matches the structure of the .sdf, then it is easier to work from the .sdf. But to make the TMX really useful it is necessary to make the .sdf contents easily handled by a tool that will also make full use of the TMX contents.

OmegaT for example...

Within OmegaT you can have automatic TMX and glossary (Sun Gloss export) matching, automatic file encoding handling, automatic file naming handling etc...

So, there is a very small Java utility sdf2txt.jar that basically extracts all the translatable contents of the .sdf file and outputs it as a "key=value" format that OmegaT can parse natively.

From there you see what needs to be done...


  • put the extracted files in the /source/ folder of your newly created OmegaT translation project,

  • put the TMXs in /tm/,

  • put the glossary files (if any) in /glossary/,

  • load the project...

and enjoy translating in a Nice and Friendly to the translator Professional yet Free Computer Aided Translation tool....

Another smart but regexpy way...

Before using the CSV trick above ensure that the line pairs are converted so that the 2 lines are put on one line.

To do that in a text editor that supports regular expressions, search for:

replace with:

Now that your .sdf is "linearized", change its name to .csv and open it in OpenOffice by using "tab" as field separator and "nothing" as text delimiter.

The tabs in the original .sdf create a number of columns from where you just need to copy the column with the en-US translatable contents.

Paste that into a text file with the ".utf8" extension, load into OmegaT... Et voilà !

You'll have to paste the contents of the translated file into the target part of the CSV file, convert back to a 2 lines pair set.

The pattern we need to find to revert the 1 line blocks to 2 line blocks is something like:

(something)(followed by lots of en-US stuff)a tab(the same something)(followed by lots of translated stuff)


and we need to replace it with:

Make sure there are no mistakes (if there are any they are likely to appear right in the first lines).

Now you should have your 2 lines block.

Rename the file to .sdf and deliver...


There are plenty of ways to deal with's localization files. But to make sure that the contents of the TMX can be fully leveraged (and with close to 70,000 segments, it would be a waste if it were not) there is a real need to avoid the PO files created by oo2po. Problem is, anything that involves the .sdf files directly requires a little bit of massaging...

Ideally, SUN would provide XLIFF files that are created directly from the original XML files (and with empty targets), as well as properly encapsulating TMX files...


sdf2txt.jar has been created by Alex Buloichik. The word count included in the output may not be 100% exact but the extraction/merge works, which is what matters for now. The code is within the Jar file and the whole thing is GPLed. Thank you very much Alex.

TMX, XLIFF, etc...

Just to make sure you have them right at hand:


Localization Industry Standards Association

LISA's TMX page

LISA's standards page

LISA's Globalization Insider

LISA is also working on SRX, the segmentation exchange standard and TBX, the terminology exchange standard.


Organization for the Advancement of Structured Information Standards


OASIS' XML Daily Newslink

OASIS' Standards page (you may want to take a look at DocBook and OpenDocument)


World Wide Web Consortium

W3C's Internationalization activity page

W3C's Internationalization Tag Set page

W3C's Translated articles page


Web Hypertext Application Technology Working Group

And while we are at it, something that is likely to appear on your desktops earlier than you think, HTML 5.0

La traducción del software libre. Por Juan Rafael Fernández García.

A series of 5 articles, in Spanish, that describe everything you need to know to localize Free Software, and software with free tools too. Most of the tools described in the paper are available for the Mac.

From Juan Rafael Fernández García.

  1. Una oPOrtunidad de colaborar
    Aunque no seamos programadores, hay una gran oportunidad de contribuir en los campos de la documentación y de la traducción. Por una vez querer es poder, ¿queremos ser miembros activos de la comunidad?

  2. Los problemas de PO y el abrazo fuerte En la primera entrega de esta serie hemos hablado de la tecnología gettext; ahora es el momento de resumir sus ventajas pero también de señalar sus defectos. Qué triste sería el artículo si no pudiéramos hablar también de las soluciones, de las alternativas...

  3. Memorias compartidas Hablábamos en la segunda entrega del momento fácil de la enumeración destructiva de problemas. Es el momento de enfrentarnos a los que tenían con ver con la necesidad de compartir esfuerzos, herramientas y resultados. Vamos a examinar las respuestas.

  4. ¿El momento de cambiar de herramientas? No podemos cerrar el estudio sin examinar las propuestas de la industria de la traducción: la especificación XLIFF y sus herramientas. ¿Está el software libre a la altura?

  5. Cerrando el ciclo El objetivo inmediato de esta serie de artículos es lograr la incorporación de voluntarios a los equipos de traducción, vamos a conocer estos equipos un poquito más de cerca.

TMX Editor, locale4j, File2XLIFF4j

A bunch of new exotic names for Mac...

TMX Editor

TMX Editor is a Java Swing GUI built for working with files supporting the TMX localization standard.
I found the application yesterday and tried to open an OmegaT tmx file just to see what the tool was able to do and, well, nothing happened. I sent a mail to Matthew Gagne, the project manager and received this answer a few minutes ago:

It seems as if this problem relates to the version of the TMX standard that is implemented currently. lists 1.4b as being the official revision level. The editor, however implements the newest TMX 2.0 draft standard using the locale4j library. This was a decision made when we began localization of our other projects here. Unfortunately there seems to be compatibility issues between the two TMX formats (1.4b vs 2.0).

Since TMX 2.0 is unlikely to be in widespread use before a while, Matthew also mentioned that he'd love if Java developers could join the project to work on backward compatibility with the current TMX version.

locale4j is the library that is at the core of TMX Editor. It currently works only with TMX 2.0 data.


Both TMX Editor and locale4j are licensed under the Mozilla 1.1 License, which is a Free Software License not compatible with the GPL.


File2XLIFF4j is quite another beast: File2XLIFF4j is a java based library for converting files to the XLIFF standard. The overview shows that the library is not exactly for the normal translator, but it is still important to know that such conversion libraries exist.

Interesting to see that the overview author, Weldon Whipple, has a "lingotek" email.


File2XLIFF4j is licensed under the GPL.

CEDICT for Apple Dictionary 1.0

One more dictionary for

Regarding the attempts at converting edict to the format (see previous posts), well they have been successful, but the issue is how to distribute the data ? The solution Prof. Breen has chosen is to make the XML data available from the site, along with a how-to that explains how to build the data. This how-to is currently being written and hopefully, everything will be ready around Christmas...

Looking for a QuickLook plugin ?

OSX Network

OSX Network is provided by Jeff Biggus and has a wonderful list of applications for OSX. OSX Network is also known for its extensive OSX development articles listing, to be found here.
Here is the page that lists all the registered QuickLook plugins. At the time of this writing, you can find plugins for the following file formats:

  • Brainsight QuickLook Generator - Supports medical image file formats

  • Colorxml-QuickLook - XML QuickLook

  • EPSQLPlugIn - EPS files

  • flv.qlgenerateo - Flash FLV files

  • Folder.qlgenerator - folder contents

  • illust.qlgenerator - Illustrator files

  • Mac2SpecQLPlugin - Spectrum SCR file

  • QLEnscript - Programming code

  • Quickcomic - Zip/cbz file

  • QuickLook Script - AppleScript source

  • TextMate in QuickLook - Renders QuickLook previews using TextMate highlighting

  • Zip.qlgenerator - zip file contents

  • ZipQuickLook - Zip file

Keep the URL in your bookmarks to check the updates.

Another site that has a list of QuickLook plugins (the two overlap a lot) is The page also has screenshots. Here again, the list seems to be updated quite often.


And if you want to create you own plugin, here is Apple's developer docs.

Apple does not seem to have a section dedicated to QL plugins on its download page...

Nice Xmas present...

It is not a Mac, but the machine has generated a huge lot of buzz since its inception.

The XO, also known as the OLPC is there for you to get !

Here are various links that actually tell something about the machine and the concept...

XO, the next lisp machine ?

Ivan Krstic's Google Tech Talk

One Laptop Per Child (New Version), Reviewed by 12-Year-Old

And of course, Alan Kay's keynote at EuroPython 2006 part 1, part 2 and part 3, with a summary by Guido van Rossum

Computing does not have to be dumb !

References to emails in other applications

DaringFireball has an article on how emails can be linked to in other applications.

Interesting if you need to keep track of clients' mails for a given project. John Gruber even published an Applescript that makes this feature even easier to use.

Maxprograms is back !

About Maxprograms

Maxprograms is manned by Rodolfo Raya, of Heartsome and XLIFF fame. Maxprograms used to provide a few free Java utilites that were later included in Heartsome's Translation suite. Rodolfo quit Hearstome a few weeks ago and decided to put all his utilities back on Maxprograms' site.

The tools

The free tools that are now distributed directly from his site are:


TMXValidator checks your documents against TMX DTD and also verifies if they follow the requiremenst described in TMX specifications.


TBXMaker converts glossaries stored in CSV (Comma Separated Values) to TBX (TermBase eXchange) format.


CSVConverter converts glossaries stored in CSV (Comma Separated Values) to TMX (Translation Memory eXchange) standard.

Java Properties Viewer

A tool specially created for viewing translated Java .properties files comprising languages not supported by the ISO 8859-1 character encoding.

MARTIF to TBX Converter

A program designed to convert glossaries in MARTIF format, also known as ISO 12200, to TBX format.


RTFCleaner removes hidden text and Trados/Wordfast markup from translated Tagged RTF files.

It is very nice to see that they are now available without having to download the full HTS package.


The utilities require Java 1.5 or better. Which means that for the time being they can be used only on OSX 10.4.8 or better. See Apple's download page if you only have Java 1.4. Java 1.5 is the default version on Leopard.

Since the utilities were included with Heartsome's Translation Suite they can be put to even better use in a workflow that involves HTS.

Free software ?

Currently, only TMXValidator has been released as a free to use source package. The license Maxprograms used is the Eclipse Plugin License v. 1.0. This makes TMXValidator free (as in free speech) software, but the EPL v. 1.0 not being compatible with the GPL it won't be possible to use the source code for inclusion in GPLed products.

All the other utilities are not (yet ?) available as free software, they are just free as in "free beer".

Anyway, thank you very much for your work Rodolfo ! And good luck to Maxprograms !

Kazunari Hirano interview

Kazunari Hirano is a long time contributor to the Japanese community and has recently been involved with Open Solaris and its localization community.

He was recently interviewed by both Reiko Saito, Japanese Language lead at SUN for Solaris, Java and Sun Java Enterprise System and by Jim Grisanzio, Sr. Program Manager, OpenSolaris Engineering at Sun.

Reiko is also very active in the Japanese localization community where she helps us a lot.

The interview has been conducted in Japanese and English and is available on both Reiko's blog and Jim's.

Apples and Windows...

Just like with oranges, apples and windows do not compare.

Still, if you really need to have both on the same machine, there are ways to do that.

As a minimal introduction, check Bill Clementson's article on the subject.

Then, if you really need to stay on the Mac with the occasional Windows application, you may want to take a look at a few recent threads on the MacLingua forum (subscription required), where a few CAT tools are discussed in the context of Windows on Mac.

Now, if you want to dream a little, check this article on Ars Technica or this thread (for geeks) on the Wine HQ mailing lists...

By the way, Wine is yet another means to get Windows on your Mac. From the site:

Wine is an Open Source implementation of the Windows API on top of X, OpenGL, and Unix.

Think of Wine as a compatibility layer for running Windows programs. Wine does not require Microsoft Windows, as it is a completely free alternative implementation of the Windows API consisting of 100% non-Microsoft code, however Wine can optionally use native Windows DLLs if they are available. Wine provides both a development toolkit for porting Windows source code to Unix as well as a program loader, allowing many unmodified Windows programs to run on x86-based Unixes, including Linux, FreeBSD, Mac OS X, and Solaris.

BetterZip Quick Look Generator

Another developer has created a Quick Look plugin that accesses the contents of all types of archives:

From the site:
The currently supported archive formats are: ZIP, TAR, GZip, BZip2, ARJ, LZH, ISO, CHM, CAB, CPIO, RAR, 7-Zip, DEB, RPM, StuffIt's SIT, DiskDoubler, BinHex, and MacBinary.

The software costs $19.95 and comes with a one month free trial.

Quick Look for zip files, folders

Currently, Quick Look does not display anything interesting when you hit a folder or a zip file.

With 2 free utilities you can hit the space bar and see a listing of the contents of a folder or of a zipped file...

- Folder Quick Look Plugin
- Zip Quick Look Plugin

Keep the pages in your bookmarks to see updates when they are released.

OmegaT 1.7.3 (with Mac bundle) released !

First of all I'd like to thank all the people who have tested my Mac bundles for OmegaT.

I used your comment to create the official version that is released today.

As you probably know, the OmegaT project puts the "test" label on versions that do not have up to date manuals but that are at least as stable as the stable version... Basically, test version have been thoroughly tested already by a number of power users and all the problems are supposed to have been ironed out.

So go ahead, it won't bite you !

The file, once unzipped, becomes OmegaT.dmg and opens with 2 files:

- documentation.html

The documentation.html file is exclusive (now) to the Mac package, and groups all the information per available language. I put the up to date manuals in bold so that you can see them right away. This is the case for the English manual exclusively as of today.

The manual has been fully updated thanks to the work of the current documentation manager to reflect all the new features (check the changes.txt file) and extra information like bidi languages handling and command line arguments.

It's been a long time since OmegaT was released with a Mac bundle (1.4.4 was the last one if I remember well) and the current release manager and I are working on automating this process (instead of having to do everything by hand here).

People who still want the rough edges of the "pure Java" version can still use the " file. It will behave exactly like previous OmegaT versions (as far as integration with OSX is concerned) and will allow for all sorts of command line arguments passing. The bundle is a little bit trickier to modify, you need to edit the .plist file inside the package to obtain the same results.

Java 1.6 for Mac !

Apple has been criticized for not including Java 1.6 into Leopard. The current default is Java 1.5 when all the other platforms (Windows, Linux, Solaris etc) have access to Java 1.6...

Since SUN opened the code of Java it is freely available for porting and modifications. What should happen eventually happened: somebody took the source code and ported it to Mac.

The result is "SoyLatte: Java 6 Port for Mac OS X 10.4 and 10.5 (Intel)" by Landon Fuller. The performance seems to be pretty good too. See the benchmark comparing Ruby on OSX and JRuby on Java 1.6/OSX by Charles Nutter.

With this release, OSX users are getting closer and closer to a stable release of Java 1.6 for their machine (only Tiger or Leopard).

There is a developer preview from Apple, but it only works under Leopard and you need to be registered as a developer to access it (registration is free). 2.4 localization

Almost two weeks since the last post. Amazing how 3 kids can suck your energy into nether...

Today's first post is an announcement. is a free office suite that a lot of translators already use for its compatibility with MS Office and the fact that, well, it is a free download and a free use application. is developped in part by SUN Microsystem, contributions come from IBM and other major players in the software industry and there is a very strong community of users and volunteers that exchange in a variety of languages. The "Native Language Confederation" is where all the non-English things take place. is thus localized by this community of communities under a separate project called, obviously, the "Localization Project".

The current available version of is version 2.3. Version 2.4 is expected to be released sometimes at the beginning of March and the localization efforts will thus start very soon.

Translators on Mac who do not use but prefer NeoOffice should be aware that all the localization work that goes into is automatically "recycled" into NeoOffice.

So, the deal is: you're enjoying a wonderful free office suite, and somehow you feel guilty for not having had to pay for it, or you feel that you'd love to "pay something back" but not being a programer you are not sure where to start...

Well, you are a translator by trade, aren't you ? Localization is where your skills can be used the best. Here is where you'll be able to find all the necessary informations for this version's translation.

There are TMX files available for some language communities and since the source files are in the PO format you can translate them in your favorite CAT tool.

First, get in touch with the translation group within your language community (from the Native Project page: click on your language community, go to the relevant page from there, either "contributions" or "participation" or "projects" etc. and propose your help !

OSX 10.5.1, Safari 3.0, TextEdit...

It took Apple three weeks to address the most problematic bugs in Leopard. Good job ! The security contents of the update are detailed here. See also the relevant TidBits link in the feed box below.

Update: Heise Security's opinion on the fixes.

Safari 3.0, shipped in 10.5, includes a WebKit update that will certainly improve our Web experience. The Ars Technica article gives an interesting perspective on the WebKit market and gives a technical review that highlights the main new features. This new version of the WebKit is available in Leopard and in the latest Tiger update (10.4.11) released yesterday.

If you make a search in TextEdit now, you'll see that the search results are highlighted as they are in Safari. Much easier to see the searched string !

Bento, NeoOffice, OSX 10.4.11

Bento is "the new personal database from FileMaker that's as easy to use as a Mac".

Daringfireball's analysis is quite interesting and convinced me to take a new look at Numbers for my simple database needs...

NeoOffice has just release the second patch for NeoOffice 2.2.

And Apple has also released a point update: OSX 10.4.11, that contains Safari 3.0 and ships a number of other items and fixes.

XLIFF 1.2, TMX 2.0, etc.

For some reason, my Google Alert for XLIFF decided to inform me today of the existence of the XLIFF 1.2 specification on the OASIS site...

Since we're at it, the other relevant standard body, LISA, is also working very hard, to produce the next version of TMX: TMX 2.0. With Heartsome's Rodolfo Raya as the standard editor, we can be sure that HTS will be one of the first application suite to support the standard.

Meanwhile, the Internationalization Activity of the World Wide Web Consortium has not ceased to produce very interesting documents, like an Updated Working Draft about the Best Practices for XML Internationalization.

The W3C's i18n group is also working on the Internationalization Tag Set. Yves Savourel of Okapi Framework fame being the chair of that Working Group, you've already guessed that the framework already supports a part of this tag set...


Christmas in November !!!

After the Okapi for Mono package 2 days ago, another package useable on the Mac has just been released: OpenWordFast, a macro for that accepts WordFast translation memories.

The project was registered on October 8th, which means that it is yet a little early to expect function parity with WordFast, currently it only accept 100% matches from the TM... But since the project is free software (GPL) I have no doubts that it will find a lot of contributors.


I received a mail from Oleg, OpenWordFast's developer after congratulating him for his work:

Hi, Jean-Christophe.

Thanks for your post. But OpenWordFast in the raw Beta stage. I'm not tested it on Mac yet.
Its lacks of vital functionality - Glossary, Terminology Recognition, search of not full match TU.

But I plan to work on this list in the future releases.

Best regards, Oleg Tsygany.

Here we go !

Transmug !

Yves, apologies !

Jost Zetzsche of The Tool Kit, the newsletter to read, even if Mac news are scarce, just reminded me of the existence of Transmug, your group of Mac using translators...

I found the Preaching to a Choir post and liked the PDF (10 mb) presentation attached to it.

ps: Amazing the number of Yves related to Mac and translation. Yves Averous of Transmug, Yves Savourel of the Okapi Framework, Yves Champollion of WordFast...

Okapi tools for Mono !

The Okapi framework is a set of applications designed to ease the work of the translator:

"The Okapi Framework is a set of interface specifications, format definitions, components and applications that provides an environment to build interoperable tools for the different steps of the translation and localization process.

The goal of the Okapi Framework is to allow tools developers and localizers to build new localization processes or enhance existing ones to best meet their needs, while preserving a level of compatibility and interoperability. It also provides them with a way to share (and re-use) components across different solutions. The project uses and promotes open standards, where they exist. For the aspects where open standards are not defined yet, the framework offers its own. The ultimate goal is to adopt the industry standards when they are defined and useable.

In short, the Okapi Framework aims at being a crucible where we forge common components that can be used in any localization and translation application, providing faster development time and better interoperability, but still allowing for the diversity of solutions."

(quote from

Problem is, Okapi is developed on the .NET platform, basically a Windows only platform.

A few years ago, people on the Linux side have decided that .NET was a valuable platform and decided to create an implementation of .NET for Linux, that could run .NET applications out of the box. Mono was born.

Mono was also made to run on OSX... The problem was that until recently Mono's support for .NET was not sufficient to run the Okapi tools and that the Okapi tools had not been written with the lowest common denominator in mind to run on Mono.

Yesterday, Yves Savourel, lead developer of the Okapi Framework Project, released a first Okapi for Mono package for testing on existing Mono environments (including OSX and Linux). The totality of the tools is not yet available but command line tools are said to work.

As far as OSX workflows are concerned, Okapi can produce XLIFF files (or OmegaT projects, or XLIFF files for OmegaT) from a number of localization/translation formats. It is now relatively trivial for OSX translators to deal with inDesign files, for example, as long as they are saved in the inDesign XML format (.inx). Okapi will convert the .inx files to XLIFF for translation in OmegaT and will convert the target files back to .inx for delivery....

Very good news for translators on OSX and warm thanks to the Okapi team !

ps: I'll post a detailed description of how to install Mono and Okapi on OSX in a few days for the readers who don't feel too adventurous. Meanwhile, here are the respective download pages: development kit comes with a few dictionaries already, but limited to 2 languages: English and Japanese.

What if you want to use other data sets there ?

Apple has released a Dictionary Development Kit that you will find in /Developer/Extras/Dictionary Development Kit/, if you have installed the XCode tools.

From a first glance, a dictionary file is basically a twisted XHTML file that gets massaged with perl scripts and a few command line applications, for use in the dictionary application.

Here is the first part of the abstract from the file Dictionary Format.rtf located in ./documents

"This document explains an XML schema that enables developing dictionaries / references that are compatible with and other Dictionary Services. The schema defines the source code format for the dictionary. The dictionary source needs to be validated to make sure it is in the correct format. It is then processed by the Dictionary Build tool together with css and other auxiliary files, and packaged into a dictionary bundle. The dictionary bundle can be installed into one of the Library/Dictionaries folders to make it accessible from"

And here is the first entry in one of the sample provided (./samples/JapaneseDictionarySample.xml):

<?xml version="1.0" encoding="UTF-8"?>
This is a sample dictionary source file.
It can be built using Dictionary Development Kit.

Entry examples for Japanese dictionary, English-Japanese dictionary, and Japanese-English dictionary.
<d:dictionary xmlns="" xmlns:d="">

<d:entry id="annojou_j" d:title="案の定">
<d:index d:value="アンノジョウ" d:title="案の定" d:yomi="あんのじょう" />
<d:index d:value="あんのじょう" d:title="案の定" d:yomi="あんのじょう" />
<d:index d:value="案の定" d:title="案の定" d:yomi="あんのじょう" />
<span class="headword">あん‐の‐じょう</span>
<span class="hyouki">【案の定】</span>
<span class="meaning">予想通りに事が運ぶさま。</span>

(Side note: interesting to see that .xml files open in Dashcode by default.)

Update: Apple has a full "Dictionary Services Programming Guide" available here. It also comes as a downloadable 1.3mb PDF file.

I have been trying to build a sample of EDICT provided for this purpose by Prof. Breen but even though the build process seems to be successful, only displays that is finds an entry without actually displaying it... More later...

Update 2: I have spent a good part of the week end testing Prof. Breen's sample. The documentation provided by Apple is really minimal so we had to try a lot of different things.
  • The IDs that are meant to be unique are not validated if they are entirely numerical, but it seems numerical values are handled without problems by the build process.

  • The build process seems to have difficulties to properly build the dictionary if the project is not in the /Developer/Extras/Dictionary Development Kit/ directory.

Regular expressions and text editing

Regular expressions

When you edit glossaries or translation memories a few regular expressions always come in handy.

Regular expressions are pattern matching expressions. You create a pattern in the search field of a text editor and the text editor will look for anything that matches the pattern. Similarly, once you have found the pattern, you can replace it with a second pattern. That way, you end up with super powerful search-replace routines that can save you hours of stress on thorny texts...

Here are two articles that you can use to get up to speed on the topic:

Regular Expressions from Princeton University
Regular Expressions Unfettered from Apple Developer Connection

As you can see, regular expression creation is not always an easy task and a helper application can sometimes save you a lot of time.

There are a few free regexp testers for OSX but they basically all work from outside your text editor.

The one that seems to be the easiest to use is reggy, a nice piece of software licensed under the GNU General Public License v2.

On his blog, Bill Clementson talks about regex-tool.el. As the name indicates, regex-tool.el is an elisp library for Emacs

Besides for text editors, regular expressions are supported by all kinds of software. Including, of course, the major Office applications on the market. Look at their user manuals to find more information about regexp handling there.

Text editing

A number of very good text editors are available for the Mac. TextEdit, OSX's default text editing application, is fine but lacks even basic regular expressions support. It does plenty of other things though, and can be considered as a simple word processor with most of what is needed in that field (read and write Word files etc).

Others major text editors on Mac include:
Smultron (free as in "speech"),
TextWrangler (free as in "beer"),
SubEthaEdit (not free, except for the old version),
BBEdit (not free at all) and
TextMate (not free either and with very bad multibyte characters support).

There are also the "ancestors" that are VI (VIM) and Emacs. Both are available from the Terminal application but require some time to get used to. Still, they are definitely some of the most powerful application OSX hosts...

Emacs is not exactly the text editor that I'd recommend to my wife. But Aquamacs, a "Mac" version in terms of interface is much friendlier and can almost right away be used as a replacement for TextEdit as far as, well, text editing is concerned. Being an adaptation of Emacs, it is just as free and also distributed under the GNU General Public License...

Emacs is written in Elisp. So anything you write in Elisp within Emacs can de-facto extend the functionality of Emacs. In other words, Emacs is just a huge macro editing environment...

Whichever text editor you decide to use, don't forget to read the user manual and especially the "searches" and "regular expressions" chapters.

Automation on the Mac !


Apple just updated the AppleScript section of its site.

I have never found a practical way to use AppleScript in my workflows... Hopefully the new version and better integration between applications and OS will make that a no-brainer now...


I have more hopes for Automator, especially the new version that has full access to the user interface (see here).

Here is the state of the documentation about Automator on Apple's developer's site. It is not updated yet for Leopard but is a very good start (after the Help files themselves) for people who want to take a serious look at this technology.

It is also possible to create actions with AppleScript or other easy to learn languages.

OmegaT development status for October

It is nice to see on OmegaT's development site that October has been quite an active month.

For those who don't know what OmegaT looks like, take a look at the screenshots, and read the online documentation.

Back to the stats.

Project rank was 137, which is the highest ever in the project's history (out of more than a 100,000 projects). The Downloads figure was 3,466, which is second best, after November 2006 (4,127), when the first build in the 1.6 series was released. Of course, the downloads figure includes the current stable and test versions as well as a few other packages.

This 3,466 figure comes right after the latest test version (OmegaT 1.7.2) was released, at the end of September.

The Mac OSX zip packages that are available for the stable version (1.6.1_04) and for the test version (1.7.2) are soon going to be replaced with a real MacOSX application bundle that will make OmegaT even easier to use on the Mac.

The development code is available through the Terminal application, or any CVS interface.

Here is a rudimentary shell script that I use to update the sources and build them (you will need the OSX developer's tools installed, available from the install DVD):

cd ~/Documents/OmegaT/application/current_cvs/
cvs -z3 co -P omegat
cd omegat
cd dist
open -e readme.txt &
open -e changes.txt &
java -jar OmegaT.jar &

I am sure there are things to improve here, but it works for me :).

The current code in CVS is labelled OmegaT 1.8 and includes the spellchecking engine etc.

The changes.txt file indicates the features that have been implemented, the bugs that have been fixed and the localizations that have been added.

Feel free to test and look for bugs !

Office 2007 files (.docx, .xlsx, .pptx) on Mac

(updated to reflect the release of StarOffice 9 for Mac and the OOXML conversion software for Office 2004)

Microsoft Office 2007 for Windows (and its Mac counterpart: Microsoft Office 2008) uses a new file format that has been available for a while now as .docx, .xlsx or .pptx ("x" to distinguish them from the standard MS formats).

The file format is commonly known as OOXML or OpenXML, or more simply as Microsoft Office 2007 format.

Even if the new files don't seem to be very widely used, they sometimes end up on a Mac user's desktop, especially since they are the default file format of the two suites (i.e., you need to go through a number of loops to save to a different format)...

What to do when you encounter such files ?

Since I do not own Office 2008 and I did not have the OOXML update for Office 2004 at the time of the writing, I had to test access with OOXML files created with NeoOffice, from "real" Microsoft .doc, .ppt and .xls. All of the test files were pretty complex and quite heavy and had all been created originally on various versions of Microsoft for Windows.

Access through proprietary applications

The iWorks '08 way

iWorks: $79 from Apple

As far as I can tell, iWorks '08 applications Pages and Keynotes opened the .docx and .pptx files I had created without any problems.

And the result was as good looking as the original files. Very impressive.

When I tried to open the .xlsx file, Numbers was considered as the default application (even the converter was not listed) but it was unable to open it correctly. I'll need to have a "genuine" .xlsx file to test Numbers' capacities.

The problem with iWorks it that it cannot save a file to the new format. It can save it to the iWords default format or to the old Microsoft format, along with a few other more classical formats.

The Microsoft way

Microsoft Office 2008: $399.95 retail, $284.99 online, $239.95 retail upgrade version, $194.99 online upgrade version from Microsoft.
(The prices given correspond to the cheapest available version for professionals, the "Home & Student" package is not available for commercial activity.)

The Mac equivalent of Office 2007, Office 2008, has been available for a few months already. Office 2008 is the quickest way to access the new file format in a relatively smooth and painless way.

If you don't want to acquire Office 2008, you can download Microsoft's "Open XML File Format Converter for Mac". The application is available from here. It is at the bottom of the page, if the URL has not changed...

The converter requires OSX 10.4.8 or later. Microsoft also says that to view the files, you need either Microsoft Office 2004 11.3.4 or later, or Microsoft Office v.X 10.1.9 or later.

If you also install "Microsoft Office 2004 for Mac 11.5.0 Update" (description available here: you'll also enable "Office 2004 for Mac to read and to write Office documents that are in Open XML Format".

The StarOffice 9 (beta) way

StarOffice: $69.95 (StarOffice 8 price, 9 is still beta), from Sun Microsystems.

StarOffice 9 beta is available from here:

It should work pretty much as 3.0 beta. See below.

System wide support on Leopard (OSX 10.5)

Leopard: $129, from Apple.

If you don't (plan to) own any recent version of Office for Mac what can you do ?

Leopard users have the free option of using the new TextEdit. It can open and save the new file format.

OOXML support is system wide, which means that the Finder and other applications will also give you a "quicklook" of such files. Although not all files are equal under Quicklook. Some are displayed properly, some are displayed as a white icon and no contents is shown... The test .pptx worked, the .docx and xlsx did not.

So, support is not extremely good and I would not rely on it to check the translatable contents of a client file...

Access through free applications and NeoOffice anyone?

Users on Panther (10.3) and above can use NeoOffice 2.2. NeoOffice is a sister application of

The current available version of the standard (2.4) does not include OOXML support but NeoOffice includes special goodies, like OOXML support, that are found in Novell's version of, which is, sadly, not available for the Mac...

As of May 7th, the beta version of 3.0 is available. This version does include support for OOXML.

As written above, I used NeoOffice to create Office 2007/OOXML files with various degrees of success in terms of interoperability. I am pretty sure NeoOffice could open relatively complex files since the files I fed it for OOXML output were fairly complex, although I'd need to test that.

As text ?

An extreme way to access the contents of such files it to handle them as if they were zipped, unzip them and find the document.xml located somewhere in the folder hierarchy that appears (it would be under /word/ for a Word document). This file is standard XML and can be opened in any text editor.

To properly access the contents of the file, you'd need to use Okapi's Tikal utility, available for the Mono (free) running environment. Tikal should be able to extract the contents of the XML into an XLIFF file that you can later load into a translation tool...


Once you have access to the file, you can translate it by overwriting it in the application of your choice. Saving the resulting file to .docx will produce results that vary with the application you used. A best bet would be to save the result to .rtf for delivery.

OmegaT and other Java based applications

If you want to use a translation memory tool, the few I know that directly handle OOXML are OmegaT, Swordfish, the newborn from Maxprograms, and Heartsome's Translation Suite.


If you have converted the file to .rtf or HTML before translation, AppleTrans should be able to handle it directly.

Okapi's Tikal for conversion to XLIFF

Or, as written above, you can use Okapi's Tikal command line utility to convert its contents to XLIFF and translate it in any of the above mentioned applications.


The Microsoft converter opens the file in Word in the RTF format and you can then use WordFast to translate it directly (from within Word 2004 / Word v.X).


With hacks, you can also translate the document.xml file in OpenLanguageTools.

Have I forgotten your favorite tool ?

Java applications on Leopard

It seems Java developers are very disappointed by the fact that not only does Leopard not ship with Java 1.6 but that the Java 1.6 preview available from the Apple Developer Connection is simply gone.

Does that mean that Java applications will stop working on Leopard ? No.

Leopard comes with Java 1.5. Quote from Apple: "Mac OS X includes the full version of J2SE 1.5, pre-installed with the Java Development Kit (JDK) and the HotSpot virtual machine (VM), so you don't have to download, install, or configure anything."

So, you can still run OmegaT, OpenLanguageTools, Heartsome's Translation Suite and any other software that does not depend on Java 1.6 without a glitch, as far as I've tested.

Some developers have reported that the Java 1.6 preview totally broke Java on Leopard but it is not the case for me... [see the update at the bottom]

Anyway, it is likely that Java 1.6 will be released soon as seems to believe Eric Burke of "It's just a bunch of Stuff That Happens"...

Check this Apple Developer Connection page to see if you miss a Java 1.5 update on Tiger.

ps1: OpenLanguageTools is not advertised as a Mac application but the UNIX version runs very well on Mac.

ps2: Heartsome will soon release version 7 of its suite.

update: Java 1.6 did indeed break a few things. When Leopard is installed the Java preferences in /Applications/Utilities/Java allow you to change the default JVM to Java 1.6. Don't do that! If you do, a number of processes that are meant to run with an unspecified version of Java will try to run with Java 1.6 and won't work. Since the Java Preferences also seems to call that unspecified version of Java, you will be stuck without being able to get back to a default Java 1.5.

Well, there may be tricks to do get back there quickly, be the safest route seems definitely to remove Java 1.6. Javablog indicates a straightforward procedure to do so. Not for the faint of heart! But that is the cost of using developer preview beta software...

What I like in OSX 10.5

Apple advertises over 300 new features in OSX 10.5. A lot of them have no immediate use for the translator but some of them really rock. 300+ New Features

Here is my current and most immediate pick. The list will grow as I use them in real life.


I never really used Spotlight, even on my relatively fast MacBook. Instead I had Butler as an application launcher and the standard Finder for file searches (even though the Finder used Spotlight technology).

Now, Spotlight is all this and more.

  • it displays applications first, so that you can really use it as an application launcher. Even though it is not as sophisticated as Butler, it does the job very properly.
  • searches are faster than on Tiger.
  • it calculates as you type a formula. I used PlainCalc for that before, and even though PlainCalc is much more powerful than Spotlight, the later is currently just what I need most of the time.
  • it acts as a (English) dictionary by showing you the first line of the definition of the word you type, clicking on the result will bring you to the Dictionary application.

Spotlight on Apple's "New features" page.


Dictionary now has a set of 3 Japanese dictionaries (Shogakukan) and also accesses Wikipedia in the main available languages. It is faster than launching Safari to find a reference online. The only drawback, when compared to Safari, is that Dictionary does not display the language versions links of the article you are reading which does not allow the user to jump to the target language in which the reference is sought.

Dictionary on Apple's "New features" page.


Automator now has a "Record" function that allows it to remember mouse activity on any application's menus. That allows for quick "macro" recording in applications that are not well integrated with OSX and/or that do not support AppleScript. Java applications for translators come to mind !

Automator on Apple's New features" page.

What I miss in OSX 10.5

The ability to have a different input system per window

This setting was present in Tiger under "International > Input Menu" at the bottom of the window, the same location in Leopard only indicates the keyboard shortcut to change of input method.

Use case: you translate from a language that requests a different input system from the language you translate to. When you need to type a word in a dictionary to check its meaning, you change of window, change of input system, type, get the results, go back to the original window, go back to the original input system and type. Having the OS remember which input systems goes with which application is quite a time an annoyance saver...

Clear visual clues to identify folders in the Dock

John Siracusa has a full page about that on Ars Technica, with plenty of nice screenshots:

Mac OS X 10.5 Leopard: the Ars Technica review: The Dock By John Siracusa | Published: October 28, 2007 - 11:36PM CT

Popular posts (last 30 days):