09 November 2007

Dictionary.app development kit

Dictonary.app comes with a few dictionaries already, but limited to 2 languages: English and Japanese.

What if you want to use other data sets there ?

Apple has released a Dictionary Development Kit that you will find in /Developer/Extras/Dictionary Development Kit/, if you have installed the XCode tools.

From a first glance, a dictionary file is basically a twisted XHTML file that gets massaged with perl scripts and a few command line applications, for use in the dictionary application.

Here is the first part of the abstract from the file Dictionary Format.rtf located in ./documents

"This document explains an XML schema that enables developing dictionaries / references that are compatible with Dictionary.app and other Dictionary Services. The schema defines the source code format for the dictionary. The dictionary source needs to be validated to make sure it is in the correct format. It is then processed by the Dictionary Build tool together with css and other auxiliary files, and packaged into a dictionary bundle. The dictionary bundle can be installed into one of the Library/Dictionaries folders to make it accessible from Dictionary.app."

And here is the first entry in one of the sample provided (./samples/JapaneseDictionarySample.xml):

<?xml version="1.0" encoding="UTF-8"?>
This is a sample dictionary source file.
It can be built using Dictionary Development Kit.

Entry examples for Japanese dictionary, English-Japanese dictionary, and Japanese-English dictionary.
<d:dictionary xmlns="http://www.w3.org/1999/xhtml" xmlns:d="http://www.apple.com/DTDs/DictionaryService-1.0.rng">

<d:entry id="annojou_j" d:title="案の定">
<d:index d:value="アンノジョウ" d:title="案の定" d:yomi="あんのじょう" />
<d:index d:value="あんのじょう" d:title="案の定" d:yomi="あんのじょう" />
<d:index d:value="案の定" d:title="案の定" d:yomi="あんのじょう" />
<span class="headword">あん‐の‐じょう</span>
<span class="hyouki">【案の定】</span>
<span class="meaning">予想通りに事が運ぶさま。</span>

(Side note: interesting to see that .xml files open in Dashcode by default.)

Update: Apple has a full "Dictionary Services Programming Guide" available here. It also comes as a downloadable 1.3mb PDF file.

I have been trying to build a sample of EDICT provided for this purpose by Prof. Breen but even though the build process seems to be successful, Dictionary.app only displays that is finds an entry without actually displaying it... More later...

Update 2: I have spent a good part of the week end testing Prof. Breen's sample. The documentation provided by Apple is really minimal so we had to try a lot of different things.
  • The IDs that are meant to be unique are not validated if they are entirely numerical, but it seems numerical values are handled without problems by the build process.

  • The build process seems to have difficulties to properly build the dictionary if the project is not in the /Developer/Extras/Dictionary Development Kit/ directory.