02 November 2007

Office 2007 files (.docx, .xlsx, .pptx) on Mac

(updated to reflect the release of StarOffice 9 for Mac and the OOXML conversion software for Office 2004)

Microsoft Office 2007 for Windows (and its Mac counterpart: Microsoft Office 2008) uses a new file format that has been available for a while now as .docx, .xlsx or .pptx ("x" to distinguish them from the standard MS formats).

The file format is commonly known as OOXML or OpenXML, or more simply as Microsoft Office 2007 format.

Even if the new files don't seem to be very widely used, they sometimes end up on a Mac user's desktop, especially since they are the default file format of the two suites (i.e., you need to go through a number of loops to save to a different format)...

What to do when you encounter such files ?

Since I do not own Office 2008 and I did not have the OOXML update for Office 2004 at the time of the writing, I had to test access with OOXML files created with NeoOffice, from "real" Microsoft .doc, .ppt and .xls. All of the test files were pretty complex and quite heavy and had all been created originally on various versions of Microsoft for Windows.

Access through proprietary applications



The iWorks '08 way



iWorks: $79 from Apple

As far as I can tell, iWorks '08 applications Pages and Keynotes opened the .docx and .pptx files I had created without any problems.

And the result was as good looking as the original files. Very impressive.

When I tried to open the .xlsx file, Numbers was considered as the default application (even the converter was not listed) but it was unable to open it correctly. I'll need to have a "genuine" .xlsx file to test Numbers' capacities.

The problem with iWorks it that it cannot save a file to the new format. It can save it to the iWords default format or to the old Microsoft format, along with a few other more classical formats.


The Microsoft way


Microsoft Office 2008: $399.95 retail, $284.99 online, $239.95 retail upgrade version, $194.99 online upgrade version from Microsoft.
(The prices given correspond to the cheapest available version for professionals, the "Home & Student" package is not available for commercial activity.)

The Mac equivalent of Office 2007, Office 2008, has been available for a few months already. Office 2008 is the quickest way to access the new file format in a relatively smooth and painless way.

If you don't want to acquire Office 2008, you can download Microsoft's "Open XML File Format Converter for Mac". The application is available from here. It is at the bottom of the page, if the URL has not changed...

The converter requires OSX 10.4.8 or later. Microsoft also says that to view the files, you need either Microsoft Office 2004 11.3.4 or later, or Microsoft Office v.X 10.1.9 or later.

If you also install "Microsoft Office 2004 for Mac 11.5.0 Update" (description available here: http://support.microsoft.com/kb/953824) you'll also enable "Office 2004 for Mac to read and to write Office documents that are in Open XML Format".

The StarOffice 9 (beta) way


StarOffice: $69.95 (StarOffice 8 price, 9 is still beta), from Sun Microsystems.

StarOffice 9 beta is available from here:
http://www.sun.com/software/staroffice/get_beta.jsp

It should work pretty much as OpenOffice.org 3.0 beta. See below.

System wide support on Leopard (OSX 10.5)


Leopard: $129, from Apple.

If you don't (plan to) own any recent version of Office for Mac what can you do ?

Leopard users have the free option of using the new TextEdit. It can open and save the new file format.

OOXML support is system wide, which means that the Finder and other applications will also give you a "quicklook" of such files. Although not all files are equal under Quicklook. Some are displayed properly, some are displayed as a white icon and no contents is shown... The test .pptx worked, the .docx and xlsx did not.

So, support is not extremely good and I would not rely on it to check the translatable contents of a client file...

Access through free applications



OpenOffice.org and NeoOffice anyone?



Users on Panther (10.3) and above can use NeoOffice 2.2. NeoOffice is a sister application of OpenOffice.org.

The current available version of the standard OpenOffice.org (2.4) does not include OOXML support but NeoOffice includes special goodies, like OOXML support, that are found in Novell's version of OpenOffice.org, which is, sadly, not available for the Mac...

As of May 7th, the beta version of OpenOffice.org 3.0 is available. This version does include support for OOXML.

As written above, I used NeoOffice to create Office 2007/OOXML files with various degrees of success in terms of interoperability. I am pretty sure NeoOffice could open relatively complex files since the files I fed it for OOXML output were fairly complex, although I'd need to test that.

As text ?



An extreme way to access the contents of such files it to handle them as if they were zipped, unzip them and find the document.xml located somewhere in the folder hierarchy that appears (it would be under /word/ for a Word document). This file is standard XML and can be opened in any text editor.

To properly access the contents of the file, you'd need to use Okapi's Tikal utility, available for the Mono (free) running environment. Tikal should be able to extract the contents of the XML into an XLIFF file that you can later load into a translation tool...


Translation



Once you have access to the file, you can translate it by overwriting it in the application of your choice. Saving the resulting file to .docx will produce results that vary with the application you used. A best bet would be to save the result to .rtf for delivery.

OmegaT and other Java based applications



If you want to use a translation memory tool, the few I know that directly handle OOXML are OmegaT, Swordfish, the newborn from Maxprograms, and Heartsome's Translation Suite.

Appletrans



If you have converted the file to .rtf or HTML before translation, AppleTrans should be able to handle it directly.

Okapi's Tikal for conversion to XLIFF



Or, as written above, you can use Okapi's Tikal command line utility to convert its contents to XLIFF and translate it in any of the above mentioned applications.

Wordfast



The Microsoft converter opens the file in Word in the RTF format and you can then use WordFast to translate it directly (from within Word 2004 / Word v.X).

OpenLanguageTools



With hacks, you can also translate the document.xml file in OpenLanguageTools.

Have I forgotten your favorite tool ?