How to support this blog?

To support this blog, you can hire me as an OmegaT consultant/trainer, or you can send translation and project management jobs my way.

Search the site:

Oracle JDK7 for OSX

You remember when Apple said they would not maintain Java anymore ? That was just 12 months ago:

Java is dead! Long live Java? (on this blog)

I just noticed that Oracle released a new preview edition of JDK7 for OSX yesterday (b215).

http://jdk7.java.net/macportpreview/

I installed it and after changing my Java preferences (search for "Java Preferences" in SpotLight), I tried the preview version of OmegaT.

The result ?


54443: Info: OmegaT-2.5.0_1 (Thu Oct 27 15:05:55 JST 2011) Locale en_US
54443: Info: Java: Oracle Corporation ver. 1.7.0-ea, executed from '/Library/Java/JavaVirtualMachines/JDK 1.7.0 Developer Preview.jdk/Contents/Home/jre' (LOG_STARTUP_INFO)


It works !

No crash yet so I think I'll finish the current job with that new version of Java.

Be extra cautious though when you use preview versions of software. A bug can bite you in the middle of a job...

Update (a few hours later)
It works, but there are a few issues that make it not practical to work with this preview right now. I've reverted to Java 1.6 until Oracle delivers something closer to a release candidate :-)

Detailed information is here:

http://wikis.sun.com/display/OpenJDK/Mac+OS+X+Port+Project+Status

Update (a few days later)
Apple has just released a new update for Java 1.6 for Snow Leopard and Lion. Check software Update. The JDK7 port just released a new build: b217.

New fun to come with OmegaT 2.5...

OmegaT 2.5, the preview version that you can get from:

https://sourceforge.net/projects/omegat/files/OmegaT%20-%20Latest/

includes a really nice new feature that, unfortunately, is not yet available for Mac users...

(Nov. 10 update: the latest version of the plugin works fine on Mac now)


If it is not, then you can rightly ask why bother mentioning it here at all ? Well, the answer is simple. It is kind of available, but because of an user interface design issue, the buttons that make it run are not available on Mac... This is going to be fixed real soon. In the meanwhile, get ready for...

A scripting interface to OmegaT's internals.

See the announcement here:

http://tech.groups.yahoo.com/group/OmegaT/message/22988

People who know what they are doing can already check this Java documentation page:

Scripting for the Java Platform

According to the scripting plugin source code, the possible languages for use in OmegaT are:

  • JavaScript
  • Jacl
  • NetRexx
  • Java
  • BML
  • VBScript
  • JScript
  • PerlScript
  • Perl
  • JPython
  • Jython
  • LotusScript
  • XSLT
  • Pnuts
  • BeanBasic
  • BeanShell
  • Ruby
  • JudoScript
  • Groovy
  • ObjectScript
  • Prolog
  • Rexx


There are already plenty of exchanges on the OmegaT mailing list regarding the scripting extension. Check this thread for example:

http://tech.groups.yahoo.com/group/OmegaT/message/23260

We'll have an announcement here when the feature works on Mac...

Dennis Ritchie and John McCarthy too...

A few days after Jobs, Dennis Ritchie and John McCarthy passed away too, but that did not trigger international interest.

Dennis Ritchie is called the "father of C", C as in "C language". Everyone who's done a little bit of programming knows about the importance of C in the computing world.

http://en.wikipedia.org/wiki/C_(programming_language)

A few days after Ritchie, John McCarthy, the "father of Lisp" passed away too. Lisp is the language that was mostly used for artificial intelligence works "back then".

http://en.wikipedia.org/wiki/Lisp_(programming_language)

Lisp is 11 years older than C. Lisp was born in 1958 and C in 1969. But both languages are still commonly used in computing today...

Of course, both languages can be used on Macs. If you install the developer's tools that come with your DVD, you have access to a C compiler. Lisp, being a family of languages, requires to make a few choices (either get an ANSI standardized Lisp, or a Scheme, or a new Lisp like Clojure, that runs in Java etc.)

As for introductory books, "Land of Lisp" by Conrad Barski, M.D., from No Starch Press has been very well reviewed.

http://nostarch.com/lisp.htm

"Practical COmmon Lisp" by Peter Seibel from Apress is really nice too and sparked a renewed interest in the language. Plus, the PDF is freely available.

http://www.gigamonkeys.com/book

As for C, well, there are so many books about C programming that the only one I can think of is Kerninghan and Ritchie's "The C Programming Language, Second Edition" from Prentice Hall.

http://cm.bell-labs.com/cm/cs/cbook/

Objective-C is a strict super-set of C and is mostly known for being the language behind OSX applications.

A good introduction I found is "Programming in Objective-C" by Stephen G Kochan, from Pearson Education.

http://www.pearsonhighered.com/educator/product/Programming-in-ObjectiveC-3E/9780321711397.page

Of course, you can find plenty of free tutorial that can get you started in both languages.


Programming is fun and if it is not already the case, you should really give it a try.

Steve Jobs passed away

It's going to be analyzed all over the world. Daringfireball linked to Job's Commencement Address in 2005. Here it is:

'You've got to find what you love,' Jobs says

Virtaal running on Mac ! Part II

After the previous post, it appeared that Virtaal has problems with my configuration. Some testing and a few mails later here is a new announcement from the Virtaal team:



From: Dwayne Bailey
Date: 19 mai 2011 08:40:17 UTC+09:00
To: translate-devel@lists.sourceforge.net
Subject: [Translate-devel] Mac builds for Virtaal 0.7.0 rc1


You can get Mac OS X builds for Virtaal 0.7.0 rc1 here:
http://translate.sourceforge.net/snapshots/virtaal-0.7.0-rc1/Virtaal-Intel-0.7.0-rc1-1.dmg

Fixes since the beta 5 for Mac:
* The build uses its own Python (should solve your problem JC)
* Spell checking works - like Windows builds we download the spell checkers
* TM server is now running

Issues:
* We get a solid hang with some keyboard shortcuts e.g. Ctrl-W to close
the translation file.  Navigation seems to work
* The installer is massive 43M, we'll put it on diet when we've got
stable builds
* The keyboard shortcuts are still mapped to Linux/Windows and haven't
been remapped to Mac
* Doesn't seem to work at all on 10.5 (Leopard) - well it does start but
you can't open anything
* Pango is still messing up Arabic


Basically Virtaal launches properly now. Welcome to the world of Mac Virtaal !

(Update: rc1-1 still had problems but rc1-2 worked fine. Check the snapshots located here to get the latest file: http://translate.sourceforge.net/snapshots/)

Virtaal running on Mac !

From Dwayne Bailey, on the Translate Toolkit development list:


Hi Virtaalers,

I've just got Virtaal running on a Mac, in a bundle and in a disk image.

It needs a testers love so please head over here and get your disk image
http://translate.sourceforge.net/snapshots/virtaal-0.7.0-beta5/Virtaal-Intel-0.7.0-beta5.dmg

I'd love to hear your feedback

To install:
1) Download
2) Click on the disk image
3) Either drag Virtaal to your applications folder or run it directly
from the folder

Testing:
1) Try to translate things, looking around is great but real work brings
out the bugs
2) Report any bugs at bugs.locamotion.org

What works:
* The file menu is integrated like a Mac app
* Translating works
* TM and MT

What doesn't work... yet:
* Spell checking - we'll probably need to use the approach we're doing
in Windows to download
* The menu appears in the application window
* No fancy installer
* Keybindings are not Cmd+ but still Ctrl+ - some don't work like
pasting placeables.

Comments on the blog

Apologies to the people who left comments recently. The notifications were sent to an old address and I just noticed the comments today. I've changed the notification reception address and I'll be better next time!

Introduction to regular expressions

[2018 update]

Textwrangler has now been replaced by BBEdit that offers a free trial and then turns into a BBEdit "free mode". BBEdit "free mode" is a better TextWrangler and also happens to be a 64 bits application compatible with High Sierra.
https://www.barebones.com/products/bbedit/comparison.html

I mention Aquamacs at the end of the article. Regular expressions for Aquamacs but also for Emacs are documented in the Emacs manual available here:
https://www.gnu.org/software/emacs/manual/html_node/emacs/Regexps.html

Check this article too:
https://www.johndcook.com/blog/2018/01/27/emacs-features-that-use-regular-expressions/

In OmegaT, you lose a lot if you're not familiar with regular expressions. OmegaT uses the Java flavor of regular expressions:
https://omegat.sourceforge.io/manual-standard/en/appendix.regexp.html

OmegaT allows you to create "tags" that can be protected and checked during the translation by using regular expressions. In a recent project I had created a 872 characters long regular expression that described 71 different tags.

As of February 2018, OmegaT also allows for replacements with capture groups as described below:
https://sourceforge.net/p/omegat/feature-requests/953/

And you can also use regular expressions to search for empty segments, as I document in this article:
https://mac4translators.blogspot.com/2018/03/searching-for-empty-translations-in.html




The technology that had the most impact on my workflow is definitely "regular expressions".

I discovered them at the end of the 90' when I was working on the conversion of a database output to a set of about 6000 static HTML pages. At the time, the editor of choice on the Mac was BBEdit from Barebones Software, but its free and "lite" version "BBEdit Lite" was also immensely popular. BBedit Lite has now been replaced by Textwrangler and just like its predecessor, Textwrangler can be used without paying a user license fee*.



What are regular expressions?


Regular expressions are a "search" function on steroids. Regular expressions were created to find patterns in strings. They can find simple patterns like the word "pattern" in this text, or more complex patterns like "a string that starts with 'pa', followed by a letter that's repeated twice, followed by any three characters that are neither 'space' nor '@' or '^' and followed by a space".

This document uses its first two paragraphs (the paragraphs in italics, above) as a test ground. Paste that paragraph in your favorite regular expressions supporting text editor (I use Textwrangler for all the descriptions so you might want to use it too) then call the search window, check the "grep" box at its bottom and search for:

re[^ ]*

You should see colors appearing while you type the search terms.

What that expression means is:
r followed by e followed by a group of characters that are not a space, or by nothing.

Hit Next and see what you get, then hit Cmd+G and see what you get. If you start from the top of the paragraph, you should have 8 "matches".


Normal characters


Most characters represent themselves in regular expressions (regex), like a "normal" search.

r means r and e means e, " " means a space. In the same sequence. No magic here.


Special effects


Some characters have special effects:

→ [ starts a group of characters
→ ] ends that group
→ ^ means "not"
→ * means "zero or more of what just came"

So, our above simple regular expression re[^ ]* means:

"look for any string that has a r followed by a e followed by zero or more characters that are not a space."

Now, what if you need to find characters like ^, [, ] or *?


Cancelling special effects


When you want to find characters that have a special effect without "triggering" that special effect, you put a "\" in front of them:

\* means the character *
\[ means the character [


And since the character "\" has the special effect of removing the special effect of a character that has a special effect... then:

→ \\ means the character \

etc.

By the way, the character . has the special effect of matching "any one character" so if you're looking for a period, then you really want to look for the \. string...


Examples:


The regular expression ". " (. followed by space) will match any one character followed by a space. There are 78 strings that match this pattern in the paragraph.

The regular expression "\. " (\ followed by . followed by space) will match any period followed by a space. There are only 2 strings that match this pattern in the paragraph.

The regular expressions "re*." (re followed by * followed by .  will match any string that is composed of a r followed by zero or more e, followed by any one character. There are 22 matches in the paragraph. Verify that you understand them all.

The regular expression ".e\*\." (. followed by e followed by \ followed by * followed by \ followed by .) will match the 4 characters string ee*. that you find at the end of the paragraph.


Triggering special effects


Some characters work the other way round: by themselves they do not have a special effect but if you stick the \ character before them, then their special effect is triggered.

t means t but \t means tabulation
r means r but \r means line break (specifically "carriage return")
s means s but \s means all sorts of white space, which includes spaces, tabulations, line breaks, etc.

If the character does not have a special effect then using \ has no effect.

i means i and \i too means i

Such sequences (\ followed by a character) are usually called "escape sequences".


Remembering matches


If you want to "memorize" a match, for later use in the expression or in the "Replace:" field, then you put the corresponding expressions between parenthesis:

(re)[^ ]+ will produce the same matches as above, but will memorize the re part and not the rest.

→ re([^ ]+) will produce the same matches as above, but will not memorize the re part and instead will memorize the rest.

→ (re)([^ ]+) will produce the same matches as above and will memorize the 2 parts separately.


Using memorized matches


Now that the matches are remembered, you can use them. Use \1 to refer to the first memorized string\2 to refer to the second memorized string, etc...

→ (e)\1\*\. will match the "ee*." string that you find at the end of the text.

→ search for (re)([^ ]+) and put \2\1 in the Replace: field:

(re) is the first group
([^ ]+) is the second group

\2\1 will thus put the second group before the first group.

The term "regular" matches the pattern: (re) matches re and ([^ ]+) matches gular. The replaced string will thus be "gularre".

→ search for (re)([^ ]+) and put \1\1_\[\2\] in the Replace: field:


(re) is the first group
([^ ]+) is the second group

\1\1_\[\2\] will put 2 instances of the first group, then an underbar, then [, then the second group, then ].

In the case of "regular", we'd have the following replacement string:
rere_[gular]


That's only the beginning...


What you need to check now is the special effects of some characters. If you've used Textwrangler it is all in the user manual, page 133, Chapter 8 (Searching with Grep)**, or you can call the Help with Cmd+? and you'll find a relevant link right away.

Textwrangler's regex is pretty standard so once you're used to it there, you can use it in other editors too. If what works in Textwrangler does not work there, check the idiosyncrasies of the editor you use.

Now, take a real world document and try to transform it by using a few regular expressions. A typical use case for a translator would be to convert a TMX file into a 2 column tab separated data set, or the opposite: to convert a 2 column tab separated data sets into a TMX file. If you manage to do that you've created your first alignment based TMX converter!



* I try to use or discuss free software when possible because I think that is the way to go. People who want to use a free text editor on the Mac can use Aquamacs. It comes with all the goodness of emacs (including the same regular expressions) and looks and feels a lot like a "normal" Mac text editor.
** [2018 Update] In the BBEdit manual, Chapter 8 is on page 165.

When your external disks are not willing to go to bed...

We've had problems with our iMacs bed time in the office pretty much since we started using them.

At night, they seem to refuse to fall asleep.

We've tried all the possible settings provided by the OSX interface. But nothing worked. Until I realized that the problems seemed to be related to the external back-up disks attached to them.

We use a LaCie and a Western Digital, both with a 1TB capacity, respectively connected to a first generation 24" Intel iMac and to a before-the-latest generation 27" iMac.

  1. Sleep settings don't work for the hard disks
  2. Manually unmounting the disks seems to solve the issue, but mounting them back is not trivial
  3. OSX seems to unmount the disks when I log out from the Apple menu but when I log back in I have to restart all my applications

I have found a solution that requires the use of the "Fast user switch menu". The settings are as follows:

  1. Go to System Preferences > Accounts
  2. Check "Show fast user switch menu as" and choose the option you prefer
  3. Make sure the "Show the Restart, Sleep, and Shut Down buttons" box is checked

Now, when you want to put the whole system to sleep, the solution is:

  1. Click on the Fast user switch menu (top right of your screen)
  2. Call the "Login Window..."
  3. Press "Sleep"

Et voilà !

My guess is that the sleep option from the login window actually unmounts the hard disks and leaves them inactive until you enter your account again.

As a side note, those sleep problems also had a disturbing side effect: when I'd simply put the system to sleep the "old" way (either by calling "Sleep" from the Apple menu or by using the automation settings in "Energy Saver"), and tried to wake it up in the morning I'd have what looked like system freezes that sounded (yes, actually "sounded") related to the external hard disk activity. With the new system, I've had only one case of "freeze" in a few weeks of time, and it still kind of looks related to the external disk behavior.

In any case, this system seems to be allowing our Macs to spend their whole night sleeping and not struggling with hectic external disks... I'll need to check further what is the cause of the disks' inability to sleep, and also what is the cause of my freezes (not happening on the 24"), but that's for another post...

Java is not so dead after all...

The announcement came from Oracle, a few weeks after Apple's declaration (see Java is dead! Long live Java?).

Basically all of Apple's Java efforts will be handled by Oracle, within the OpenJDK project and Oracle will release Java for Apple machines, not Apple anymore.

If you are interested in declarations, check this link: OpenJDK News, Nov. 12.

Before OpenJDK took charge of the OSX version of Java, there were a number of projects that attempted to use the FreeBSD version of OpenJDK to create something that would run under OSX.

Now, OSX has its own development project at OpenJDK and things are slowly progressing, which is a very good thing...

The project goals are:

  • Pass all appropriate certification tests for Java SE 7
  • Include a complete, native Cocoa-based UI Toolkit
  • Provide excellent performance

Which basically translates into: what Apple has been providing us with since the beginning of OSX, but by the people who are behind Java (and hopefully with simultaneous Windows/Linux/OSX releases).

If you want to build Java on OSX, you can. Follow the instructions on this page: Mac OS X Port. I have built Java 1.7 on my machine and it worked well with OmegaT. The only problem is that the UI toolkit still depends on X11, which means that your favorite Java application will not look like a native citizen of OSX. But that will happen!

If you want to see how the project advances and check the discussions (and participate in case you have a problem building the thing for example) check the following links:

The questions that remain unanswered are the following:

  1. Will Java 1.7 on OSX be available at the time OSX 10.7 Lion is released next summer?
  2. If not, will OSX 10.7 Lion include the current Apple released Java 1.6 by default an all new machines?
  3. Once Java 1.7 is released by Oracle, will Apple include it in OSX bundles or will users have to go fetch it from Oracle's site ?

My take is that Java 1.7 will not be finished by the time Lion is released, and for the rest, I have no idea.
I'd like to say "yes" for a default Java 1.6 in Lion and "possible" for inclusion of Oracle's Java in further releases (after all OSX bundles plenty of third party development tools). We'll see.

But the future is definitely much brighter than it was back in October.

Popular, if not outdated, posts...

.docx .NET .pptx .sdf .xlsx AASync accented letters Accessibility Accessibility Inspector Alan Kay alignment Apple AppleScript ApplescriptObjC AppleTrans applications Aquamacs Arabic archive Automator backup bash BBEdit Better Call Saul bug Butler C Calculator Calendar Chinese Cocoa Command line CSV CSVConverter database defaults Devon Dictionary DITA DocBook Dock Doxygen EDICT Emacs emacs lisp ergonomics Excel external disk file formats file system File2XLIFF4j Finder Fink Font français Free software FSF Fun Get A Mac git GNU GPL Guido Van Rossum Heartsome Homebrew HTML IceCat Illustrator InDesign input system ITS iWork Japanese Java Java Properties Viewer Java Web Start json keybindings keyboard Keynote killall launchd LISA lisp locale4j localisation MacPorts Mail markdown MARTIF to TBX Converter Maxprograms Mono MS Office NeoOffice Numbers OASIS Ocelot ODF Okapi OLPC OLT OmegaT OnMyCommand oo2po OOXML Open Solaris OpenDocument OpenOffice.org OpenWordFast org-mode OSX Pages PDF PDFPen PlainCalc PO Preview programming python QA Quick Look QuickSilver QuickTime Player Rainbow RAM reggy regular expressions review rsync RTFCleaner Safari Santa Claus scanner Script Debugger Script Editor scripting scripting additions sdf2txt security Services shell shortcuts Skim sleep Smultron Snow Leopard Spaces Spanish spellchecking Spotlight SRX standards StarOffice Stingray Study SubEthaEdit Swordfish System Events System Preferences TBX TBXMaker Terminal text editing TextEdit TextMate TextWrangler The Tool Kit Time Capsule Time Machine tmutil TMX TMX Editor TMXValidator transifex Translate Toolkit translation Transmug troubleshooting TS TTX TXML UI Browser UI scripting Unix VBA vi Virtaal VirtualBox VLC W3C WebKit WHATWG Windows Wine Word WordFast wordpress writing Xcode XLIFF xml XO xslt YAML ZFS Zip