How to support this blog?

To support this blog, you can hire me as an OmegaT consultant/trainer, or you can send translation and project management jobs my way.

Search the site:

emacs regex with emacs lisp

June 28 update: using \#1 instead of (string-to-number \1)

A reader on reddit mentionned that the manual also had the "\#d" construct to replace the often used (string-to-number \d) function.

That regex-replace improvement was mentionned in Stevve Yegge's emacs 22 introduction, back in 2006.

Last but not least, I noticed that the whole post was originally written with (string-as-number ...) when the correct function name is (string-to-number ...)


Not strictly related to translation but here is what's happening...

I've resumed studies last year, trying to finish an MA in Japan Studies I started 25 years ago.

For the first year, I only have to write a 30ish pages dissertation on my subject (representation of women in kendo magazines in Japan) and I decided to go the emacs + org-mode way, with the easy export to ODF function that's packaged with the thing.

So I decided to write each chapter in a different org file, and send them one by one to my director. But then, for the final delivery I needed to put all that in one big file and was faced with the fact that all my footnotes would need to be re-indexed manually because each file had notes starting at 1...

I usually use BBedit for any serious regex work. Mostly because the interface is clearer than emacs, and the regexp feels more modern (\d vs [:digit:])

But one thing you can't do in BBEdit is to send commands to the replace string. For ex, in my case, add 21 to the matching number, which seems pretty trivial, when you think of it, but doing that will involve other technologies, like using perl or some other command line thing.

In emacs, however, everything can be interpreted as an expression, hence you can insert code wherever you want and get the result from that code right in the document.

The org-mode footnotes all look like [fn:12], where "12" is the note number that I need to replace with an incremented number. Since there are no instances of fn:\d+ without the brackets that are not footnotes, I figured I could just be searching for that string:

fn:\([:digit:]+\)

Notice that in emacs, "(" and ")" need to be escaped, also I could have used the [0-9] class.

In BBedit I'd just need:

fn:(\d+)

And now I need to replace that with the expression that will add 21 to the number.

In BBEdit, I'd be stuck here. I just can't add anything to a match. In emacs, I can replace the match with that:

fn:\,(+ 21 (string-to-number \1))

The emacs lisp expression is "(+ 21 (string-to-number \1))", which means "convert the \1 match that is a string into its numerical value and add 21 to it".

But, wasn't \1 supposed to match [0-9]+, which is a number? Well, yes, but really it's just digits, hence strings, that have no numerical value whatsoever, so first, we need the expression to convert them to a numerical value before adding 21 to them.

Now, the trick is to have the expression be handled as an operation and not as an arbitrary string, and that's where the "\," prefix comes into play.

"\," tells the replace engine that the string that follows must be interpreted as an emacs lisp expression and not as a mere string. With it, the regexp replaces properly adds 21 to my note numbers, and I get two dozen footnotes updated in one fell swoop...

I love BBEdit and its people, but emacs is really a gift that keeps on giving.


Here are some handy references:

the emacs regexp-replace function
Regexp Replacement
the emacs regexp syntax
Syntax of Regular Expressions
the emacs-lisp string-to-number function
Conversion of Characters and Strings

And here is a really super short introduction to lisp syntax

There is no real need to have a very deep understanding of emacs lisp to use this regexp-replace function. Just remember that a lisp expression generally looks like this:

(operator operands)

Where the operator is a generally a function, like + or string-to-number above, and the operands can be any expression that is accepted by the operator. So, here:

(+ 21 (string-to-number \1))

means:

add 21 to the result of the expression (string-to-number \1)

with (string-to-number \1) meaning:

convert the string matched by \1 to its numeric value

Obviously, if \1 is not a string, the conversion will fail and the addition won't work. And without that conversion, if we had just added \1 as a string, the addition that expects numbers as operands would have failed.

I just realized that this is my first emacs lisp related post ever ! I'd like to thank that person I met in Tokyo about 15 years ago who showed me the way. It's an egg that definitely took some time to hatch...

Popular, if not outdated, posts...

.docx .NET .pptx .sdf .xlsx AASync accented letters Accessibility Accessibility Inspector Alan Kay alignment Apple AppleScript ApplescriptObjC AppleTrans applications Aquamacs Arabic archive Automator backup bash BBEdit Better Call Saul bug Butler C Calculator Calendar Chinese Cocoa Command line CSV CSVConverter database defaults Devon Dictionary DITA DocBook Dock Doxygen EDICT Emacs emacs lisp ergonomics Excel external disk file formats file system File2XLIFF4j Finder Fink Font français Free software FSF Fun Get A Mac git GNU GPL Guido Van Rossum Heartsome Homebrew HTML IceCat Illustrator InDesign input system ITS iWork Japanese Java Java Properties Viewer Java Web Start json keybindings keyboard Keynote killall launchd LISA lisp locale4j localisation MacPort Mail markdown MARTIF to TBX Converter Maxprograms Mono MS Office NeoOffice Numbers OASIS Ocelot ODF Okapi OLPC OLT OmegaT OnMyCommand oo2po OOXML Open Solaris OpenDocument OpenOffice.org OpenWordFast org-mode OSX Pages PDF PDFPen PlainCalc PO Preview programming python QA Quick Look QuickSilver QuickTime Player Rainbow RAM reggy regular expressions review rsync RTFCleaner Safari Santa Claus scanner Script Debugger Script Editor scripting scripting additions sdf2txt security Services shell shortcuts Skim sleep Smultron Snow Leopard Spaces Spanish spellchecking Spotlight SRX standards StarOffice Stingray Study SubEthaEdit Swordfish System Events System Preferences TBX TBXMaker Terminal text editing TextEdit TextMate TextWrangler The Tool Kit Time Capsule Time Machine tmutil TMX TMX Editor TMXValidator transifex Translate Toolkit translation Transmug troubleshooting TS TTX TXML UI Browser UI scripting Unix VBA vi Virtaal VirtualBox VLC W3C WebKit WHATWG Windows Wine Word WordFast wordpress writing Xcode XLIFF xml XO xslt YAML ZFS Zip