Tuesday, July 6, 2010

Using Applescript to Extract Hyperlinks from a Word document

I was working with a Word document that had a bunch of hyperlinks, and I thought that what I'd like to do is to create a table of URL's, kind of like a list of figures or list of tables. Applescript to the rescue! This script creates a new table at the insertion point that has two columns: link text and URL. It's smart enough to eliminate duplicate entries, and uses the table sort command to sort the results.


tell application "Microsoft Word"
activate
set allTexts to {}
set allLinks to every hyperlink object of active document
set theTable to make new table at active document ¬
with properties {number of rows:1, number of columns:2}
insert text "Link" at text object of cell 1 of row 1 of theTable
insert text "URL" at text object of cell 2 of row 1 of theTable
repeat with theLink in allLinks
set theText to text to display of theLink
if allTexts does not contain theText then
make new row at end of theTable with properties ¬
{allow break across pages:false}
insert text (theText) at text object of cell 1 ¬
of last row of theTable
insert text (hyperlink address of theLink) ¬
at text object of cell 2 of last row of theTable
set allTexts to allTexts & {theText}
end if
end repeat
table sort theTable field number 1 with exclude header
end tell