Posting del.icio.us Links to WordPress: Finishing Up

On Wednesday, I posted more info about how to clean up my weekly del.icio.us links. There are a few things I’d like to do before I wrap this up.

  1. change all tags and attributes to lowercase
  2. close every dt element
  3. close every dd element
  4. make things a bit more automatic

If we take a closer look at the code for each entry we will see a pattern.


One line has a <DL> followed by the anchor. The next line has a <DD> followed by my comments.

<DT><A HREF="url" LAST_VISIT="1238086010" ADD_DATE="1238086010" TAGS="tagone,tagtwo">Link text</A>
<DD>comments

The only thing that makes this tricky at all is that sometimes the comments span more than one line. We can get around this fairly easily though. All we need to do is put the closing </dd> before all the <DT> tags except the first one. Let’s make that easier by changing the first one to lowercase. We’ll change part of what we did yesterday to accomplish this. Instead of replacing

<DL><p>

with

<dl>

we will replace

<DL><p><DT><A HREF=

with

<dl><dt><a href=

The rest is of the cleanup is pretty straightforward.


Replace

<DT><A HREF="

with

</dd><dt><a href="

and

</A>

with

</a></dt>

and then

LAST_VISIT=[^<]*TAGS="

with

tags="

since I don’t need two of those attributes anyway.

And I almost forgot

<DD>

with

<dd>

Wrap it all up and we have

grep '^> ' < links.diff |awk '{sub(/<DL><p><DT><A HREF=/,"<dl><dt><a href=")};{sub(/<\/A>/,"</a></dt>")};{sub(/<DT><A HREF=/,"</dd><dt><a href=")};{sub(/<DD>/,"<dd>")}{sub(/LAST_VISIT[^<]*TAGS=/,"tags=")};{sub(/^> /, "")};!/<\/DL>/{print}' > foo.html;echo "</dl>" >> foo.html

All we need now is to make the whole process more automatic. Since we have to add that line break in the old export file we can change things up once again to do that automatically. And since we will probably want to save this as a shell script, we can go ahead and make it more readable. I changed a couple of things I didn’t detail here and this is what I ended up with:

First I generalize a bit so I can change things later if I want to

diff $OLDLINKS $NEWLINKS |grep '^> ' |awk '{sub(/<\/A>/,"</a></dt>")};{sub(/<DL><p><DT><A HREF=/,"<dl><dt><a href=")}{sub(/<DT><A HREF=/,"</dd><dt><a href=")};{sub(/<DD>/,"<dd>")}{sub(/LAST_VISIT[^<]*TAGS=/,"tags=")};{sub(/^> /, "")};!/<\/DL>/{print}' > $MYLINKS;echo "</dl>" >> $MYLINKS

then decide on path names (I like to let FireFox save in Downloads automatically and I’m going to delete the new links file anyway, so I set the pathname accordingly.)

export LINKSDIR=$HOME/Documents/Personal/blogging
export OLDLINKS=$LINKSDIR/old-delicious.htm
export NEWLINKS=$HOME/Downloads/delicious-`date "+%Y%m%d"`.htm
export MYLINKS=$LINKSDIR/mylinks.html

then we make our new links file the old one for next week. We should also add that line break while we’re at it (and remove the new links file)

awk '{sub(/<DL><p>/,"<dl>\n")};{print}' < $NEWLINKS > $OLDLINKS
rm $NEWLINKS

and I like to go ahead and open my links file so I can make any quick edits and then post

mate $MYLINKS

I save it and then put it in PATH and make executable

sudo mv preplinks /usr/bin/
sudo chmod 755 /usr/bin/preplinks

You can grab the script here and do the same.

Now every week I go to del.icio.us and export my bookmarks as html and then I run

preplinks

and TextMate launches with my html all ready to be checked and posted.

Works for me.

This is the last in a series of posts. The first two posts are here and here.

Comments

Leave a Reply




Experience