Cleaning Up My del.icio.us Links
On Monday, I posted some info about how I am thinking of posting my weekly links.
Today I want to make one correction to the process, talk details about how to clean up the diff file, and then put together a quick script to do that part automatically. Once again, I am going to do this for the first time as I write this. I will summarize the process below.
First, the correction. After my first use of this method I discovered that one more quick edit to the html export will make the parsing of the diff file much easier. Before I move ~/delicious.htm to ~/delicious-old.htm I need to add a line break just after <DL><p>. It may not seem like much but it makes a big difference.
Actually, as it turns out, this is fairly easy to do with awk and grep. Let’s take a look at exactly what I want to do first.
I am only interested in lines that start with > and a space so I start with
grep '^> ' < links.diff
I want to replace the <DL> with <dl> and I don’t need the <p> at all. So now I have
grep '^> ' < links.diff |awk '{sub(/<DL><p>/,"<dl>")}
Now we get rid of the > and the space at the beginning of each line.
grep '^> ' < links.diff |awk '{sub(/<DL><p>/,"<dl>")};{sub(/^> /, "")}
Then we don’t print the last line at all.
grep '^> ' < links.diff |awk '{sub(/<DL><p>/,"<dl>")};{sub(/^> /, "")};!/<\/DL>/{print}'
This gives me everything I need but I still have uppercase tags and attributes, some attributes I don’t really care about, and none of the elements are closed. We can take care of closing the <dl> with a simple echo “</dl>” after it.
echo "</dl>"
So, if we want to save all this to a file we can do this.
grep '^> ' < links.diff |awk '{sub(/<DL><p>/,"<dl>")};{sub(/^> /, "")};!/<\/DL>/{print}' > foo.html;echo "</dl>" >> foo.html
Now all I need to do is clean up those uppercase letters and close all the other elements. I’ll take a look at that on Friday.
This is the second in a series of posts. The first post is here and the next one is here.
Posting del.icio.us Links Weekly to WordPress
I’ve been using del.icio.us to share links since 2005. I’ve always used another method for bookmarking links for myself, but del.icio.us has been my favorite method for the sharing of interesting links. Before del.icio.us I had a separate linkblog so right away I wanted a way to display my shared links in a similar format. I started out by replacing my linkblog with an html rendering of the RSS feed from del.icio.us. I quickly realized that I wanted more than that so I set the blog back up and used a cron job to auto-post my links to the WP database. I’ve written about all of this before.
After a while, I gave up on the linkblog completely and just used a widget to show the links on my blog. Not quite what I wanted but good enough for a while. Recently I decided to set up the daily blog posting feature that del.icio.us provides. This is a very nice feature but doesn’t work well for me because my links come in waves. So, I turned that off earlier this week and set off in search of a way to post the links as a weekly roundup. I’ve seen other sites do this and I like it a lot.
After a few quick searches, I didn’t find anything I thought was worth spending time fooling with. It seems to me it’s just as easy to come up with something on my own. As a hacker I would prefer something as automatic as possible, but I don’t mind having to do something manually. I will probably want to tweak the weekly posting a touch anyway.
The first thing that came to mind was using the RSS feed but I dismissed that because it will only show a maximum of 100 items. That would probably do for my purposes but I’d like to go ahead and set up something I don’t have to worry about – did I get all the links? etc.
So I decided on a different approach. I haven’t done any of this yet. I am going to work on it while I write this.
Here is the plan:
- export the links as html
- grab out the html I need and paste it into a new post in WP
- post it
Simple, except for a few points.
- How do I know what html I need?
I decided to use a simple diff between the exports from this and last week. Seems simple enough.
- I’d like valid xhtml and it doesn’t look like it comes out that way
I should be able to take care of this with some regex.
I actually came up with this idea a few days ago and I grabbed an export then. I checked my blog and found that the latest link posted was the trash vortex page at greenpeace.org so I simply removed all links above that and saved this file as ~/delicious.html.

Now it’s time to grab the new links for this week, so I go to del.icio.us and export the html and save it to the desktop. Then,
mv ~/delicious.htm ~/delicious-old.htm
mv ~/Desktop/del*.htm ~/delicious.htm
diff ~/del* > links.diff
The only thing to do now is clean it up and post it. Let’s start by doing it manually. I’ve stripped most of the new links out for demonstration. Take a look.

First, I remove the first three lines and the last five lines. I’ve run a few tests now and it looks as though this will always be the case. This should make automation easier. This procedure is obviously going to require a bit of manual intervention so I should be able to notice when a problem crops up.
After removing those lines I am left with a bunch of lines like those below.
> <DL><p><DT><A HREF="http://online.wsj.com/article/SB123731266862258869.html" LAST_VISIT="1238172267" ADD_DATE="1238172267" TAGS="fun,economics,games,culture,scrabble,words">Scrabble and Other Games Have Overvalued Points - WSJ.com</A>
> <DD>Scrabble is a great game and should be left alone.
The only thing necessary to make this “work” is to remove the > at the beginning of each line, but we will make it “right” by changing uppercase tags and attributes to lowercase, closing all elements, and wrapping all of it in
<dl></dl>
Then I copy and paste it into a new post in WP and I’m all set. Requires a bit of input but not hard to do. I will see how much I can automate i on Wednesday.
This is the first in a series of posts. The next post is here
Rent Back Direct
Amazing uderstanding of wordpress, this guy really knows what he is doing. excellent work. thanks very much Bill!!
WordPress as a CMS
I set up a custom WordPress theme to be used as a complete CMS for this website.
WordPress Theme
I finished up a custom WordPress theme for this website.
Betsy Flanagan
excellent work, he’s very conscientious, responsive and upfront. I highly recommend him and will definitely use him again.
–Betsy Flanagan, StartupStudio.com
Startup Studio
I set up and configured a Wordpress powered site for this client. The work included a custom template based on the client’s original design, several plug-in modifications, and two custom plug-ins. The site is inactive now, but you can see it at http://startupstudio.com/.
WordPress Theme
I do a lot of WordPress work – themes, plugins, core mods, etc.
This client wanted a standards compliant theme with an adsense style look. He provided an image and I built the xhtml/CSS and the WP theme.
I am not sure where you can find his use of it, but there is a slightly modified version here.
WordPress Modificaitons – selfinvestors.com
Bill dealt with a project that was was far more difficult and time consuming than imagined with great professionalism and patience, continuing to work to find solutions. Clearly, Bill is highly experienced programmer with wordpress expertise. I would use him again.
WordPress Modifications – selfinvestors.com
I wrote a few custom plugins and theme modifications for this website.
