Cleaning Up My del.icio.us Links

On Monday, I posted some info about how I am thinking of posting my weekly links.

Today I want to make one correction to the process, talk details about how to clean up the diff file, and then put together a quick script to do that part automatically. Once again, I am going to do this for the first time as I write this. I will summarize the process below.

First, the correction. After my first use of this method I discovered that one more quick edit to the html export will make the parsing of the diff file much easier. Before I move ~/delicious.htm to ~/delicious-old.htm I need to add a line break just after <DL><p>. It may not seem like much but it makes a big difference.

Actually, as it turns out, this is fairly easy to do with awk and grep. Let’s take a look at exactly what I want to do first.

I am only interested in lines that start with > and a space so I start with

grep '^> ' < links.diff

I want to replace the <DL> with <dl> and I don’t need the <p> at all. So now I have

grep '^> ' < links.diff |awk '{sub(/<DL><p>/,"<dl>")}

Now we get rid of the > and the space at the beginning of each line.

grep '^> ' < links.diff |awk '{sub(/<DL><p>/,"<dl>")};{sub(/^> /, "")}

Then we don’t print the last line at all.

grep '^> ' < links.diff |awk '{sub(/<DL><p>/,"<dl>")};{sub(/^> /, "")};!/<\/DL>/{print}'

This gives me everything I need but I still have uppercase tags and attributes, some attributes I don’t really care about, and none of the elements are closed. We can take care of closing the <dl> with a simple echo “</dl>” after it.

echo "</dl>"

So, if we want to save all this to a file we can do this.

grep '^> ' < links.diff |awk '{sub(/<DL><p>/,"<dl>")};{sub(/^> /, "")};!/<\/DL>/{print}' > foo.html;echo "</dl>" >> foo.html

Now all I need to do is clean up those uppercase letters and close all the other elements. I’ll take a look at that on Friday.

This is the second in a series of posts. The first post is here and the next one is here.

Posting del.icio.us Links Weekly to WordPress

I’ve been using del.icio.us to share links since 2005. I’ve always used another method for bookmarking links for myself, but del.icio.us has been my favorite method for the sharing of interesting links. Before del.icio.us I had a separate linkblog so right away I wanted a way to display my shared links in a similar format. I started out by replacing my linkblog with an html rendering of the RSS feed from del.icio.us. I quickly realized that I wanted more than that so I set the blog back up and used a cron job to auto-post my links to the WP database. I’ve written about all of this before.

After a while, I gave up on the linkblog completely and just used a widget to show the links on my blog. Not quite what I wanted but good enough for a while. Recently I decided to set up the daily blog posting feature that del.icio.us provides. This is a very nice feature but doesn’t work well for me because my links come in waves. So, I turned that off earlier this week and set off in search of a way to post the links as a weekly roundup. I’ve seen other sites do this and I like it a lot.

After a few quick searches, I didn’t find anything I thought was worth spending time fooling with. It seems to me it’s just as easy to come up with something on my own. As a hacker I would prefer something as automatic as possible, but I don’t mind having to do something manually. I will probably want to tweak the weekly posting a touch anyway.

The first thing that came to mind was using the RSS feed but I dismissed that because it will only show a maximum of 100 items. That would probably do for my purposes but I’d like to go ahead and set up something I don’t have to worry about – did I get all the links? etc.

So I decided on a different approach. I haven’t done any of this yet. I am going to work on it while I write this.

Here is the plan:

  1. export the links as html
  2. grab out the html I need and paste it into a new post in WP
  3. post it

Simple, except for a few points.

I actually came up with this idea a few days ago and I grabbed an export then. I checked my blog and found that the latest link posted was the trash vortex page at greenpeace.org so I simply removed all links above that and saved this file as ~/delicious.html.

Remove all html above the last posted link

Now it’s time to grab the new links for this week, so I go to del.icio.us and export the html and save it to the desktop. Then,

mv ~/delicious.htm ~/delicious-old.htm
mv ~/Desktop/del*.htm ~/delicious.htm
diff ~/del* > links.diff

The only thing to do now is clean it up and post it. Let’s start by doing it manually. I’ve stripped most of the new links out for demonstration. Take a look.

Diff file

First, I remove the first three lines and the last five lines. I’ve run a few tests now and it looks as though this will always be the case. This should make automation easier. This procedure is obviously going to require a bit of manual intervention so I should be able to notice when a problem crops up.

After removing those lines I am left with a bunch of lines like those below.

> <DL><p><DT><A HREF="http://online.wsj.com/article/SB123731266862258869.html" LAST_VISIT="1238172267" ADD_DATE="1238172267" TAGS="fun,economics,games,culture,scrabble,words">Scrabble and Other Games Have Overvalued Points - WSJ.com</A>
> <DD>Scrabble is a great game and should be left alone.

The only thing necessary to make this “work” is to remove the > at the beginning of each line, but we will make it “right” by changing uppercase tags and attributes to lowercase, closing all elements, and wrapping all of it in

<dl></dl>

Then I copy and paste it into a new post in WP and I’m all set. Requires a bit of input but not hard to do. I will see how much I can automate i on Wednesday.

This is the first in a series of posts. The next post is here

LeftLink

LeftLink was a collection of interesting links with re-written headlines. The project slowed to a complete halt due to the time necessary to maintain it manually.

So, it was brought back to life as an aggregation of progressive info using the simple mechanism I put together for iPhoneDeck.

Rent Back Direct

Amazing uderstanding of wordpress, this guy really knows what he is doing. excellent work. thanks very much Bill!!

RentBackDirect

WordPress as a CMS

I set up a custom WordPress theme to be used as a complete CMS for this website.

Google Maps

I created a Google Maps site which helped users locate medical testing centers. You can see it at http://www.reallycheckyourself.org/.

WordPress Theme

I finished up a custom WordPress theme for this website.

1000 Moms

I took an existing PHP website set up by a novice and worked in some real programming without disrupting what was already very comfortable for the client. You can see the site at http://www.1000moms1000dollars.com/.

Betsy Flanagan

excellent work, he’s very conscientious, responsive and upfront. I highly recommend him and will definitely use him again.

–Betsy Flanagan, StartupStudio.com

Startup Studio

I set up and configured a Wordpress powered site for this client. The work included a custom template based on the client’s original design, several plug-in modifications, and two custom plug-ins. The site is inactive now, but you can see it at http://startupstudio.com/.

WordPress Theme

I do a lot of WordPress work – themes, plugins, core mods, etc.

This client wanted a standards compliant theme with an adsense style look. He provided an image and I built the xhtml/CSS and the WP theme.

I am not sure where you can find his use of it, but there is a slightly modified version here.

Another eBay Listing

Another client needed a custom eBay listing that would get by eBay’s sausage machine.

eBay Listing

Worked very hard on the projct.Reliable and easy to communicate with. Will definitely work with again.

eBay Listing

eBay Listing

I created xhtml/CSS that would create the custom look the client wanted and survive eBay’s sausage machine.

Listings no longer active.

Layout Jerk

Great work, and will be using for more work in the near future.

Layout Jerk

Layout Jerk

I created hundreds of layouts for MySpace pages for this site.

Layout Jerk

WordPress Modificaitons – selfinvestors.com

Bill dealt with a project that was was far more difficult and time consuming than imagined with great professionalism and patience, continuing to work to find solutions. Clearly, Bill is highly experienced programmer with wordpress expertise. I would use him again.

selfinvestors.com

WordPress Modifications – selfinvestors.com

I wrote a few custom plugins and theme modifications for this website.

selfinvestors.com

Sport.co.uk – Sport Resources and Information.

I was hired to build a complete sports website in the UK. Site is no longer in operation but you can see it in the wayback machine.

Sport.co.uk – Sport Resources and Information.

Greg – Music Utopia

bsoist is a very good programmer, who worked on my project to 110% of my satisifaction. The programming he did was clean, and works great! He preformed the project in a very reasonable amount of time. His communication throughout the project was outstanding. I recommend him to everyone and I will be using him again my self. Thanks for the great job bsoist!!

Music Utopia

Music Utopia

Greg hired me to add the storefront to his indie music website.

Music Utopia Home Page

Surreal Art, Fantasy Art & The Contemporary Surrealism of Domen Lombergar

Domen hired me to add a couple of new features to his website and fix up some SQL problems.

Surreal Art, Fantasy Art & The Contemporary Surrealism of Domen Lombergar

Domen Lombergar

Very qualified programmer. Provided lots of feedback during the creation process and created a marvelous job in the end. Highly recommended.

Surreal Art, Fantasy Art & The Contemporary Surrealism of Domen Lombergar

Amazon API

I’ve been experimenting with Amazon.com’s APIs for several projects.

You can see my work here.

Experience