Carnivorous coffee

I don’t know why this morning is different from any other, but the coffee I drank on the way to work this morning has grown claws and long, sharp fangs, and is slowly but surely chewing its way out of my stomach. Ugh. I think I’d better throw a honeybun or a bearclaw down there. Maybe it will act as a buffer…

Comment spam notes

There’s been more talk about comment spammers around the net lately. And there’s especially been some traffic around Jay Allen’s MT-Blacklist plugin for MovableType. We’re going to be doing some similar things for WordPress.

I’ve commented in a few discussions about an idea that I have that would be more of a deterrent (hopefully) than a blocking mechanism. It would work by encoding URLs in submitted comments, randomly replacing characters with their numeric entity equivalents. This change is invisible to browsers, but the idea is that Google would index the URL differently every time it was posted in a blog’s comments. Which would bust the ranking for that link. The downside is that it would also bust Google ranking for non-spammers. I plan to run some experiments later to verify whether Google automaticly decodes the entities before indexing links.

If this technique does bust the ranking, we might still be able to use it without punishing regular users too much. We could use a scoring technique to attempt to decide whether a comment is spam or not. Ones that we can decide are definitely spam get blocked completely. Ones that definitely are not spam go through unchanged. Comments that score in a gray area would be encoded, but the admin would be able to retroactively whitelist them.

Thoughts?

Clothes don’t make the man

It hasn’t been a good day for me and clothes. First, when I was getting the milk out of the fridge this morning, I knocked down a little plastic container of salsa, which splattered all over my slacks, got a few drops on my shirt, and a healthy blob which almost went into my left eye (which closed reflexively, just in time). So I had to quickly strip and change clothes before I left for work. Then during my lunch break, I was leaning in through the passenger-side door of my car to retrieve my sunglasses from under the seat, and got grease on the leg of my current slacks. I think it will wash out, but it’s pretty annoying to go through half of you day with black grease smudges on your khakis.

I don’t know why we have to wear slacks where I work, anyways. They hardly ever bring any visitors back into our cube farm. Seems to me we’d be fine wearing decent jeans and anything dressier than a t-shirt.

Simplog 0.9pre1

More news in the blogware software arena. MyPHPBlog has been renamed to Simplog, and release 0.9pre1 is now availble for testing. I used to be a developer on the MPB project (I contributed the RSS feed code and rough initial support for pingback/trackback), and I still keep up with their announcments. Simplog has some interesting design points, and supports multiple blogs from a single installation.

WordPress 0.72 is out

After a flurry of minor bug fixes, WordPress 0.72 is released.

Don’t let the tiny version number change fool you — there is lots of yummy goodness inside, with many cool improvements over version 0.71.

  • All most all of the configuration options are managed through your browser. Edit your config file once and never touch it again!
  • A new default style
  • Improved Post/Edit interface makes it easy to edit/delete recent comments
  • Improved HTML auto-formatting
  • Password protected posts
  • New get_links_list() function builds your entire blogroll for you
  • Improved Links Manager
  • Support for blogging clients that use the metaWeblog and MovableType APIs
  • ‘Quicktags’ gives more intuitive HTML formatting option when editing posts
  • Image upload feature compatible with more PHP server setups
  • RSS feeds for comments
  • RSS feeds now support “Conditional GET” to help conserve bandwidth
  • Geoposition support
  • Security fixes
  • …and tons of minor little bug fixes…

Fun Filters

UPDATE 2008-09-22: This code was superceded by my Text Filter Suite plugin. You can download the current version of the plugin from the WordPress Plugin Directory.


Updated Oct 4. Fixed code to not convert sequences inside the regex example into smilies. Updated funfilters.zip file with a properly formatted README.

I finally got around to improving and cleaning up my blog filters. If any other other WordPress users would like to play with my hacks, I’m making the code available: funfilters.zip

Instructions

Inside the zip, you will find three text files:

The first step is to insert the contents of fun_filters.php.txt at the end of your b2-includes/b2functions.php file (but before the final ‘?>‘ line)

Next, open up your b2-includes/b2vars.php file and scroll down to the bottom. Look for this line:

    add_filter('all', 'wptexturize');

Just before that line, insert the contents of the activate_filters.php.txt file. With this code in place, your blog will automatically switch on the pirate filter on Talk Like a Pirate Day (September 19). Also, any of the other filters can be activated by adding ‘?filter=filtername‘ to the end of your URL (where filtername is one of ‘pirate‘, ‘chef‘, ‘fudd‘, ‘jive‘, or ‘kraut‘).

This started when I added a pirate filter to my blog for Talk Like a Pirate Day. My first version was easier than I expected it to be (though it had flaws), which inspired me to locate and convert some of Kalsey’s MovableJive filters.

The biggest flaw in the earliest version was that it would filter text inside of HTML tags, causing it to mangle links and such. I fixed this by borrowing an idea from Simon Willison. Simon’s use of a callback function to only match text that was not part of a tag was good, but it included the ‘>’ and ‘<‘ brackets from surrounding tags in the matched text being substituted, requiring you to hack them back in at the end of your content filter.

After an afternoon studying the pcre pattern syntax and wrestling regexes with the help of the Regex Coach I came up with an improved pattern, which doesn’t require us to tack the ‘>’ and ‘<‘ back on manually. Cool, huh? Here’s the regex pattern I came up with: (?(?<=>)|\A)([^<>]+)(?(?=<)|\Z)

And yes, I know, it’s not perfect. But it should work okay for HTML that’s moderately clean. You’d probably have to try hard to bust it. It doesn’t even seem to get confused if you have angle brackets inside of an attribute in a tag, even though I thought it would.

The Bongo Project

A quick update while I have a break between phases of a programming project at work….

The Bongo Project is at least as cool as the Carrier Pigeon Internet Protocol implementation (an implentation of RFC1149, “A Standard for the Transmission of IP Datagrams on Avian Carriers”). A student took up a professor’s challenge to implement the physical layer of the OSI model with bongo drums using a couple of linux boxes with microphones listening for the primitive ping packets…