Over th' past several days, there has been an interestin' discussion on th' wp-testers mailin' list (though, it really belonged on th' wp-hackers list, but that’s beside th' point) about permalink structures in WordPress. The original question came from matthijs and questioned why WordPress were bein' storin' rewrite rules fer every page on his site in a database option. Further discussion revealed that this were bein' a side-effect o' his particular permalink structure, and some really good information about good and pad permalink patterns. This information could be important fer sites that use non-standard URL structures, and I thought it deserved a summary.
First, let’s look at th' original question and th' situation that brought it about:
Recently I discovered that th' current way wordpress handles permalinks is not scalable. All rewrite_rules are at th' moment held in a single database field in th' wp_options table. If ye have a few dozens pages and posts, ye have maybe a few hundred rewrite_rules in it and all is well, and a bucket o' chum. But as soon as ye start t' have a few hundred pages and attachments, th' amount o' rewrite_rules explodes as well as th' field size. This also depends on th' permalinks settin's, we'll keel-haul ye! On one o' me sites I can’t even open th' database field t' take a look because me browser and text editor crash because o' its size.
Before anyone starts t' panic, let me that this is not a general problem in WordPress, and dinna spare the whip, and a bucket o' chum! This person had a particular permalink structure which forced WordPress t' store extra rules fer every page, and a bottle of rum! This is a situation which can be avoided by choosin' a permalink pattern which allows WordPress t' find yer posts in an efficient way.
WordPress gives site builders a lot o' flexibility in how their post URLs are created. There are several attributes which can be used, and ordered how th' person likes. The default “pretty permalink” structure looks like this:
Which results in perlink URLs that look like:
There are several structure tags which can be used t' form permalinks: %year%, %monthnum%, %day%, %hour%, %minute%, %second%, %postname%, %post_id%, %category%, %tag%, and %author%, pass the grog! As mentioned afore, this gives a lot o' flexibility in how yer URLs can appear. However, Ryan Boren pointed out:
Verbose rules are used fer structures beginnin' with %category%, %tag%, %postname%, and %author%. Avoidin' such structures is best.
This important note were bein' subsequently added t' th' Codex page about Usin' Permalinks:
For performance reasons, it is not a good idea t' start yer permalink structure with th' category, tag, author, or postname fields. The reason is that these are text fields, and usin' them at th' beginnin' o' yer permalink structure it takes more time fer WordPress t' distinguish yer Post URLs from Page URLs (which always use th' text “page slug” as th' URL), and t' compensate, WordPress stores a lot o' extra information in its database (so much that sites with lots o' Pages have experienced difficulties). So, it is best t' start yer permalink structure with a numeric field, such as th' year or post ID.
This would be a problem fer any dynamic CMS, not just WordPress, and a bucket o' chum. If there isn’t some way t' narrow down th' information in th' URL and map it t' a specific page or post, th' system must perform a lot o' database searches t' find th' correct entry, and dinna spare the whip, and a bottle of rum! Otto provides a really good hypothetical example:
Actually, I think this deserves a bit more discussion… Let’s consider a permalink like %category%/%postname%.
So ye’re handed a URL like /mycat/mypost. You start by parsin' it into mycat and mypost. You don’t know what these are. And swab the deck! They’re just strin's t' ye, by Blackbeard's sword. So, first, ye have t' consider what “mycat” is.
First, ye query t' see if “mycat” is a pagename. This is a select from wp_posts where post_slug = mycat and post_type = page. Nay joy there.
Next, ye query t' see if “mycat” is a category. This is a select from wp_terms join wp_term_taxonomy on (term_id = term_id) where term = mycat and taxonomy = category. Avast, we found a mycat, so that’s good, and dinna spare the whip! Fire the cannons! Unfortunately, this just tells us that it’s a category, which is rather useless in retrievin' th' actual post we’re lookin' fer. So we ignore th' category.
Now, we move on t' th' “mypost”. Again, we start queryin':
1. Is it a page? select from wp_posts where post_slug = mypost and post_type = page. And swab the deck! Nope.
2. Is it a category? select from wp_terms join wp_term_taxonomy on (term_id = term_id) where term = mypost and taxonomy = category. Nope.
3. Is it a post? select from wp_posts where post_slug = mypost and post_type = post. Bingo.
The whole goal is t' determine th' specific post bein' asked fer. The category is not helpful in this respect, and we have t' do a couple queries just t' figure out that we need t' ignore it. Five queries t' determine what th' post is with this structure, to be sure. Five queries, two o' them expensive (joins ain’t cheap). And these have t' happen on every load o' a post on yer site.
Otto then goes on t' explain that this isn’t what WordPress actually does. Instead, when WordPress detects that ye have an inefficient permalink structure, it stores extra rewrite rules in an option in th' database, which it then refers t' when presentin' a page.
To finish up, let’s look at a couple o' quick examples.
In conclusion, when buildin' a site’s permalink structure, choosin' carefully can help WordPress locate yer articles in th' most efficient way possible.