Proper RFC 4122 UUIDs as GUIDs in WordPress

UUIDs (Universally Unique IDentifier), also known as GUIDs (Globally Unique IDentifier), is a string that identifies a piece of information in computer systems. WordPress use GUIDs to identify each individual post, but use URLs (kind of) for GUIDs, and thus does not follow the standard definition (RFC 4122) of a UUID (or GUID).

A WordPress GUID:
A proper RFC 4122 UUID as a URN: urn:uuid:65396530-3934-5930-a563-343736343835

As you can see, the WordPress GUID isn’t even the regular permalink to a post (if you have pretty permalinks enabled, and most people do). But as slugs in permalinks may change, we need a GUID that doesn’t change. They should be immutable to work as an identifier. So it makes sense that WordPress uses that URL. This makes it is easy to mistake the GUIDs in WordPress for being URLs you can use for something else than as an ID. But the GUIDs in WordPress should never be treated as URLs. They simply are not. They are IDs. They just happen to also be URLs in the default WordPress implementation. They are not the URLs we want to expose anywhere, though.

Both URLs and URNs are URIs, but a GUID should be a URN, as it is for ID, not for location. The difference between URI, URL and URN is well explained here.

To avoid any possible confusion around if WordPress GUIDs are URLs, and to make them compatible with the UUID format that the rest of the world uses, we can use the wonderful Plugin API and hook into WordPress to use proper RFC 4122 UUIDs.

About UUIDs and versions (subtypes)

A UUID is 128 bits long, and requires no central registration process.

Adoption of UUIDs and GUIDs is widespread, with many computing platforms providing support for generating them, and for parsing their textual representation.

– WikiPedia (on RFC 4122 UUIDs)

RFC 4122 defines different versions, or subtypes, of UUIDs. Version 4 is the one that is easiest to use, as it is completely based on cryptographically random (or pseudo-random) bits. UUID v3 and v5 specifies how we can use URLs in the URL namespace as basis for UUIDs. The difference between v3 and v5 is that v3 use MD5 whereas v5 use SHA-1. SHA-1 should be used where backwards compatibility with MD5 isn’t necessary.

UUID versions (subtypes) that are interesting to us:
UUID Version 4: Based on random bits. Gives us 2^122 different combinations, which should never be an issue. Really. There are 7.38e26 possible UUIDs for each human being on the planet.
UUID Version 5: Based on a SHA-1 hash generated from the URL namespace UUID and a URL. Not even a 1:2^122 chance of a collision.

Hooking into WordPress

Because of how WordPress saves new posts, the most efficient is to use UUID v4, as they can be included when a new post SQL insert is performed.

The way WordPress by default insert GUIDs is to do an SQL update after the first insert, as the new post ID is required to create the “permalink”. If we want to use UUID v5, based on a “permalink”, there is unfortunately no filter for the GUID update, so we have to hook in a little later, where we check if the GUID is set to the “permalink” and then run yet another SQL update to set the GUID field to a proper UUID v5 string.

However, IMHO, since we are in fact dealing with articles that are assigned unique URLs (permalinks), we shouldn’t have to resort to using random UUIDs (v4). I think I’ll settle on using version 5, but it is up to you to make your own decision.

Using UUID version 4

This is the computationally most efficient, as we filter the UUID into the post field before it is inserted into the database.

To use UUID version 4 for your GUIDs in WordPress, you can add this snippet, e.g. as an mu-plugin:

And that’s really everything that’s needed!

(Thanks to Dominik Schilling for pointing out to me that WordPress introduced wp_generate_uuid4() in version 4.7, so you don’t need to bring your own implementation.)

Using UUID version 5

This is not based on (pseudo) randomness, and are truly unique, but requires two additional SQL update queries to be run after the initial insert. It should however not really be an issue in most (any?) cases.

The UUIDs are based on the URLs that WordPress use for GUIDs as default, but follows a standardized format for UUIDs as URNs, and will not be confused as URLs.

To use UUID version 5 for your GUIDs in WordPress, you can add this snippet, e.g. as an mu-plugin:

Unlike UUID version 4, you need to bring your own UUID implementation (an uuid_v5() function in the example above).

UUID version 5 implementation

Here’s a ready RFC 4122 compliant implementation for UUID version 5 (name based with SHA-1 hashing). Save it as an mu-plugin, e.g. uuid.php:

One last word of caution

Please do not think that UUIDs have anything at all to do with security. Do not use it as such.

Do not assume that UUIDs are hard to guess; they should not be used as security capabilities (identifiers whose mere possession grants access), for example.

– RFC 4122

There are 5 comments

I love comments that bring new insights, shares ideas and experiences, and most of all: corrects my mistakes. For support questions, there are other fora, like Stack Overflow, Server Fault and the WordPress support forum.

Your email address will not be published. Required fields are marked *