2009-02-11 - Years ago when Blogs and other publishing platforms were becoming popular, someone, somewhere, decided that using the
<title> of the Blog Post for the website address (URI) was a good idea. Little did they know that they would be responsible for a cottage industry developing to help manage those long unruly and totally unfriendly URIs.
Who would have known that such a service would ever be needed? And, who would have ever thought that a service like Twitter would become one of the most popular means of communication amongst peers? What does Twitter have to do with this? Read further, that 140 SMS character limit is surely wreaking havoc on all those using the long hyphenated URIs. These publishers are forced to use a URI Shortening Service therefore inserting a middle man that is outside the brand domain. From my perspective, this is not good practice and may have negative side effects in the overall scheme of things.
2009-02-25 Update - Phishing Scam Hits Google Mail Users
Gmail users who are logged into the accompanying chat service Google Chat, as most are, have been getting messages that appear to be from friends, urging them to click on a Web address starting with TinyURL.com that takes them to a site called ViddyHo. The site asks for the person's Gmail login information and then hijacks the account, sending out chat messages to all of the user's contacts and spreading itself further.
Did you know that not all emails are read and/or sent in HTML mode? Did you also know that email, when sent in Text Mode and/or converted to text during its travels, that there is a 72 character width limit per line within the email message itself? For me, that is the maximum URI length I'd be working with to ensure that my links were not breaking during their travels. And, I'm surely not going to use a URI Shortening Service to present what looks like a spoofed URI string of some sort and may cause users to not click the link. I'm serious, there are some challenges in this area when you have links that look like the links in email spam that many are used to receiving. It is a cause and effect that you as a Publisher should be proactive at avoiding.
Another challenge that comes into play with email is when the replying and/or forwarding of the original message begin. Depending on the users email settings, the original message may be converted to plain text and get appended with (
> ). When this happens, it takes away from the original 72 characters you started with. Usually you'll have the appended character along with a space and then the line from the original message. In this scenario, we've just lost 2 of our original 72 character maximum so now we're down to 70.
What does a 70 character URI look like? Here are some examples using our domain which is 30 characters to start. That means we have a maximum of 40 characters to work with in our directory and file naming conventions before wrapping occurs.
That last 66 character example is from an older article posted here at the directory by Gord Collins. At that time, we were not 100% certain how we were going to move forward with directory and file naming conventions. I was not comfortable at all with the number of hyphens appearing and when we hit that third hyphen, I made the decision to start working towards a more intuitive shorter URI structure which in turn forced me to tighten up the overall taxonomy.
Page titles in the SERPs will normally truncate at around 70 characters depending on the word composition at the point of truncation. Yahoo! recommends 67 characters as a limit.
More important, search engines use titles to index web sites, and often display them in search engine results. To make your page most appealing to search engines, we recommend that you limit your page title to 67 characters and do not include images in the page title area.
Note: The 67 character limit for the TITLE Element is a Yahoo! suggestion and is not a hard rule although the above verbiage from Yahoo! would tend to make you think otherwise. Longer character counts perform just fine.
We've seen Google state in writing 2-20 words, while never mentioning character counts, for article title lengths in various services that they provide such as Google News. How that may translate over to the TITLE Element is beyond the scope of this article.
It is always a good practice to target your primary keyword phrases at the beginning of the TITLE. Well balanced forward and reverse thinking is beneficial in this area.
Did you know that there is an HTTP Status Code that the server can return if your URI exceeds a certain length?
The server is refusing to service the request because the Request-URI is longer than the server is willing to interpret. This rare condition is only likely to occur when a client has improperly converted a POST request to a GET request with long query information, when the client has descended into a URI "black hole" of redirection (e.g., a redirected URI prefix that points to a suffix of itself), or when the server is under attack by a client attempting to exploit security holes present in some servers using fixed-length buffers for reading or manipulating the Request-URI.
/topcat/, /topcat/document, /topcat/subcat/document, /topcat/subcat/subcat/document
When naming categories, I try to keep them to one primary top level keyword. If two or even three keywords are necessary, one or maybe two hyphens may be acceptable. But, I'll work my magic and figure out some way to categorize a destination using single word category paths. There is a place for everything and everything in its place.
I see many hopping on the Flat Structure bandwagon and I'm here to tell you that the future holds quite a few challenges for you. I'll let you figure out all the technical details but, the foremost issue you will be faced with in a flat structure is file naming conventions. I can't begin to tell you the number of cons that far outweigh the pros in this situation. You may also end up with a plethora of multi-hyphenated-URIs-which-may-not-be-real-user-friendly. And, I don't think they are really that SEO friendly like some claim them to be.
I feel most, if not all websites with sufficient content, SHOULD have at least one sub-directory level. This allows you to categorize your content and removes the naming restrictions that a flat structure imposes. You can still use the flat structure concept but, only for top level pages, those should always be at the top of the structure (top of the click path) whether it be at the root or within a sub-directory. Click paths will determine the structural flow of your website.
We are in the process now of creating a tutorial for Windows that will allow you to utilize ISAPI_Rewrite to develop your own URI Shortening routines. Take out the 301 middle man and end the fragmentation of your brand. For example, I've set up a quick shortening routine for this article. I can utilize a link like this
http://www.SEOConsultants.com/2009/02/11 which will 301 to
http://www.SEOConsultants.com/uris/. While not the most optimal example since we use shorter URIs to begin with, it does illustrate the technique that can be easily implemented for most website owners whether your website is hosted on Windows or Apache (*nix) servers.
Here's a great example of our URI shortening routine at work on one of our directory level testing areas...
The above permanently redirects (301) to a 76 character URI that is 23 levels deep...
If you are relegated to utilizing long URIs and find yourself promoting your content within Social Media outlets or platforms that limit your character counts (usually the 140 SMS limitation), you may want to consider rolling your own URI Shortening Service. If you have a domain that falls within the reasonable character limits (*59 or less) and can manage a flat URI shortening routine, I'd suggest this option before using a third party service.
* Based on utilizing an ISO 8601 date string for the shortening routines. How you handle additional posts on the same date is your choice. You could do
/2009/02/11/abc, whatever works best for you from both usability and scalability perspectives. I surely don't want to make recommendations that may cause challenges for you in the long term.
I'm finding that dates are very important when working with publisher type content. It is only natural to utilize them in the URI for archiving purposes. And, they work well when developing a same domain shortening routine. Just remember to keep them as short as you can. Shorter URIs placed just right in SMS messages do not get converted in many instances. This allows you to post branded links to your quality content.
Note: Date based URIs can be formatted in a variety of ways. Some may use a continuous date string, others may categorize dates further due to content volume. Again, this is something you should give careful consideration to before launching a date based taxonomy.
You'll also want to heed Google's advice here and pay very close attention to how they suggest the formatting of URI strings for Google News Publishers.
Display a three-digit number. The URL for each article must contain a unique number consisting of at least three digits. For example, we can't crawl an article with this URL: http://www.example.com/news/article23.html. We can, however, crawl an article with this URL: http://www.example.com/news/article234.html. Keep in mind that if the only number in the article consists of an isolated four-digit number that resembles a year, such as http://www.example.com/news/article2006.html, we won't be able to crawl it.
Using the title of your Blog Posts for the file naming is not best practice from a variety of usability standpoints as I've outlined above and have referenced below. And please, no references to the Google Blog or any other trusted resource that utilizes the long multi-hyphenated domain structure. Just because they opted to inflict that usability nightmare upon their users is surely no reason that you should do the same. Take my advice, don't follow in this instance, set your own standards following best practices in this area.