W3C Compliance - Not an SEO Factor?

2010-03-05 - BREAKING SEO NEWS!

AlanBleiweiss: W3C Compliance is NOT an SEO factor to Google #MattCuttsQuote #SMX
1:29 PM Mar 4th, 2010 http://Twitter.com/AlanBleiweiss/statuses/9991551911

Extra! Extra! BREAKING INTERNET NEWS! But wait, there's more!

Table of Contents

  1. That's Not What Was Said!
  2. Basic Example of W3C Compliance being an SEO Factor
  3. What is the relationship between SEO and W3C?
  4. Questions for Matt Cutts
  5. HTML and XHTML Techniques for WCAG 2.0
  6. Using HTML According To Spec
  7. My pages do just fine with all the errors!
  8. Why would you rely on error recovery routines?
  9. Valid vs Invalid - All Things Being Equal
  10. Is Validation Important for SEO?
  11. Validation and ROI - A Failed Argument
  12. Fix Your Website Now!
  13. On a Side Note: Matt Cutt's Quote About Website Speed
  14. Bing: Clean code can help quite a bit in indexing on all the SEs.

That's not what was said!

During an exclusive WebProNews WebProNews interview on 2010-03-04 by Mike McDonald featuring the handsome and illustrious Matt Cutts from Google, the following question was asked...

Question: What is the relationship between SEO and W3C?
Mike McDonald - We've answered that a million times. There is, you know, it's not going to hurt you but, it's, it's not going to help you in the rankings right?
Matt Cutts - Yeah, there's so many people that write invalid HTML with syntax errors, that still is good content, that we need to be able to rank that good content even if somebody doesn't, you know, have something that is completely lint free in terms of validation.

Now, take a look at the opening Tweet where Matt Cutts was quoted...

W3C Compliance is NOT an SEO factor to Google

I do not see where Matt Cutts made the above exact statement. I can see how one could interpret it that way based on how the initial question was posed and how the lead-in was performed by Mike McDonald.

W3C Compliance is NOT an SEO factor to Google

The above is an oxymoron. You mean to tell me that an amateur oversight like missing alt attributes on a logo or primary category links that were designed as graphics is NOT an SEO factor? That's false! Proper alt attributes are required for compliancy and they are what the user agent SEEs when indexing your content. It is written in the HTML protocols, you can't argue with that.

Note: Content is "equivalent" to other content when both fulfill essentially the same function or purpose upon presentation to the user.
http://www.W3.org/TR/WCAG10/#equivalent, http://www.W3.org/TR/UAAG20/#def-text-eq

Back to Previous

Basic Example of W3C Compliance being an SEO Factor

Here's a very simple and basic example of how validation and SEO are directly related, I see this regularly with sites that are optimized for the search engines, not to mention humans.

<div id="header">
<a href="/"><img src="/images/logo.png" width="300" height="100"></a>
</div>

Hint: See areas highlighted in red.

I'm sure most of you reading this know exactly what is wrong with the above. If that is the case, this article is probably not for you but, you'll want to read along for the fun and excitement we're going to have!

For those of you who don't SEE what is wrong with the above, let me add a little semantic flavoring to it...

<div id="header">
<a href="http://www.example.com/"><img src="/images/company-name.png" width="300" height="100" alt="Company Name"></a>
</div>

Hint: See areas highlighted in green.

I've added the alt="Company Name" and have changed the image file name to company-name.png. If the alt attribute were not present, the UA defaults to the file name for a text alternative. If you're going to overlook alt attributes, hopefully your file naming conventions provide a semantic alternative. Be sure to use hyphens to separate words for optimal parsing.

Some would suggest adding a title attribute to the a href and that's fine too. It does present a repetitive instance (stuttering) if the title and alt are an exact match, be careful in how you construct your alternative and/or equivalent text values.

<div id="header">
<a title="Company Name" href="http://www.example.com/"><img src="/images/company-name.png" width="300" height="100" alt="Company Name"></a>
</div>

Hint: See areas highlighted in green.

Back to Previous

What is the relationship between SEO and W3C?

The original question was asked in a way that was suggestive and set the stage for a generic answer which of course can be interpreted many ways.

What is the relationship between SEO and W3C answered in 246 characters surely doesn't give the topic any justice.

Mike McDonald's lead-in placed Matt Cutts in a position where a casual yeah was presented freely in preface to the following statement. I don't think it was fair for Mike to pre-load the question with (emphasis mine)...

What is the relationship between SEO and W3C?
Mike McDonald - We've answered that a million times. There is, you know, it's not going to hurt you but, it's, it's not going to help you in the rankings right?
Matt Cutts - Yeah, there's so many people that write invalid HTML with syntax errors, that still is good content, that we need to be able to rank that good content even if somebody doesn't, you know, have something that is completely lint free in terms of validation.

Okay, I can understand the exaggeration but that last part of the question puts Matt right into a casual yeah response followed by a statement that raises many more questions. I believe once the right questions are asked, the outcome of Matt's responses could easily be influenced in favor of validation as having a major impact on SEO.

I believe I know why Mike McDonald led with the above question, one look at the validation status of WebProNews and I believe they're just looking for Matt Cutts to confirm that their crappy code is okay. And get this, they are missing 35 alt attributes, must be a Professional SEO working on that site.

Back to Previous

Questions for Matt Cutts

Matt Cutts as Dr. Evil My questions would have been posed in this manner...

Matt, you handsome devil you, I've got a few questions for you.

  1. How can invalid markup affect a document's performance in the SERPs? For example, how are WebProNews' 121 HTML Errors, 19 Warnings, and 35 Missing Alt Attributes affecting their documents as of 2010-03-05?
  2. Are there any types of markup errors that Google is aware of that may prevent the proper indexing of a document and therefore have an impact on the SEO initiatives in place? For example, I see a few parsing errors in the validation report for WebProNews that would concern me, how about you?
  3. What happens if a primary navigation menu is in graphic format and is void of alt attributes and descriptive file names to describe what those graphic headers are? Are these not issues that arise during a validation routine? Are these not issues that present potential challenges for one's SEO objectives?

I have a very long list of questions where I can put Matt Cutts into a position where he would surely agree that validation does have an impact on SEO. I believe the three questions above do just that. It would be very difficult for anyone to dispute the fact that the above factors have a direct impact on your SEO initiatives, however minor you may think they are.

Back to Previous

HTML and XHTML Techniques for WCAG 2.0

Here comes the nitty gritty as they say. For those of you who persistently question what the ROI of validation is, here are your answers. These are from the authority on the subject, the folks who write the instructions for the Internet protocols, the W3C World Wide Web Consortium W3C, the instructions that your development team have apparently failed to read.

This next section comes from the WCAG (Web Content Accessibility Guidelines 2.0). I will address each section and try to paraphrase things in laymen's terms.

H88: Using HTML According to Spec

Description: The objective of this technique is to use HTML and XHTML according to their respective specifications. Technology specifications define the meaning and proper handling of features of the technology. Using those features in the manner described by the specification ensures that user agents, including assistive technologies, will be able to present representations of the feature that are accurate to the author's intent and interoperable with each other.
http://www.W3.org/TR/2008/NOTE-WCAG20-TECHS-20081211/html.html#H88-description

The above is pretty straightforward. That last <em>phasized sentence should give you a few clues as to the importance of writing well formed and valid markup.

At the time this technique was published, the appropriate versions of these technologies is HTML 4.01 and XHTML 1.0. HTML 4.01 is the latest mature version of HTML, which provides specific accessibility features and is widely supported by user agents. XHTML 1.0 provides the same features as HTML 4.01, except that it uses an XML structure, has a more strict syntax than the HTML structure. Later versions of these technologies are not mature and / or are not widely supported by user agents at this time.

See that part above about providing specific accessibility features that are widely supported by user agents? You do realize that Googlebot is a user agent, correct? Okay, I know you knew that, I just wanted to make sure before we continue.

Back to Previous

Using only features that are defined in the specification

HTML defines sets of elements, attributes, and attribute values that may be used on Web pages. These features have specific semantic meanings and are intended to be processed by user agents in particular ways. Sometimes, however, additional features come into common authoring practice. These are usually initially supported by only one user agent. When features not in the specification are used, many user agents may not support the feature for a while or ever. Furthermore, lacking standard specifications for the use of these features, different user agents may provide varying support. This impacts accessibility because assistive technologies, developed with fewer resources than mainstream user agents, may take a long time if ever to add useful support. Therefore, authors should avoid features not defined in HTML and XHTML to prevent unexpected accessibility problems.

Emphasis and Strong Emphasis mine. Specific semantic meanings. Does everyone reading this know what that means? It means when you are constructing your document, those buttons you push up there in your WYSIWYG Editor have specific meanings behind them.

WYSIWYG Blockquote For example, that is not an INDENT button (those two little buttons with arrows pointing left and right, see image at right), that is a <blockquote> button and it has a very specific meaning, were you aware of that? For many, probably not, and this is common. Most are accustomed to a word processing environment which has distorted their view of WYSIWYG. Careful, a web WYSIWYG is much more meaningful than a word processing WYSIWYG. The buttons look the same but now they have real semantic value.

Back to Previous

Using features in the manner prescribed by the specification

The HTML specification provides specific guidance about how particular elements, attributes, and attribute values are to be processed and understood semantically. Sometimes, however, authors use features in a manner that is not supported by the specification, for example, using semantic elements to achieve visual effects without intending the underlying semantic message to be conveyed. This leads to confusion for user agents and assistive technologies that rely on correct semantic information to present a coherent representation of the page. It is important to use HTML features only as prescribed by the HTML specification.

Emphasis mine. There are HTML Specifications which many Webmasters, Developers, Designers, SEOs, etc. have overlooked, I don't think they know of their existence. Ask your Webmaster why your website has all of those markup errors. Ask them if they can explain each of those markup errors to you and what impact they could potentially have on your marketing efforts. Have them document the error details and explain why those errors are okay to have, you'll want that for future reference when requesting a refund of your monies - you're getting poor quality building materials. Hey, don't shoot the messenger.

Back to Previous

Making sure the content can be parsed

HTML and XHTML also define how content should be encoded in order to be correctly processed by user agents. Rules about the structure of start and end tags, attributes and values, nesting of elements, etc. ensure that user agents will parse the content in a way to achieve the intended document representation. Following the structural rules in these specifications is an important part of using these technologies according to specification.

Emphasis mine. There are rules for every single HTML Element and Attribute you will use in your documents. Not adhering to the rules for their usage may negate their parsed semantic value.

Please do keep in mind that we are discussing Machine Readable Grammar. If Google had their way (which they do), they wouldn't have to crawl the web anymore, they could assign a unique ID to each domain and you'd present them with XML files via an API with well formed semantically structured quality content. No more wasted resources crawling documents that invoke millions of error recovery routines.

There are many websites doing this already, they have been for years, that's why they are outperforming YOU in the SERPs. If you haven't noticed, today's date is Mar 5, 2010 and not Mar 5, 1996, get with the program!

Back to Previous

My pages do just fine with all the errors!

Of course they do. The crawlers are quite advanced these days and are designed to recover from a large percentage of the errors that are present. If they can't recover, they stop indexing the document at that point in the code. You can see this behavior when viewing cache copies of websites where the crawler couldn't understand the code behind the scenes and just stopped right in the middle of crawling the page. Obviously not the desired result.

Back to Previous

Why would you rely on error recovery routines?

The objective for the crawler is to get from point A to Z in the quickest amount of time. Along the way, it has to catalog A thru Z. Of course a straight line between point A and Z is the ultimate scenario and that is what valid and well formed markup provides.

The user agent, e.g. Googlebot, MSNBot, Slurp, starts at point A with just enough resources to reach point Z, allowing additional resources for common markup errors along the way. If at any time the markup errors present require the UA to divert those initial resources and travel to point Z in a less than optimal path, having to recover from hundreds, or even thousands of markup errors, that's your loss, and the UAs. That is why it is important that the UA be able to recover from markup errors and why Matt Cutts specifically states this...

What is the relationship between SEO and W3C?
Matt Cutts - There's so many people that write invalid HTML with syntax errors, that still is good content, that we need to be able to rank that good content even if somebody doesn't, you know, have something that is completely lint free in terms of validation.

Do you really want to leave the fate of your document indexing to error recovery routines? Are you sure those parsing errors reported by the validator that you've shown to your developer are going to be okay? Why can't they just fix them now? Why do the errors have to be there to begin with? That's not right.

Back to Previous

Valid vs Invalid - All Things Being Equal

I have another question for Matt Cutts from Google;

Let's say there are 5 websites competing in the same industry, each with identical link profiles, all with quality content and similar informational value.

One of those websites utilizes well formed and valid markup. They've also used the latest technologies for semantically presenting that content to the UA e.g. HTML5, Semantic Markup, Microformats, Document Link Relationships, etc.

The other 4 present a wide range of markup errors, most basic in nature but, a few errors which may prevent the proper indexing of their domain by Googlebot. Which of the 5 would Google Search Engineers favor?

I think any Search Engineer will admit that crawling well formed valid markup is nirvana for their UAs. Why? Have you ever wondered how many resources are used when performing error recovery routines on malformed markup? How do you think Googlebot reacts to these thousands of markup errors?

Back to Previous

Is Validation Important for SEO?

Yes, it is probably the single most important thing you as an SEO can do for on-page optimization. Why do I say that? How else are you going to find and fix semantic markup errors? You can't see them in most instances, you have to inspect the code, you know, that stuff they refer to as HTML Markup. Code? What's that? Ya, I kind of figured you'd say that.

Here's a good argument for markup validation and its potential impact on SEO...

From the above referenced article, you'll find this statement...

Another problem was a lack of appreciation for standards. W3C has HTML validators that we knew we failed but there was always a feeling of "so what?" The site rendered perfectly fine on the browsers we had access to. Why should this cause a problem?

That line sounds vaguely familiar to me, how about you?

Back to Previous

Validation and ROI - A Failed Argument

Here's a good argument for markup validation and its potential impact on ROI. To make a long story short, the above type of attitude was adopted by Target. It cost them $6,000,000.00 USD. That's a lot of zeroes!

If any one of you ever bring up the ROI argument again, I will publicly humiliate you by taking your home page document and ripping it apart byte for byte while performing a public review and explain to YOU in detail where your challenges are. YOU know who YOU are.

YOU act as if this is something that comes after the fact? It is not. It happens before the fact. Don't be dissin' validation just because YOU don't understand it. Or better yet, don't even dare dis validation if your Developer is sitting there trying to justify his and/or her invalid existence, I will publicly humiliate you, twice, maybe even thrice.

You want to know why your website markup is invalid? Because your lazy ass developer and/or designer and/or webmaster forgot to the read the instructions. Unfortunately when you took your website out of the box, it did not contain a 128-fold printed set of instructions showing steps 1 through 1,000 in pictures, with exact counts for all the nuts and bolts.

But hey, all you really have to do these days is enter some information in a few fields, check this box, and that box, select from a few dropdowns, verify, click the Publish button. Done.

Ya, if only things were that simple, you'd be in the number one spot right now wouldn't you?

Back to Previous

On a Side Note: Matt Cutt's Quote About Website Speed

I also see Matt being misquoted about the speed of your documents. If you were smart, you'd address these issues now and not wait until it becomes an extremely competitive situation and the leaders have distanced themselves from you. Stop following. Be proactive and take the lead - before your competitors do.

Matt Cutts - We're always going to care first and foremost about quality. How good is a page for users, how much does it help them find the information that they needed. Only in cases where it's like roughly the same quality of content page but this one is a lot faster or this one is a lot slower, those are the sort of situations where it might make a difference.

Back to Previous

Bing: Clean code can help quite a bit in indexing on all the SEs.

Brett Yount, Program Manager, Bing Webmaster Center - Clean code can help quite a bit in indexing on all the SEs. If you are just starting out, I suggest finding a W3C compliant template.
Posted on 2 Mar 2010 10:23 AM
http://www.bing.com/community/forums/p/657771/9587249.aspx

Back to Previous

 

SEO Consultants Directory