13 March 2009
All
A common issue with websites that contain SWF content is that search engine spiders cannot enter the site to crawl its content. It's like the website has a bouncer and the spider doesn't have proper identification to gain entry to the club. To prevent this scenario from happening, you need to ensure that all of your detection scripts for checking browser compatibility and Adobe Flash Player versions follow established best practices. This article walks you through the topics to watch out for and explains techniques you can utilize now to ensure that search engine spiders can walk right in your website's front door.
To get the most out of this article, you should have a knowledge of Adobe Flash technologies and an advanced understanding of web development techniques related to SWF-based websites and applications. You should also read Search optimization techniques for RIAs, which provides detailed explanations of the techniques and topics discussed.
Search engine spiders are actually rather delicate creatures. They can be thrown off by a wide variety of problems that we call spider traps. These barriers prevent spiders from crawling a site, usually stemming from technical approaches for displaying web pages that work fine for browsers but do not work for spiders. By eliminating these techniques from your site, you allow spiders to index more of your site.
Unfortunately, many spider traps are the product of JavaScript code that determines what version of Adobe Flash Player is installed, or whether the user's browser is compatible with the website. After all the budget and time that was invested to build your site, you don't want to hear that it shuts out search engines.
Review all the steps and detection scripts that are required before displaying the SWF content to your user:
If you want to see how many sites that feature SWF content actually block search spiders, simply do a search for "flash player required" and see how many results you find. There are millions of sites out there that show a message like "MyWebSite.com requires the Flash Player plug-in." Spiders are not web browsers, so they cannot interact with your site to download any software requirements. They can read only document formats, such as HTML and PDF files. So when they run into these software download requirements, they go elsewhere.
That said, technology does change and becomes more advanced every day. In fact, last year Adobe announced a major breakthrough with the release of optimized Adobe Flash Player technology (since dubbed "Flash Player for Search Engines"), which is essentially a "headless" version of Flash Player that can change states of SWF content and gain access to the text content residing within. For a quick overview of how Flash Player for Search Engines works, please watch Duane Nickull's video blog post about it.
Luckily, spiders become more sophisticated every year. Designs that trapped spiders a few years ago are now OK. Still, you need to keep up with spider advances to employ some cutting-edge techniques with your SWF content.
The odds are that you're using the popular SWFObject 2 to embed your SWF content. If you're not, you should be. It's proven to be the most effective way to determine browser and Flash Player version compatibility.
If you are using the lastest version of SWFObject 2, you know that it will automatically put alternative SWF content into a <div> tag. For example:
<script type="text/javascript" src="swfobject.js"></script>
<script type="text/javascript">
swfobject.embedSWF("myContent.swf", "myContent", "300", "120", "9.0.0");
</script>
</head>
<body>
<div id="myContent">
<p>Alternative content</p>
</div>
</body>
The challenge with this is that in some cases using a <div> tag is really "hiding" content from the user. Many experts in the search industry have warned against hiding content from the user, which the search engine may determine as being a spam technique. For example, suppose your site was about basketball shoes, but you wanted to rank really well for football shoes. You could put the basketball shoes content in your main body content and put a hidden <div> tag containing information about football shoes.
There is no definitive documentation that says "do not use <div> tags", but the theory is that using the <noscript> tag to display the alternative content is a safer method because it is displaying the alternative content to users who do not have JavaScript enabled. Search engines cannot read JavaScript and it provides good accessibility, so it's a win-win situation:
<noscript>
<!-- alternative SWF content -->
</noscript>
For further details on the benefits and techniques of this method, see the Place SWF content in HTML source section of my article, Search optimization techniques for RIAs.
Pop-up windows create unique headaches for search engines trying to crawl your site.
Because search spiders cannot see pop-up windows, structure your SWF content in a way that doesn't rely on pop-up windows.
Avoid using a "Launch Flash site" button on your index HTML file that launches a pop-up window containing your SWF content.
A robots.txt file is a good way to steer search engines away from irrelevant content. See the Create a robots.txt file section of my article, Search optimization techniques for RIAs, to learn more.
This plain-text file that you place in the root directory of your web server tells the spider what files it is allowed to look at on that server. It's a simple way to control what content you want the search engine to crawl.
Visit the Creating a robots.txt file page on Google Webmasters Help for additional documentation.
Here are the steps you need to take to make sure your website is structured in a way that does not block search engines from crawling your SWF content:
<noscript> method instead of <div> tags for displaying your alternative SWF content.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License.