Adobe
Sign in My orders My Adobe
ePaper
Ask Adobe
Using the PDF IFilter
Windows: When I search PDF files on a Web site that I indexed using the PDF IFilter on the Microsoft Site Server, I get inconsistent results. Sometimes a search returns five documents, sometimes more or less. What is going on?
The problem you're running into is caused by the Microsoft Site Server reading the PDF IFilter slightly differently than it does other IFilters. The Adobe PDF IFilter was designed to work with Microsoft Index Server, so your best bet is to use that product instead. The Index Server (as opposed to the full-blown, full-featured, commercial site-management tool, Site Server) is included free of charge with the Internet Information Server (IIS) in the Microsoft Windows Option Pack.

The way the general process works is this: The Adobe® PDF IFilter extends Microsoft Index Server so that visitors to your Web site can search for text within PDF files on the site, over and above the types of documents the Index Server supports by default.

Essentially, an IFilter is a DLL (dynamic link library) file that applications call to extract words from a given file format. For example, when a "caller" such as Microsoft Index Server or Site Server indexes the contents of a directory that contains a PDF file, it checks in the Windows NT Registry for a DLL that can handle files with the .pdf extension. The PDF IFilter DLL recognizes the file format, and is then used to extract all of the words from the file.

There is another option for word-hunting within your PDFs: You can use a third-party PDF search engine. For example, Verity (a company that worked with Adobe Systems to create the Acrobat® Find and Search functions, as well as Acrobat Catalog) offers search engines that can search your entire site, including PDF files. For more information (US).