Home > SharePoint 2007 > How do you get SharePoint 2007 to read and index content inside a PDF file?

How do you get SharePoint 2007 to read and index content inside a PDF file?

| More

This is an easy one but requires a little bit of work to get working correctly.  SharePoint uses a feature called Index Server to search documents but it doesn’t search within PDFs by default. Searching inside PDF documents requires an iFilter from Adobe which they designed for 3rd party systems to read the PDF file format. Adobe includes this filter with Adobe Reader or you can download iFilter separately from Adobe’s site if you don’t want Reader installed on your SharePoint servers. 

http://www.adobe.com/products/reader – Latest version of Adobe Reader

or

http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611 – x86 iFilter
http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025 – x64 iFilter

 
CENTRAL ADMINISTRATION
Now in SharePoint itself, you need to configure the search service to index files with the .pdf extension:

1. Go to CA and open up the Shared Service under Shared Services Administration.
2. Click Search Administration under the Search section.
3. Click File Types in the left nav bar and then click New File Type.
4. Enter “pdf” and click OK.

ICONS
You will also want to display the PDF icon next to PDF Documents in SharePoint.  You can download the icon from here:

http://www.adobe.com/images/pdficon_small.gif

and copy it into the 12 hive folder here:

C:Program FilesCommon FilesMicrosoft SharedWeb Server Extensions12TEMPLATEIMAGES

Then open up this XML template file:

C:Program FilesCommon FilesMicrosoft SharedWeb Server Extensions 12 TEMPLATEXMLDOCICON.XML

and add the this line in the <DocIcons.ByExtension> section if it isn’t there already:

<Mapping Key=”pdf” Value=”pdficon_small.gif”/>

REGISTRY
Now on to the registry changes you need to make on each index server.  Make sure to backup your registry before making any changes.  These two changes will register the Adobe PDF iFilter with the Office Search service.  The values that need to be changed are:

HKEY_LOCAL_MACHINESOFTWAREMicrosoftOffice Server12.0SearchSetupContentIndexCommonFiltersExtension.pdf

HKEY_LOCAL_MACHINESOFTWAREMicrosoftShared ToolsWeb Server Extensions12.0SearchSetupContentIndexCommonFiltersExtension.pdf

Both values should be changed to:

{E8978DA6-047F-4E3D-9C78-CDBE46041603}

SYSTEM PATH
Now you need to add the Adobe install directory to the System Path envrionmental veriable so that the search service can find the dll which provides the iFilter service:

1. Right click My Computer
2. Click Properties
3. Click Advanced
4. Click Environment Variables
5. In the bottom half of the window, find the Path variable and double click it.
6. At the end of the value, add:

;C:Program FilesAdobeReader 9.0Reader

RESTART SEARCH SERVICES
Now you need to restart the Office Search service so that all changes are reflected. Open up cmd prompt and type

sc stop osearch [press enter]
sc start osearch [press enter]

Or just restart it via the Services MMC.

If you already have PDF documents in SharePoint that you want to search inside, you have to ”Reset all crawled content” in Search Settings and then begin a new ”Full Crawl” under Content Sources.

More of my posts you might like:

  1. My favorite 3rd party Sharepoint 2007 add-ons
  2. Sharepoint Warmup Tool – Speed up your Sharepoint loading time
  3. Changing the host header for an already existing Sharepoint site/application
| More
  1. No comments yet.
  1. No trackbacks yet.
CommentLuv Enabled