PDF attachments are excluding in full-text search indexes

On Domino Release 8.5.1FP2 (same problem on 8.5.1 FP3)On database NSF - ODS 51 and Full Text indexed with options" Index attached files" + “With found text”

Attachements : Attachments are embedded objects in NSF (NOT using DAOS)

	PDF's file embedded are well indexed. 

My problem : Using full text search a keyword is found only the documents with .txt, .xls, .doc… NOT the documents with the PDFs files.

So, if somebody knows this problem :

The full-text searching in attached PDFs files documents does not work !

	Why ?

Thank you for your help

Nota : I did various tests on differents format of PDFs Files like 1.3 (Acrobat 3.x) or, 1.4 (Acrobat 5.x) or, 1.5 ( Acrobat 6.x)

This formats should be supported =>

Subject: is the PDF searchable natively?

Not all PDFs are searchable, for instance if it’s a scanned document resulting in an image. Try opening the PDF document in Adobe and see if it’s searchable that way. If not then this may explain why it’s not being included in the FTI.

Subject: …My pdfs files are well full-text indexed…

Thank you for your response but I checked before posting here and all my Pdfs files are well full text indexed.

A search with Adobe on the word is working and also with the search of explorer windows when my files are on my disk.

Normally, the pdfs files are indexed with Domino but here i don’t understand why not.

Subject: Which option did you choose

Which option did you choose while creating the index ?

Without convertion filters

or

Using convertion filters on known file types ?

You have to use the second option to be able to search in an accurate way in pdfs.

Are your pdf protected with a password ? I’m not sure about this one but it might fail depending of the securities in the pdf.

Hope this helps

Renaud

Subject: Index created without conversion filters

And it’s the purpose of my question because 1. Index in NSF database is created with basic option (index attachment without filters)

  1. PDFs files are well indexed

  2. PDFs files are aren’t protected by a password

  3. For the same exactly request search, when a word is only in a .txt file or .xls, .doc fil is found, but, when the word is in pdf file isn’t found.

Thank you for your help.

Regards

Sylvie

Subject: I don’t know why but…

the only thing I know is that you have to use filters to index pdfs properly. :wink:

Why I don’t know and maybe someone here knows. Sorry. :wink:

Renaud

Subject: [SOLVED] - Thank You - Renaud !!!

When I rebuild my FTI WITH OPTION “conversions filters” is OK for PDF files and the other text files.

I didn’t understand because I tried with this option unsuccesfull but perhaps the delay on rebuilding induced me to the wrong conclusion.

Thank You for your help.

Kind regards