Certain attachments not working in full text index

Test setup:Created a new database: testfti.nsf

Database is located on a Linux server running Domino 8.5.2

Database has one form and one rich text field

Added 4 documents:

Document 1 – rich text field has the term “Indianapolis”

Document 2 – rich text field has an MS Word 2003 document attached that contains the term “Miami”

Document 3 – rich text field has an MS Word 2007 document attached that contains the term “Chicago”

Document 4 – rich text field has a text file attached that contains the term “Dallas”

Created a full text that included attached documents and specified using conversion filters. From the debug file it looked like all documents and all attachments were indexed. (see below)

Results:

The expected documents were found when searching using the specified terms with the exception of the MS Word 2007 attachment. The search term ‘Chicago” returned no results and generated “OUT FTGSearch error = F22” on the server. (see below)

A local replica was made using an 8.5.2 Notes client and indexed using the same options. On the local replica the MS Word 2007 document was returned when searching for the term it contained, “Chicago”

I thought this was an 8.5 issue but just for grins I made a replica of the database on a Domino 7.0.2 server running on Windows. I was surprised to see the same results- the search term was not found in the MS Word 2007 attachment but was found in the MS Word 2003 document.

Similar results for PDF attachments – search terms not found in PDFs of any version.

I get the feeling that I must be missing something obvious. Formats.ini?

Any ideas would be appreciated.

Mike

FTInit: before call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: after call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

written: 0

FTInit: before call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: after call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: before call to LoadFTFilterLibrary – pftt->pftp = F5142370

FTInit: after call to LoadFTFilterLibrary – pftt->pftp = F5142370

FTInit: before call to LoadFTFilterLibrary – pftt->pftp = F5142370

FTInit: after call to LoadFTFilterLibrary – pftt->pftp = F5142370

FTGIndex: After call to FTGIndexStart 1 ms for [/local/notesdata/testfti.ft]

FTGIndex: Get modified NoteID table 0 ms for [/local/notesdata/testfti.ft]

FTGIndex: Before calling IDEnumerate 0 ms for [/local/notesdata/testfti.ft]

FTGetDocStream: INIT: Opened NoteID 916 in DB /local/notesdata/testfti.nsf

FTGetDocStream: TERM: Finished NoteID 916 in DB /local/notesdata/testfti.nsf

FTGetDocStream: INIT: Opened NoteID 91A in DB /local/notesdata/testfti.nsf

Indexing Attachment Object: ‘This is a 2007 Word document.docx’ Size = 10545

Now try brute force method - File = ‘This is a 2007 Word document.docx’ Size = 10545

FTGetDocStream: TERM: Finished NoteID 91A in DB /local/notesdata/testfti.nsf

FTGetDocStream: INIT: Opened NoteID 91E in DB /local/notesdata/testfti.nsf

Indexing Attachment Object: ‘This is a 2003 Word document.doc’ Size = 22528

Now try brute force method - File = ‘This is a 2003 Word document.doc’ Size = 22528

FTGetDocStream: TERM: Finished NoteID 91E in DB /local/notesdata/testfti.nsf

FTGetDocStream: INIT: Opened NoteID 922 in DB /local/notesdata/testfti.nsf

Indexing Attachment Object: ‘This is a text file.txt’ Size = 27

Now try brute force method - File = ‘This is a text file.txt’ Size = 27

FTGetDocStream: TERM: Finished NoteID 922 in DB /local/notesdata/testfti.nsf

FTGIndex: Finished: 17 ms. for [/local/notesdata/testfti.ft]

4 documents added, 0 updated, 0 deleted: 13479 text bytes; 96 numeric bytes for [/local/notesdata/testfti.ft]

FTGIndex: All Done: 13 ms for [/local/notesdata/testfti.ft]

OUT FTGIndex rc = 0 (No error) - for [/local/notesdata/testfti.ft]

FTInit: before call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: after call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: before call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: after call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: before call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: after call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: before call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

FTInit: after call to LoadFTFilterLibrary – pftt->pftp = EDCADAF8

IN FTGSearch

Search for Dallas in the text attachment-Success:

Query: (dallas)

Engine Query: (“dallas”%STEM)

GTR query performed in 1 ms. 1 documents found

0 documents disualified by deletion

0 documents disqualified by ACL

0 documents disqualified by IDTable

0 documents disqualified by NIF

Results marshalled in 0 ms. 1 documents left

OUT FTGSearch error = 0

FTGSearch: found=1, returned=1, start=0, count=0, limit=0

Total search time 1 ms.

Search for Miami in the Word 2003 attachment - Success

Query: (miami)

ngine Query: (“miami”%STEM)

TR query performed in 1 ms. 1 documents found

0 documents disualified by deletion

0 documents disqualified by ACL

0 documents disqualified by IDTable

0 documents disqualified by NIF

Results marshalled in 0 ms. 1 documents left

OUT FTGSearch error = 0

FTGSearch: found=1, returned=1, start=0, count=0, limit=0

Total search time 1 ms.

Search for Chicago in the Word 2007 attachment- Fail:

Query: (chicago)

Engine Query: (“chicago”%STEM)

OUT FTGSearch error = F22

FTGSearch: found=0, returned=0, start=0, count=0, limit=0

Subject: Same here

It appears I’m having the same problem with Office 2007 documents. Did you find a resolution?

Subject: Fix full text search results within PDF and DocX attachments

We found that several installations of Lotus Domino 8.5.1 and 8.5.2 had issues when searching a full-text indexed database with attachment conversion filters turned on. When searching for content within the attachment, no results were displayed.

This is due to a wrong characterset in the keyview settings. This can be fixed by adding the following notes.ini entries:

FT_BINARY_FILTER_OFF=0

OS400_KEYVIEW_CSID=0052

PLATFORM_CSID=052

Where 0052 stands for ISO 1252 West European Latin. After changing the ini-settings, please completely remove the FT-index and re-create it.

See also:

A suggestion from IBM Support is to have all packages installed on the domino server:

http://www-01.ibm.com/support/docview.wss?&uid=swg27013075