Notes Invoking MS Word - Improving Efficiency

Our Notes application builds Word documents. An agent creates a document collection of employee documents, then goes through the collection and builds a Word document applicable to each employee. There are several MS Word .dot files used as templates for various permutations and combinations of final Word documents, depending on each employee’s data. The agent pulls in various templates and populates various form fields in each of the templates to build the final Word document and then saves it in the Notes database as an attachment.

This is a very long process, as we have over 10,000 employees and it takes 3 - 6 seconds per employee to create each Word document. We’re trying to make the process as efficient as possible in an attempt to cut hours off the agent runtime.

It’s been suggested that instead of using a doc collection and simply looping through it, we could try to group employees with common data together so the same templates will be used for, perhaps, thousands of employees being referenced in a row, rather than jumping back and forth between employees who require different templates. The thought behind this is, as the Word template files are in use, they’re cached in the server memory. If we can group employees together, we can take better advantage of the cache and not have to constantly move templates in and out of memory. Keeping in mind that there are 15 templates and each employee requires at least 4 of the templates to build a document for them, is this an option?

So for example we have templates 1, 2, 3, 4, 5, 6, and 7 .

We access the templates for each employee as follows:

1, 2, 3, 4

1, 2, 3, 4

1, 2, 3, 4

1, 2, 5, 6, 4

1, 2, 7, 4

1, 2, 5, 6, 4

1, 2, 3, 4

Would we gain efficiency if instead we processed as follows:

1, 2, 3, 4

1, 2, 3, 4

1, 2, 3, 4

1, 2, 3, 4

1, 2, 5, 6, 4

1, 2, 5, 6, 4

1, 2, 7, 4

I’m aware that using a view or other means to group the incoming employee documents is going to be less efficient than using a document collection, therefore the gain accomplished by using/cacheing common templates may be offset by the loss of efficiency in getting away from a doc collection.

Ideas?

Subject: There is only one way to improve this - and that is dropping Application Automation

You are doing what we call Application Automation. You are opening Word, and thru code, automating the application. There is some serious overhead in just opening Microsoft Word, let alone building, editing, and working with the document(s). Even if you turn off the display painting, this is fairly slow. You might be able to speed up how you get the Notes data - but there is little way to improve the Word automation.

It is my belief that Application Automation is going to go away over the next 5 to 10 years. We will see Document Generation take over. This is possible today, with both ODF and OOXML document formats now being supported in Office, OpenOffice.org, and Lotus Symphony. Creating a document via the XML formats is super fast - and can be done without loading the application. This allows for server side document generation. PSC has been working with Document Generation for over a year now - and is very invested in this technology. It is not for the faint of heart - and its definitely somewhere between 1.0 and 2.0 software. But it is possible - we have a client that is generating powerpoint presentations on a server. These aren’t simple presentations - they are 440 slides with 372 charts & tables. The Application Automation solution used to take 8 hours for each presentation. The Document Generation solution, using OOXML, takes 10 minutes. Since they generate 200 of these at a time, that is going from 1600 hours for the Application Automation solution to 33 hours for the Document Generation solution using OOXML. ODF would be roughly the same improvement.

If you would like to discuss more, please contact me at jhead AT psclistens DOT com