Puzzle: What's the most efficient way to search & replace usernames

THE CHALLENGEWhat’s the most efficient way to locate and replace user names across 100,000 documents?

THE BACKGROUND

Approximately 600 user names are changing.

For example “Jane Doe/Germany/SuperCo” is changing to “Jane Marie Doe/Philipines/SuperCo”.

In order to maintain document access, all AUTHORS, READERS and NAMES fields need to be updated.

Some fields store a single-value while other fields store multiple-values.

Some fields store values in Abbreviated format while others store in Canonical format.

There are approximately 30 fields per document that have the potential to be updated.

(Side Note: I did not create the database, I just inherited it.)

CURRENT DESIGN

The current approach is to build a collection of documents, loop through the documents and search for all user names stored in any non-richtext field. If a match is found replace it and save the document. This approach touches each document only ONE time. Below is working code, but it takes a long time run when testing with a single name replacement. It will take much longer once the code is expanded to search for 600+ users. I do plan to run this code in an agent “On Server” which will help speed up the processing.

Does anyone have any suggestions on a more efficient approach?

WORKING CODE…

Option Public

Option Declare

Sub FixNames

Dim s As New NotesSession

Dim db As NotesDatabase

Dim collection As NotesDocumentCollection

Dim doc As NotesDocument

Dim cntr As Integer

Dim debug As Integer

Dim NewName As String

Dim OrigName As String

Dim OrigPerson As NotesName

Dim NewPerson As NotesName

Dim ErrorMsg As String

Dim SearchCriteria As String

Dim i As Integer

On Error GoTo Oops

'-----------------------------------------------------------------------------

’ Initialize variables

'-----------------------------------------------------------------------------

Cntr = 0

debug = 0

'SearchCriteria = “/Germany”

SearchCriteria = “Jane Doe/Germany/SuperCo”

'Placeholder: eventually add code here to import the old and new names from spreadsheet

'Hardcoding for testing purposes

OrigName = “Jane Doe/Germany/SuperCo”

NewName = “Jane Marie Doe/Philipines/SuperCo”

Set db = New NotesDatabase( “TestServer1”, “TestDB.nsf”)

Set OrigPerson = New NotesName( OrigName )

Set NewPerson = New NotesName( NewName )

If debug Then Print "Searching for " & OrigPerson.Common

'-----------------------------------------------------------------------------

’ Build a collection of documents that meet the search criteria

'-----------------------------------------------------------------------------

Set collection = db.FTSearch( SearchCriteria ,0)

'-----------------------------------------------------------------------------

’ Loop through the collection of documents for the current database

'-----------------------------------------------------------------------------

Set doc = collection.GetFirstDocument()

While Not(doc Is Nothing)

cntr = cntr + 1    

Print "Processing document " & cntr & " of " & collection.Count



'-------------------------------------------------------------------------

' Loop through all fields for the current document

'-------------------------------------------------------------------------

ForAll item In doc.Items      



  ' Placeholder: Eventually add code to loop through all user names to be changed

            

  '---------------------------------------------------------------------

  ' Does the current field contain the original user name string?

  ' If yes, update the field. Otherwise skip it. 

  '---------------------------------------------------------------------

  If InStr ( item.text, OrigName ) > 0 Then

    If debug Then Print "Curr value " + item.name + " = " + item.text

              

    If item.type = 1 Then     

      'Skip RichText fields

      Print "Skipping RichText field: " + item.name  

    else    

      

      '-------------------------------------------------------------

      ' Determine total values in the field. 

      ' Field contains a single value if Ubound equals 0

      ' Field contains a multi values if Ubound is greater than 0 

      '-------------------------------------------------------------

      If UBound(item.Values) = 0 Then 

    

        '---------------------------------------------------------

        ' Single value field. Check field format and replace value.

        '---------------------------------------------------------

        If (item.text = OrigPerson.Abbreviated) Then 

          ' Found single value name in abbreviated format

          Call ReplaceFieldValue ( doc, item.name, NewPerson.Abbreviated )            

          If debug Then Print "New value " + item.name + " = " + item.text

        Else 

          ' Found single value name in a alternate format. Set to canonical 

          Call ReplaceFieldValue ( doc, item.name, NewPerson.Canonical )            

          If debug Then Print "New value " + item.name + " = " + item.text

        End If  

        

      Else

        '---------------------------------------------------------

        ' Multi-value field. Locate element and replace value. 

        '---------------------------------------------------------

        If debug Then Print "Found Multi Value field."  

        i=0

        ReDim ReturnArray(0) As Variant

        ForAll v In item.Values          

          ReDim Preserve ReturnArray (i) As Variant

          If InStr ( v , OrigPerson.common ) > 0 Then

                          

            ' Found match. Check the name format and replace it

            If v = OrigPerson.Abbreviated Then

              ReturnArray(i) = NewPerson.Abbreviated

            Else

              ReturnArray(i) = NewPerson.Canonical

            End If            



          Else 

            ' Value does not contain the search term, skip it. 

            ReturnArray(i) = v

          End If

          i = i + 1

        End ForAll

        Call ReplaceFieldValue ( doc, item.name, ReturnArray )      



      End If  ' End of Multi-value check

    End If    ' End of RichText check  

  End If      ' End of found string in field check

  

  ' Eventually... Put end of loop through user names here

  

  doc.Save True, True  

  

End ForAll     ' End of loop through fields in document    

    

Set doc = collection.GetNextDocument(doc)

Wend

Print "Update complete. Total documents found: " & cntr

'------------------------------------------

’ Exit subroutine statement

'------------------------------------------

Finished:

Exit Sub

Oops:

'------------------------------------------

’ Build and display the error message

'------------------------------------------

ErrorMsg = “Error #” & Err & Chr$(10) &_

|“| & Error$ & |”| & Chr$(10) & Chr$(10) & "Line " &_

Erl & | in object: “| & LSI_Info(2) & |”|

MsgBox ErrorMsg, 16, “Unexpected Error”

Resume Finished

End Sub

Sub ReplaceFieldValue ( doc As NotesDocument, FieldName As String, NewValue As Variant )

Dim item As NotesItem

Set item = doc.ReplaceItemValue( FieldName, NewValue )

End Sub

Subject: Adminp !

Hello,

I might be late to the party, but the most efficient solution to your problem would be to use adminp (the Administration Process) and related Lotusscript or Java classes.

Start here :

then move on to there :

https://notes.helsinki.fi/help/help8_designer.nsf/b3266a3c17f9bb7085256b870069c0a9/a4cb3788838967448525731b0049f3e9?OpenDocument

… and let good old adminp do the job for you !

Subject: Some ideas…

One way to make it faster is to read all the original names and new names into a list.Your code indicates that you have a text file (or .csv file) with the name pairs.

Put the old (existing) name as the list tag, then you can very easily use IsElement() to check if a specific name is in the list.

To solve the problem with some fields containing the canononical name, and some containing the abbreviated name, you can put code in the section where you read the names from the csv file:

Dim oldname As string

Dim newname as String

Dim tempname as NotesName

Dim user List As String

Open “c:\namefile.csv” for input as #1

Do Until eof(1)

Input #1, oldname, newname

Set tempname = New NotesName(oldname)

user(tempname.Abbreviated)=newname

user(tempname.Canonical)=newname

user(tempname.Common)=newname 'Optional

Loop

Close #1

Now you have 3 list items for each name, for easy comparison later in your code.

Looking further at your code, I see that you use db.FTSearch(). Make sure the database is full-text indexed, or it will be very slow.

I would also optimize your loop, when you process all the NotesItems in the document. Check the field type first, before you make any other comparisons, that will make the code faster. Always put code that skip out of an if-statement or similar as early as you can, to avoid extra processing time.

I would optimize the loop substantially. First of all, all your print statements takes a lot of time. You may want to remove them to speed up processing.

Since all Notes fields technically are multi-value fields, simply loop through all the values in each field instead of writing code that handles them differently depending on if they contain multiple values or not.

Something like this:

Dim tmpvalue As Variant

ForAll item In doc.Items

If item.Type <> 1 Then

tmpvalue = item.Values

ForAll t in tmpvalue

'*** Check if field value “t” is in list of users

If IsElement(user(t)) Then

t = user(t)

End If

End ForAll

Call doc.ReplaceItemValue(item.Name,tmpvalue)

Call doc.Save(true,false)

End If

End ForAll

Something like that. I haven’t tested the code, so use at your own risk. :slight_smile:

Subject: Thank you.

Excellent. Thank you very much.