Strip HTML using LotusScript

I needed a LotusScript routine to strip HTML out of some text I was importing from an ODBC data store. I ended up creating a soltuion that works pretty nicely, and also will conditionally strip Orphan (“<” & “>”) tags. Full writeup is on my site at http://www.devinolson.net/devin/spankysplace.nsf/plinks/BDOZ-6B8TTY

Here is the code (2 functions, code is commented):

Function StripHTML (strSource As String, bool_StripOrphans As Boolean) As String

%REM

This function will strip HTML tags from a passed in string,

and return the resulting string.

Orphan Tags (“<” & “>”) will be handled based on the value of bool_StripOrphans.

The Orphan Tags will be removed if bool_StripOrphans is True,

and will be ignored otherwise.

%END REM

Dim intPosOpen As Integer

Dim intPosClose As Integer

Dim strTarget As String

strTarget$ = strSource

If bool_StripOrphans Then

’ Strip out Orphan Tags

Do

intPosOpen% = Instr(strTarget$, “<”)

intPosClose% = Instr(strTarget$, “>”)

If intPosOpen% < intPosClose% Then

’ Either the first open indicator occurs prior to the first close indicator,

’ or doesn’t exist at all.

If intPosOpen% = 0 Then

’ The first open indicator doesn’t exist.

’ If the Orphan close indicator exists, then strip it out.

If (intPosClose% > 0) Then strTarget$ = StripFirstSubstr(strTarget$, “>”)

Else

’ The first open indicator exists, and occurs prior to the first close indicator.

’ THIS INDICATES STANDARD MARKUP. STRIP IT OUT

strTarget$ = StripFirstSubstr(strTarget$, Mid$(strTarget$, intPosOpen%, (intPosClose% - intPosOpen%) + 1))

End If ’ intPosOpen% = 0

Else

’ Either the first close indicator occurs prior to the first open indicator,

’ or doesn’t exist at all.

If intPosClose% = 0 Then

’ The first close indicator doesn’t exist.

’ If the Orphan open indicator exists, then strip it out.

If (intPosOpen% > 0) Then strTarget$ = StripFirstSubstr(strTarget$, “<”)

Else

’ The first close indicator occurs prior to the first open indicator,

’ and is therefore an Orphan. Strip it out.

strTarget$ = StripFirstSubstr(strTarget$, “>”)

End If 'intPosClose% = 0

End If ’ intPosOpen% < intPosClose%

Loop While ((intPosOpen% + intPosClose%) > 0)

Else

’ Orphan tags are to be ignored.

Do

intPosOpen% = Instr(strTarget$, “<”)

If intPosOpen% > 0 Then

’ An open indicator exists. Find the subsequent close indicator

intPosClose% = Instr(intPosOpen, strTarget$, “>”)

Else

’ No open indicator exists. Set the close position to zero and bail out.

intPosClose% = 0

End If ’ intPosOpen% > 0

If intPosClose% > intPosOpen% Then

’ The first open indicator exists, and occurs prior to the first close indicator.

’ THIS INDICATES STANDARD MARKUP. STRIP IT OUT

strTarget$ = StripFirstSubstr(strTarget$, Mid$(strTarget$, intPosOpen%, (intPosClose% - intPosOpen%) + 1))

Else

’ No close indicator exists. Set the open position to zero and bail out.

intPosOpen% = 0

End If ’ intPosClose% > intPosOpen%

Loop While ((intPosOpen% + intPosClose%) > 0)

End If ’ bool_StripOrphans

StripHTML$ = strTarget$

End Function ’ StripHTML

Function StripFirstSubstr (strSource As String, strSubstr As String) As String

%REM

This function strips the first occurence of a substring from a string,

and returns the result.

If the substring is not contained within the source string,

this function returns the source string.

%END REM

If (Instr(strSource$, strSubstr$) > 0) Then

StripFirstSubstr$ = Strleft(strSource$, strSubstr$) & Strright(strSource$, strSubstr$)

Else

StripFirstSubstr$ = strSource$

End If ’ (Instr(strSource$, strSubstr$) > 0)

End Function ’ StripFirstSubstr

Hope this helps!

-Devin