Originally Posted by
swaggerbox
It is not only Escherichia Coli but for all scientific names enclosed in <SN> tags.
I thought I'd address this because most suggestions haven't.
Originally Posted by
swaggerbox
I have attached a representative sample of the WB.
It's not especially representative because there's only one scientific name!
In the attached I've had a go with 2 macros:
Sub blah()
Set Dict = CreateObject("scripting.dictionary")
TheString = Sheet1.TextBox1.Text
Debug.Print TheString
SplitString = Split(TheString, "<SN>")
For Each PartString In SplitString
Posn = InStr(PartString, "</SN>")
If Posn > 0 Then
TagContents = "<SN>" & Left(PartString, Posn - 1) & "</SN>"
If Not Dict.exists(TagContents) Then
AbbreviatedTag = Split(Application.Trim(TagContents))
AbbreviatedTag(0) = Left(AbbreviatedTag(0), 5) & "."
AbbreviatedTag = Join(AbbreviatedTag)
Dict.Add TagContents, AbbreviatedTag
myCount = Len(TheString) - Len(Application.Substitute(TheString, TagContents, AbbreviatedTag)) / (Len(TagContents) - Len(AbbreviatedTag))
For i = 2 To myCount
TheString = Application.Substitute(TheString, TagContents, AbbreviatedTag, 2)
Next i
End If
End If
Next PartString
Sheet1.TextBox2.Text = TheString
End Sub
and
Sub blah2()
Set Dict = CreateObject("scripting.dictionary")
TheString = Sheet1.TextBox1.Text
Debug.Print TheString
SearchFrom = 1
Do
Posn1 = InStr(SearchFrom, TheString, "<SN>")
Posn2 = InStr(Posn1, TheString, "</SN>")
TagContents = Application.Trim(Mid(TheString, Posn1, Posn2 - Posn1))
If Dict.exists(TagContents) Then
LeftBit = Left(TheString, Posn1 - 1)
RightBit = Mid(TheString, Posn2)
TheString = LeftBit & Dict(TagContents) & RightBit
SearchFrom = Len(LeftBit & Dict(TagContents)) + 6
Else
AbbreviatedTag = Split(TagContents)
AbbreviatedTag(0) = Left(AbbreviatedTag(0), 5) & "."
AbbreviatedTag = Join(AbbreviatedTag)
Dict.Add TagContents, AbbreviatedTag
SearchFrom = Posn2 + 5
End If
Loop Until InStr(SearchFrom, TheString, "<SN>") = 0
Sheet1.TextBox2.Text = TheString
End Sub
If there's no space within a tag one of them puts 2 dots after the first letter, the other a single dot, so one needs some attention.