View Full Version : [SOLVED:] Convert Excel Cell Text in Bold or Italic to HTML Tags/Convert Cell Line Breaks
gmaxey
10-07-2020, 09:03 AM
I have a very large Excel file where several of the columns contain multi-line descriptive data containing mixed formatted text (in addition to the normal font, some formatted with bold and some with italic). I need to replace the formatted "Bold" or "Italic" text in the cells with "<B>text</B> and "<I>text</I>" flags and add paragraph tags <P></P> to the cells replacing the Chr(10) linebreaks.
A real nub with Excel coding and all I could put together (from pecking through examples on the web) follows:
Problem: This is really slow as each character in each cell is evaluated. Is there anything equivalent in Excel to Words Find.Execute command such that I could simply set a range to each Cell then find "BOLD" text and insert the flags before and after each instance of a found instance?
Select the column to process:
Sub ConvertFormatToHTML()
Dim oRng As Range
Dim oCell
Dim lngCount As Long
Dim lngScope As Long
Set oRng = Selection.Columns.Item(1)
lngScope = GetLastRow(oRng)
For Each oCell In oRng.Cells
oCell.Value = AddTags(oCell)
lngCount = lngCount + 1
DoEvents
If lngCount = lngScope Then Exit Sub
Next
End Sub
Function AddTags(ByVal oCell As Range) As String
Dim lngIndex As Long
Dim strResult As String
Dim bIsBold As Boolean, bIsItalic As Boolean
bIsBold = False
bIsItalic = False
For lngIndex = 1 To Len(oCell.Value)
If oCell.Characters(lngIndex, 1).Font.FontStyle = "Bold" Then
If bIsBold = False Then
strResult = strResult + "<B>"
bIsBold = True
End If
Else
If bIsBold = True Then
strResult = strResult + "</B>"
bIsBold = False
End If
End If
If oCell.Characters(lngIndex, 1).Font.FontStyle = "Italic" Then
If bIsItalic = False Then
strResult = strResult + "<I>"
bIsItalic = True
End If
Else
If bIsItalic = True Then
strResult = strResult + "</I>"
bIsItalic = False
End If
End If
strResult = strResult + oCell.Characters(lngIndex, 1).Text
If oCell.Characters(lngIndex, 1).Text = Chr(10) Then
strResult = strResult & "</P><P>"
End If
Next lngIndex
If bIsBold = True Then strResult = strResult + "</B>"
If bIsItalic = True Then strResult = strResult + "</I>"
If Len(strResult) > 0 Then
AddTags = "<P>" & strResult & "</P"
Else
AddTags = vbNullString
End If
AddTags = Replace(AddTags, "<P><P>", "<P>")
AddTags = Replace(AddTags, "</P></P>", "</P>")
AddTags = Replace(AddTags, Chr(10), "")
lbl_Exit:
Exit Function
End Function
Function GetLastRow(oRng As Range) As Long
GetLastRow = Cells(Rows.Count, oRng.Column).End(xlUp).Row
lbl_Exit:
Exit Function
End Function
The only thing I can think of is to have a Class Object "FontToTag"
The first step would be to trigger F2T.Reset to set its state to null
I see it as checking each character to see if that fontstyle is the same as a previous character and returning the proper tag to Excel for emplacement before that character in the string.
F2T.Reset
If Not {All] fontstyle = "Regular" Then
For each character in String
strResult = F2T(character) & strResult
Your code would have to handle IF lngIndex = 1 then add <P>
Class FontToTag would need to check Fontstyles regular, bold, italic, and bolditalic, with the understanding that changing from B&I to one of the other three only applies a Closing Tab for the one(s) that changed
The only other way I can see to speed this up is to first check the string for any FontStyle NOT regular and only then send the cell for full processing.
gmaxey
10-07-2020, 10:56 AM
Sam,
Thanks for your reply. I'm afraid the class approach is a little over my head. Taking baby steps with Excel:
As for your: "The only other way I can see to speed this up is to first check the string for any FontStyle NOT regular and only then send the cell for full processing."
Sounds promising, but how? Also there is the matter of the line breaks. I think If Instr(oCell.Value, Chr(10)) > 0 would work send to processing any cell with line breaks but how would I determine if a cell was or was not all normal text?
As a start
Function AddTags(ByVal oCell As Range) As String
Dim lngIndex As Long
Dim strResult As String
Dim bIsBold As Boolean, bIsItalic As Boolean
'Skip any Cells that don't need Tags
If oCell.Font.Regular Then
AddTags = oCell.Value
Exit Function
End if
.
.
.
Just for ideas, I'm pretty sure about Split and Join with the arrays. But the Character Properties has me flummoxed.
Function SamT(oCell As Range) As String
Dim arrLines '() 'As Excel.Characters
Dim arrWords '() 'As Excel.Characters
Dim cWord 'As Excel.Characters
'Check for singletons
If Not InStr(oCell.Text, Chr(10)) Then arrLines = oCell.Text
If Not InStr(oCell.Text, " ") Then cWord = oCell.Text
'Make some decisions
'is not a single line, is not a single word
arrLines = Split(oCell.Text, Chr(10))
'Loop thru all the lines
For i = LBound(arrLines) To UBound(arrLines)
arrWords = Split(arrLines(i), " ")
For j = LBound(arrWords) To UBound(arrWords)
cWord = arrWords(j)
CheckWord:
If cWord.Font.FontStyle = Regular Then GoTo SkipWord
'Else use your tag processing code
'cWord = AddTags(cWord)
SkipWord:
arrWords(j) = cWord
Next j
arrLines(i) = Join(arrWords, " ")
Next i
SamT = Join(arrLines, "</p>chr(10)<p>")
End Function
Paul_Hossler
10-07-2020, 08:32 PM
Edit - I realized I could simplify the code a bit more from my ver3
Edit^2 - cleanup some debug code :(
Try something like this to see if it's faster for you
1. I used .SpecialCells to just get text cells
2. Arrays for speed so I only need to use .Characters as little as possible
3. short circuit if entire cell is bold and/or italic
4. I put the result in the next cell
Option Explicit
Sub ConvertFormatToHTML()
Dim oRng As Range
Dim oCell As Range
ActiveSheet.Columns(1).Select ' testing
Set oRng = Nothing
On Error Resume Next
Set oRng = Selection.Columns(1).SpecialCells(xlCellTypeConstants, xlTextValues)
On Error GoTo 0
If oRng Is Nothing Then Exit Sub
For Each oCell In oRng.Cells
oCell.Offset(0, 1).Value = AddTags(oCell)
Next
End Sub
Function AddTags(oCell As Range) As String
Dim sResult As String, sChar As String
Dim i As Long, L As Long
Dim bBold As Boolean, bItal As Boolean
Dim A() As String
With oCell
'check the entire cell
If Not .Font.Bold And Not .Font.Italic Then
sResult = .Value
ElseIf .Font.Bold And .Font.Italic Then
sResult = "<b><i>" & .Value & "</i></b>"
ElseIf .Font.Bold And Not .Font.Italic Then
sResult = "<b>" & .Value & "</b>"
ElseIf Not .Font.Bold And .Font.Italic Then
sResult = "<i>" & .Value & "</i>"
Else
L = Len(.Value)
'number of chars
ReDim A(1 To L)
For i = 1 To L
sChar = .Characters(i, 1).Text
bBold = .Characters(i, 1).Font.Bold
bItal = .Characters(i, 1).Font.Italic
If bBold And bItal Then
A(i) = "<b><i>" & sChar & "</i></b>"
ElseIf bBold Then
A(i) = "<b>" & sChar & "</b>"
ElseIf bItal Then
A(i) = "<i>" & sChar & "</i>"
Else
A(i) = sChar
End If
Next i
sResult = Join(A, "")
'InStr faster than Replace if nothing to Replace
If InStr(sResult, "</b><b>") > 0 Then sResult = Replace(sResult, "</b><b>", vbNullString)
If InStr(sResult, "</i><i>") > 0 Then sResult = Replace(sResult, "</i><i>", vbNullString)
If InStr(sResult, "</i></b><b><i>") > 0 Then sResult = Replace(sResult, "</i></b><b><i>", vbNullString)
End If
End With
sResult = "<p>" & sResult & "</p>"
AddTags = Replace(sResult, Chr(10), "</p><p>")
End Function
Not too fast either:
Sub M_snb()
ActiveSheet.UsedRange.Copy
With CreateObject("Word.document")
.Windows(1).Visible = True
.Characters.first.PasteExcelTable 0, 0, 1
.tables(1).converttotext
.Application.Selection.homekey 6
.Application.Selection.collapse
With .Application.Selection.Find
.ClearFormatting
.Font.Italic = 1
.Text = "*"
.Forward = 1
.Wrap = 1
.Format = 1
.MatchWildcards = 1
.Execute
End With
With .Application
Do Until .Selection.Font.Italic = False
.Selection.InsertBefore "<i>"
.Selection.EndOf 1
.Selection.Find.Font.Italic = Not .Selection.Find.Font.Italic
.Selection.Find.Execute
.Selection.InsertAfter "</i>"
.Selection.EndOf 1
.Selection.Find.Font.Italic = Not .Selection.Find.Font.Italic
.Selection.Find.Execute
Loop
End With
sn = Filter(Split(.Content, "i>"), "</")
.Close 0
End With
For j = 0 To UBound(sn)
ActiveSheet.Cells.Replace Replace(sn(j), "</", ""), "<i>" & sn(j) & "i>", 2
sn = Filter(sn, sn(j), 0)
If UBound(sn) = -1 Then Exit For
Next
End Sub
gmaxey
10-08-2020, 09:34 AM
Sam, Paul, snb
Thanks guys for all of your input and assistance. There is no perfect world but I think this will work for me:
Option Explicit
Sub ConvertFormatToHTML()
Dim oRng As Range
Dim oCell
Dim lngCount As Long
Dim lngScope As Long
Set oRng = Selection.Columns.Item(1)
lngScope = GetLastRow(oRng)
For Each oCell In oRng.Cells
If lngCount > 0 Then oCell.Value = AddTags(oCell)
lngCount = lngCount + 1
DoEvents
Debug.Print lngCount
If lngCount = lngScope Then Exit Sub
Next
End Sub
Function AddTags(ByVal oCell As Range) As String
Dim lngIndex As Long
Dim strResult As String
Dim bIsBold As Boolean, bIsItalic As Boolean, bIsBoldItalic As Boolean
bIsBold = False
bIsItalic = False
bIsBoldItalic = False
Select Case oCell.Font.FontStyle
Case Is = "Regular": strResult = oCell.Value
Case Is = "Bold"
strResult = "<B>" & oCell.Value & "</B>"
oCell.Font.FontStyle = "Regular"
Case Is = "Italic"
strResult = "<i>" & oCell.Value & "</i>"
oCell.Font.FontStyle = "Regular"
Case Is = "Bold Italic"
strResult = "<B><i>" & oCell.Value & "</i></B>"
oCell.Font.FontStyle = "Regular"
Case Else
For lngIndex = 1 To Len(oCell.Value)
If oCell.Characters(lngIndex, 1).Font.FontStyle = "Bold Italic" Then
If bIsBoldItalic = False Then
strResult = strResult + "<i><B>"
bIsBoldItalic = True
End If
Else
If bIsBoldItalic = True Then
strResult = strResult + "</i></B>"
bIsBoldItalic = False
End If
End If
If oCell.Characters(lngIndex, 1).Font.FontStyle = "Bold" Then
If bIsBold = False Then
strResult = strResult + "<B>"
bIsBold = True
End If
Else
If bIsBold = True Then
strResult = strResult + "</B>"
bIsBold = False
End If
End If
If oCell.Characters(lngIndex, 1).Font.FontStyle = "Italic" Then
If bIsItalic = False Then
strResult = strResult + "<i>"
bIsItalic = True
End If
Else
If bIsItalic = True Then
strResult = strResult + "</i>"
bIsItalic = False
End If
End If
strResult = strResult + oCell.Characters(lngIndex, 1).Text
Next lngIndex
If bIsBold = True Then strResult = strResult + "</B>"
If bIsItalic = True Then strResult = strResult + "</i>"
If bIsBoldItalic = True Then strResult = strResult + "</i></B>"
End Select
If InStr(strResult, Chr(10)) > 0 Then
strResult = Replace(strResult, Chr(10), "<P></P>")
End If
If Len(strResult) > 0 Then
AddTags = "<P>" & strResult & "</P>"
Else
AddTags = vbNullString
End If
AddTags = Replace(AddTags, "<P><P>", "<P>")
AddTags = Replace(AddTags, "</P></P>", "</P>")
AddTags = Replace(AddTags, Chr(10), "")
lbl_Exit:
Exit Function
End Function
Function GetLastRow(oRng As Range) As Long
GetLastRow = Cells(Rows.Count, oRng.Column).End(xlUp).Row
lbl_Exit:
Exit Function
End Function
Paul_Hossler
10-08-2020, 12:26 PM
OK, but I have to say that it doesn't look very efficient
Not sure it'll work on strings like: nnnBBBBXXXXIIIInnnnBBBB when XXXX = Bold Italic, n = none, B = bold and I = italic
gmaxey
10-08-2020, 05:04 PM
Paul,
You are right. It wouldn't. I could be missing the tree for the forest, but I don't see how to avoid the For Each Character loops when the cell contains mixed formatting
Feel free to "Give me a fish" here if you wish. I don't want the content to replace the existing content. I'll attach a sample file.
Option Explicit
Sub ConvertFormatToHTML()
Dim oRng As Range
Dim oCell
Dim lngCount As Long
Dim lngScope As Long
Set oRng = Selection.Columns.Item(1)
lngScope = GetLastRow(oRng)
For Each oCell In oRng.Cells
If lngCount > 0 Then oCell.Value = AddTags(oCell)
lngCount = lngCount + 1
DoEvents
Debug.Print lngCount
If lngCount = lngScope Then Exit Sub
Next
End Sub
Function AddTags(ByVal oCell As Range) As String
Dim lngIndex As Long
Dim strResult As String
Dim bIsBold As Boolean, bIsItalic As Boolean, bIsBoldItalic As Boolean
bIsBold = False
bIsItalic = False
bIsBoldItalic = False
Select Case oCell.Font.FontStyle
Case Is = "Regular": strResult = oCell.Value
Case Is = "Bold"
strResult = "<B>" & oCell.Value & "</B>"
oCell.Font.FontStyle = "Regular"
Case Is = "Italic"
strResult = "<i>" & oCell.Value & "</i>"
oCell.Font.FontStyle = "Regular"
Case Is = "Bold Italic"
strResult = "<B><i>" & oCell.Value & "</i></B>"
oCell.Font.FontStyle = "Regular"
Case Else
For lngIndex = 1 To Len(oCell.Value)
Select Case oCell.Characters(lngIndex, 1).Font.FontStyle
Case Is = "Bold Italic"
If bIsBold = True Then strResult = strResult + "</B>": bIsBold = False
If bIsItalic = True Then strResult = strResult + "</i>": bIsItalic = False
If bIsBoldItalic = False Then
strResult = strResult + "<B><i>"
bIsBoldItalic = True
End If
Case Is = "Bold"
If bIsBoldItalic = True Then strResult = strResult + "</i></B>": bIsBoldItalic = False
If bIsItalic = True Then strResult = strResult + "</i>": bIsItalic = False
If bIsBold = False Then
strResult = strResult + "<B>"
bIsBold = True
End If
Case Is = "Italic"
If bIsBold = True Then strResult = strResult + "</B>": bIsBold = False
If bIsBoldItalic = True Then strResult = strResult + "</i></B>": bIsBoldItalic = False
If bIsItalic = False Then
strResult = strResult + "<i>"
bIsItalic = True
End If
Case Else
If bIsBold = True Then strResult = strResult + "</B>": bIsBold = False
If bIsItalic = True Then strResult = strResult + "</i>": bIsItalic = False
If bIsBoldItalic = True Then strResult = strResult + "</i></B>": bIsBoldItalic = False
End Select
strResult = strResult + oCell.Characters(lngIndex, 1).Text
Next lngIndex
If bIsBold = True Then strResult = strResult + "</B>": bIsBold = False
If bIsItalic = True Then strResult = strResult + "</i>": bIsItalic = False
If bIsBoldItalic = True Then strResult = strResult + "</i></B>": bIsBoldItalic = False
End Select
If InStr(strResult, Chr(10)) > 0 Then
strResult = Replace(strResult, Chr(10), "<P></P>")
End If
If Len(strResult) > 0 Then
AddTags = "<P>" & strResult & "</P>"
Else
AddTags = vbNullString
End If
AddTags = Replace(AddTags, "<P><P>", "<P>")
AddTags = Replace(AddTags, "</P></P>", "</P>")
AddTags = Replace(AddTags, Chr(10), "")
lbl_Exit:
Exit Function
End Function
Function GetLastRow(oRng As Range) As Long
GetLastRow = Cells(Rows.Count, oRng.Column).End(xlUp).Row
lbl_Exit:
Exit Function
End Function
27285
gmaxey
10-09-2020, 06:24 AM
Paul,
Was in a rush yesterday to get something done and didn't have the time to full review (or appreciate) what you had suggested. I revised it a little to better suit my style and eliminated some of the boolean variables. It seems to work very well with your samples and my text as well. Thank you.
Sub ConvertFormatToHTML()
Dim oRng As Range
Dim oCell As Range
ActiveSheet.Columns(1).Select
Set oRng = Nothing
On Error Resume Next
Set oRng = Selection.Columns(1).SpecialCells(xlCellTypeConstants, xlTextValues)
On Error GoTo 0
If oRng Is Nothing Then Exit Sub
For Each oCell In oRng.Cells
'oCell.Offset(0, 1).Value = AddTags(oCell)
oCell.Value = AddTags(oCell)
Next
lbl_Exit:
Exit Sub
End Sub
Function AddTags(oCell As Range) As String
Dim strResult As String, strChar As String
Dim lngIndex As Long, lngLen As Long
Dim arrChrs() As String
With oCell
'Check the entire cell
If Not .Font.FontStyle = "Bold" And Not .Font.FontStyle = "Italic" And Not .Font.FontStyle = "Bold Italic" Then
strResult = .Value
ElseIf .Font.FontStyle = "Bold Italic" Then
strResult = "<b><i>" & .Value & "</i></b>"
ElseIf .Font.FontStyle = "Bold" Then
strResult = "<b>" & .Value & "</b>"
ElseIf .Font.FontStyle = "Italic" Then
strResult = "<i>" & .Value & "</i>"
Else
lngLen = Len(.Value)
'Number of chars
ReDim arrChrs(1 To lngLen)
For lngIndex = 1 To lngLen
strChar = .Characters(lngIndex, 1).Text
Select Case .Characters(lngIndex, 1).Font.FontStyle
Case "Bold Italic"
arrChrs(lngIndex) = "<b><i>" & strChar & "</i></b>"
Case "Bold"
arrChrs(lngIndex) = "<b>" & strChar & "</b>"
Case "Italic"
arrChrs(lngIndex) = "<i>" & strChar & "</i>"
Case Else
arrChrs(lngIndex) = strChar
End Select
Next lngIndex
strResult = Join(arrChrs, "")
End If
'InStr faster than Replace if nothing to Replace
If InStr(strResult, "</b><b>") > 0 Then strResult = Replace(strResult, "</b><b>", vbNullString)
If InStr(strResult, "</i><i>") > 0 Then strResult = Replace(strResult, "</i><i>", vbNullString)
If InStr(strResult, "</i></b><b><i>") > 0 Then strResult = Replace(strResult, "</i></b><b><i>", vbNullString)
End With
strResult = "<p>" & strResult & "</p>"
AddTags = Replace(strResult, Chr(10), "</p><p>")
lbl_Exit:
Exit Function
End Function
@greg
Reduce
ActiveSheet.Columns(1).Select
Set oRng = Nothing
On Error Resume Next
Set oRng = Selection.Columns(1).SpecialCells(xlCellTypeConstants, xlTextValues)
to
On Error Resume Next
Set oRng = activesheet.Columns(1).SpecialCells(2,2)
Paul_Hossler
10-09-2020, 08:41 AM
1. It takes time to read a property from an object so I used the booleans to hold the result if I needed it again (you did emphasize the need for speed), or just for readability (my style)
2. Something like .Font.FontStyle = "Bold" is MS Word oriented (I think) and involves two object levels and a string compare to return a T/F. Something like .Font.Bold returns T/F directly, so I thought it'd be a tad faster
3. There's been many discussions here in Excel-land about using arrays to improve speed since directly reading from the worksheet N times to manipulate data one cell at a time is a lot slower than bringing the N cells into an array all at once, crunch the array in memory, and then put all N cells back to the worksheet all at once. Of course, all that the array has are Values, no formatting. Join-ing the pieces into a single String for Replace is probably the equivalent
4.
Followup to snb's comment
a. You do not need to .Select something to use or act on it (MS Word is very 'Selection' object oriented)
b. I captured the returned Set of the .SpecialCells so that I could exit neatly if by chance there were no text cells
c. My style is to use enumerations instead of the numerical values for readability. The computer doesn't care, but I do when reading the code
d. If you were making this a general purpose sub, then you could just use
If Not TypeOf Selection Is Range Then Exit Sub
Set oRng = Nothing
On Error Resume Next
Set oRng = Selection.SpecialCells(xlCellTypeConstants, xlTextValues)
If oRng Is Nothing Then Exit Sub
5. You could make AddTags into a UserDefinedFunction and use it directly in cells on the worksheet, or call it inside a macro to do something with the tagged text
:rotlaugh::rotlaugh::rotlaugh:
(Still in Hossler-style :*) )
Option Explicit
Sub TagActiveCell()
MsgBox AddTags(ActiveCell)
End Sub
Function AddTags(oCell As Range) As String
Dim sResult As String, sChar As String
Dim i As Long, L As Long
Dim bBold As Boolean, bItal As Boolean
Dim A() As String
AddTags = vbNullString
With oCell
'check the entire cell
If Len(Trim(.Value)) = 0 Then Exit Function
If Not .Font.Bold And Not .Font.Italic Then
sResult = .Value
ElseIf .Font.Bold And .Font.Italic Then
sResult = "<b><i>" & .Value & "</i></b>"
ElseIf .Font.Bold And Not .Font.Italic Then
sResult = "<b>" & .Value & "</b>"
ElseIf Not .Font.Bold And .Font.Italic Then
sResult = "<i>" & .Value & "</i>"
Else
L = Len(.Value)
'number of chars
ReDim A(1 To L)
For i = 1 To L
sChar = .Characters(i, 1).Text
bBold = .Characters(i, 1).Font.Bold
bItal = .Characters(i, 1).Font.Italic
If bBold And bItal Then
A(i) = "<b><i>" & sChar & "</i></b>"
ElseIf bBold Then
A(i) = "<b>" & sChar & "</b>"
ElseIf bItal Then
A(i) = "<i>" & sChar & "</i>"
Else
A(i) = sChar
End If
Next i
sResult = Join(A, "")
'InStr faster than Replace if nothing to Replace
If InStr(sResult, "</b><b>") > 0 Then sResult = Replace(sResult, "</b><b>", vbNullString)
If InStr(sResult, "</i><i>") > 0 Then sResult = Replace(sResult, "</i><i>", vbNullString)
If InStr(sResult, "</i></b><b><i>") > 0 Then sResult = Replace(sResult, "</i></b><b><i>", vbNullString)
End If
End With
sResult = "<p>" & sResult & "</p>"
AddTags = Replace(sResult, Chr(10), "</p><p>")
End Function
gmaxey
10-09-2020, 09:11 AM
Paul,
Good points. I've revised as follows:
Sub ConvertFormatToHTML()
Dim oRng As Range
Dim oCell As Range
ActiveSheet.Columns(1).Select
Set oRng = Nothing
On Error Resume Next
Set oRng = Selection.Columns(1).SpecialCells(xlCellTypeConstants, xlTextValues)
On Error GoTo 0
If oRng Is Nothing Then Exit Sub
For Each oCell In oRng.Cells
oCell.Offset(0, 1).Value = AddTags(oCell)
'oCell.Value = AddTags(oCell)
Next
lbl_Exit:
Exit Sub
End Sub
Function AddTags(oCell As Range) As String
Dim strResult As String, strChar As String
Dim lngIndex As Long, lngLen As Long
Dim arrChrs() As String
Dim bBold As Boolean, bItalic As Boolean
With oCell
'Check the entire cell
If Not .Font.Bold = True And Not .Font.Italic = True Then
strResult = .Value
ElseIf .Font.Bold = True And .Font.Italic = True Then
strResult = "<b><i>" & .Value & "</i></b>"
ElseIf .Font.FontStyle = "Bold" Then
strResult = "<b>" & .Value & "</b>"
ElseIf .Font.FontStyle = "Italic" Then
strResult = "<i>" & .Value & "</i>"
Else
lngLen = Len(.Value)
'Number of chars
ReDim arrChrs(1 To lngLen)
For lngIndex = 1 To lngLen
strChar = .Characters(lngIndex, 1).Text
bBold = .Characters(lngIndex, 1).Font.Bold
bItalic = .Characters(lngIndex, 1).Font.Italic
Select Case True
Case bBold And bItalic: arrChrs(lngIndex) = "<b><i>" & strChar & "</i></b>"
Case bBold: arrChrs(lngIndex) = "<b>" & strChar & "</b>"
Case bItalic: arrChrs(lngIndex) = "<i>" & strChar & "</i>"
Case Else: arrChrs(lngIndex) = strChar
End Select
Next lngIndex
strResult = Join(arrChrs, "")
End If
'InStr faster than Replace if nothing to Replace
If InStr(strResult, "</b><b>") > 0 Then strResult = Replace(strResult, "</b><b>", vbNullString)
If InStr(strResult, "</i><i>") > 0 Then strResult = Replace(strResult, "</i><i>", vbNullString)
If InStr(strResult, "</i></b><b><i>") > 0 Then strResult = Replace(strResult, "</i></b><b><i>", vbNullString)
End With
strResult = "<p>" & strResult & "</p>"
AddTags = Replace(strResult, Chr(10), "</p><p>")
lbl_Exit:
Exit Function
End Function
Paul_Hossler
10-09-2020, 09:41 AM
NOT a big deal (style comment), but .Font. ... is a Boolean so this
If Not .Font.Bold = True And Not .Font.Italic = True Then
could be shortened to
1.
If .Font.Bold = False And .Font.Italic = False Then
or
2.
If Not .Font.Bold And Not .Font.Italic Then
or
3.
If Not (.Font.Bold Or .Font.Italic) Then
Personally, I find #2 more readable (but as I said, it's my style and I'm comfortable / used to it)
Example: "Cat Dog Fox"
Excel: Cat.FontStlye = Bold; a.Fontstyle = Bold; c.Fontstyle = Bold; " ".Fontstyle = Regular; D.Fontstyle = BoldItalic; o.Fontstyle = BoldItalic; etc for each character
By default, each new character's Font Object inherits the fontstyle of the preceding character. A Fontstyle 'ends' when a new FontStyle is initiated
Cell.FontStyles provides the default character Fontstyle
Html: <b>Cat <i>Cat</i></b> Dog
Font tags Encapsulate Strings, Encapsulated Tags are additive. A Font 'ends' when the closing Tag is presented.
Encapsulating each character is allowed, <b>C</b><b>a</b><b>t</b> <b><i>D</i></b><b><i>o</i></b><b><i>g</i></b>, etc, and HTMLTidy should clean it up. I would start and end the HTM string(entire cell) with the Tag for the Cell's FontStyle
Personally, I would save all output to a *.htm file, (a basic text file that starts with "<html><body>" & NewLine + results,) and feed that to HTMLTidy.
Option Explicit
Sub test()
Dim re1 As Object
Dim re2 As Object
Dim s As String
Dim r As Range
Dim m, k
Set r = Selection
s = r.Value(xlRangeValueXMLSpreadsheet)
Set re1 = CreateObject("VBScript.RegExp")
re1.Global = True
Set re2 = CreateObject("VBScript.RegExp")
re2.Global = True
'
re1.Pattern = "<B>(<Font[\s\S]*?>)([\s\S]*?)(</Font>)*</B>"
s = re1.Replace(s, "$1<B>$2</B>$3")
re1.Pattern = "<I>(<Font[\s\S]*?>)([\s\S]*?)(</Font>)*</I>"
s = re1.Replace(s, "$1<I>$2</I>$3")
re2.Pattern = "<Style ss:(ID=""s\d*"")>[\n\s]*?<Font[^>]*?ss:Bold=""1"""
If re2.test(s) Then
For Each m In re2.Execute(s)
re1.Pattern = "(<Cell ss:Style" & m.submatches(0) & "><Data ss:Type=""String"">)([\s\S]*?)(</Data>)"
s = re1.Replace(s, "$1<B>$2</B>$3")
Next
End If
re2.Pattern = "<Style ss:(ID=""s\d*"")>[\n\s]*?<Font[^>]*?ss:Italic=""1"""
If re2.test(s) Then
For Each m In re2.Execute(s)
re1.Pattern = "(<Cell ss:Style" & m.submatches(0) & "><Data ss:Type=""String"">)([\s\S]*?)(</Data>)"
s = re1.Replace(s, "$1<I>$2</I>$3")
Next
End If
re1.Pattern = "((<ss:|<)Data[\s\S]*?>)([\s\S]*?)(<(/ss:|/)Data>)"
s = re1.Replace(s, "$1<P>$3</P>$4")
re1.Pattern = "
"
s = re1.Replace(s, "<P></P>")
r.Offset(, 2).Value(xlRangeValueXMLSpreadsheet) = s
r.Offset(, 2).ClearFormats
r.Offset(, 2).Font.Bold = False
r.Offset(, 2).Font.Italic = False
'
Set re1 = Nothing
Set re2 = Nothing
End Sub
マナ
re1.Pattern = "
"
s = re1.Replace(s, "<P></P>")
re1.Pattern = "& # 10 ;" <---Remove space
s = re1.Replace(s, "<P></P>")
gmaxey
10-10-2020, 05:01 AM
Mana,
Very interesting and fast. However, at this point, it is not dealing with FontSytle "Bold Italic"
This Bold&Italic <B><i>Bold&Italic</B></i>
and this Bold&ItalicBold should end up <B><i>Bold&Italic</i>Bold</B>
Paul_Hossler
10-10-2020, 07:59 AM
@mana
that xlRangeValueXMLSpreadsheet is interesting. Never saw that before
Option Explicit
Sub test2()
Dim re1 As Object, re2 As Object
Dim s As String
Dim r As Range
Dim m As Object
Set r = Selection
s = r.Value(xlRangeValueXMLSpreadsheet)
Set re1 = CreateObject("VBScript.RegExp")
re1.Global = True
Set re2 = CreateObject("VBScript.RegExp")
re2.Global = True
re1.Pattern = "<I>(<Font[\s\S]*?>)([\s\S]*?)(</Font>)*</I>"
s = re1.Replace(s, "$1<I>$2</I>$3")
re1.Pattern = "<B>(<Font[\s\S]*?(?!Style)>)([\s\S]*?)(</Font>)*</B>"
s = re1.Replace(s, "$1<B>$2</B>$3")
re2.Pattern = "<Style ss:(ID=""s\d*"")>([\s\S]*?)</Style>"
If re2.test(s) Then
For Each m In re2.Execute(s)
If InStr(m.submatches(1), "ss:Bold=""1""") > 0 Then
re1.Pattern = "(<Cell ss:Style" & m.submatches(0) & "><Data ss:Type=""String"">)([\s\S]*?)(</Data>)"
s = re1.Replace(s, "$1<B>$2</B>$3")
End If
If InStr(m.submatches(1), "ss:Italic=""1""") > 0 Then
re1.Pattern = "(<Cell ss:Style" & m.submatches(0) & "><Data ss:Type=""String"">)([\s\S]*?)(</Data>)"
s = re1.Replace(s, "$1<I>$2</I>$3")
End If
Next
End If
'
re1.Pattern = "((<ss:|<)Data[\s\S]*?>)([\s\S]*?)(<(/ss:|/)Data>)"
s = re1.Replace(s, "$1<P>$3</P>$4")
re1.Pattern = "& # 10;" '<--remove space
s = re1.Replace(s, "<P></P>")
r.Offset(, 2).Value(xlRangeValueXMLSpreadsheet) = s
r.Offset(, 2).ClearFormats
r.Offset(, 2).Font.Bold = False
r.Offset(, 2).Font.Italic = False
'
Set re1 = Nothing
Set re2 = Nothing
End Sub
マナ
Or:
Sub M_snb()
sp = Range("A1:B20")
sn = Filter(Split(Replace(Replace(Range("A1:D20").Value(11), "</Cell>", "<Cell>"), "</Font>", ""), "<Cell>"), "<Font")
For j = 1 To UBound(sn)
If InStr(sn(j), "<I>") + InStr(sn(j), "<B>") Then
sn(j) = Replace(Replace(Replace(Replace(Replace(sn(j), "<I>", Chr(0) & "I" & Chr(1)), "<B>", Chr(0) & "B" & Chr(1)), "</I>", Chr(0) & "/I" & Chr(1)), "</B>", Chr(0) & "/B" & Chr(1)), ">", "<")
sp((j - 1) \ UBound(sp, 2) + 1, (j - 1) Mod UBound(sp, 2) + 1) = Replace(Replace(Join(Filter(Split(sn(j), "<"), ":", 0), ""), Chr(0), "<"), Chr(1), ">")
End If
Next
Range("A1:B20").Offset(25) = sp
End Sub
gmaxey
10-11-2020, 07:20 AM
Very nice. Thanks mana
gmaxey
10-11-2020, 07:24 AM
snb, Your code throws a type mismatch error. Since it has nothing to do with converting the content of the selected range or column, why do you even bother?
Paul_Hossler
10-11-2020, 07:35 AM
Very nice. Thanks mana
Agreed - I learned a little something
However, I don't think I could get all the parameters, etc. correct. I'll have to opt for simplicity over efficiency :(
gmaxey
10-11-2020, 07:48 AM
Paul,
Likewise. This is a case where I was given the fish. A "Deep Sea" fish. I better stay closer to the shore :-). Thanks all of you for your help.
gmaxey
10-11-2020, 10:42 AM
Sorry guys, one more question. In mana's code he sets his range to the the selection.
For my purposes, I need to select a column of data and run the code. I don't want to process the first heading row. How do I adapt:
Set r = Selection
such that the range excludes the first row.
I tried this (among other things) but get and error:
Set r = Selection.Offset(1, 0)
Thanks.
Paul_Hossler
10-11-2020, 01:37 PM
In Excel, ActiveCell is used a lot
One if the data has no blanks, another if it does
Of course, you could use the second for everything, but I thought I'd show two different ways
This could stand some error checking, but we'll leave that as a homework assignment :devil2:
Option Explicit
'click somewhere in Col A
Sub Fragment()
Dim rngToProcess As Range, rStart As Range, rEnd As Range
Set rStart = ActiveCell.EntireColumn.Cells(2, 1)
Set rEnd = rStart.End(xlDown)
Set rngToProcess = Range(rStart, rEnd)
On Error Resume Next
Set rngToProcess = rngToProcess.SpecialCells(xlCellTypeConstants, xlTextValues)
On Error GoTo 0
MsgBox rngToProcess.Address
End Sub
'click somewhere in Col C
Sub Fragment2()
Dim rngToProcess As Range, rStart As Range, rEnd As Range
Set rStart = ActiveCell.EntireColumn.Cells(2, 1)
Set rEnd = ActiveSheet.Cells(ActiveSheet.Rows.Count, ActiveCell.Column).End(xlUp)
Set rngToProcess = Range(rStart, rEnd)
On Error Resume Next
Set rngToProcess = rngToProcess.SpecialCells(xlCellTypeConstants, xlTextValues)
On Error GoTo 0
MsgBox rngToProcess.Address
End Sub
gmaxey
10-11-2020, 02:13 PM
Thanks Pual
rambleon
02-06-2021, 07:18 AM
Hi Mana,
In addition to converting Bold and Italic text to web format - which is an excellent piece code, would it be possible to also include conversion of special characters for web, such as ampersand (&), quotes (") and apostrophe (').
Thanks
@gmax
To exclude columnheaders:
Sub M_snb()
sn=cells(1).currentregion.columns(3).offset(1).specialcells(2)
End Sub
rambleon
02-07-2021, 01:08 AM
Hi Mana,
This code was working but now it just shows [Running] then Not Responding.
I'm selecting the column to convert on a sheet with 2,200 rows.
Any ideas?
Thanks
rambleon
02-07-2021, 02:20 AM
I went back to an earlier version of the file, and your code to convert selected column to html worked fine, so problem was mine.
Sorry,
Thanks
Powered by vBulletin® Version 4.2.5 Copyright © 2025 vBulletin Solutions Inc. All rights reserved.