View Full Version : [SLEEPER:] Remove encoding from an xml file using regular expressions
kunguito
11-23-2009, 03:58 AM
Hi everyone,
I want "encoding="XX"" removed from a string and I'll be using regular expressions.Where XX could be any possible encoding: UTF-8, UTF-16, ISO...
Here is a sample:
<?xml version="1.0" encoding="UTF-8"?>
And this is how it should look like after the replacement:
<?xml version="1.0"?>
How would the pattern look like?
Note that I'm not asking how to use regular expressions but how to build a pattern.
kunguito
11-23-2009, 04:48 AM
Code snipet based on a piece of MSDN documentation.
Set objRegExp = New RegExp
' myPattern = "encoding=" & Chr(34) & "[+]" & Chr(34)
myPattern = "encoding=" & "\w+"
'Set the pattern by using the Pattern property.
objRegExp.Pattern = myPattern
' Set Case Insensitivity.
objRegExp.IgnoreCase = True
Set objRegExp = New RegExp
' myPattern = "encoding=" & Chr(34) & "UTF-8" & Chr(34) ==> This works fine but is a subcase of what I want.
' myPattern = "encoding=" & Chr(34) & "[+]" & Chr(34) ==> jumps the test if statement below
myPattern = "encoding=" & "\w+" ==> So does that one
'Set the pattern by using the Pattern property.
objRegExp.Pattern = myPattern
' Set Case Insensitivity.
objRegExp.IgnoreCase = True
'Set global applicability.
objRegExp.Global = True
'Test whether the String can be compared.
If (objRegExp.Test(firstLine) = True) Then
'Replace the matches.
firstLine = objRegExp.replace(firstLine, "") ' Replace
End If
JScript and VBScript regex syntax appears to be different.
Some documentation here:
http://msdn.microsoft.com/en-us/library/6wzad2b2(VS.85).aspx
Powered by vBulletin® Version 4.2.5 Copyright © 2025 vBulletin Solutions Inc. All rights reserved.