PDA

View Full Version : Conversion Problem in word to xml



jothi
12-13-2009, 11:58 PM
Hi All,

I am facing problem in converting word to xml using vba. If i save word as xml some of the symbols were not recognized and it present as nothing in xml file(the symbol fonts are not recognized). Recognized symbols are present as unicode character. So how can i change symbol font.

If anyone helps regard this i am so thankful,

Thank u,

Jothi.R

lucas
12-14-2009, 09:26 AM
moved to the Word help forum.

TonyJollans
12-14-2009, 10:21 AM
Firstly, what version of Word?

Secondly, if you save as XML without using VBA do you see the same problem? In other words, is this issue anything to do with VBA, which I suspect it isn't.

I don't know the answer, but, if you give a bit more information I can take a look - what symbols are giving you trouble? What do you mean when you say they are present as unicode characters? Present where? And when you say 'present as nothing' in the xml file, do you mean they are not there? Or what? And what is not recognising the symbol font?

jothi
12-14-2009, 11:54 PM
1. I am using Word 2003.
2. If i save word as xml without using VBA am getting the same problem.

If i insert the symbol from Insert->symbol->Font: Times new roman

jothi
12-15-2009, 12:10 AM
hi Tony,

Am new to VBA.

Please ignore my previous reply.

1. I am using Word 2003.
2. If i save word as xml without using VBA am getting the same problem.

I will explain my problem detaily:
Insert the symbol from Insert->symbol->Font: (Eg: Times new roman) Eg: !
If i save this as xml am getting this symbol in unicode format.
But:
Insert->Symbol->Font: (Symbol) Eg: !
If i save this as xml am get null in xml file.

I am having one solution for this problem:
vba code to replace the symbol font :
Function Call: Call ReplaceAllSymbols(FC:=ChrW(-4063), FF:="Symbol", RC:=33, RF:="Times New Roman") 'excl
Function Definition:

Private Sub ReplaceAllSymbols(FC As String, FF As String, RC As String, RF As String)

Dim FoundFont As String, OriginalRange As Range, strFound As Boolean
Application.ScreenUpdating = False
Set OriginalRange = Selection.Range
ActiveDocument.Range(0, 0).Select
strFound = False
With Selection.find
.ClearFormatting
.Text = FC
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
Do While .Execute
If Dialogs(wdDialogInsertSymbol).Font = FF Then
Selection.InsertSymbol Font:=RF, _
CharacterNumber:=RC, Unicode:=True
Else
Selection.Collapse wdCollapseEnd
End If
Loop
End With
On Error GoTo err

OriginalRange.Select
err:
Set OriginalRange = Nothing
End Sub

Here what i am struggling means i dont know how to find character code(Eg: -4063 for &excl) for all the symbols in symbol font. I hope you will understand my question.

Thanks,
Jothi.

TonyJollans
12-15-2009, 09:04 AM
The more I use it, the more I dislike XML!

I can well believe there is a problem with Symbols that use Private Use Area characters, but I'm not sure where it is exhibiting itself for you. Word files (whether XML or not) are designed, primarily, for consumption by Word. Do you see a problem when Word reads the file, or are you using a different consumer?

I still don't really understand when you say you have a null in the XML, as I am not seeing this. I see:

<w:sym w:font="Symbol" w:char="F021"/>

.. which is correct, as far as I understand it, although the way Word uses Symbols leaves a lot to be desired, and I always use ‘proper’ Unicode characters now.

Could you post the relevant parts of your XML, please?

jothi
12-16-2009, 06:24 AM
S You r correct. I am having Problem with Private use Area Characters. Actually i used xsl transformation to get the xml. In that i missed to transform Private use characters(<w:sym w:font="Symbol" w:char="F021"/>). So i got null in transformed output. After Your reply only i noticed that tag. Still i dont know how to get the ISO unicode &#x0021;(&excl;) using this <w:sym w:font="Symbol" w:char="F021"/>.

Thank u,
Jothi.R

TonyJollans
12-16-2009, 07:23 AM
There is no way you can automate the transformation without hard coding all the values you want to transform. By definition, PUA characters can be anything the designer of the Font wants, and not necessarily with unicode equivalents.

If the Symbol Font is available to your ultimate consumer, you should be able to use the characters within it, and transform:


<w:sym w:font="Symbol" w:char="F021"/>

into the same as you would transform this:


<w:rPr>
<w:rFonts w:ascii="Symbol" w:h-ansi="Symbol"/></w:rPr>

<w:t>&#xF021;</w:t>

fumei
12-16-2009, 11:48 AM
The more I use it, the more I dislike XML!

Amen to that.

jothi
12-17-2009, 07:06 AM
Thank u So Much Tony...