Consulting

Results 1 to 14 of 14

Thread: Find and replace text using a data file and "Save As"

  1. #1
    VBAX Regular
    Joined
    Sep 2013
    Location
    Oakland, Califorbia
    Posts
    7
    Location

    Find and replace text using a data file and "Save As"

    This is a little embarrassing. I have not needed to code anything in over 20 years. I used to hang out on forums like this answering questions like this about Visual Basic back when Windows 3.1 was the OS. But it has been too long.

    Please guide me as best you can. If you could just supply the right words for the concepts in the pseudocode, that would be great. If you could code it, I would be ever so grateful. I would prefer having it work across multiple versions of Word if possible. I am hoping the code is pretty generic and will not have changed much, if at all. I am using 2003, 2007 and 2010 versions of Word, and I imagine the people I will be sharing this with will be as well.

    I use Adobe Premiere Pro to edit videos. If I export a title I get a text file (XML) filled with all kinds of stuff, but the text I want to change is just plain text. So it could be <XYZ> and I could find and replace it to achieve my goal. It seems to me that it should look something like this:

    I open the Data file which has multiple lines of text - anywhere from 2 to 1000 or more. This is the file where the Macro button resides.

    The psuedocode:

    File Open Dialog ''''find and open the text file that I will change over and over again
    Search for XYZ ''''XYZ being the temporary text I use to help identify the place in the file that needs to be changed
    Read first line of data file ''''usually not more than 40 characters but could be up to 255
    Replace XYZ with first line
    Save As filename plus -n ''''Save text file to new name - n is a number that increments from 0001 to 9999
    Read 2nd line of data file
    Replace 1st line with 2nd line
    Save As filename - incrementing n
    Repeat until EOF
    EOF post a message saying "Done"

    This should leave me with a directory full of individual files numbered sequentially.

    In case it helps, the section of the file I am changing looks like this:

    <TRString>XYZ</TRString>

    The file extension is .prtl

    It seems like it should be easy but the actual code is escaping me.

    Thank you for taking the time to read this and if you can help, I would be appreciative. If you would be willing to code it and want credit for your work, the entire Adobe video editing community would be happy to shower you with praise. We have the pedestal all ready for you to stand on.

    -- Steven

  2. #2
    Welcome to VBA!

    Give this a whirl. Start with an empty document and save it as a Macro-Enabled (*.docm) file or as a Word 97-2003 (*.doc) file. Either will work in any version as long as the Word 2003 users have the Compatibility Pack installed; if they don't have it, or you aren't sure, then use the .doc format. This document will have to be stored in a trusted folder so the macro will be allowed to run.

    In the first paragraph of the document, insert a MacroButton field with this code (use Ctrl+F9 to insert the field markers):
    { Macrobutton EditVideoFile Double-click here to start video file processing }

    Then paste in the lines of data (the replacements for the XYZ).

    Open the macro editor, insert a module in the document's project, and paste in this code:

    Option Explicit
    
    Sub EditVideoFile()
        Dim dataFile As Document    ' this data doc or template
        Dim baseFile As Document    ' file being modified
        Dim baseFileFN As String    ' path\name of file from Adobe
        Dim rgData As Range         ' range of data file containing replacements
        Dim para As Paragraph       ' paragraph in rgData
        Dim rgPattern As Range      ' a range used in baseFile
        Dim replaceText As String   ' current line from this data file
        Dim iterNum As Long         ' number of current line / saved file
        Dim saveFN As String        ' path\base name of saved file
        
        Set dataFile = ActiveDocument
        
        MsgBox "Select the data file (*.prtl) to be processed."
        With Dialogs(wdDialogFileOpen)
            .Name = "*.prtl"
            If .Show = -1 Then
                Set baseFile = ActiveDocument   ' file just opened becomes Active
                baseFileFN = baseFile.FullName  ' this path\name will be used to open it again
            Else
                Exit Sub
            End If
        End With
        
        MsgBox "Select the folder and filename (without numbers) to save as."
        With Dialogs(wdDialogFileSaveAs)
            If .Display = -1 Then   ' just get the path\name, don't actually save
                saveFN = WordBasic.FileNameInfo(.Name, 5) & WordBasic.FileNameInfo(.Name, 4) & "-"
                'MsgBox saveFN
            Else
                Exit Sub
            End If
        End With
    
        On Error GoTo Errhdl
        iterNum = 1
        Set rgData = dataFile.Range
        rgData.MoveStart unit:=wdParagraph, Count:=1 ' exclude paragraph containing macro button
        For Each para In rgData.Paragraphs
            replaceText = Left(para.Range.Text, Len(para.Range.Text) - 1) ' exclude paragraph mark
            
            ' force all-caps XYZ to lower-case xyz
            ' to avoid inserting text from data file in all caps
            Set rgPattern = baseFile.Range
            With rgPattern.Find
                .Text = "XYZ"
                .Wrap = wdFindStop
                While .Execute
                    rgPattern.Text = LCase(rgPattern.Text)
                Wend
            End With
            
            ' do the replacement
            baseFile.Range.Find.Execute _
                FindText:="xyz", _
                MatchCase:=False, _
                ReplaceWith:=replaceText, _
                Replace:=wdReplaceAll
            
            baseFile.SaveAs _
                FileName:=saveFN & Format(iterNum, "0000") & ".prtl", _
                FileFormat:=wdFormatText, _
                AddToRecentFiles:=False
            
            ' close the modified file, then reopen the unmodified base
            baseFile.Close SaveChanges:=wdDoNotSaveChanges
            Set baseFile = Documents.Open(FileName:=baseFileFN, AddToRecentFiles:=False, Visible:=False)
            
            iterNum = iterNum + 1
        Next para
        
        baseFile.Close SaveChanges:=wdDoNotSaveChanges
        MsgBox "Done"
        Exit Sub
        
    Errhdl:
        MsgBox "Error " & Err.Number & " at line " & iterNum & vbCr & Err.Description
    End Sub
    Open the document and double-click the macrobutton field to start it up.

    There are a couple of interesting wrinkles in this code.
    • The XML file and the output files can be in any folder, and not necessarily in the same folder with each other.
    • If the XML file contains 'XYZ' in all caps, and the macro tried to replace that with lower-case or mixed-case text from the data file, the result would be all caps. That's a shortcoming of the Range.Find object. To avoid that, this macro first loops through all the occurrences of 'XYZ' and changes them to 'xyz'; then the later replacement preserves the case of the data.
    • Each time the XML file is opened, modified, and saved to another name, the macro closes it and then reopens the unmodified file to prepare for the next iteration. This opening and closing causes the status bar to flicker. If that bothers you, it can be suppressed, but I see it as a sign that the macro is still running.

  3. #3
    VBAX Regular
    Joined
    Sep 2013
    Location
    Oakland, Califorbia
    Posts
    7
    Location
    OK, that is simply amazing. Thank you so much. It would have taken me a month of Sundays to get that code written, and I probably would not have been anywhere near that efficient.

    There is a problem with the formatting, however. For some reason it stripped out all of the bracketed information and left just the values of the parameters. It is like Word recognized the XML like structure and just got rid of it. Might it be the way the files are saved? Or the method by which they are opened?

    For example, the area around the text to be changed looks like this before the macro is run:

    "<txBase>173.01</txBase><XPos>42.7166</XPos><angle>0</angle><verticalText>false</verticalText><objectLeading>0</objectLeading></BaseProperties><EnclosingObjectType>block</EnclosingObjectType><Alignment>left</Alignment><TRString>xyz</TRString>"

    After the replacement, the new title, with the text changed to "This is a test" looks like this:
    "
    173.01
    42.7166
    0
    false
    0


    block
    left
    This is a test
    "

    Perhaps the brackets are confusing Word? Is there a way to tell Word that this is just text, not formatted?

    Once again, thank you so much for doing this. I will start taking it apart line by line to understand it all. It will take a while, but will certainly be worth the effort.

  4. #4
    I didn't test my code with an actual XML file, but with just a text file that I threw together with some bracketed nodes, because I didn't expect a *.prtl file to have the line
    <?xml version="1.0" encoding="utf-8"?>
    at the top. When you use Word's Open command to open any file that starts with that line, Word automatically strips out all the XML coding. (This is a result of a patent lawsuit that Microsoft lost, http://www.zdnet.com/blog/microsoft/...soft-word/3712.)

    It took me a few false starts to figure out how to get the coding to stay. Using the Range object's InsertFile method didn't do it. I wound up using what are perhaps the oldest functions in VBA, the Open #, Line Input #, and Close # functions, in the subroutine you'll find at the end of this version. The subroutine inserts the text line by line into a new blank document. And I've tested this one with a real XML file renamed as a .prtl file...

    Option Explicit
    
    Sub EditVideoFile()
        Dim dataFile As Document    ' this data doc or template
        Dim baseFile As Document    ' file being modified
        Dim baseFileFN As String    ' path\name of file from Adobe
        Dim rgData As Range         ' range of data file containing replacements
        Dim para As Paragraph       ' paragraph in rgData
        Dim rgPattern As Range      ' a range used in baseFile
        Dim replaceText As String   ' current line from this data file
        Dim iterNum As Long         ' number of current line / saved file
        Dim saveFN As String        ' path\base name of saved file
        
        Set dataFile = ActiveDocument
        
        MsgBox "Select the data file (*.prtl) to be processed."
        With Dialogs(wdDialogFileOpen)
            .Name = "*.prtl"
            If .Display = -1 Then   ' just get the path\name, don't actually open
                baseFileFN = WordBasic.FileNameInfo(.Name, 1) ' full path\name
            Else
                Exit Sub
            End If
        End With
        
        MsgBox "Select the folder and filename (without numbers) to save as."
        With Dialogs(wdDialogFileSaveAs)
            If .Display = -1 Then   ' just get the path\name, don't actually save
                saveFN = WordBasic.FileNameInfo(.Name, 5) & WordBasic.FileNameInfo(.Name, 4) & "-"
                'MsgBox saveFN
            Else
                Exit Sub
            End If
        End With
    
        'On Error GoTo Errhdl
        iterNum = 1
        Set rgData = dataFile.Range
        rgData.MoveStart unit:=wdParagraph, Count:=1 ' exclude paragraph containing macro button
        For Each para In rgData.Paragraphs
            replaceText = Left(para.Range.Text, Len(para.Range.Text) - 1) ' exclude paragraph mark
            
            Set baseFile = Documents.Add(Visible:=False)
            InsertAsNotXML FN:=baseFileFN, baseFile:=baseFile
            
            ' force all-caps XYZ to lower-case xyz
            ' to avoid inserting text from data file in all caps
            Set rgPattern = baseFile.Range
            With rgPattern.Find
                .Text = "XYZ"
                .Wrap = wdFindStop
                While .Execute
                    rgPattern.Text = LCase(rgPattern.Text)
                Wend
            End With
            
            ' do the replacement
            baseFile.Range.Find.Execute _
                FindText:="xyz", _
                MatchCase:=False, _
                ReplaceWith:=replaceText, _
                Replace:=wdReplaceAll
            
            baseFile.SaveAs _
                FileName:=saveFN & Format(iterNum, "0000") & ".prtl", _
                FileFormat:=wdFormatText, _
                AddToRecentFiles:=False
            
            ' close the modified file without saving
            baseFile.Close SaveChanges:=wdDoNotSaveChanges
            
            iterNum = iterNum + 1
        Next para
        
        MsgBox "Done"
        Exit Sub
        
    Errhdl:
        MsgBox "Error " & Err.Number & " at line " & iterNum & vbCr & Err.Description
    End Sub
    
    Sub InsertAsNotXML(FN As String, baseFile As Document)
        Dim aLine As String
        Dim pos As Long
        
        Open FN For Input As #1
        Do While Not EOF(1)
            Line Input #1, aLine
            
            ' There may be other characters at the start of the line,
            ' so remove them.
            pos = InStr(LCase(aLine), "<?xml")
            If pos > 0 Then
                aLine = Right(aLine, Len(aLine) - pos + 1)
            End If
            
            baseFile.Range.InsertAfter aLine & vbCr
        Loop
        Close #1
    End Sub

  5. #5
    VBAX Regular
    Joined
    Sep 2013
    Location
    Oakland, Califorbia
    Posts
    7
    Location

    I did something wrong

    Jay,

    You have done me a phenomenal favor. Well, me and potentially thousands of other people who will probably want to use this macro.

    I knew it was a mistake not posting a copy of the actual file. Sorry. I can't post a link since my post count is too low, but here it is for you to copy/paste. http: // stevengotz . com / files / Title.prtl (had to fool the forum software)

    There are only two more small favors I must ask.

    The Save As dialog starts with the default file name of "Create Titles.docm" using .docm as the Save As type. How can I get it to use something like "NameOfTitle" and use the Plain Text (*.txt) format?

    I thought it was saving as the docm format but I was wrong. I must have done something else wrong. It works fine just the way it is. But I would still like to see the default name changed.

    I will set about trying to figure it out myself, but I have forgotten more than I realized. My first guess is that there are parameters to the wdDialogFileSaveAs command that I need to supply. But if you want to chime in one last time, I would not be offended.

    The second favor is simply this: Please let me know if you would like to have your name as the author of this macro show up in the beginning of the document. I would love to see you get the credit and the thanks of a grateful video editing community, but I don't want to expose you to a bunch of people asking for favors, either.
    Last edited by Steven Gotz; 09-16-2013 at 09:35 PM. Reason: change of circumstances

  6. #6
    VBAX Regular
    Joined
    Sep 2013
    Location
    Oakland, Califorbia
    Posts
    7
    Location
    OK. I thought I had it working. Then I didn't. Then I did.

    Well, I fooled myself. I replaced "xyz" with "This is a test" and it failed. I tried it again with a simple "123" and it worked. I thought I was going nuts for a while there.

    It finally got through my thick skull that there was a problem with the number of characters. It is being stored in a separate parameter named "Run Count". See below. I changed xyz to xyz1 and the run count of 4 crashed the program. It should be 5 as seen below. Apparently the program is counting an extra character of some sort since three characters requires a Run Count of 4.

    <RunLengthEncodedCharacterAttributes><CharacterAttributes RunCount="5" StyleRef="4097" TextRef="4097" TXKerning="0" TXPostKerning="0" BaselineShifting="0" /></RunLengthEncodedCharacterAttributes><tagName>

    So, now the problem is counting the number of characters. I can figure that one out. Then I have to search for the Run Count parameter. Using a modified version of the ' do the replacement section of the code, I should be able to work this out.

    If this sounds OK, great! If I am heading down the wrong path, please stop me.

    Thanks again.

    I think I will wait until I am fresh. Doing this late at night is just asking for trouble.

  7. #7
    You just need a couple of minor tweaks.

    To get the name and file type correct in the SaveAs dialog box, add these two lines after the With Dialogs(wdDialogFileSaveAs) statement:

            .Name = "NameOfTitle.txt"
            .Format = wdFormatText
    To fix the RunCount, add this section after the existing Find/Replace:

            baseFile.Range.Find.Execute _
                FindText:="RunCount=" & Chr(34) & "4" & Chr(34), _
                MatchCase:=False, _
                ReplaceWith:="RunCount=" & Chr(34) & Len(replaceText) + 1 & Chr(34), _
                Replace:=wdReplaceAll
    That replaces the "4" with one greater than the length of the string used to replace "XYZ".

    I did download your Title.prtl file and tested the code with it. As best I can tell without the Adobe software, it works.

  8. #8
    VBAX Regular
    Joined
    Sep 2013
    Location
    Oakland, Califorbia
    Posts
    7
    Location
    Jay,

    You are a CHAMP!!!

    Please do me one more favor. Find someone to pat you on the back for me please.

    I help people every day on the Adobe forums, and I remember posting code all those many years ago, on bulletin boards and then Compuserve. But this would have taken more brain power and cost me a lot of hair pulling to get this done. You just saved me so much work, and I am going to be sharing this on LinkedIn as well as on the Adobe Premiere Pro forum.

    So, as I said, I will be more than happy to give you credit for a job well done. Just let me know if it is OK to use your name. Otherwise I will just credit it to a great guy on a forum I found. That way you won't get bugged for special favors.

    Thank you, thank you, thank you. I knew this was possible, but you made it look so easy. I have to tell you though, I actually understood the second version. The first one was like Greek to me. This one makes a LOT more sense to an old coder.

    I will come back and post when I finish my tutorial on how to use this and put it on YouTube. I am just going to get so much use out of this.

  9. #9
    VBAX Regular
    Joined
    Sep 2013
    Location
    Oakland, Califorbia
    Posts
    7
    Location
    I just tested it to 9999 titles. It took 50 minutes. Not bad. Not for that much time savings. Not bad at all.

  10. #10
    Yes, you can put my name on it, with my web address, http://jay-freedman.info.

    I don't mind being asked for help, if I can occasionally say no. Like you, I've been doing this since the time when dial-up bulletin boards were the only places available.

  11. #11
    VBAX Regular
    Joined
    Sep 2013
    Location
    Oakland, Califorbia
    Posts
    7
    Location
    Jay,

    (Or anyone else who might happen to stop by...)

    Everything is working great and I have a bunch of happy, English speaking beta testers.

    The Spanish speakers? Not so much.

    Apparently, the code can't handle the diacritical marks that Spanish and some other languages use. I assume it is because we are using code for ASCII text.

    So, obviously, the question is: How do we accommodate non-ASCII text? Is it inherent in the .Text parameter or just how the String is declared?

    The text that is causing the trouble looks like this:
    Hablaré acerca de los factores
    factores podrían
    en células
    Hace 45 años

  12. #12
    Steven,

    I'm quite sure it isn't anything to do with ASCII vs. non-ASCII, because all of those characters (é, í, ñ, as well as ¿, ö, ü, and a lot of others) are within the ASCII range of character codes below 256. Further, when I pasted your examples of Spanish text into my data document and ran the macro, the correct characters appeared in the output files.

    Since Word refuses to open the output files, I've been using a text editor (UltraEdit) to look at them. When I used Notepad the first time, it came up with some odd font that showed arrows and other strange glyphs for the accented characters, but when I switched to Arial or Times New Roman they came out as expected. Maybe Adobe Premiere or whatever program you're using to open the files doesn't like ASCII values above 128?

  13. #13
    OH MY GOD

    THANK YOU THANK YOU THANK YOU... 100 beers for this man!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

    May I ask 1 little change PLEASE!!!!!!!!!!!!

    Instead of tormenting the file name. Can the file name be of the test.

    E.G (in the word file)

    Hello world /ænd / -- Please NOTE I WILL BE USING FOR STRANGER CHARACTER - I get this message, The document may contact text content that will be lost upon conversion etc etc
    Peter pan /fender/
    But face /yay/

    The files will be

    Hello world - ænd .prtl
    Peter pan - fender.prtl
    But face - yay.prtl

    THANK YOU SOOOO MUCH!!!!!!!!!!!!!!!


    Quote Originally Posted by Jay Freedman View Post
    You just need a couple of minor tweaks.

    To get the name and file type correct in the SaveAs dialog box, add these two lines after the With Dialogs(wdDialogFileSaveAs) statement:

            .Name = "NameOfTitle.txt"
            .Format = wdFormatText
    To fix the RunCount, add this section after the existing Find/Replace:

            baseFile.Range.Find.Execute _
                FindText:="RunCount=" & Chr(34) & "4" & Chr(34), _
                MatchCase:=False, _
                ReplaceWith:="RunCount=" & Chr(34) & Len(replaceText) + 1 & Chr(34), _
                Replace:=wdReplaceAll
    That replaces the "4" with one greater than the length of the string used to replace "XYZ".

    I did download your Title.prtl file and tested the code with it. As best I can tell without the Adobe software, it works.
    Last edited by authorleon; 03-08-2015 at 05:29 PM.

  14. #14
    I have fixed the special character part by
               FileName:=saveFN & Format(iterNum, "0000") & ".prtl", _
                FileFormat:=wdFormatText, _
                Encoding:=msoEncodingUnicodeLittleEndian, _
                AddToRecentFiles:=False

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •