PDA

View Full Version : Solved: Need to Extract File Name from String



JohnnyBravo
05-08-2010, 09:52 AM
I've got a list of downloads that I'm trying to keep track of. I need the file names in a new word document. What I have is a simple list:

http://domain_name/0249.022d68278b05d70a10585e7303/rar.html (http://domain_name/download/0249.022d68278b05d70a10585e7303/rar.html) done 411.42 MB
http://domain_name/1991.106d4da5421d659c8024ac5fbf/dsf545.rar.html (http://domain_name/download/1991.106d4da5421d659c8024ac5fbf/dsf545.rar.html) 383 MB
http://domain_name/1288.f979f962afc34242b5e778ba23/TL_E21.wmv.html (http://domain_name/download/1288.f979f962afc34242b5e778ba23/TL_E21.wmv.html) done 317.21 MB
http://domain_name/3840.3b1d727d3f3aa9d923e3cb561/ilms_armstrong.mpg.html (http://domain_name/download/3840.3b1d727d3f3aa9d923e3cb561/ilms_armstrong.mpg.html) 359.42 MB
http://domain_name/1631.149a5b233c45b0a8c9eb9a011/molod36.rar.html (http://domain_name/download/1631.149a5b233c45b0a8c9eb9a011/molod36.rar.html) done 30 MB
http://domain_name/4946.4b13098d07b8f776b6133fb4684ab395/jessica_mazury_hd.avi.html (http://domain_name/download/4946.4b13098d07b8f776b6133fb4684ab395/jessica_mazury_hd.avi.html) 431
http://domain_name/3683.3fa87165c8c418cb4b4e72f692f46546/rar.html (http://domain_name/download/3683.3fa87165c8c418cb4b4e72f692f46546/rar.html) 419.22 MB

What I'm looking for is a new word document that looks like this:
rar 411.42 MB
dsf545.rar 383 MB
TL_E21.wmv 317.21 MB
ilms_armstrong.mpg 359.42 MB
molod36.rar 305 MB
jessica_mazury_hd.avi 431 MB
rar 419.22 MB

Notice that the file name is always just before the *.html (end of the url string). I'm a VBA newbie so I have no idea how to get this started.

TonyJollans
05-08-2010, 11:13 AM
You don't need VBA - you can do it with a Find and Replace:

Copy the list to a new file, hit Ctrl+H, and then ...

Find what: http*/([!/]@).html
Replace with: \1

Check "Use wildcards" and hit "Replace All".

JohnnyBravo
05-08-2010, 03:20 PM
Thanks Tony. I've been using F & R for a long time - didn't know it had such a capability. It worked out just fine.

Just out of curiosity, let me ask a different scenario. If I wanted to produce a list that looked like the following, could F & R extract the file name looking like this?

rar done 411.42 MB
dsf545.rar 383 MB
TL_E21.wmv done 317.21 MB
ilms_armstrong.mpg 359.42 MB
molod36.rar done 305 MB
jessica_mazury_hd.avi 431 MB
rar 419.22 MB

Essentially I'm wondering if the F & R you gave me could be modified so that the word "done" could be retained just after the file name.

TonyJollans
05-08-2010, 04:16 PM
Hmm! I hadn't noticed the 'done's before - what I gave you does actually keep them - and everything after the file name - because all it changes is the file name. Dropping it would be more involved (but it should be possible with some kind of numeric pattern if you want that option as well).

Tinbendr
05-10-2010, 08:40 PM
Sub GetTrailingData()

Dim A As Integer
Dim Offset as Integer

With ActiveDocument
For A = .Paragraphs.Count To 1 Step -1
With .Paragraphs(A).Range
Offset = InStrRev(.Text, "/")
.Text = Replace(Mid(.Text, Offset + 1, Len(.Text)), ".html", "")
End With
Next A
End With

End Sub

fumei
05-11-2010, 11:17 AM
Nice, Tinbendr, very nice. Clean and efficient.

fumei
05-11-2010, 11:45 AM
And as an alternative, using Tinbendr's starting point...

Sub GetTrailingData_B()
Dim Offset As Long
Dim oPara As Paragraph
For Each oPara In ActiveDocument.Paragraphs()
With oPara
Offset = InStrRev(.Range.Text, "/")
If Offset <> 0 Then
.Range.Text = _
Replace(Mid(.Range.Text, Offset + 1, _
Len(.Range.Text)), ".html", "")
End If
End With
Next oPara
End Sub

JohnnyBravo
05-12-2010, 11:59 PM
Works great. My thanks to all of you.