Log in

View Full Version : Need to apply heading formatting to first line of blocks of text



passinthru
01-22-2012, 02:41 PM
Oh, wait, it's not that simple. :) I hope someone will see this as an interesting challenge and take a crack at it.


Here's my problem:

I have a document which is a scan of an oooollllld songbook. (1800s, actually. Yes, it's in the public domain.) There are over 300 songs in it. I want to build a table of contents for it.

Each song is one or more blocks of text, like this:

Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Suspendisse sagittis diam id ligula blandit eget pretium.
Stellus pellentesque. Morbi adipiscing ligula sed felis
Pretium a feugiat magna molestie. Nunc molestie ante id

Urna bibendum feugiat. Donec sit amet tortor at ipsum
Mattis fermentum ac sagittis felis liquam volutpat tellus
Ut lacus placerat vestibulum a sit amet justo. Proin congue
Vehicula justo vitae tristique. Maecenas ut eros erat, eu.

Suscipit tellus. Phasellus mollis, massa ultrices bibendum
Egestas, ligula est semper est, eget feugiat nibh semend
Magna. Nam nec mi dui. Phasellus molestie sodales odio nec
Consequat. Pellentesque eleifend metus, at consectetur


IN BETWEEN each song is a line with a number (1 to 300-something), followed by 4 spaces, followed by mixed text & numbers.

Then a blank line, then the song, followed by a blank line.


I would like to apply Heading 1 to the first line of each song, and then Heading 2 to the first line of each stanza after the first. Like this:

9 9.9.8.4. ABC

Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Suspendisse sagittis diam id ligula blandit eget pretium.
Stellus pellentesque. Morbi adipiscing ligula sed felis
Pretium a feugiat magna molestie. Nunc molestie ante id

Urna bibendum feugiat. Donec sit amet tortor at ipsum
Mattis fermentum ac sagittis felis liquam volutpat tellus
Ut lacus placerat vestibulum a sit amet justo. Proin congue
Vehicula justo vitae tristique. Maecenas ut eros erat, eu.

Suscipit tellus. Phasellus mollis, massa ultrices bibendum
Egestas, ligula est semper est, eget feugiat nibh semend
Magna. Nam nec mi dui. Phasellus molestie sodales odio nec
Consequat. Pellentesque eleifend metus, at consectetur

1 8.4.5. TDS

More lorem ipsum sit amet, consectetur adipiscing elit.
Suspendisse sagittis diam id ligula blandit eget pretium.
Stellus pellentesque. Morbi adipiscing ligula sed felis
Pretium a feugiat magna molestie. Nunc molestie ante id

Urna bibendum feugiat. Donec sit amet tortor at ipsum
Mattis fermentum ac sagittis felis liquam volutpat tellus
Ut lacus placerat vestibulum a sit amet justo. Proin congue
Vehicula justo vitae tristique. Maecenas ut eros erat, eu.

Suscipit tellus. Phasellus mollis, massa ultrices bibendum
Egestas, ligula est semper est, eget feugiat nibh semend
Magna. Nam nec mi dui. Phasellus molestie sodales odio nec
Consequat. Pellentesque eleifend metus, at consectetur




Once I have it formatted that way, I can easily create the TOC, and an additional "index" to the first lines of each stanza.


I have no clue about VBA for Word. I have done some simple macros in Excel, and some shell programming, but I am a looong way from being a programmer.

Is anyone interested in walking me through this, or creating a macro that will work?

rruckus
01-23-2012, 03:14 PM
This is pretty easy, just loop through each paragraph and check it's chracter style or text and apply the paragraph style or OutlineLevel. Is your text already formatted exactly like the sample?

passinthru
01-23-2012, 07:32 PM
It's currently all plain text. I want the first line of the first stanza of each song to be Heading1, and the first line of the remaining stanzas (if any), to be Heading2.

The number of stanzas is quite variable, from 1 to perhaps a dozen or so.

entwined
01-23-2012, 08:36 PM
Hi there,

This one's tough indeed. But I've just made tons of macro recordings with few editing of the codes and I ended up with this:


Sub Macro1()

Selection.WholeStory
Selection.Style = ActiveDocument.Styles("Normal")

Selection.HomeKey Unit:=wdStory
Selection.Find.ClearFormatting
With Selection.Find
.Text = "^#.^#"
.Forward = True
.Wrap = wdFindStop
End With
Selection.Find.Execute

Do While Selection.Find.Found = True
Selection.MoveDown Unit:=wdParagraph, Count:=2
Selection.Style = ActiveDocument.Styles("Heading 1")

Selection.Find.ClearFormatting
With Selection.Find
.Text = "^#.^#"
.Forward = True
.Wrap = wdFindStop
End With
Selection.Find.Execute

Loop

Selection.HomeKey Unit:=wdStory
Selection.Find.ClearFormatting
With Selection.Find
.Text = "^#.^#"
.Forward = True
.Wrap = wdFindStop
End With
Selection.Find.Execute

Do While Selection.Find.Found = True
Selection.MoveUp Unit:=wdParagraph, Count:=1
Selection.MoveDown Unit:=wdParagraph, Count:=1, Extend:=wdExtend
Selection.Font.Color = wdColorRed
Selection.MoveDown Unit:=wdParagraph, Count:=1

Selection.Find.ClearFormatting
With Selection.Find
.Text = "^#.^#"
.Forward = True
.Wrap = wdFindStop
End With
Selection.Find.Execute

Loop

Selection.HomeKey Unit:=wdStory
Selection.Find.ClearFormatting
Selection.Find.Style = ActiveDocument.Styles("Normal")
Selection.Find.Font.Color = wdColorAutomatic
With Selection.Find
.Text = "^p^p^?"
.Forward = True
.Wrap = wdFindStop
End With
Selection.Find.Execute

Do While Selection.Find.Found = True
Selection.MoveRight
Selection.MoveLeft
Selection.Style = ActiveDocument.Styles("Heading 2")

Selection.Find.ClearFormatting
Selection.Find.Style = ActiveDocument.Styles("Normal")
Selection.Find.Font.Color = wdColorAutomatic
With Selection.Find
.Text = "^p^p^?"
.Forward = True
.Wrap = wdFindStop
End With
Selection.Find.Execute

Loop

Selection.WholeStory
Selection.Font.Color = wdColorAutomatic
Selection.HomeKey Unit:=wdStory

End Sub


I just tried your sample to work it out. But I don't know if this would work perfectly on your actual document as your document should be fixed in terms of blank lines(should be one blank line only and no extra blank lines), and the pattern of it(e.g. all line numbers should have the same pattern). It's worth a try though. :)

passinthru
01-24-2012, 07:31 PM
Many thanks! :)

Just to be clear, the lines with the numbers all start with a numeral, then 4 spaces. After that there's no consistency. The number of characters varies.

I can manage to get rid of any double blank lines - a simple search and replace will do it. A global find for two paragraph marks and replace with one should do it.

Hmmm. If I give the number of songs, would that help? Since it's a known number, perhaps just go to #1, find blank line, find next line of text, make it Heading1, then find next blank line, make it heading two, then loop until the next number?

The total number is 300-something. Let's call it 350 for now.

Does that help?

entwined
01-24-2012, 09:56 PM
You mean to say those line numbers are like this?

2[space][space][space][space]2.4.6.8. ABC

If these 4 spaces are fixed, then the macro should work just fine. Just replace "^#.^#" to "^# " from the code that I gave you.

passinthru
01-25-2012, 04:29 AM
That's correct. I'll give it a try today!

macropod
01-26-2012, 03:39 AM
Hi passinthru,

Try:
Sub Demo()
Application.ScreenUpdating = False
Dim RngDoc As Range, Rng As Range
Set RngDoc = ActiveDocument.Range
With RngDoc.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "(^13)[ ]{1,}"
.Replacement.Text = "\1"
.Forward = True
.Wrap = wdFindStop
.Format = True
.MatchCase = True
.MatchWholeWord = False
.MatchWildcards = True
.MatchSoundsLike = False
.MatchAllWordForms = False
.Execute Replace:=wdReplaceAll
.Text = "[^13]{2,}[!0-9]@^13"
.Replacement.Text = ""
Do While .Execute
Set Rng = RngDoc.Duplicate
With Rng
.Start = .Start + 2
.Style = "Heading 2"
End With
Loop
RngDoc.Start = ActiveDocument.Range.Start
.Text = "^13[0-9][!^13]@[^13]{1,}[!^13]@^13"
Do While .Execute
Set Rng = RngDoc.Duplicate
With Rng
.Start = .Start + 1
.Style = "Heading 1"
End With
Loop
RngDoc.Start = ActiveDocument.Range.Start
.Text = "(^13){2,}"
.Replacement.Text = "^p"
.Execute Replace:=wdReplaceAll
End With
Application.ScreenUpdating = True
End Sub

passinthru
01-29-2012, 03:43 PM
Thank you, Paul. That worked perfectly.... except..... (you knew that was coming, right?)

Your macro set every first line to Heading1, but also set the number line to Heading1.

So, my TOC looks like this:


9 9.9.8.4. ABC

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

1 8.4.5. TDS

More lorem ipsum sit amet, consectetur adipiscing elit.

macropod
01-29-2012, 03:55 PM
Hi Hi passinthru,

I thought that's what you wanted:

I would like to apply Heading 1 to the first line of each song, and then Heading 2 to the first line of each stanza after the first
What Style is supposed to be applied to the numbered lines?

entwined
01-29-2012, 04:11 PM
There will be no styles applied on the line numbers. Right passingthru? Have you tried my code? Replace first "^#.^#" to "^#[4 spaces]". Hope it works for you. :)

macropod
01-29-2012, 04:47 PM
Hi entwined,

As you'll see from passingthru's posts, the numbered lines have been bolded, with implies a Style with that attribute (or direct formatting - spit!). In any event, there's no such thing as 'no style'.

entwined
01-29-2012, 05:04 PM
Macropod,

Ok I'm sorry about that. Yeah the line numbers are bold in his post but I think what he really meant is he wants to apply styles on the first lines of the stanzas only and the line numbers orig style will be retained as is. Hope this clears out everything... :)

passinthru
01-29-2012, 07:14 PM
You are correct, entwined. The style is not applied to the number lines.

Sorry for the confusion!

entwined
01-29-2012, 11:13 PM
Ok good. Try my code then... :)