PDA

View Full Version : Sorting "Encyclopedia" like document



MacroShadow
12-31-2012, 07:13 AM
Happy New Year to all!

My question this time deals with sorting text. I am working on an encyclopedia type document. The document makes heavy use of custom styles.

Each entry may span over several paragraphs, I need to sort the document alphabetically by the entry titles. I need the titles to be sorted, followed each by their text body.

I hope my intention is clear.

Thanks in advance.

Frosty
01-07-2013, 01:44 PM
Like many of your questions, I have to answer your question with a question: why do you want to do this? What is the desired output? You want to re-sort the actual document?

There are a number of different ways you could do this, but the "best" way depends on what you want the output to be. Can you give a simple before and after document, or post the code you've started with?

There are a number of posts about sorting arrays (WordBasic.SortArray has some well-documented issues), but you could also change the actual document format (depending on how it is set up) to take all of your custom-styled paragraphs, dump them into a 2-column table... with your headings in the first column, and the text associated with those headings in the 2nd column... and then sort the table.

Or you could do everything in code. Or you could dump it all into a database.

The basic process is-- grab all of the data into some kind of sortable format, then sort it, then dump it back out. Easy, right?

MacroShadow
01-07-2013, 01:47 PM
I was hoping to use something along the lines of the following:
Public Sub test()
Dim rngDocContent As Range, rngTemp As Range
Dim rngMyRange() As Variant
Dim intCounter As Integer
Set rngDocContent = ActiveDocument.Content
intCounter = 1
With rngDocContent.Find
.ClearFormatting
.Text = "~"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Execute
Do While .Found = True
Set rngTemp = rngDocContent.Duplicate
rngTemp.Select
Selection.Extend
With Selection.Find
.ClearFormatting
.Text = "%"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindAsk
.Execute
End With
ReDim Preserve rngMyRange(0 To intCounter)
Set rngMyRange(intCounter) = Selection.Range
Selection.Start = Selection.End
intCounter = intCounter + 1
ReDim Preserve rngMyRange(intCounter)
.Execute
Loop
End With
End Sub

Public Function Sort(ByRef str() As String, ByVal booAsc As Boolean)

Dim iLower As Integer, iUpper As Integer, iCount As Integer
Dim str2 As String, Temp As String

iUpper = UBound(str)
iLower = 1

Dim bSorted As Boolean
bSorted = False
Do While Not bSorted
bSorted = True
For iCount = iLower To iUpper - 1
If booAsc Then
str2 = StrComp(str(iCount + 1), str(iCount), vbTextCompare)
Else
str2 = StrComp(str(iCount), str(iCount + 1), vbTextCompare)
End If
If str2 = 1 Then
Temp = str(iCount + 1)
str(iCount + 1) = str(iCount)
str(iCount) = Temp
bSorted = False
End If
Next iCount
iUpper = iUpper - 1
Loop
End Function

gmaxey
01-07-2013, 02:29 PM
One easy way is to sort headings using OUTLINE view:

Example Text:
Zebras (Heading 1)
Black and white horsey looking animals
Apes (Heading 1)
Cousin to man and monkey
Dog (Heading 1)
Fury friends

Shift to outline view. Show level 1, then select and sort by paragraph. Show all levels, shift back to normal view.

Frosty
01-07-2013, 03:53 PM
Agreed, there are methods through the user interface that will let you sort on the fly.

But, MacroShadow, your description of the problem and the code you posted don't really have a relationship. You said you had custom styles, but your code doesn't deal with that at all... is this just code you copied and pasted from somewhere else, or have you properly applied styles but are using tildes and percent characters as defining what text is associated with what?

I suggest you post a 2 page document... page 1 with some dummy text but accurate use of styles, unsorted... and page 2 being the result of the sort process you'd like to have happen.

Your sort method is a bubble sort, which may be fine since Bubble Sorts are accurate, although they can be slow.

But your main methodology appears to be building an array of ranges, and then sorting that array based on the first character of each range. That probably works, but I'm questioning how you build your array of ranges, since your use of the Selection object doesn't appear useful to me (and will, in fact, slow you down a good bit).

MacroShadow
01-08-2013, 12:24 AM
Greg,

I couldn't get you suggestion to work. Anyways, I would prefer to do it via vba.

Frosty,

I hadn't made use of styles in the code, because I couldn't figure out how, but I mentioned it because it does seem important.

Attached please find the sample document.

gmaxey
01-08-2013, 04:43 AM
It didn't work because your headings are not built in headings that are automatically outlined. This may work for you.

Sub ScratchMacro()
Dim lngView As Long
Dim oRng As Word.Range
ActiveDocument.Styles("LetterTitle").ParagraphFormat.OutlineLevel = wdOutlineLevel1
ActiveDocument.Styles("ParagraphTitle").ParagraphFormat.OutlineLevel = wdOutlineLevel1
lngView = Application.ActiveWindow.View
Application.ActiveWindow.View = wdOutlineView
Application.ActiveWindow.View.ShowHeading 1
Set oRng = ActiveDocument.Range
oRng.Select
Selection.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending
Application.ActiveWindow.View = wdPrintView
End Sub

MacroShadow
01-09-2013, 10:11 AM
Greg,

Thanks. While it did work for the sample document, it totally messed up a different document, (which I cannot upload). I will try analyzing the document to find the issue. in the meantime I'd appreciate alternative options.

Frosty
01-09-2013, 10:17 AM
MacroShadow... if code works on a sample document but doesn't work on a real document, I think you should probably work up a new sample before asking for additional code.

The important "lessons" from this process are the following:
1. How you identify a single "encyclopedia" entry. Whether it is style based or something else.
2. *You* researching the .ParagraphFormat.OutlineLevel property, and the implications of that for your real documents. Properly setting up your custom styles with an appropriate Outline Level is critical to the kind of operation you desire (especially when working with a document like this without VBA-- the Document Map is hugely useful for long encyclopedic-type documents).
3. Record a macro which simply finds the first instance of a Style -- you'll see how to use a style with the find object (which is an alternative approach to what Greg is suggesting).

Personally, I would look at what Greg's suggesting before I'd look at what I'm suggesting. His code is considerably more simple than what my methodology would appear. If all you need to do is some code additional code to 'tweak' the styles in your "real" document in order to get his methodology to work-- I'd do it, instead of researching alternates.

gmaxey
01-09-2013, 11:19 AM
MacroShadow,

I'm bowing out. I've offered you a simple solution and don't have the time or desire to try to build better mouse traps.


Greg,

Thanks. While it did work for the sample document, it totally messed up a different document, (which I cannot upload). I will try analyzing the document to find the issue. in the meantime I'd appreciate alternative options.