PDA

View Full Version : Get the full Chapter Number



stephan.zehr
03-11-2008, 10:37 AM
Hi,

I write a script to export all comments of a Word Document in an Excel Comment sheet.

So the approach is simple, I extract at the moment the information like Author, Page, and Comment Text by a loop over all comments.

But one think is missing. A chapter reference to the comment (chapter number) because the page is often not presice enough.

I know I'll get with


ActiveDocument.Comments(1).Scope


The text where the comment referse to.

So any idea to extract the full chapter numbering string?

e.g.

2.3.5 Headline 235
Bla Bla Bla
Bla Bla Bla <- Comment

I need 2.3.5.

thx

Stephan

---- code ---



For Each c In ActiveDocument.Comments
xlsObj.value(rowIndex, 1) = c.Initial
xlsObj.value(rowIndex, 2) = c.Reference.Information(wdActiveEndAdjustedPageNumber)
'xlsObj.value(rowIndex, 3) = ??? Chapter Number
xlsObj.value(rowIndex, 5) = c.Range.Text
rowIndex = rowIndex + 1
Next c

fumei
03-12-2008, 10:05 AM
More details requried.

So any idea to extract the full chapter numbering string?

e.g.

2.3.5 Headline 235
Bla Bla Bla
Bla Bla Bla <- Comment

If 2.3.5 is a string, then it is just a string. Word does not know anything about "chapters". It is a meanless term to Word.

stephan.zehr
03-12-2008, 10:28 AM
If 2.3.5 is a string, then it is just a string. Word does not know anything about "chapters". It is a meanless term to Word.

Well I did not say it is simple, but it should be possible to "calculate" the chapter number, shouldn't? That's why I am asking here. :)

fumei
03-12-2008, 11:24 AM
Calculate with what? Word does not know what "chapter" is. What is a "chapter"?

I will repeat. If 2.3.5 is just a string, then it is just a string. This is why I am asking.

If it is NOT just a string - that is, you are using a list style which you are not saying if you are - then you can get the value using ListString.

Say:

2.3.5 Headline 235

is using a list style, and say it is paragraph 34, then:
MsgBox ActiveDocument.Paragraphs(34).Range.ListFormat.ListString
would return "2.3.5"

Or, say you have various paragraphs using a list style - which you do not say you are using....:

1. Headline
2. a;jdfa;jdfa;
2.1. sdfnslfn
2.1.1. Headline
2.1.2. yadda tadada
2.1.2.1. Headline

then code like:
Dim r As Range
Set r = ActiveDocument.Range
With r.Find
Do While .Execute(Findtext:="Headline", Forward:=True) _
= True
MsgBox r.ListFormat.ListString
Loop
End With


would return:

1 - the first "Headline"
2.1.1 - the second "Headline"
2.1.2.1 - the third "Headline"

"Chapter" is a meaningless term. Word does not know what a chapter is, so how can it "calculate" one? It does know what list numbers are (see examples above); it does know what Sections are - if you have your chapters in their own Section, then sure it can "calculate" what Section something is in....but "chapter" is still meaningless to Word.

stephan.zehr
03-13-2008, 07:55 AM
Calculate with what? Word does not know what "chapter" is. What is a "chapter"?


Yes I saw the (meta) model of word is here very generic.
My definition is:
A Chapter start with a Headline, IMHO even this is not in the model of word available. But how generate Word a "table of content"?



then code like:
Dim r As Range
Set r = ActiveDocument.Range
With r.Find
Do While .Execute(Findtext:="Headline", Forward:=True) _
= True
MsgBox r.ListFormat.ListString
Loop
End With


would return:

1 - the first "Headline"
2.1.1 - the second "Headline"
2.1.2.1 - the third "Headline"


Well but only in your version, not in a Germany one where headlines is called "?berschrift".

Is their a other way to identify headlines?


"Chapter" is a meaningless term. Word does not know what a chapter is, so how can it "calculate" one?
Well this is the reason, it does not know something about a chapter / chapter number (exception TOC). Therefor I search for a script doing this. I just wondering about seems nobody has this problem before or I just did not found working code.

So I started to create my own piece but this is not mature enough.

fumei
03-13-2008, 09:40 AM
"Is their a other way to identify headlines?"

Other way? I can not repeat this again. WHAT is a headline? Word does not know what a headline is, or a chapter is...unless you tell it something.

You tell it by using Sections (which Word does understand), or Styles (which Word also understands).

"A Chapter start with a Headline, IMHO even this is not in the model of word available. But how generate Word a "table of content"?"

WHAT is a Headline? Word has no idea what that means.

"Well but only in your version, not in a Germany one where headlines is called "?berschrift"."

I am repeating myself, and I will have to let this go because I do not seem to be getting through. I have asked - repeatedly - whether Headlines (or ?berschrift....who cares???) is a string. Is it only text? Is it using a style? You have not bothered to answer.

I have told you - and shown you - how to get 2.3.5 IF, repeat IF, you are using a list style. No answer.

You do not understand how Word works. I will say for the last time...

"A Chapter start with a Headline, IMHO even this is not in the model of word available. "

Maybe in your document, in YOUR mind, a chapter starts with a Headline...but Word does not know what a chapter is, nor a headline, nor a ?berschrift (if ?berschrift is just text).

Word knows lists, it knows Sections.

How does Word generate a Table of Contents? Like this:

Word generates a TOC by Style. These can be the default styles (Heading 1, Heading 2....), or by whatever styles you tell it to use. But in ANY case, it uses styles to generate a TOC. Just for an example, in a instructor manual for a course, I have the module title use the style ModuleTitle, the named main chunks of each module use ModuleHeading style, and the teaching point chunks use TeachingPoint style.

I changed the TOC to generate using those styles, to get:

stephan.zehr
03-14-2008, 04:49 AM
:) ... ok I see, I frustrate you ... sorry.

I really understand that word do not know about "Headlines" and "Chapters".



"A Chapter start with a Headline, IMHO even this is not in the model of word available. But how generate Word a "table of content"?"

WHAT is a Headline? Word has no idea what that means.


When I speak about headline and chapter, I referred to a real document structure. And I am pretty sure you know how usually documents are structured.

Because word does not know anything about this, that’s why I have this problem.

As I said a real document has chapters (even if word do not care) and I search for a script which is able to analyze this stupid word object model and try to collect the string which I would interpreted as a Chapter Number of a comment or the selected start of a range.

It does not help to say several times, word doesn’t know native about this stuff.

I am not stupid!

Yes sorry I already know that in Word a headline is just a paragraph with predefined style, any yes I know the headline numbers are just list numbering.

So well, nevertheless I have a written document in word (because I have to use) and I have to extract chapter numbers even word don’t knows them.
It least I know what it is and e.g. Latex too.

So well the expected answer is like:

Here is a script which tries to detect chapter numbers but has this and this restrictions and work in only about 70% of typical test cases because of this reason.
That is the reason why I ask on a VBA forum site.

What I didn’t expect is long description about word is not able and word do not know and in word is this what you expect just a list.
That I have to write a VBA script I know (I am here on a VBA forum). That it isn’t just 2 line of code, I realized already for my self (sorry maybe I did not tell you).

Does some have a ready written script, or is somebody able to write on in just 5-10 minutes because he/she has much more practice then me or does somebody have another simpler solution? (may know where I can finde one?)

But don’t start again to discuss with me about the definition of chapters and headlines.
If you don’t know what it is (e.g. in printed document) I have not the patience to tell you, sorry.

Greetings

Stephan

Tinbendr
03-14-2008, 06:47 AM
Stephen,

I found this code. It extracts the chapter reference at the cursor.

Maybe this will get you started.

Sub ListParagraphs()
'Helmut Weber 2003
Dim s As String ' the accumulated liststring
Dim l As Integer ' the listlevel
Dim p As Integer ' a counter for listparagraphs
Dim r As Range ' a range
Selection.Collapse direction:=wdCollapseStart
Set r = Selection.Range
r.Start = 0
r.End = Selection.Start
l = Selection.Range.ListFormat.ListLevelNumber
s = Selection.Range.ListFormat.ListString & s
For p = r.ListParagraphs.Count - 1 To 1 Step -1
If r.ListParagraphs(p).Range.ListFormat.ListLevelNumber < l Then
s = r.ListParagraphs(p).Range.ListFormat.ListString & s
l = r.ListParagraphs(p).Range.ListFormat.ListLevelNumber
End If
Next
MsgBox s
End Sub

fumei
03-14-2008, 12:28 PM
What I do not have patience for is someone who does not answer questions.

I asked if you are using a listing style. Did you answer? No. I showed how, if you were using listing style how you can get the numbering for it - ListString.

I showed you how you can use a string ("Headlines", but it could just as easily be ?berschrift, or yadda, or anything) and get the numbered list - the 2.3.5

"If you don’t know what it is (e.g. in printed document) I have not the patience to tell you, sorry."

Don't be absurd. That is just plain silly. Of course I know what TEXT someone could make as a chapter start.

I can make a chapter be "Yadha,ha aldhh", and the next chapter "akdghsak jl;qw"...but so what???

Word needs a definition. I explained that Sections start would work, I explained how numbered lists would work.

"analyze this stupid word object model"

The object model is quite fine, it is how people use it that is stupid.

The problem is that text is NOT an object. So your issue has nothing to do with the object model.

Tinbendr's suggestion uses the same thing I suggested, ListString.

You do not even answer direct questions. If you used Word properly, then EVERY "chapter" starting paragraph would use a explicit style. I use a style called...."ChapterStart". It is only used for the opening/start of a chapter.

That way I can quickly and easily [b]using the Object Model, get any information I want on the "chapter".

I explained how Word generates a table of contents - which YOU asked! - and gave an example of how you can use Styles to define "chapters" (or any other sectioning of text).

You do not appear to be willing to listen, or learn.

"But don’t start again to discuss with me about the definition of chapters and headlines.
If you don’t know what it is (e.g. in printed document) I have not the patience to tell you, sorry."

That is rude and insulting. The key thing there is "printed document". Of course you can see what a chapter is in a printed document. So what? Print it all you want, but unless you define it as an object, then is is just text to Word. You can say ignore definitions...but until you DO undertsand them, AND define them, you are out of luck.

Good luck.

stephan.zehr
03-17-2008, 01:17 AM
First, thx for the script, the same I found in the internet too. Sorry would not help. The problem is to detect the paragraph with a headline style in which e.g. a selected rage object is (starts).

To answer the questions:
Yes I use a list styles applied on the standard style "headline" and so on. Thx for you explanations.

I personally think the object model of word is to generic. In the domain "writing" it is typical to have headlines, chapters and so on. In other models as in Latex, OpenDocument, HTHML or DocBook they exists and OOXML will not change this. But this is my personal opinion.

As I said I expected, I had the hope somebody has a script but well, I am able to develope software (was my job several years) but it is more simpel to get a ready one.

Nevertheless thx for your support.

gwkenny
03-17-2008, 02:26 AM
I personally think the object model of word is to generic. In the domain "writing" it is typical to have headlines, chapters and so on. In other models as in Latex, OpenDocument, HTHML or DocBook they exists and OOXML will not change this. But this is my personal opinion.

List numbers have more uses than just in your domain of "writing". Thus it's not called chapter numbers. To do so would pidgeon hole the use of list numbers. If you are selling a product, you want to market to the widest distribution possible so it wouldn't make sense to pidgeon hole your product.

Why is it so hard to use Microsoft's definition? It's not a great leap in concept and it helps when conversing because everybody is using the same definition. Half the problem in this thread is angst because people are not using Microsoft's definitions. It's a Microsoft product and thus of us who develop in it use its' definitions because it makes communication much easier. If you ask for help, please use common terms that everyone in the community will understand.


As I said I expected, I had the hope somebody has a script but well, I am able to develope software (was my job several years) but it is more simpel to get a ready one.

Gerry gave you the appropriate object property: .Range.ListFormat.ListString

If the selection/range doesn't have a list string, just keep stepping one paragraph to the front until you get one (or reach the beginning of the document). It's a few lines of code. Given the property, the time it would take you to "look up code" or request code, you could write it if you are adept as you indicate. I know it would take me just a minute or two if that.

If you are not adept with VBA, feel free to post the code you attempted and several folks on the board would be happy to help.

fumei
03-17-2008, 02:03 PM
Look, I am sorry if I came across overly annoyed, but gwkenny is right.

You are using Microsoft Word. It has an Object Model (and a Document Model). I will absolutely agree that it is a bit different, but there you go. If you are happier working with other applications...then please do that. Stop using Word.

However, if you are going to use Word, then you have to work with....Word.

For example, you asked about table of contents. You mention printing the document. I responded with who cares. Yes of course a printed document is structured with chapters and headings. But did you know that in Word, a generated Table of Contents has nothing - Zip - ZERO - Nada - to do with chapters and headings?

There is absolutely NO relationship - within the Word Object Model - between a Table of Contents and structures like chapters and/or headings. NONE.

Yes, it looks like there is.

Chapter 1 - Yadda......
.....sub Heading A......
.....sub Heading B......
Chapter 2 - Blah Blah......
.....sub Heading A......
...........sub sub Heading whatever......
.....sub Heading B......

BUT...in fact, there is no relationship. Word has no knowledge of "Chapter 1 Yadda" being the start of anything; or that "sub Heading A" has some relationship to "Chapter 1 Yadda".

Again, once generated, the Table of Contents looks like - just like a printed document looks like - there is some connection.

However, there is no connection.

A Word Table of Contents is generated by ONE thing, and ONE thing only.

The existence of a specified Style. Period. And that Style can be anywhere - Word does not care.

Take a look at the image below. This is a very real Word generated Table of Contents. I just made this right now. It only took a couple of seconds. Look at it.

The "chapter" levels are....question marks.
The first "sub-heading" are.....a semi-colon
The second "sub-heading" are....quotation marks.

Please note that the document itself - in the text structure and would print out structured - has:

Chapter 1 - The VBA Userform
Design Basics
CommandButtons

blah blah

Chapter 2 - Logic Statements
For Each
Select Case

blah blah

fumei
03-17-2008, 02:08 PM
What I am trying to say, and have felt frustrated with your responses, is that I have no doubt we could can come with something that will work for you IF, IF, IF, you are willing to listen (and actually answer questions when asked) and learn.

Otherwise, maybe it would be better if you used other software. Word is a very powerful, and quite sophisticated, word processor. However, it DOES have its own Object Model, and that is the way it is. You have to work within that, or you just simply go crazy.

fumei
03-17-2008, 02:13 PM
As an aside to sneaky knowledgeable Word users, I used to drive this one guy crazy by altering his defaults to generated ToC to something like the above ?, ;, " heading levels.

bad dog.

gwkenny
03-18-2008, 04:52 AM
There is absolutely NO relationship - within the Word Object Model - between a Table of Contents and structures like chapters and/or headings. NONE.

Yes, it looks like there is....

Gerry, I understand where you are coming from, but I think you are looking at things too narrowly. The key is to create the relationship between the concept of chapter/heading and TOC using styles. Basically the User defining the relationship so it becomes valid.

But we're beating a dead horse here in this thread. Think we can reach 25 posts in this thread?

Tinbendr
03-18-2008, 06:14 AM
Stephan,

Can you provide a sample document?

fumei
03-18-2008, 06:48 AM
g-, yes of course, but read your own words.

"The key is to create the relationship between the concept of chapter/heading and TOC using styles."

my bolding.

I do not think I am being too narrow, it is Word that is being explicit. There IS no relationship until YOU, the Word user, creates one. Word will not do it. YOU must do it, by using Styles.

You can have all the Chapter, Headings all you want in your document, but until YOU create the relationship - and it still remains, in fact,....no relationship at all.

The fact is, that it IS narrow. A ToC is generated by one thing - and ONE is narrow, n'est pas?

You can all the chapters and heading you want, structurally, in a document, but if they do not use a style used by the ToC...the ToC will always generate a blank ToC.

Period.

A ToC ONLY generates by the existence of the defined style(s). ONLY. That is very narrow, and it is not me that is doing it. And, it still does not really define any relationship. There is still NO relationship to the chapters. It is still just one thing....the styles defined as to be used in the ToC.

Period.

Look, suppose I generate a ToC using MyChapter style as Level1; MySubHeading_A style as Level2; MySubHeading_B as Level3.

My "chapters" and subheadings are generated into the ToC. It looks right, prints right. Is there a real relationship to the structure (chapters and headings)? No. In my stupid example, I simply reset the Levels to StupidStyle1, StupidStyle2 and StupidStyle3....voila! The ToC is generated to those. My structure has not changed, the document still has a structure of chapters and headings, but they are no longer in the ToC...because there IS no real relationship to the structure. The ONLY (narrow) relationship of a ToC is to the existence/usage of the styles set for the ToC.

Styles. Not structure.

However, I agree this is possibly beating a dead horse.

I also agree that we could help the OP more if he posted a sample file.

stephan.zehr
03-18-2008, 08:14 AM
Gerry gave you the appropriate object property: .Range.ListFormat.ListString


Yes I use this in my code. But just because I found the code which Tinbendr posted in the internet of my post i started to implement.




If the selection/range doesn't have a list string, just keep stepping one paragraph to the front until you get one (or reach the beginning of the document). It's a few lines of code. Given the property, the time it would take you to "look up code" or request code, you could write it if you are adept as you indicate. I know it would take me just a minute or two if that.

Yes and no, because this is allways the same. The first view is allways simple but when you start with coding and running the stuff more and more test cases occure. e.g. The selected text could be in enumeration so now you have to check is it a headline or not, next the selected text could be in the header or in the footer, next the selected text could be on the start page (without any chapter) .. and that are just a few cases where is stoppt handling exceptions ... how i have trainee for this job :)

stephan.zehr
03-18-2008, 08:18 AM
fumei ... SORRY ...

I understood you allready in you first explanation about the TOC.

Thanks !!!

fumei
03-18-2008, 11:59 AM
It sounds like your document is not designed fully for use in Word.

Here is another example. I am working on a 500 page document (on Word VBA), and say I am in the midst of a bunch of stuff...and I have no idea what "chapter" I am in.

Aside: that can never happen for the same reason the following works. But anyway...

In the header of every "chapter", is...the chapter name. Let's say "Unit 6 - Controlling Code". The full text content of the header is:

Advanced Word - Selected Techniques (Doc Title, left justified)
Unit 6 - Controlling Code ("chapter" title - right justified)
Student Guide (new line, left justified)

Now, the "chapter" title (in the header) uses a character style - HeaderUnitName. The document title uses a character style - HeaderTitle.

Note: both style are formatted EXACTLY THE SAME, because I want the header text to look the same throughout. However, by using a different style for just the "chapter", I can:
Sub GetUnitName()
Dim r As Range
Set r = Selection.Sections(1) _
.Headers(wdHeaderFooterPrimary).Range
With r.Find
.Style = "HeaderUnitName"
.Execute
' here is the "chapter" name
MsgBox r.Text
End With
Set r = Nothing
End Sub

The ONLY instance of the style "HeaderUnitName" in the range of the Selection (where the cursor is) header, is the "chapter" name. Voila! I can get the chapter name....if I need it.

Of course I don't really, as I can simply scan at the header and see it. However, if I want to get the chapter name programmatically, it is a snap. It is the range of text in the header formatted with HeaderUnitName.

Say I am in the middle of a bunch of stuff, and I want to know what is the current heading I am working under.

Within a "chapter", these could be: Objective, Notes, Teaching Points, Exercise, Lab, New Terms, etc.....

Sub GetCurrentHeadingName()
Dim r As Range
Set r = Selection.Range
With r.Find
.Style = "HdLeft1"
.Execute Forward:=False
MsgBox r.Text
End With
Set r = Nothing
End Sub

Voila! Because I do not actually want to move the Selection - just find out what the heading is - I make a range of the Selection, then use Find on it.

The code looks backwards (Forward:=False) for the style HdLeft1 (which is the style used for headings AND the ToC Level2)...and...voila! There it is, the current heading level, by text.

Design. Design. Design.

These are design issues. Properly understood and used within the context of how Word is designed, Word is a very sophisticated application.

gwkenny
03-18-2008, 07:51 PM
but until YOU create the relationship - and it still remains, in fact,....no relationship at all.


This is exactly what I am saying.



The fact is, that it IS narrow. A ToC is generated by one thing - and ONE is narrow, n'est pas?

...

A ToC ONLY generates by the existence of the defined style(s). ONLY. That is very narrow, and it is not me that is doing it. And, it still does not really define any relationship. There is still NO relationship to the chapters. It is still just one thing....the styles defined as to be used in the ToC.


You contradict yourself. Is there or is there not a relationship?

To me, when you use styles appropriately you create the relationship between the concept of chapters and the TOC. YES word operates via the style and it's table/index coding, but the fact remains, the TOC gets generated with the correct chapter information. How can it do that without some sort of relationship?

It's all about how you look at it. For example with that guy who wanted unformatted text. I just said to apply Normal style format. You said:


However, it is NOT unformatted. It is formatted to whatever is the format of Normal.

Given the way Word operates, it is the same thing. It is all in the way you look at it. If we wanted to be extremely narrow, we could have told the original poster there is no such thing as unformatted text in Word cause everything has a style attached to it, so can he please be more specific. But we all understood what he was asking.

(The preceding paragraph information does not hold for the most recent version of Word 2007- Thanks Tony :hi: )

We'll just have to agree to disagree.

And hey, are we getting damn close to 25 posts in this thread or what???

fumei
03-19-2008, 10:22 AM
Yup.

"You contradict yourself. Is there or is there not a relationship?"

IMO, there is no real relationship. There is one within our perception, but NOT one in Word.

"the TOC gets generated with the correct chapter information. How can it do that without some sort of relationship?"

No...it does not have a relationship. It only looks that way IF - and ONLY IF - you have actually made the text of your "chapters" the style used by the ToC.

You say there is a relationship between the ToC and the "correct" chapter information? Really.

I go to a chapter title. I change the style to Yadda. And what is the relationship to the ToC now?

The actual text of the Chapter has not changed, its location has not changed; its "relationship" to the ToC however is....poof! Gone.

There is no relationship between a ToC item and text, or typographical structure in the document.

But, shrug, I guess we will have to agree to disagree.