PDA

View Full Version : Code Integration Query



deedii
02-05-2012, 08:36 PM
Hi guys, I am new to vba and I am trying to learn it. I am a writer and i want to create a vba that compares two paragraphs for duplication. If duplication if found the sentence being duplicated will be highligted into different colors, example i found 5 duplicate, i want to see also five colors in the output so that i can easily count the number of duplicated sentence fragments or phrases. I found a code that do exactly what i need only that it cannot set a lenght of matching parts and it has only two colors for highlighting duplicates. I created a userform that contains 2 rich texbox, the first box is for the source and the second box is for the article to be compared in the source. I also put a command button "Compare". I also put a combo box so that i can select the minimum lenght of matching part. Below is the code i found:



Sub Compare()

Const NMin As Long = 7
Dim R As Range, W As Range
Dim C As Range, C2 As Range, N As Long

Set R = ActiveDocument.Content

For Each W In R.Words
If W.HighlightColorIndex = wdNoHighlight Then
N = NMin

Do
Set C = W.Duplicate
C.MoveEnd wdWord, N
If C.End = ActiveDocument.Range.End Then Exit Sub
Set C2 = ActiveDocument.Range(C.End, ActiveDocument.Range.End)

Select Case True
Case Len(C.Text) > 256, _
C.HighlightColorIndex = 9999999, _
C2.Find.Execute(FindText:=C.Text, Wrap:=wdFindStop) = False

If N > NMin Then DoHighLight C
Exit Do

Case Else
N = N + 1

End Select
Loop

End If
Next
End Sub

Sub DoHighLight(C As Range)

Dim C2 As Range

C.MoveEnd wdWord, -1
C.HighlightColorIndex = wdYellow
Set C2 = ActiveDocument.Range(C.End, ActiveDocument.Range.End)
While C2.Find.Execute(FindText:=C.Text, Wrap:=wdFindStop)
C2.HighlightColorIndex = wdRed
Wend

End Sub

Public Sub Show()
frmMain.Show
End Sub




Now my question is how can I integrate this code into my own modified format. Please see attached file.

Hope you can spare time on my problem.
Thank you so much.

Credits to Tony Jollans for the code.

deedii
02-07-2012, 03:55 PM
UP. Can someone please help me? :)

fumei
02-07-2012, 10:03 PM
Use Option Explicit.

What is NMin?????

deedii
02-08-2012, 03:09 PM
NMin is the length of matching parts. Meaning it will highlight the sentence with seven succeeding words duplicated.

e.g
Source - RichBox A
The quick brown fox jumps over the lazy dog.
Article - RichBox B
The quick brown fox jumps over the lazy dog.

That codes works just fine. I just need to integrate it into a form with two rich text box and compare the two like in the layout i created in the attached file. :)

fumei
02-08-2012, 09:09 PM
NMin is NOT defined...use Option Explicit. The code only works, supposedly, fine because you are not using Option Explicit.

It is hard to test as you only have two examples in your document. Plus I am getting object errors from your code.

fumei
02-08-2012, 09:18 PM
Plus you have TWO subroutines with the same name - Compare. This is not good. I am having a hard time getting anything to work. Could you explain how it is supposed to work?. Are you selectng text and putting it, somehow into the userform? Are you making some text highlighted and looking for similar text? It is impossible to tell from the sample document.

You have combobox1 - a dropdown - with NO items. If you are typing text (3,4,5,etc), why is it a combobox (a dropdown). Plus if LMatch equals the value in combobox1, why are you doing a Select Case? Just make them equal.
Select Case ComboBox1.Text
Case "3"
LMatch = 3
Case "4"
LMatch = 4
Case "5"
LMatch = 5
Case "6"
LMatch = 6
Case "7"
LMatch = 7
Case "8"
LMatch = 8
Case "9"
LMatch = 9
Case "10"
LMatch = 10
Case Else
End Select


LMatch = ComboBox1.Value

or more carefully

LMatch = CInt(ComboBox1.Value)

I am not trying to be critical. I am trying to figure what is going on.

fumei
02-08-2012, 09:37 PM
For example, I cleared all highlights, put new highlight on the first four words (which ARE duplicated), typed 3 into the...ahem, empty dropdown...click Compare and get a messagebox "Dogs"

HUH????

deedii
02-08-2012, 09:45 PM
Ok sorry for the confusion in the file. Im just playing with it to see the workaround. Here's the scenario, I ' am writing an article from the source that my editorial adviser gave me, after i rewrite the article I will now compare my rewritten article from the source if there are duplications. The code i posted here is actually do the job in finding the duplicated document and highlight it, only that it works for whole document, meaning I have to put the source and the article i wrote in one document, to avoid confusion i decided to make a form which has two RichBox one is where i paste the source and the other is where i paste the article i wrote to be compared from the source. Never mind what is written in the documents coz I will just copy paste it in the form the most important is the macro in the form that will compare Source & Article for duplication. :)

I don't highlight it manually the code is the one who is highlighting the duplicated sentence.

The code is working fine, my problem is that how will i integrate that code into the form I created which will compare Rich Box A to Rich Box B.

deedii
02-08-2012, 09:57 PM
Oh Im so sorry I made you confused. That combo box is actually to change the value of NMin so that it will be more flexible, I dont want to limit NMin as 7 so I experimented to add another variable, I am thinking to make it as inputbox instead of combox so i can really just input what is the NMin. Please dont mind the code in the form coz its actually a mess of my experimentation the real code that works and i want to integrate is in the module. I forgot to clean the mess in form when i attached it here. Sorry :(

fumei
02-08-2012, 10:46 PM
Hopefully someone else can help you because as stated the code does not work fine for me.

Again though, I strongly recommend you start using Option Explicit.

deedii
02-08-2012, 11:47 PM
What code fumei? Please download the attached file "DupTest" and run "compare" to see what I mean. The code actually works but only inside the document. The form I created has no code since thats my problem, integrating the code into the form. This will be the result if you will run the code in the attached document.

http://i42.tinypic.com/fe4482.png

If you can see the code only works inside the document itself. So the code compare the Page 1 of the document which is the SOURCE to Page 2 which is the Article.

I want to achieve the same result but using a form so it should be like this;

http://i44.tinypic.com/w5qi9.png

So when I click "Compare" the result should be like this (Note: That is only a screenshot of what should the result be."

http://i42.tinypic.com/9r6jog.png

So in general I want to integrate the code inside the file I attached into the Form.

Hope it is clear now. :)

fumei
02-10-2012, 12:12 AM
The file DupTest is even worse. The userform module has no code at all. So clicking Compare does absolutely nothing.


Ah, wait. You did it as a procedure right in the document. Let me see....

deedii
02-10-2012, 05:59 AM
Oh as I posted. The reason why the userform has no code, bcoz thats actually my problem integrating the working code in a module into the userform. So all of those command button and input box has still no function at all.

fumei
02-10-2012, 06:52 AM
Wow. I have seen some brute force code before, but that is incredible. The small sample in the demo executes more than 4,000 instructions! Good grief.

I am steppng through it. Can you repeat clearly what it is you want to happen?

deedii
02-10-2012, 07:44 AM
I want to integrate the code in the module to the userform. The final result should be like below image.

http://i42.tinypic.com/9r6jog.png

So instead of using the code in the active page I will just copy paste the data(source and article) to the userform which has two richbox(source & article) to be compared.

(Note: That image above was just created so that i can show you what will be the result when you click "compare".)

fumei
02-10-2012, 03:51 PM
Well you are NOT going to get the colour part for the text in the userform.

I do not see how the code will work for text IN the userform. Are you saying you have made this work? In that you copy text into each textbox and get this kind of result?

Lastly, there is only so much text that can fit into the textboxes. Does this not limit things?

deedii
02-10-2012, 04:17 PM
No its not a normal textbox its a RichTextBox. I didnt make it work, what i did to get the screenshot of the possible result, is just copy and paste the result in the document after i run the code in that two rich boxes into the userform just to get that screenshot. So i think RTB can handle that format coz when i pasted it there it has also a highlight as shown in the screenshot.

Paul_Hossler
02-11-2012, 08:14 AM
I just restructured the code in this version, didn't actually try make the compare work. This is probably the way I'd do it, but there's lots of different ways



Now my question is how can I integrate this code into my own modified format


The code that the userform handles or needs -- I put in the user form module. I also added an 'Exit' CommandButton

The macro ('Compare') that is exposed to the user is in mod_Main

The 'helper' subs, etc. are in a Option Private Module called mod_Subs

Like I said, it really doesn't do the compare, but the restructuring / integration might make it easier for you to incorporate.

Any questions ... just ask, and I'm sure someone will help

Paul

Paul_Hossler
02-11-2012, 08:25 AM
Also ..

It seems like a lot of work and you won't get the color high lighting

Word 2010 has a nice document compare built in, so if 2010 is an option, you might investigate it

This is a screen shot of all the Compare options pasted into the Compare document, and the Add/Deletes on the left

Paul

fumei
02-11-2012, 01:24 PM
paul, why do you code separate Load/Show instructions? Sub Compare()
Load frmMain
frmMain.Show
End SubThe Show instruction fires the Load automatically. Technically speaking, the Show instructiobn causes VBA to check if the object IS Loaded, and if it is not, Load it. Actually even more technically speaking, the Show instruction does a Load (sending an internal instruction to allocate memory), but if it already is loaded (memory is allocated) - an internal IF...THEN - continues on to execute the internal Show instructions (sending object attributes to the GUI).

fumei
02-11-2012, 01:33 PM
By the way, I ask only out of curiosity. There is no reason NOT to use two instructions (and once in a while a reason TO use two instructions), as there is virtually no performance hit. Just wondering why you would, without such a real reason to use two instructions.

Paul_Hossler
02-11-2012, 05:27 PM
paul, why do you code separate Load/Show instructions?


You're right, it's just a bad habit that I have a hard time breaking :dunno

I've been working on converting old code to use strings more effeciently, Load/Show just sort of fell off my radar screen

http://www.aivosto.com/vbtips/stringopt.html#whyslow


Paul

fumei
02-11-2012, 06:21 PM
While old (not many people are still using VB6), that article is still very valid and should be required reading.

fumei
02-11-2012, 06:24 PM
BTW, on review I can not think of any instance for needing a separate Load from Show instruction. The bottom line is a Show does a Load regardless.

It is kind of like that somewhat annoying combobox _Change event that fires when a userform starts up. If I understand it correctly, even though there is no apparent "change", technically speaking the combobox object is created, and THEN a null is given...thus a "change". Seems a bit wonky to me.

Paul_Hossler
02-12-2012, 06:47 AM
BTW, on review I can not think of any instance for needing a separate Load from Show instruction. The bottom line is a Show does a Load regardless.



I do think it's better to use the Load/Unload pair, because just .Hide doesn't seem to release memory

While the Load might not be required to make the VBA work, IMHO it's better style/preference/habit to always 'bracket' chunks of code, i.e. Load ... Unload

Test1 does not use Load/Unload and appears to consume memory, while Test2 gives the memory back when it's not needed anymore


Option Explicit
Private Declare Sub GlobalMemoryStatus Lib "kernel32" (lpBuffer As MEMORYSTATUS)

Type MEMORYSTATUS
dwLength As Long
dwMemoryLoad As Long
dwTotalPhys As Long
dwAvailPhys As Long
dwTotalPageFile As Long
dwAvailPageFile As Long
dwTotalVirtual As Long
dwAvailVirtual As Long
End Type

Dim MemInfo As MEMORYSTATUS

Sub Test1()
Dim dwBefore As Long, dwAfter As Long
dwBefore = 0
dwAfter = 0

Call GlobalMemoryStatus(MemInfo)
dwBefore = MemInfo.dwAvailVirtual

UserForm1.Label1.Caption = "Welcome back"
UserForm1.Show
UserForm1.Hide

Call GlobalMemoryStatus(MemInfo)
dwAfter = MemInfo.dwAvailVirtual

MsgBox (dwAfter - dwBefore)

End Sub


Sub Test2()
Dim dwBefore As Long, dwAfter As Long

dwBefore = 0
dwAfter = 0

Call GlobalMemoryStatus(MemInfo)
dwBefore = MemInfo.dwAvailVirtual

Load UserForm1
UserForm1.Label1.Caption = "Welcome back"
UserForm1.Show
UserForm1.Hide
Unload UserForm1

Call GlobalMemoryStatus(MemInfo)
dwAfter = MemInfo.dwAvailVirtual
MsgBox (dwAfter - dwBefore)

End Sub


1. I thinkI'm doing the MemStatus correctly
2. It might not make any real difference
3. It might make a difference

Paul

fumei
02-12-2012, 03:58 PM
Ah, but that is when you use .Hide. It is VERY rare that I ever use .Hide. I can not even think of the last time, as I fail to see how it is useful really. You MUST use .Show to unHide...which fires Load again. So what is gained?

"Test1 does not use Load/Unload and appears to consume memory, while Test2 gives the memory back when it's not needed anymore"

HUH??????

Well yes...but...HUH? Consume memory? Test1 still has memory allocated because it is still Loaded. Test2 releases it with Unload. They are not equivalent tests.

To see a (IMO) more significant value, get dwAfter after the .Hide (before the Unload. UserForm1.Show
UserForm1.Hide
Call GlobalMemoryStatus(MemInfo)
dwAfter = MemInfo.dwAvailVirtual
Unload UserForm1


A big goose egg. Bupkus. Nada. There is NO memory advantage to using .Hide. Which of course makes sense, as memory is still allocated.




I do think it's better to use the Load/Unload pair, because just .Hide doesn't seem to release memory

While the Load might not be required to make the VBA work, IMHO it's better style/preference/habit to always 'bracket' chunks of code, i.e. Load ... Unload
.Hide does not just "seem" to not release memory...it does not release memory.

Load IS required to make VBA work. Show fires .Load....always.

I am not sure what you mean by bracket.

A Show userform (which ALWAYS fires .Load) must always have an Unload.

fumei
02-12-2012, 04:09 PM
There is a small reason for using .Hide, in that focus returns to the application. Whenever that comes into play (rare) I have found that a rethink is helpful. Perhaps making the userfom nonModal.

However, this becomes a design and useability issue (rather than a memory one), and almost invariably a rethink of the PURPOSE of the userform.

BTW: do you know of a way to identify ONLY the GUI resources involved with a userform. Remember the resources you are testing with your code are system resources, not specifically GUI resources.

Paul_Hossler
02-12-2012, 06:33 PM
big goose egg. Bupkus. Nada. There is NO memory advantage to using .Hide. Which of course makes sense, as memory is still allocated.



True, no argument, I was only going by what Bill Gates says:



Unload Statement

Removes an object from memory.

Syntax

Unload object

The required object placeholder represents an object expression (http://www.vbaexpress.com/forum/ms-help://MS.EXCEL.DEV.14.1033/EXCEL.DEV/content/HV10383569.htm) that evaluates to an object in the Applies To list.

Remarks

When an object is unloaded, it's removed from memory and all memory associated with the object is reclaimed. Until it is placed in memory again using the Load statement, a user can't interact with an object, and the object can't be manipulated programmatically.






BTW: do you know of a way to identify ONLY the GUI resources involved with a userform. Remember the resources you are testing with your code are system resources, not specifically GUI resources






Really wish I did. I'd really like to find out some what to see just how much the application is using



So if I'm reading you and Chairman Bill right, I could just use .Show and Unload if I wanted to, correct????



Paul

deedii
02-12-2012, 08:09 PM
By the way is there any chance to count the highlighted words? Only Highlighted words?

fumei
02-13-2012, 01:39 PM
Paul: : "So if I'm reading you and Chairman Bill right, I could just use .Show and Unload if I wanted to, correct????"

Yes, precisely. Show loads into memory. Unload unloads from memory. 99.999999% of the time there is no real need for using an explicit Load instruction, as .Show ALWAYS has an implicit Load instruction.

deedii, a running count? A total count?

deedii
02-13-2012, 06:22 PM
Just count the total highlighted word but only in the "Article Page". I want to compute the percentage of the total duplicated words. So I will just count all highlighted red words. The formula Im thinking is

number of highlighted words from "Article Page" divided by the total words in "Source Page" times 100.

I presume that it will get the percentage of the duplicated words from the source.

fumei
02-13-2012, 09:40 PM
You count highlighted words by...counting highlighted words.Sub CountHighlight()
Dim r As Range
Dim W As Range
Dim i As Long
Set r = ActiveDocument.Range
For Each W In r.Words
If W.HighlightColorIndex = wdRed Then
i = i + 1
End If
Next
MsgBox i & " words are highlighted."
End Sub

fumei
02-13-2012, 10:31 PM
You could also do a count by adding a counter to your Sub DoHighLight procedure. If so, because the procedure is closed each time, you will need to make the counter variable a PUBLIC variable in a standard module.

deedii
02-14-2012, 12:59 AM
fumei thanks for that it works, and oh another thing can it also count the total of words in a page and divide it with the highlighted words times 100? I have a work around to ask the user the number of highlighted words to compute the percentage but i think it will be a hassle to do that, is there in any case to include the page word count(not the all page only SOURCE page) in Sub CountHighlight()
to automatically compute the percentage?

Thanks much :)

fumei
02-14-2012, 01:54 AM
"total of words in a page" This difficult because of the fluid nature of pages.

As for percentages, just do the math.
As for automatically, as I stated, do the count in your DoHighlight procedure.

Frosty
02-15-2012, 10:51 AM
Sorry to hijack deedii's thread, but I wanted to weigh in on show/load part of the discussion. Show loads into memory if you haven't yet loaded into memory, obviously. But my preferred standard is not to dimension something with the New keyword... so that, for me, my forms are loaded prior to the Show command.


sub FormDemo
'This sets aside the memory space
Dim myForm as frmMyForm

'this loads it
Set myForm = New frmMyForm
'This shows it
myForm.Show

'this unloads it
Unload myForm
'this clears the memory space
Set myForm = Nothing
End Sub

I know that garbage collection isn't always effective in VBA, but apart from using New on the same line as the Dim statement, I actually like to have a separate action between loading my form into memory and displaying it to the user... it allows me to perform other actions I may or may not want to do before getting user interaction (apart from stuff within Initialize/Activate events for the userform itself).

But that is because I prefer to structure my code in such a way that the only code within a user form is related to the userform itself and the GUI for the user-- not any actions which will be performed after the form is dismissed.

Thoughts? Is this a stupid way of doing it?

EDIT: Fumei, it's good to see you back here :)

Frosty
02-15-2012, 10:58 AM
Oh, and I use .Hide all the time-- from within my form when the user hits "OK" ... and then my calling routine does the "work" of whatever my user has told my form to do.

fumei
02-15-2012, 03:22 PM
OK, I will weigh in.

The radical difference is you are declaring a userform object. In none of the code posted so far has this been the case.


But my preferred standard is not to dimension something with the New keyword... so that, for me, my forms are loaded prior to the Show command.

Nor has the New keyword been used.

Oh, and I use .Hide all the time-- from within my form when the user hits "OK" ... and then my calling routine does the "work" of whatever my user has told my form to do.

But...why.

Frosty
02-15-2012, 04:05 PM
Well, you learn something new every day. I didn't realize that even with Option Explicit, you can avoid declaring your userforms as objects when you want to use them. I am in the habit of declaring userforms, so I didn't realize this loophole to the Option Explicit concept existed.

So even Paul's Sub "Test1" in my normal subroutine would include the following lines (at the appropriate location)

Dim f As UserForm1
Set f = New UserForm1

As well as garbage collection later to...
Unload f
Set f = Nothing

I *think* this is validating what I typically do, since Paul's code in Test1 is actually loading the form twice in some cases (UserForm1.Hide will also load the form, if it doesn't exist in memory -- i.e., someone has hit the Red X, or the ESC key if a command button has the CANCEL property set to true, and that command button has Unload Me in the _Click event, etc).

The reason I use Hide is because I typically do not allow a form to unload itself (since I always separate out my "information gathering" phase with my "do stuff" phase). So I require the form to remain loaded in my calling routine until I am done with it, in order to utilize the information I've collected. I realize this is more a preference rather than a requirement, but is the way I typically structure my code.

In practical application, this basically means that the code in my userforms is strictly restricted to modifying the form based on user interaction, and the code which performs actions on a document exists in the subroutine which called the form. There are exceptions to every rule, of course, but this is my general practice.

The reason behind it is, in really complicated forms which have to do extensive processing during initialization, I don't want to accidentally have those events happen more than once for a single routine (if the Userform falls out of scope, but code then later directly refers to it-- now I get a reloaded form, but I can't know whether the information from this reloaded form is accurate or not).

Of course, if all "action code" is within the form, then it is a self-encapsulated thing... so you don't have to worry about whether the form is loaded or not.

But, for me, it seems a bit more difficult to modularize (and thus, test) code which always requires me to run a userform, click on the option I'm testing, and then step through code to see if it works.

My practice is to set up routines with appropriate parameters, and be able to test those routines independently of any UI.

I realize this is a bit more theoretical in the various ways to structure code... but I'm always pleased to increase my understanding of items.

In looking at Paul's restructuring, he wouldn't run into an issue, because his main form calling routine "Compare" doesn't have any code after frmMain.Show.

However, even a frmMain.Hide after the frmMain.Show would reinitialize the form-- which would cause performance hits, depending on how much is going on in the Initialize event.

It's probably easier to demonstrate with a modification of the document code. Stand by ;)

Perhaps my posts won't be such a hijack after all...

Frosty
02-15-2012, 04:40 PM
To the OP, Deedii --

Short answer: what you want isn't easy, and it might not be possible. I would investigate using Word's native comparison feature (even in Word 2003, it isn't bad).

Rich Text form controls are possible on VBA userforms, but you're trying to reinvent the wheel which has been programmed by Microsoft. You just want to compare two bits of text (source and article), to see what is similar.

The problem is, you're trying to add a feature (and thus, your brute force method) of getting an overall length of a string to compare. Why do you need to specify the "minimum length of matching parts"? In the document comparison world, this falls under the concept of "readability."

Are you trying to see if someone has plagiarized you?

I think we need a bigger picture of what you *really* want to accomplish in order to give you an answer.

I could, for example, program a dialog which allows you to
1. Paste text into a text box called "source"
2. Paste other text into a text box called "article"
3. Click "compare"
4. See a document which uses Word's native redlining to show you the comparison.

But you could just as easily:
1. Paste source text into a new, blank document you have saved called Source.doc
2. Paste article text into a new black document you have saved called Article.doc.
3. Run the native redline comparison on those two documents, and analyze the results.

And you don't need any coding what-so-ever. And this will work in version Word 2000 on up (if memory serves).

So... what version of Word are you using?

And what do you want to do after you look at the results of your "comparison" function?

Frosty
02-15-2012, 05:22 PM
Here is a proof of concept for Deedii for using the native comparison feature, as well as a structure which (maybe) better articulates what I was talking about.

It will work in both 2003 and 2010 (have not checked 2007, although I think it should work). The major difference is that in 2003, you would be forced to save two documents before completion (just save to desktop and delete afterwards-- code could easily be added to delete those "temp" files)

Frosty
02-15-2012, 05:28 PM
Quick note: this is just a proof of concept, because the major limiting factor here is the actual pasting of text into the text boxes of the form. That strips away formatting (which may or may not be important).

Even if you added programmatic references to use a "rich text box" control (which are available somehow, but I haven't done something like that recently), I'm pretty sure you're going to lose at least some of what you think you might get, so I'm not sure what that would add in this kind of proof-of-concept.

In any event... hope this helps you, Deedii.

Fumei: I hope this at least identifies what I was talking about, structurally/conceptually.

deedii
02-16-2012, 06:54 AM
Frosty thank you so much for your kind answer.
Yes something like that only that I want to achieve the same result in the RTB as the code do the documents. BTW sorry to challenge you guys, but I really appreciate it a lot :)

Frosty
02-16-2012, 07:56 AM
There is no RTB available from VBA within a userform. C'est impossible.

What version of Word are you using?
What gave you the idea of a Rich Text Box?
Why do you need the results in a form at all?
We are trying to solve your problem for free, not meet your exact design requirements.

What you want might be possible in a stand alone application (VB, .NET addin, a COM addin, etc)... But it is not available in VBA.

deedii
02-16-2012, 07:32 PM
Thank you for that :)

I am using 2007.
Well the idea came to me since i think RTB can handle highlighting format.
Coz I think it would be easier for me to count how many duplicated sentences.
I know and Im very thankful for that, thats why I am open for a good suggestions from you guys :).

Hmm i didnt thought that this will be very complicated, I thought its easy to integrate. For now I am using it as is with some tweak from your idea and fumei and paul. It somehow manage to do things that I want to do so Im very thankful for you guys. :)

Paul_Hossler
02-17-2012, 06:30 AM
Frosty --



There is no RTB available from VBA within a userform. C'est impossible.


http://www.tek-tips.com/viewthread.cfm?qid=1544286



So we are talking about VBA rather than VB ... OK, this is probably caused - would you believe - by Internet Explorer security (http://www.tek-tips.com/viewthread.cfm?qid=1544286#).

Basically the RTB is not trusted as safe by IE (the control itself is erroneously marked as safe but IE is configured not to believe it). Later service packs of Forms2 (also known as the Microsoft (http://www.tek-tips.com/viewthread.cfm?qid=1544286#) Office Forms Library, i.e VBA's userforms) check with IE on whether they are allowed load and run a control. The policy for the RTB is to NOT load or run it, because the 'killbit' (http://en.wikipedia.org/wiki/Killbit) is set in the registry. This killbit setting is not accessible through the IE interface, so there are two primary workarounds

1) Use genuine VB6 to wrap the RTB in a usercontrol which you can mark as safe

2) Edit the relevant killbit setting in the registry (normal warnings about modifying the registry apply):
HKEY_LOCAL_MACHINE\Software\Microsoft\Internet Explorer\ActiveX Compatibility\{3B7C8860-D78F-101B-B9B5-04021C009402}
and change the 'Compatibility Flags' value from 0x400 to 0


Possible solution. Not great, but possible

Paul

deedii
02-17-2012, 07:52 PM
Hmmm this is very complicated than I thought. The main reason why I wanted to integrate that code instead of using it inside the document, because it doesnt really compare the two articles the (source) and the (article) it only look for duplicate inside the document even I separate the source page to article page. There are some instance that sentences can be repeated in the source so the result it will be highlighted (Please see attached file to see what I mean.). What I need actually is to compare two article for duplication, meaning even if sentences are repeated in the (source) it will not be highlighted, it will only be highlighted if it appear in the (article). Would it be possible to compare two documents instead of comparing page 1 to page 2 that will give me the same result? Or compare two pages only that the highlight in page 1 is all yellow and highlight in page 2 is all red using that code, meaning even if sentences are duplicated in the source page it will just highlight yellow and if it appears in the article page it will be highlighted red in the article page?

fumei
02-17-2012, 08:45 PM
Too bad there are not many copy editors anymore because it would be far cheaper to hire a copy editor to do what you seem to be asking. The development cost to come up with a robust solution is far higher. I for one can not afford to put more time into this.

deedii
02-17-2012, 09:46 PM
@Frosty I was doing some test on the concept you gave and Im really impressed. I have one question will it be possible to display only duplicated sentence in the "compared document" pane?

@Fumei Thank you so much for participating. I appreciate it a lot :)

bigJD
02-17-2012, 10:26 PM
This thread catched up my attention as I was looking for the same macro to compare two documents for duplication. Upon reading all your post actually the solution for you deedii can be found

h t tp://tinyurl.com/72sl6xs

and its an open source for you to edit as you like, perhaps you know how to program.

I was looking at the code and your right it actually dont compare, it just look for duplication inside the documents. If someone able to compare two documents and highlight result in other document like what the app i gave you do then I guess your problem is solved. I noticed frosty has a very good concept only that it will not give you the result you want as it uses compare features of the office itself, I dont know if it possible to tweak it just to display the duplication in the compared document pane as you stated. Hope there a check for duplication in Comparison Setting in the word review features. Another problem is setting the number of phrase/word to match or your NMin.

Now the challenge for us is can we replicate

h t tp://tinyurl.com/72sl6xs

application in vba?

PS: Sorry to put it in qoute Im still not allowed to post link.

deedii
02-25-2012, 10:23 AM
Exactly thats something like that bigJD.

One last question before I marked it as solved.
With the code posted on the first page. Im having a problem in the minimum string to be match which is the NMin variable, whenever I set it to any number says 5 or 7, it still does not follow it because I can still see two consecutive words or three consecutive words highlighted when it supposed to be 5 or 7 (depending on the value of NMin) consecutive words before it will be highlighted. Any solution for that to make it accurate? Thanks

Frosty
02-29-2012, 05:46 PM
The short answer is you can't make that NMin value perfectly accurate in the results. This is because your entire code structure is based on the Words collection.

The "Words" collection (which is a collection of ranges of what Microsoft thinks are words) is not always accurate and is fundamentally broken. There are a number of flaws which is probably not that interesting to list. But I've been trying to think of a better approach for you, and I've come up empty.

I was thinking there might be a way to use the document comparison function, and then simply discard the deleted text and inserted text, and simply highlight what is perceived as either a "Move" or exactly the same text... but even that would be limited, since you clearly want to be able to use the NMin value and have it be useful.

I would recommend purchasing that 3rd party app which bigJD posted the pseudo-link to.

Couple of notes on that: I'm not sure why he had to use tinyurl-- that seems shady to me. The actual link is to an external website which is selling a product which would seem to do exactly what you want.

But generally when someone with a single post count gives a link to a product which is being sold, I begin to be cynical that the entire thread is about marketing.

However, giving you the benefit of the doubt that this thread hasn't wasted our time as a marketing ploy... I think what you have currently is about as good as you're going to get. It would be possible to program it better, but I think you've guessed at this point that it isn't easy... and I simply don't have the time to work on this at the moment, when I have people wanting to pay me to program for them.

Good luck!

deedii
02-29-2012, 08:25 PM
Thanks frosty I got the point :)
BTW this is the link if I get it right removing the spaces on h t t p
http://plagiarism.bloomfieldmedia.com/z-wordpress/2011/10/24/new-release-wcopyfind-4-1-0/

I dont see any tool for sale there, and its even an open source application. I just dont get what you mean on your notes.

Again thanks so much people :)