Consulting

Page 1 of 2 1 2 LastLast
Results 1 to 20 of 22

Thread: Solved: UDF for mann-whitney statistical test?

  1. #1
    VBAX Regular
    Joined
    Oct 2009
    Posts
    7
    Location

    Solved: UDF for mann-whitney statistical test?

    Hello all,

    I am finding myself in need of performing a statistical test that is not inherently available within Excel, and so I was hoping that I could create a UDF to implement the needed function. Unfortunately, I believe I am a bit beyond my VBA capabilities here, so I thought I'd turn to experts for help/advice.

    The statistical function I desire is the nonparametric Mann-Whitney U test (also known as the unsigned Wilcoxon rank sum test). A good description of how it is calculated is provided at: europe.isixsigma.com / library / content / c080806a.asp.

    Can anyone offer suggestions for how best to implement this solution? Thanks in advance for any advice anyone can provide.

    ==Dygarth

  2. #2
    VBAX Expert
    Joined
    Feb 2005
    Posts
    929
    Location
    Quote Originally Posted by dygarth
    Hello all,

    I am finding myself in need of performing a statistical test that is not inherently available within Excel, and so I was hoping that I could create a UDF to implement the needed function. Unfortunately, I believe I am a bit beyond my VBA capabilities here, so I thought I'd turn to experts for help/advice.

    The statistical function I desire is the nonparametric Mann-Whitney U test (also known as the unsigned Wilcoxon rank sum test). A good description of how it is calculated is provided at: europe.isixsigma.com / library / content / c080806a.asp.

    Can anyone offer suggestions for how best to implement this solution? Thanks in advance for any advice anyone can provide.

    ==Dygarth
    Welcome to VBA-X.
    I do a lot of this kind of stuff. I can develop a procedure to compute the relevant statistic(s) and get back to you.
    Last edited by MWE; 10-20-2009 at 03:11 PM.
    "It's not just the due date that's important, it's also the do date" [MWE]

    When your problem has been resolved, mark the thread SOLVED by clicking on the Thread Tools dropdown menu at the top of the thread.

  3. #3
    VBAX Regular
    Joined
    Oct 2009
    Posts
    7
    Location
    Hi MWE! Thanks so much for your reply and for your assistance. Is there anything I can do/offer to help with the effort?

  4. #4
    VBAX Expert
    Joined
    Feb 2005
    Posts
    929
    Location
    Quote Originally Posted by dygarth
    Hi MWE! Thanks so much for your reply and for your assistance. Is there anything I can do/offer to help with the effort?
    No problem, I enjoy this kind of thing. I have written dozens of stat-oriented procedures in VBA including lots of SixSigma and DFSS stuff and a complete Monte Carlo simulation package that works in any application. I have probably the only version of MSProject that has a complete MonteCarlo simulation capability built-in.

    I have written the Mann-Whitney procedure but need to do a little testing to ensure that it is generating correct results. Do you want just the U statistic or do you want the "next step" which is acceptance or rejection of Ho for some confidence interval and DOFs?
    "It's not just the due date that's important, it's also the do date" [MWE]

    When your problem has been resolved, mark the thread SOLVED by clicking on the Thread Tools dropdown menu at the top of the thread.

  5. #5
    VBAX Regular
    Joined
    Oct 2009
    Posts
    7
    Location
    Thanks; I admire your passion. I guess I envision a Mann-Whitney UDF that is somewhat analogous to Excel's TTEST, in that it accepts two arrays of values as inputs, perhaps an option for one-tailed or two-tailed, and then returns a p-value (ideally, adjusted for ties) of the resultant statistical test. In my mind, any subsequent acceptance or rejection of Ho should be performed by the user (me!) given their desired alpha level.

    If I can provide any additional input, please don't hesitate to ask. Once again, I really appreciate your help!
    Last edited by dygarth; 10-21-2009 at 07:19 PM.

  6. #6
    VBAX Mentor anandbohra's Avatar
    Joined
    May 2007
    Location
    Mumbai
    Posts
    313
    Location
    Quote Originally Posted by MWE
    No problem, I enjoy this kind of thing. I have written dozens of stat-oriented procedures in VBA including lots of SixSigma and DFSS stuff and a complete Monte Carlo simulation package that works in any application. I have probably the only version of MSProject that has a complete MonteCarlo simulation capability built-in.

    I have written the Mann-Whitney procedure but need to do a little testing to ensure that it is generating correct results. Do you want just the U statistic or do you want the "next step" which is acceptance or rejection of Ho for some confidence interval and DOFs?
    we will be glad if you share your work with us & provide access to your efforts.
    Knowledge is the only thing which increase is you spread it.
    Always Mark your Thread as Solved the moment u got acceptable reply (located under Thread tools)
    Practice this & save time of others in thinking for unsolved thread

  7. #7
    VBAX Expert
    Joined
    Feb 2005
    Posts
    929
    Location
    Quote Originally Posted by anandbohra
    we will be glad if you share your work with us & provide access to your efforts.
    Knowledge is the only thing which increase is you spread it.
    What I write for forum needs and general purpose stuff is shared. I have 30 or so items in the KB as well. But some of my code is the result of significant work over many years, contains non-trivial engineering knowledge and I use it in my consulting business. I am retired, but do some consulting in systems engineering, SixSigma, Lean, DFSS, project management, etc. I know what I am doing, but some/most of my code is not as bullet-proof as I would like it to be for others to use. It is as good as most commercials applications, but not good enough for me to sleep well nights. It is all too simple for folks to grab some code and generate conclusions that are bogus.

    So, if you have specific questions, ask me.
    "It's not just the due date that's important, it's also the do date" [MWE]

    When your problem has been resolved, mark the thread SOLVED by clicking on the Thread Tools dropdown menu at the top of the thread.

  8. #8
    Distinguished Lord of VBAX VBAX Grand Master Bob Phillips's Avatar
    Joined
    Apr 2005
    Posts
    25,453
    Location
    I fail to see how 'develop a procedure to compute the relevant statistic(s)' and getting back to the OP would be any different from posting it here. If anyone uses it, it is in the knowledge that it was given free, with no warranties of any kind.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  9. #9
    VBAX Expert
    Joined
    Feb 2005
    Posts
    929
    Location
    Quote Originally Posted by xld
    I fail to see how 'develop a procedure to compute the relevant statistic(s)' and getting back to the OP would be any different from posting it here. If anyone uses it, it is in the knowledge that it was given free, with no warranties of any kind.
    I do not follow you. This thread is about a specific request for a Mann-Whitney procedure which I will post "here" as part of an xls file when I am done ... as I have posted lots of stuff in the past

    I think you are confusing the specific request from dygarth for the M-W procedure with what I interpreted to be a much more general request from anandbohra to share my stat/math/engr libraries and applications which I use in my consulting business.
    "It's not just the due date that's important, it's also the do date" [MWE]

    When your problem has been resolved, mark the thread SOLVED by clicking on the Thread Tools dropdown menu at the top of the thread.

  10. #10
    Distinguished Lord of VBAX VBAX Grand Master Bob Phillips's Avatar
    Joined
    Apr 2005
    Posts
    25,453
    Location
    I interpreted it very differently. I think he had the same thoughts as me, that you were going to handle it off-line.

    If you post it here, great, I too would like to see it.
    ____________________________________________
    Nihil simul inventum est et perfectum

    Abusus non tollit usum

    Last night I dreamed of a small consolation enjoyed only by the blind: Nobody knows the trouble I've not seen!
    James Thurber

  11. #11
    Site Admin
    Urban Myth
    VBAX Guru
    Joined
    May 2004
    Location
    Oregon, United States
    Posts
    4,940
    Location
    You're the best MWE.

  12. #12
    VBAX Expert
    Joined
    Feb 2005
    Posts
    929
    Location
    dygarth: Attached is a spreadsheet with the Mann-Whitney function and a short test. The method I used is a combination of what was originally proposed and what MiniTab does. In particular:
    • the two arrays are merged and sorted
    • ranks are determined and adjusted for any ties
    • the Mann-Whitney statistic W is computed
    • the normal approx statistic Zw is computed based on Ha (the alternate hypothesis) and adjusted for any ties
    • a t dist is used to extract the p value for a given Zw, DOFs and one or two tail test (driven by the Ha)
    It all seems to makes sense and generates W and Zw values identical to what I have found in examples, but the p values are slightly off (they seem to be high). MiniTab talks about the use of a "continuity correction factor" in their computation of p values, but I do not know what that is or how/if it applies.

    Take a look and let me know what you think. For general use, remove the lines in the M-W sub that write values to the activesheet; they are there only for the test proc.
    "It's not just the due date that's important, it's also the do date" [MWE]

    When your problem has been resolved, mark the thread SOLVED by clicking on the Thread Tools dropdown menu at the top of the thread.

  13. #13
    Moderator VBAX Wizard Aussiebear's Avatar
    Joined
    Dec 2005
    Location
    Queensland
    Posts
    5,064
    Location
    Very well done MWE.
    Remember To Do the Following....
    Use [Code].... [/Code] tags when posting code to the thread.
    Mark your thread as Solved if satisfied by using the Thread Tools options.
    If posting the same issue to another forum please show the link

  14. #14
    VBAX Regular
    Joined
    Oct 2009
    Posts
    7
    Location
    Hi MWE,
    Thanks so much for posting this Mann-Whitney implementation! I will review it, test it with my own reference data, and let you know how it goes asap. Thanks again for your assistance with this statistic.
    ==Dygarth

  15. #15
    VBAX Regular
    Joined
    Oct 2009
    Posts
    7
    Location
    Hello again MWE,

    I have been reviewing and testing your Mann-Whitney implementation. First, thanks for making it so easy to understand. It has made the testing and debugging process much easier. Second, great job!

    During my review, I believe that I have found two subtle bugs in the code:
    (1) In the SortTies subroutine, the routine did not properly handle tie runs that extended to the end of the input array X;
    (2) In the MannWhitney subroutine, the computation of Zw within the 'SomeTies = True' Case had a slight error: the last occurrence of "(Nz + 1)" should actually be "(Nz - 1)".

    In the revised Excel file attached, I have made these two corrections as well as a few other tweaks. I have been testing this version of the routine against some datasets I had that were previously run in StatGraphics. As with your experience, the p-values were close but not exact. Still not sure why.

    One other thing...I would actually like to use this routine as a UDF within Excel, but I can't seem to convert it so that it works properly. Ideally, I'd like to use it in a manner similar to TTEST, with a calling structure something like: =MWTEST(X,Y,nTails). Any suggestions?

    Thanks again for all your help.
    ==Dygarth

  16. #16
    VBAX Regular
    Joined
    Oct 2009
    Posts
    7
    Location
    I just ran across another subtle issue that causes the MannWhitney routine to throw a runtime error: The routine fails if there are *no* ties within the combined arrays. To resolve this issue, I converted SortTies from a sub to a function that returns a boolean True if any ties were found, or False if no ties were found. I then altered the MannWhitney routine to accommodate this change accordingly. See attached. Enjoy!
    ==Dygarth

  17. #17
    VBAX Expert
    Joined
    Feb 2005
    Posts
    929
    Location
    Well, 3 mistakes is not so bad. Or, rather, you correctly identified the 3 bugs I left in for you to fix

    So, we have both gained something from this.

    Regarding your last question about converting the MW sub to a function and calling it like any other Excel function. The rules for calling VBA functions from an Excel cell are approx:
    1. the function can only return a single value
    2. the function can not alter any cells other than the calling cell
    BUT, most VBA subs that work with arrays have to be rewritten internally because arrays and indices for passed arguements do not work (well, they have never worked for me). You have to use an "items in objects" approach, for example to figure out the length of X and Y to redim Z:
    [vba] Nx = 0
    For Each Item In X
    Nx = Nx + 1
    Next Item
    Ny = 0
    For Each Item In Y
    Ny = Ny + 1
    Next Item
    Nz = Nx + Ny
    ReDim Z(1 To Nz)
    ReDim Ztag(1 To Nz)
    ReDim Rank(1 To Nz)[/vba]
    I tweaked the MW sub into a function with MWP(X,Y,Ha) that returns the p value using the items in object approach and commented out the lines that wrote back to Sheet1 so as not to violate rule #2 above. I added a call to MWP in the test sub and MWP works fine when called from another VBA proc. I then added a call from cell J45: =MWP(A2:A40,B2:B43,"X<>Y") and got the correct p value. See attached.
    "It's not just the due date that's important, it's also the do date" [MWE]

    When your problem has been resolved, mark the thread SOLVED by clicking on the Thread Tools dropdown menu at the top of the thread.

  18. #18
    VBAX Regular
    Joined
    Oct 2009
    Posts
    7
    Location
    Perfect! Thanks so much! It works very nicely. I truly appreciate your help!
    ==Dygarth

  19. #19
    VBAX Expert
    Joined
    Feb 2005
    Posts
    929
    Location
    Quote Originally Posted by dygarth
    Perfect! Thanks so much! It works very nicely. I truly appreciate your help!
    ==Dygarth
    Glad to help. Let me know if you ever figure out why the p-values extracted from the Excel's tdist are not quite right.
    "It's not just the due date that's important, it's also the do date" [MWE]

    When your problem has been resolved, mark the thread SOLVED by clicking on the Thread Tools dropdown menu at the top of the thread.

  20. #20
    VBAX Regular
    Joined
    May 2017
    Posts
    19
    Location
    Hi

    This sounds very interesting. I would like to test the excel implementation results against R.

    How can I download the final code/workbook please?

    Thanks

    W

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •