PDA

View Full Version : Splitting, Sorting and Sampling



rangudu_2008
12-14-2008, 05:11 AM
Hi all,
I'm trying to implement a randomized way of sampling of alerts (generated by from a transaction monitoring system) processed by an user.

The sampling process is based on certain conditions and i want to automate it. It should be implemented in such a way that it is flexible enough to over-rule the conditions at certain times.
This entire process is done on a fortnightly/monthly basis (as it is a quality check being done on alerts & cases processed.

The sampling is being done based on certain sampling rates and is classified based on a certain value called the risk weightage (in short called the RWT value). The end user who performs the quality check process should also have the option for choosing sample alerts & cases based on different sampling rate if the sampling conditions do not apply for the users.

My scenario is explained below:
There are many columns of data in my workbook, which are used for the sampling process. They are as below:

User Name
Rules (the Rule Name)
Record ID - [Unique Value]
RWT (the Risk Weight of the Rule) - A one or two digit number
Level - A single digit number - will be either 1 or 2 or 3Sampling is done for all users based on the risk weight of the alert & level.

Inorder to implement the randomized way of sampling alerts based on the conditions, for all levels (1, 2 & 3) and for all users/rules, frequent sorting/filtering of alerts have to be done. In simple words, for a certain user, samples are chosen based on the RWT of the alert for all the 3 levels.

For example, inorder to randomly choose sample alerts for USER A, the RWT value needs to be sorted in descending order first - followed by checking of the level. Both taken in combination determines the sampling rate (from the chart) based on which samples have to be chosen.

By a sample, i mean the entire row of data which needs to be exported to another sheet in the same workbook.

I already have done some basic coding for randomly sampling certain no. of alerts (specified by the end user) from an excel sheet & exporting it to another sheet based on the first column - whatever it may contain - Now i need to modify it accordingly inorder to the conditions as well and determine the sample rate at runtime. The option of choosing certain no. of alerts (say X) will be retained.

I've formulated a rule chart (whose range can be changed as new rules can also be included) that contains the RWT values & the sampling rate for each level, but i need to sort the entire sheet based on RWT value (in descending order) frequently and take sample for that particular rule with the highest risk weight.

The thing that baffles me here is that since the rules can trigger alerts in various combinations - I also need to take care of not choosing the same record twice for an user randomly.

Can anybody help me out to combine the code modules that i've developed?

Ranga

rangudu_2008
12-18-2008, 07:54 AM
In the attached workbook, the main columns that will be used are:

RWT column
Level ColumnTo randomly choose samples, it is required to sort the RWT column and apply filter to the Level column (in the RAW sheet) and export the rows chosen to another sheet.
With those sampling rates, the alerts with highest RWT value will form the sample space of records from which samples have to be picked.
How can this sorting and filtering be implemented effectively?

Ranga

Benzadeus
12-18-2008, 09:12 AM
Could you enumerate what you want for the macro do?
Like:
1 - Sort RWT column applying filter which criteria is *** in the Level column.
2 - Export data in new sheet.
3 - ...

rangudu_2008
12-18-2008, 10:47 AM
There are a set of rules with threshold limits for which alerts are generated in the monitoring system. An alert can either be generated in the system for breach of one rule or a combination of rules. Each rule has an RWT value associated with itself. These alerts are processed by various users in the system. Alerts can also be processed at a higher level in the system (by creating cases). These processed alerts (by the users) and cases are subjected to a quality check process in the system on a fortnightly basis.

I need to randomize the way of choosing samples (if processed alerts and cases) for the quality check process that is being performed every fortnight. The criteria for choosing sample records for the quality check process needs to be automated. I've mentioned the sampling rates and the criteria for the quality check process in my earlier post. Sampling for quality check is done mainly based on the RWT value.

The RWT value associated with an alert is nothing but the RWT value of the rule.
If an alert is generated for a combination of different rules, then the RWT value of the alert is the sum of RWT value of the all the rules together.
For eg., if rule1 has an RWT value of 6, rule2 has an RWT value of 7, then an alert generated for the combination of rule1 and rule2 will have an RWT value of 13.

I've some 50 columns in an excel workbook, from which i need to randomly choose rows of data and export them to another sheet. These exported rows of data will be used as samples for the quality check process.

There are a few columns which will be useful for implementing the random sampling methodology. They are:

Alert ID - [Unique Value] - which is processed by the users in the system
Rules (along with its description) - Rule/Rules for which an alert is generated in the system
RWT (Risk Weight of an Alert) - A one or two digit number - obtained by adding risk weight of the rule/rules for which the alert is generated in the system
User Name - The name of the user who processes the alert
Level - A single digit number - will be either 1 or 2 or 3 - denotes the hierarchy level for which alerts are generated in the system.
Level Hierarchy - 3 - high, 2 - medium, 1 - lowFor the quality check process, sampling is done based on the following conditions:

All new users of the system with <= 6 months vintage for upto 6 months;
For all new rules activated in the system (to generate alerts), all users are subjected to the quality check process for upto 3 months.For each level, sampling rates differ - for alerts it differs between 6%, 8% and 10%; for cases it is 10% for all the 3 levels. To choose samples i need to sort & filter out rows from the workbook for each level.

I also need to take care of choosing sample records when the sampling conditions fail - either when there are no new users in the system or there are no new rules activated in the system. In such a scenario, samples may need to be chosen based on a sample rate specified by the user who carries out the random sampling process.

How can this sorting, splitting and sampling process be implemented effectively? Can anybody help me out?

Ranga

rangudu_2008
12-23-2008, 10:02 AM
My basic idea for randomly choosing samples, is by sort the entire sheet of data based on the RWT column in descending order first; this will bring together the rows of data with the highest RWT value at the top.

Now having formed the sample space of records from which samples have to be chosen randomly for all three levels and for all users, samples can be chosen level-wise for each user. This requires applying filter to the Names column (inorder to select one person at a time) and then filtering by the level column (to pick samples for each level). Each randomly chosen row of data should be consolidated to one sheet.

Feeding data to the macro according to the conditions (which i had mentioned earlier) will be taken care by the user while feeding data before running the macro.
While sampling for all new users of the system, those those records that are processed by new users will be taken. For new rules activated in the system, records pertaining to those new rule/rules processed by any user of the system will be taken up for sampling.

How can this way of sampling be implemented? Can anybody help me?

Ranga

rangudu_2008
12-28-2008, 11:11 PM
Listed below the enumerated steps what the macro should do:

Sort RWT column in descending order first.
Now apply filter to the Names column and by choosing an user once at a time, randomly choose sample rows for each Level by applying a secondary filter on the Level column which contains any one of the 3 values (1, 2 or 3).
All users should be sampled randomly in this way as mentioned in the above step level-wise and all the randomly chosen rows of data should be copied and listed in another sheet in the same workbook.The quality check process is based on rules and users, as mentioned below:

All new users of the system with <= 6 months vintage for upto 6 months;
For all new rules activated in the system (to generate alerts), all users are subjected to the quality check process for upto 3 months.To sample out rows of data based on the above conditions, i decided to split the data that is fed into this macro for processing in two ways:

Alerts processed by all new users is fed directly;
Alerts generated and processed by all users for all new rules is also fed later.I wrote code for randomly picking out rows of data, but the logic i used was different and it needs to be modified as per the above requirement.

Since alerts are generated for multiple rules (different combinations), i tried to add another extra column that contains only the rule codes in the data that is fed into this macro (in order to get the sampling rate for alerts generated for multiple rules). I wrote the code to split the rule codes alone.

I tried implementing this randomly sampling methodology by formulating a chart that lists out the set of sampling percentages for each level for any alert that is processed for the rules. I'm trying to integrate all the codes together and i?ve got some few other issues to fix.

Can anybody help me out with this macro as i've some timelines to follow?
In case of any more explanations kindly drop me a mail at rangamails@yahoo.com


Ranga

rangudu_2008
12-29-2008, 05:37 AM
Hi VBAX mates,
I've coded for picking out sample rows of data from an Excel sheet and to split out codes from a specific column. Just wanna modify it further to suit my requirements.

I've five different columns in a sheet as follows:

Names - Contains the list of User Names
Record ID - Contains unique IDs
Notes - Description that is related to the alert ID
RWT - Weightage of the alert that's based on codes specified in Notes column
Level - Level for which the alert ID was generated (Values - 1, 2 or 3)I want to pick out whole rows of data (alert IDs) for different users based on RWT value for all levels.

The manual process done is as follows:

Sort the whole sheet based on RWT column (in descending order)
Apply filter based on Names (for each user) and for each level (1, 2 and 3)
Pick out specific rows for each level and for each user based on the rule.After sorting the RWT column in descending order, apply filters to sort by Names first and then by level. Can this be done automatically at the click of button via a VBA module in Excel 2003? I'm not able to accomplish this multiple filtering via VBA.

More generally my question is,
Is it possible in Excel to apply filters to multiple columns via VBA at runtime?

Kenneth Hobs
12-29-2008, 08:49 AM
Data > Filter > Advanced Filter

For a VBA example, see: http://vbaexpress.com/forum/showthread.php?t=24005

lucas
12-29-2008, 10:38 AM
I'm lost, why are there two workbooks. The one Rule Splitting v3 has columns similar to what you list but no ID column.

lucas
12-29-2008, 10:48 AM
I just had a look at this thread also started by you:
Splitting, Sorting and Sampling

You appear to be asking the exact same question....only you have started a new thread.

I am merging the two threads and will ask you politley not to start mulltiple threads asking the same question. It is confusing and annoying and will eventually cause all of your questions to be ignored.

rangudu_2008
12-29-2008, 07:13 PM
Hey Lucas,
I was looking for attachments while posting for the same. Had attached the wrong one. Right one attached.

Actually my macro has about 55 columns of data which i'll use. I've included only the main columns that are used for my random sampling. The other file Random Extract.xls has the general coding for picking out random samples from a sheet. I need to make modifications to it to suit my requirements more.

Ranga

rangudu_2008
01-15-2009, 09:22 PM
Attaching a document and few files which i've coded..