PDA

View Full Version : How to code a boolean expression parser?

rlv
04-27-2013, 09:36 AM
I'm looking for guidance in implementing a VBA boolean expression parser. I have to read a range of cells which contain boolean expressions of the form:

Condition_A AND ((Condition_B OR Condition_C) AND NOT (Condition_E OR Condition_F))

And resolve them to a boolean result. The valid operators are AND, OR, and NOT along with "()" for nesting/grouping. The operands are all booleans, i.e. Condition_A can only be True or False.
Is there a standard approach to this?

SamT
04-27-2013, 12:52 PM
I'm not sure what you're asking. A parser is something that can break a complex statement into individual clauses for ease of understanding the entirety.

mikerickson
04-27-2013, 01:34 PM
One way to approach this is to create some formation rules for boolean expressions.

One set of formation rules might be:

If a and b are well formed formulas (wffs) then

(a AND b) is a wff
(a OR b) is a wff
(NOT a) is a wff

Notice that this treats AND and OR as binary operators,

(a AND b AND c) is not a wff
written properly it would be (a AND (b AND c)) or ((a AND b) AND c)

This makes things easier, because we can quickly add in non-associative operators (like IMP, EQV or XOR) without having to write special rules for them.

The parser would be given a string like
(((x<10) OR (y<100)) AND (x+y<>100)))
break it down into its components
((x<10) OR (y<100)) and (x+y<>100)
and then break each of those down into sub-components and sub-sub-components until we reach the atomic conditions
(x<10), (y<100) and (x+y<>100)

The advantge of requiring full parenthisation is that to find the first level components of an expression all one needs to do is remove the leading ( and trailing ) and then start counting ('s and )'s. As soon as there are as many ( as ), we know that that is the first component and the remainder is the other component.

rlv
04-27-2013, 02:10 PM
I like the idea of the wff as it sounds like it would make life easier. Unfortunately I'm dealing with a legacy database that has nearly 4000 decidedly un-wff equations. The existing expressions rely on associativity and order of precedence. It's the code to break the equation down into groups that I'm not sure how to go about tackling. You know, reading though the string and identifying operators operands and in which order to evaluate them

I could write a simple program

Sub MyBooleanEvaluation()

Dim A As Boolean, B As Boolean, C As Boolean, D As Boolean
Dim MyResult As Boolean

A = True
B = False
C = False
D = True

MyResult = (A And Not B) And Not C And D
Debug.Print MyResult ' True

MyResult = A And (Not B) And (Not C) And D
Debug.Print MyResult ' Also True
End Sub

And VBA parses the equation and properly evaluates it, even if not wff. It knows about operators and what to do with expressions nested in brackets. It's that same kind of under the hood equation processing that I want to implement so that I can evaluate the result of an expression string read from a spreadsheet cell.

A Conceptual example:

Function IsExpressionTrue(BooleanEquationString As String) As Boolean
' a bunch of code here
End Function

Sub TestTheExpression()
Dim ExpressionString1 As String, ExpressionString2 As String

ExpressionString1 = "Condition1 AND Condition2 OR (Condition3 AND (Condition4 OR Condition6))"
ExpressionString2 = "(Condition1 OR Condition2) AND (Condition3 AND NOT Condition4)"

If IsExpressionTrue(ExpressionString1) And IsExpressionTrue(ExpressionString1) Then
MsgBox "Yay, the criteria is satisfied"
End If
End Sub

Where:

Condition1, Condition2, Condition3, Condition4, & Condition6 are Boolean variables I will be evaluating per whatever expression is defined by the expression string in the cell. I already have code that will go out and get the current value for these variables. The part I need help with is evaluating the Boolean expression string, particularly how to deal with the parts of the equation within nested brackets. I figure this must be a common programming problem and I would like to follow the standard approach and avoid re-inventing the wheel (unless I have to).

SamT
04-27-2013, 02:43 PM
How about some examples of the Condition expressions? Are they themselves booleans or...?

=If(AND(A3>3,B2<5),If(OR((C6+D7<5),(F9-G2>14),"True","False"),42)
Conditions:
A3>3
B2<5
C6+D7<5
F9-G2>14

rlv
04-27-2013, 05:17 PM
How about some examples of the Condition expressions? Are they themselves booleans or...?

Here is an exact representation of the kind of string in the database:

"Condition1 AND Condition2 OR (Condition3 AND (Condition4 OR Condition6))"

The actual tag names in the database are not "Condition1" etc. They all have other names, but I'm trying to keep this simple to illustrate the underlying concept.

Condition1, Condition2, Condition3, Condition4, etc are all Boolean variables . They have an assigned value, either "True" or "False" so if the string were, for example,

"Condition1 OR Condition2"

and Condition1 has an assigned boolean value of "True" , while Condition2 has an assigned boolean value of "False", then the equation should evaluate to "True".

SamT
04-27-2013, 07:50 PM
And here is an exact representation of the solution:

If String = True Then True Else False

My friend, getting information from you is like dealing with a ghost in a heavy fog at twilight. One just knows something is there, but it's so vague and undefined and keeps shifting around so much that one has no clue about its true shape. :(

First it was a range, then it's a DB, and now it's tags

rlv
04-28-2013, 07:31 AM
SamT, I too am frustrated since from my end it appears we are talking past each other. With respect to your comment "First it was a range, then it's a DB, and now it's tags", sarcasm is not helpful. Does it really matter whether the boolean expressions are in a excel range or a in database? Is it so hard to imagine that an Excel VBA application might open an Access DB recordset and pull a table of boolean expressions into a range of cells for processing? I suppose the details of how that happens would have some bearing if I were asking how to read data from an Access DB, but that's not the case here.

I appreciate that you took the time to look at my post. Thank you. Your first response came the closest to grasping the issue.

I'm not sure what you're asking. A parser is something that can break a complex statement into individual clauses for ease of understanding the entirety.

That's essentially what I am going to do - parse/tokenize a complex statement. In this case the complex statements are a series of strings, each in the form of a Boolean equation (not a VBA boolean equation, but a boolean equation none the less) and correctly & reliably evaluate the boolean result irrespective of the number of variables, the number of sub-groupings as defined by parenthesis, or their location within the string. My hope was that someone here might have experience in coding an expression parser and could offer advice. I have some ideas about how to proceed, but I wanted to check in with some other programmers, and scan the KB here in case there was a standard approach. Again, thanks for making the effort to respond.

SamT
04-28-2013, 08:28 AM
To evaluate the entire string: Assuming Excel knows the conditions.
Sub test()
Dim Cel As Range
Dim CelStr As String
Const ClrTrue As Long = 4
Const ClrFalse As Long = 3
For Each Cel In UsedRange
CelStr = Cel.Text
If Evaluate("=" & CelStr) Then
Cel.Interior.ColorIndex = ClrTrue
Else
Cel.Interior.ColorIndex = ClrFalse
End If
Next Cel
End Sub

snb
04-28-2013, 09:45 AM
Why not posting a sample workbook ??

SamT
04-28-2013, 11:01 AM
I still don't know for a fact that he's even using Excel ranges, I am assuming so because he said in his last post

Is it so hard to imagine that an Excel VBA application might open an Access DB recordset and pull a table of boolean expressions into a range of cells for processing?