Excel

Parsing a chemical formula and determining the atomic counts

Ease of Use

Intermediate

Version tested with

97, 2000, 2002,2003 

Submitted by:

byundt

Description:

A traditional formula is simplified and subscripted, with atomic counts listed in a table. 

Discussion:

Chemical formulas are normally written to convey structure, and frequently include radicals or repeating groups inside parentheses. But gas chromatograph results are often stated using atomic counts (each element listed once). So it is desirable to do the following: 1) Simplify a traditional formula 2) List the number of atoms in the molecule in a table 3) Convert the numbers in a formula into subscripts The first two tasks are performed by user-defined functions; the third is done by a macro sub. The code allows elements to be listed more than once in the traditional formula, as well as multiple levels of parentheses (for radicals or repeating groups). 

Code:

instructions for use

			

Option Explicit Function GetFormula(ChemFormula As String, Elements As Variant, Optional Multiplier As Long, Optional Atoms As Variant) As String 'Returns a simplified formula (each element listed only once) given a traditional chemical formula Dim i As Long, First As Long, Last As Long Dim frmla As String, s As String If Multiplier = 0 Then Multiplier = 1 On Error Resume Next s = Elements.Address On Error GoTo 0 If s = "" Then 'Test whether function called from worksheet or VBA First = LBound(Elements) Last = UBound(Elements) Else First = 1 Last = Elements.Cells.Count End If If IsMissing(Atoms) Then ReDim Atoms(First To Last) Atoms = GetElements(ChemFormula, Elements, Multiplier, Atoms) For i = LBound(Atoms) To UBound(Atoms) 'If Atoms(i) = 1 Then frmla = frmla & Elements(i) & Atoms(i) & " " 'Show a 1 after elements with just one atom in ChemFormula If Atoms(i) = 1 Then frmla = frmla & Elements(i) & " " 'Don't show a 1 after elements with just one atom in ChemFormula If Atoms(i) > 1 Then frmla = frmla & Elements(i) & Atoms(i) & " " Next GetFormula = frmla End Function Function GetElements(ChemFormula As String, Elements As Variant, Optional Multiplier As Long, Optional Atoms As Variant) As Variant Dim RgExp As Object, MatchCollection As Object, oMatch As Object, RegSub As Object Dim First As Long, i As Long, iLeft(1 To 7) As Long, iRight(1 To 7) As Long, _ j As Long, Last As Long, LenFrmla As Long, n As Long, nest As Long Dim frmla As String, s As String, sTemp As String If Multiplier = 0 Then Multiplier = 1 LenFrmla = Len(ChemFormula) For i = 1 To Len(ChemFormula) If Mid(ChemFormula, i, 1) = "(" Then nest = nest + 1 iLeft(nest) = i End If If Mid(ChemFormula, i, 1) = ")" Then iRight(nest) = i nest = nest - 1 If nest = 0 Then If IsNumeric(Mid(ChemFormula, iRight(1) + 1, 1)) Then n = Multiplier * Val(Mid(ChemFormula, iRight(1) + 1)) 'Multiplier for radical inside parentheses j = Len(n & "") Else n = 1 End If frmla = GetFormula(Mid(ChemFormula, iLeft(1) + 1, iRight(1) - iLeft(1) - 1), Elements, n, Atoms) ChemFormula = Left(ChemFormula, iLeft(1) - 1) & Mid(ChemFormula, iRight(1) + 1 + j) i = iLeft(1) - 1 LenFrmla = Len(ChemFormula) End If End If Next On Error Resume Next sTemp = Elements.Address On Error GoTo 0 If sTemp = "" Then 'Test whether function called from worksheet or VBA First = LBound(Elements) Last = UBound(Elements) Else First = 1 Last = Elements.Cells.Count End If If IsMissing(Atoms) Then ReDim Atoms(First To Last) Set RgExp = CreateObject("VBScript.RegExp") For i = First To Last If Len(Elements(i)) > 1 Then s = s & Elements(i) & "|" Next For i = First To Last If Len(Elements(i)) = 1 Then s = s & Elements(i) & "|" Next RgExp.Pattern = "(" & Mid(s, 1, Len(s) - 1) & ")(\d+|)" RgExp.Global = True Set MatchCollection = RgExp.Execute(ChemFormula) For Each oMatch In MatchCollection Set RegSub = oMatch.submatches i = Application.Match(RegSub(0), Elements, 0) + First - 1 If RegSub(1) = "" Then Atoms(i) = Atoms(i) + Multiplier Else Atoms(i) = Atoms(i) + CLng(RegSub(1)) * Multiplier End If Next GetElements = Atoms Set oMatch = Nothing Set MatchCollection = Nothing Set RgExp = Nothing Set RegSub = Nothing End Function Sub FormulaSubscripting() Dim cel As Range Dim RgExp As Object, oMatch As Object, oMatches As Object Dim i As Long, n As Long Set RgExp = CreateObject("VBScript.RegExp") RgExp.Pattern = "\d+" RgExp.Global = True For Each cel In Selection If cel <> "" Then n = 1 'Single cell array formulas can be converted, but not multi-cell formulas If cel.HasArray Then n = cel.CurrentArray.Cells.Count If n = 1 Then i = 0 cel.Formula = cel.Value Set oMatches = RgExp.Execute(cel) If oMatches.Count > 0 Then For Each oMatch In oMatches i = InStr(i + 1, cel, oMatch) cel.Characters(i, Len(oMatch)).Font.Subscript = True Next End If End If End If Next Set oMatch = Nothing Set oMatches = Nothing Set RgExp = Nothing End Sub

How to use:

  1. Select and copy above code (Ctrl + C).
  2. From Excel, open the Visual Basic Editor (VBE) with ALT + F11.
  3. Select any file on left, choose Insert-Module from the menu.
  4. Paste code in right pane (Ctrl + V).
  5. Close the VBE with ALT + Q.
  6. Note that the GetFormula and GetElements functions call each other. If you just want to use one of them, you must install the other as well. The SubscriptFormula sub is completely independent.
  7. The GetElements function requires the RegExp object, which is normally installed with Internet Explorer 5 or later.
  8. The SubscriptFormula sub won't process cells belonging to multi-cell array formulas.
 

Test the code:

  1. To use the SubscriptFormula sub, select the cells to be processed.
  2. ALT + F8 to open the macro selector, select the SubscriptFormula sub, then click the "Run" button. Note that any existing formulas in these cells will be eliminated!
  3. To use the GetFormula function, you will need a list of atomic symbols in the desired order for the final formula. You then create a formula like this: =GetFormula(A1,Elements) where A1 contains the traditional formula and Elements is a named range containing the list of elements.
  4. To use the GetElements function, you need a table whose captions are the atomic symbols. Select the cells in a row of that table, then enter the array formula =GetElements(A1, Elements) where A1 contains the traditional formula and Elements is a named range containing the list of elements. Because it is an array formula, you must hold the Control and Shift keys down while pressing Enter!
 

Sample File:

ChemicalParsing.zip 17.24KB 

Approved by mdmackillop


This entry has been viewed 127 times.

Please read our Legal Information and Privacy Policy
Copyright @2004 - 2020 VBA Express