3
\$\begingroup\$

I have written a VBA function that will do parsing / replacement according to tokens / placeholders.

Example:
Input string: Username %u, date is %d.
The function would replace %d with the current date and %u with the current username; the output would look like:
Username Andy, date is 20170820.

Sounds simple to implement, but here's a twist: Regexps and Replace() are not safe. In the example above, if %u after replacement would contain another token (let's say Andy%d), there will be a recursive replacement and the output will mangled: Username Andy20170820, date is 20170820.

I could write this efficiently in C++ or some other "proper" language but I'm relegated to the VBA. I don't want to work on Chars inside a string as that doesn't look very efficient (and I might be using this formula to parse 10000 lines in an Excel sheet).

Here's my solution; it works, looks good and stable and allows for easy extension / customisation, but I'm not sure I have written it in the most efficient way (and in the best style). Your input would be very much appreciated :)


Function AI_ParseMagicSymbols(ByVal TextToParse As String) As String

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'' Replace magic symbols (placeholders) with dynamic data.
''
'' Arguments: a string full of magic.
''
'' Placeholders consist of one symbol prepended with a %:
''    %d - current date
''    %t - current time
''    %u - username (user ID)
''    %n - full user name (usually name and surname)
''    %% - literal % (placeholder escape)
''    Using an unsupported magic symbol will treat the % literally, as if it had been escaped.
''    A single placeholder terminating the string will also be treated literally.
''    Magic symbols are case-sensitive.
''
'' Returns:   A string with no magic but with lots of beauty.
''
'' Examples:
'' "Today is %d" becomes "Today is 2018-01-26"
'' "Beautiful time: %%%t%%" yields "Beautiful time: %16:10:51%"
'' "There are %zero% magic symbols %here%.", true to its message, outputs "There are %zero% magic symbols %here%."
'' "%%% looks lovely %%%" would show "%% looks lovely %%" - one % for the escaped "%%" and the second one for the unused "%"!
''
'' Alexander Ivashkin, 26 January 2018
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''

Dim sFinalResult As String
Dim aTokenizedString() As String
Dim sTempString As String
Dim sPlaceholder As String
Dim sCurrentString As String
Dim iIterator As Integer
Dim iTokenizedStringSize As Integer
Dim bThisStringHasPlaceholder As Boolean

' Default placeholder is "%"
Const cPlaceholderSymbol As String = "%"

aTokenizedString = Split(Expression:=TextToParse, Delimiter:=cPlaceholderSymbol)
iTokenizedStringSize = UBound(aTokenizedString())
bThisStringHasPlaceholder = False
sFinalResult = ""

For iIterator = 0 To iTokenizedStringSize
    sCurrentString = aTokenizedString(iIterator)

    If bThisStringHasPlaceholder Then
        If sCurrentString <> "" Then
            sPlaceholder = Left(sCurrentString, 1)
            sTempString = Right(sCurrentString, Len(sCurrentString) - 1)

            ' This is the place where the MAGIC happens
            Select Case sPlaceholder
                Case "d":
                    sCurrentString = Date & sTempString
                Case "t":
                    sCurrentString = Time & sTempString
                Case "u":
                    sCurrentString = Environ$("Username") & sTempString
                Case "n":
                    sCurrentString = Environ$("fullname") & sTempString
                Case Else:
                    sCurrentString = cPlaceholderSymbol & sCurrentString
            End Select
        Else
            ' We had two placeholders in a row, meaning that somebody tried to escape!
        sCurrentString = cPlaceholderSymbol
        bThisStringHasPlaceholder = False
    End If
End If

sFinalResult = sFinalResult & sCurrentString

If sCurrentString = "" Or (iIterator + 1 <= iTokenizedStringSize And sCurrentString <> cPlaceholderSymbol) Then
    ' Each string in the array has been split at the placeholders. If we do have a next string, then it must contain a magic symbol.

    bThisStringHasPlaceholder = True
    ' Even though it is called "...ThisString...", it concerns the NEXT string.
    ' The logic is correct as we will check this variable on the next iteration, when the next string will become ThisString.
Else
    bThisStringHasPlaceholder = False
End If

Next iIterator

AI_ParseMagicSymbols = sFinalResult

End Function
\$\endgroup\$
5
  • \$\begingroup\$ Welcome to CR! I hope you get good reviews! In the meantime I would suggest you read Making Wrong Code Look Wrong, an excellent article by Joel Spolsky about Hungarian Notation, which I'm sure is going to be brought up in the reviews you'll be getting. \$\endgroup\$ Commented Jan 26, 2018 at 21:12
  • \$\begingroup\$ Perhaps remove all '%' from the inputs (e.g. VBA's Replace(<InputString.>,"%",""). Perhaps not an elegant solution but that depends on what is considered 'safe' and how far surgical removal will go. \$\endgroup\$
    – AJD
    Commented Jan 27, 2018 at 6:00
  • \$\begingroup\$ Hi @AJD, thanks for you comment. Unfortunately I can't see how removing tokens would help :) What would be the next step after I remove all the "%"'s? \$\endgroup\$
    – Alexander
    Commented Jan 27, 2018 at 8:45
  • \$\begingroup\$ Removing tokens (only on inputs to the right of sCurrentString=[...] the would help because your example would be Andyd instead of Andy%d. Thus, the input is now safe from the problem you correctly identified. In terms of safety, There are other ways to address this but my draft (now discarded) answer just kept getting more and more complicated. \$\endgroup\$
    – AJD
    Commented Jan 27, 2018 at 19:49
  • \$\begingroup\$ As for the sanitizing data that is returned by functions - yes, I had thought about this, too; however, there is a flaw: you change the output! The % symbol could be an integral part of the data; even in my example with a username - if we make Andy%d look like Andyd, then we're basically talking about a different username! Another example: a recursive manual for that very function that would take: Use %z placeholder for Date. The function would replace %z with the actual placeholder: Use %d placeholder for Date. \$\endgroup\$
    – Alexander
    Commented Jan 29, 2018 at 9:03

1 Answer 1

1
\$\begingroup\$

My immediate idea, without looking at your code, was to use split. I tried to optimize your code by using some tricks:

  • To get rid of the problem with the double %, I replace this by a dummy-string (that will not appear in your input, hope you can accept this).
  • I leave the first token alone. Either the input starts with a % - in that case it's empty anyhow, or it doesn't, so we don't have to look at it at all.
  • At the end, the dummy-string is replaced back with a single %

I ended up with:

Function AI_ParseMagicSymbols(ByVal s As String) As String
    Const PlaceholderSymbol As String = "%"
    Const ReplaceEscapedChar As String = "<ESCAPEPERCENT>"

    s = Replace(s, PlaceholderSymbol & PlaceholderSymbol, ReplaceEscapedChar)
    Dim tokens() As String, i As Long
    tokens = Split(s, PlaceholderSymbol)

    AI_ParseMagicSymbols = tokens(0)         ' We don't treat the first token
    For i = 1 To UBound(tokens)
        Dim token As String, placeHolderChar As String, replaceStr As String
        token = tokens(i)
        placeHolderChar = Left(token, 1)

        ' This is the place where the MAGIC happens
        Select Case placeHolderChar
            Case "d":
                replaceStr = Date
            Case "t":
                replaceStr = Time
            Case "u":
                replaceStr = Environ$("Username")
            Case "n":
                replaceStr = Environ$("fullname")
            Case Else:
               replaceStr = "%" & placeHolderChar ' No magic here, keep % and the first char of token
        End Select

        AI_ParseMagicSymbols = AI_ParseMagicSymbols & replaceStr & Mid(token, 2)
    Next i

    AI_ParseMagicSymbols = Replace(AI_ParseMagicSymbols, ReplaceEscapedChar, PlaceholderSymbol)

End Function
\$\endgroup\$
1
  • \$\begingroup\$ Thanks @FunThomas, I like your (slightly) shorter version :) Although using any escape characters (or sequences) is not really "pure" ;) \$\endgroup\$
    – Alexander
    Commented Feb 12, 2018 at 14:06

Not the answer you're looking for? Browse other questions tagged or ask your own question.