Hi TBQuerier,
I think the ThinBASIC Tokenizer module can do what you ask for. Have a look at this basic example (derived from code by Eros):
Uses "Console", "Tokenizer"
Function TBMain()
String MyBuffer ' -- Will contain string buffer to be parsed
Long CurrentPosition ' -- Current buffer pointer position
Long TokenMainType ' -- Will contain current token main type
String Token ' -- Will contain current string token
Long TokenSubType ' -- Will contain current token sub type
' -- Parser tuning
%CustomKeywords = 100
%CustomKeyword_Var = 1
%CustomKeyword_String = 2
Tokenizer_KeyAdd("VAR" , %CustomKeywords, %CustomKeyword_Var)
Tokenizer_KeyAdd("STRING" , %CustomKeywords, %CustomKeyword_String)
Tokenizer_Default_Set(";", %TOKENIZER_DEFAULT_NEWLINE)
' -- Prepare text for parsing
MyBuffer = "var sVar : string;" +
"sVar := ""whats happening"""
' -- Init current buffer position. THIS IS IMPORTANT
CurrentPosition = 1
' -- Loops until token is end of buffer
While TokenMainType <> %TOKENIZER_FINISHED
' -- Here we are. Most important point here is that all passed parameters
' must be a single variable and not an expression. This is necessary because
' parameters are passed by reference in order to return information about token
' --
' MyBuffer must contain the string you want to parse
' CurrentPosition must be initialized to 1. After execution this parameter will contains
' current position just after current token
' TokenMainType on exit, it will contain the main type of the token found
' Token on exit, it will contain the string representation of the token found
' TokenSubType on exit, it will contain the sub type of the token found (if relevant)
' --
Tokenizer_GetNextToken(MyBuffer, CurrentPosition, TokenMainType, Token, TokenSubType)
' -- Write some info
PrintL LSet$(Token, 32) + DecodeType_ToString(TokenMainType, TokenSubType)
Wend
PrintL "Press any key to quit..."
WaitKey
End Function
Function DecodeType_ToString( nType As Long, nSubType As Long ) As String
String sResult
Select Case nType
Case %TOKENIZER_FINISHED
Return "Tokenizer finished..."
Case %TOKENIZER_ERROR
sResult = "Error"
Case %TOKENIZER_UNDEFTOK
sResult = "Undefined token"
Case %TOKENIZER_EOL
sResult = "End of line"
Case %TOKENIZER_DELIMITER
sResult = "Delimiter"
Case %TOKENIZER_NUMBER
sResult = "Number"
Case %TOKENIZER_STRING
sResult = "String"
Case %TOKENIZER_QUOTE
sResult = "Quoted"
Case %CustomKeywords
sResult = "Custom keyword / " + Choose$(nSubType, "%CustomKeyword_Var", "%CustomKeyword_String")
End Select
Return sResult
End Function
You can check ThinBasic/SampleScripts/Tokenizer for 2 more examples.
Petr
Bookmarks