View Full Version : Python Sets
14-10-2010, 07:39
[font=courier new][size=8pt]Here is a Python program which uses the built in data type, set, in a function
which determines if a passed string qualifies as a valid thinBasic variable
Sets make certain operations very easy.
:oops: :x :twisted:
# file = ""
# fixed: 2010-10-15
# This is a Python script that determines which in a list of identifiers (strings) are valid thinBasic variable names.
# If you download it, change its name from, "", to, "".
# I made it using Python verson 3.1.2.
# To run it, you need Python installed on your computer.
# To run it using versions before 3, remove the parentheses from the print statements;
# i.e., use, "print 'hello'", not, "print('hello')".
# To run it, you can open a command window in the folder that contains the file
# (shift + right click on the folder, "Open command window here", from the menu).
# Then, execute the command, "python".
# global variables
# thinBasic variable names are case insensitive.
# thinBasic variable names can be any length greater than 0.
# Currently, the last character in a variable name cannot be, '_'.
# The first character in a thinBasic variable name must be either, '_', 'a-z', or, 'A-Z'.
firstchar = None
# A middle character in a thinBasic variable name must be either, '_', 'a-z', 'A-Z', or, '0-9'.
midchars = None
# The last character in a thinBasic variable name must be either, 'a-z', 'A-Z', or, '0-9'.
lastchar = None
# functions
def setsets():
global firstchar, midchars, lastchar
l = list(s)
firstchar = set(l)
s = '0123456789'
l = list(s)
midchars = set(l)
midchars = midchars.union(firstchar)
lastchar = set()
lastchar = lastchar.union(midchars)
def validvariablename(s):
s = s.upper()
l = list(s)
ll = len(l)
if ll == 0:
return False
i = 1
for c in l:
if i == 1:
if not c in firstchar:
return False
if i == ll:
if not c in lastchar:
return False
if i > 1 and i < ll:
if not c in midchars:
return False
i += 1
return True
# program
n = list(range(13))
n[0] = ''
n[1] = '_'
n[2] = 'a_'
n[3] = '6'
n[4] = 'd'
n[5] = '_q'
n[6] = '_____________________________________________________6'
n[7] = '____________________________________a___________________________________________b'
n[8] ='gjtuhjy897058#bnguy87'
n[9] = 'TMGIUKJ_fhrydhe_6978576_3867thgy_HTY6UY87IU'
n[11] = '$'
n[12] = 'biykh9708ou968turhfyt758395867fhvnbmgkhiukjotlgpedhtyfhcnghtyr856903euthdrut8679whdbcnvjgktoy978463tergdhfncmvjfgur86otiyughdjnvqyrhgnvmbjguy8679486849586uyjhkfmvhgnbutkfieldotiykhiukfjhmbnv78yi0tuy76urhfjhmvnbhgyturjdhfyrht78590uijkhmnljouitehdfgvnbmhjyito7980fhrudjnvmbkhiykhoedj5utjguyjhiukjmnjhnbgjhuy7uryehdgfbvhguy87958374ythfnvgjbmhkuiykdcnvhfngytu6759123dgrtdgvnghbjhmnkjiuolpdrujgkyihknmbjguhyjtoedjgnvmbjhuy48tuyjhgmbnkhiukjoedplazmcnvhgythfjguyjhitkgolrgjhmbjhuyjgutyikolpedjgmbjhmnkhiykhiu8975tuyjhu68793edjgnbmgjhmnkjiuf'
print('Valid thinBasic variable name?')
i = 0
for s in n:
print(i, validvariablename(s))
i += 1
15-10-2010, 06:26
[font=courier new][size=8pt]There was a mistake in the code above. Now, I have fixed it.
The function, "validvariablename()", was,
def validvariablename(s):
s = s.upper()
l = list(s)
ll = len(l)
if ll == 0:
return False
i = 1
for c in l:
if i == 1:
if not c in firstchar:
return False
if i == ll:
if not c in lastchar:
return False
if not c in midchars:
return False
i += 1
return True
[font=courier new][size=8pt]It should have been,
def validvariablename(s):
s = s.upper()
l = list(s)
ll = len(l)
if ll == 0:
return False
i = 1
for c in l:
if i == 1:
if not c in firstchar:
return False
if i == ll:
if not c in lastchar:
return False
if i > 1 and i < ll:
if not c in midchars:
return False
i += 1
return True
[font=courier new][size=8pt]It was only by luck, that it gave the correct answers.
Petr Schreiber
16-10-2010, 00:06
Hi Dan,
thanks for your journeys to other languages, it is very interesting.
Rough translation of your code to TB could look like:
' Variation of Python code by Dan
' global variables
' thinBasic variable names are case insensitive.
' thinBasic variable names can be any length greater than 0.
' Currently, the last character in a variable name cannot be, "_".
' The first character in a thinBasic variable name must be either, "_", "a-z", or, "A-Z".
' A middle character in a thinBasic variable name must be either, "_", "a-z", "A-Z", or, "0-9".
' The last character in a thinBasic variable name must be either, "a-z", "A-Z", or, "0-9".
Uses "Console"
' program
Global firstchar, midchars, lastchar As String
Function TBMain()
Dim n(13) As String = "",
PrintL "Valid thinBasic variable name?"
Dim i As Long, s As String
For i = 1 To UBound(n)
s = n(i)
PrintL i, validvariablename(s)
End Function
' functions
Function setsets()
Dim s As String
firstchar = s
s = "0123456789"
midchars = s
midchars = CharacterUnion(midchars, firstchar) ' OR just midchars += firstchar for lazy union with duplications
lastchar = CharacterUnion(lastchar, midchars) ' OR just lastchar += midchars for lazy union with duplications
lastchar = Remove$(lastchar, "_")
End Function
Function validvariablename(s As String) As Long
Dim i As Long
s = Ucase$(s)
Dim ll As Long = Len(s)
If ll = 0 Then Return FALSE
Dim characters(ll) As String * 1 At StrPtr(s)
Dim c As String
For i = 1 To ll
c = characters(i)
If i = 1 Then
If InStr(firstchar, c) = 0 Then Return %FALSE
ElseIf i = ll Then
If InStr(lastchar, c) = 0 Then Return %FALSE
If InStr(midchars, c) = 0 Then Return %FALSE
End If
Return %TRUE
End Function
Function CharacterUnion(firstSet As String, secondSet As String) As String
Dim i As Long
Dim result As String
Dim character As String
result = firstSet
For i = 1 To Len(secondSet)
character = Chr$(Asc(secondSet, i))
If InStr(result, character) = 0 Then result += character
Return result
End Function
When we look at the task as a such, it could be coded much simpler like:
' Testing if string has valid name
' global variables
' thinBasic variable names are case insensitive.
' thinBasic variable names can be any length greater than 0.
' Currently, the last character in a variable name cannot be, "_".
' The first character in a thinBasic variable name must be either, "_", "a-z", or, "A-Z".
' A middle character in a thinBasic variable name must be either, "_", "a-z", "A-Z", or, "0-9".
' The last character in a thinBasic variable name must be either, "a-z", "A-Z", or, "0-9".
Uses "Console"
Function TBMain()
Dim n(13) As String = "",
PrintL "Valid thinBasic variable name?"
Dim i As Long, s As String
For i = 1 To UBound(n)
s = n(i)
PrintL i, validvariablename(s)
End Function
Function validvariablename(s As String) As Long
' -- Assigned just once, yet we avoic globals
Static firstchar As String = "_ABCDEFGHIJKLMNOPQRSTUVWXYZ"
Static midchars As String = "0123456789" + firstchar
Static lastchar As String = Remove$(midchars, "_")
s = Ucase$(s)
Select Case Len(s)
Case 0
Return %FALSE
Case 1
' -- First char test
Return IIf(Verify(s, firstchar) = 0, %TRUE, %FALSE)
Case Else
' -- Last char test
If Verify(RIGHT$(s, 1), lastChar) > 0 Then Return %FALSE
' -- Middle char test
If Verify(Mid$(s, 2, SelectExpression-2), midchars) > 0 Then Return %FALSE
End Select
Return %TRUE
End Function
But I must admit the idea of sets and ability to perform operations on them is quite interesting to me!
16-10-2010, 07:27
[font=courier new][size=8pt]Hi Petr.
At first I got runtime errors when I ran your two programs. It didn't like the declaration of n(13) on the separate lines. I put all of the strings on the same very long line, and then the two programs ran OK. (Maybe my version of thinBasic is not new enough, I haven't updated it for awhile.)
Now, I see that an underscore can be the last character in a variable name. The problem I had was when I used Console_WriteLine.
For this code, I get a runtime error.
Uses "Console"
Dim abc_ As Integer
abc_ = 1
Console_WriteLine abc_
[font=courier new][size=8pt]But, this code runs OK.
Uses "Console"
Dim abc_ As Integer
abc_ = 1
[font=courier new][size=8pt]So, it seems the trouble can be avoided by using parentheses.
I didn't realize that thinBasic could do all of those things with strings.
You have demonstrated that thinBasic can easily replicate sets of characters.
There are other things which full implementations of the set type do, as you know.
I guess that implementing a set of strings in thinBasic, would be more work. I guess you could do it with an array of strings. You would have to make it be able to shrink and grow. And, you would have to make sure there were no duplicates.
I was thinking about trying to make a code checker for thinBasic programs. Something like, "PC-lint",
So, I knew that I would need to be able to determine what constitutes a valid thinBasic variable name. And, I thought I would use either Python or Ruby. That's what got me interested in sets. I don't know anything about parsers. But, intuitively, it seems to me that sets would be very helpful. I realize that making something useful would be a big job, so, most likely I am only dreaming. Life always seems to intrude on my plans.
( Here is something interesting. I got a copy of "PyCharm", The address on my receipt is,
JetBrains s.r.o.
Natalie Yaremych
Na Lysinach 443/57
14700 Praha 4
Czech Republic
You can visit her today - :P )
:oops: :twisted:
Petr Schreiber
16-10-2010, 09:31
Hi Dan,
the run-time error was caused because I used technique available from latest ThinBASIC Which feature is it? The newly introduced "implicit line continuation". Before (and in most BASICs of today), when you need to split command parameters over multiple lines, you should use line continuation character, which is underscore:
Dim n(13) As String = "", _
"_", _
"a_", _
"6", _
"d", _
"_q", _
"_____________________________________________________6", _
"____________________________________a___________________________________________b", _
"gjtuhjy897058'bnguy87", _
"TMGIUKJ_fhrydhe_6978576_3867thgy_HTY6UY87IU", _
"$", _
With TB you don't have to use it. But my fault I did not warned you about the requirement of the latest version.
To do such a thing, you can use the following in the code:
... and ThinBASIC will warn you about your version being old first automagically. I updated the code with it now.
Regarding sets - I think they could be easily implemented using new module approach for OOP. You could directly declare variable of type CharacterSet, which is more intuitive than using various workarounds. I did not checked if the OOP interface for modules is fully ready in current releases yet.
You can visit her today :P
Well, I am affraid the garden around her house will be guarded by pack of pythons, is it safe? :)
John Spikowski
24-10-2010, 06:10
Well, I am afraid the garden around her house will be guarded by pack of pythons, is it safe?
I installed Ubuntu 10,.10 on my laptop recently and noticed the packages that were most frequently referenced were Python and Gnome (Gtk). I'm beginning to put more effort in using Python and Gtk on Linux as a standard. PHP is a parsing scripting engine and too acceptable to injections of unwanted code. (even though there is a ton of code/applications available using it)
Peter Verhas once made the comment,
If I would have known Python was going to be so popular, I would have never written ScriptBasic.
It's a toss up if I tackle C or Python first.