Idea about "pseudo-variable" target functions [Archive] - thinBasic: Basic Programming Language Community Forum

View Full Version : Idea about "pseudo-variable" target functions

Robert Hodge

12-07-2013, 16:40

Anyone who has used PL/1 is familiar with what they call "pseudo-functions", which in modern terms are what we would now call objects with assignment-operator semantics. PL/1 calls them pseudo-variables because they can be assigned-to the same way as conventional variables can. For example, Basic has a MID$ function, which corresponds to a PL/1 SUBSTR. Examples:

PL/1:

DECLARE (ONE,TWO) CHAR(4);
SUBSTR(ONE,1,2) = SUBSTR(TWO,3,2);

Basic:

DIM ONE, TWO AS STRING
MID$ (ONE,1,2) = MID$(TWO,3,2)

ThinBasic calls the MID$ on the left side a "MID$ Statement", since there is no other grammatical role that could be given to it. This is all well and good as far as MID$ is concerned, but in language terms, it is "cheating". That is, this technique works great for MID$, and for the few other functions that have an alternative "statement" form to them, but it doesn't help me if I want to make my OWN "statement-like" functions. I just can't do it. ThinBasic creates this capability, but reserves it only for itself, not allowing users to write their own such syntax.

I believe it would be of benefit to allow this technique to be applied on a general basis.

In order to do that, a new procedural construct would have to be introduced. Along with SUB procedures and FUNCTION procedures, there would be TARGET procedures.

In some ways, a TARGET is like a "setter" function, a concept sometimes used in languages that support "accessors" or "attribute functions".

One possible use of a FUNCTION/TARGET pair is to define access to an array with bounds that do not start with 1. Another use might be to define "mapping" or "associative storage" functions which store data based on keys. The Dictionary module implements such logic, but targets would allow the use of Dictionaries in a much more natural notation.

A TARGET would have the following characteristics:

A target is a SUB-like block of code, begun with TARGET and ending with END TARGET
A target accepts a parameter list like a SUB procedure does
A target has an AS type-name clause, like a FUNCTION does. However, for a TARGET, the AS type-name specifies the expected type of value to be assigned to the target. For example, a SUBSTR-like target would have an AS STRING clause, since it would be expected that only strings would be assigned to the target.
A target may have the same name as a FUNCTION. Thus, the same program may have both a FUNCTION ABC and a TARGET ABC.
When both a FUNCTION and a TARGET exist with the same name, the implementation code for these two procedures are completely different and independent. It is up to the developer to decide how the two procedures relate to each other.
A target is only recognized as being referenced when it appears on the left side of an = sign
Inside the code of a target, a built-in function would be required to access the value on the right-hand side of the = sign. This built-in function could have any appropriate name, such as RHS, VALUE, SOURCE, etc. The exact name isn't as critical as the fact that this function is needed. For sake of discussion, let's say this function is called RHS.
In a TARGET, the RHS function is polymorphic, in the sense that its type is always the same as the AS type-name clause on the TARGET header statement. For a given TARGET procedure, RHS has only one type, and so is not the same thing as a VARIANT. There is no need to define multiple spellings, such as RHS, RHS$, etc. because the type of RHS is known by the AS type-name clause.
If a TARGET had knowledge of compound assignment operators, such as +=, it would open up a number of intriguing possibilities. However, it would also make the semantics more complex, perhaps prohibitively so. For that reason, this proposal does not define semantics for implementing compound assignment operators, except in the most simple way. That is, if ABC is both a function and a target taking one numeric parameter, then ABC(5) += 1 would be interpreted as ABC(5) = ABC(5) + 1, where the ABC on the left side of the = (and the left of the original += ) would be a reference to the TARGET, and the right-hand ABC is a reference to the FUNCTION of the same name.

Simple (though not terribly useful) example: Define access to a zero-based array:

DIM __MYHEX(16) AS STRING = "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F"

FUNCTION MYHEX (N AS LONG) AS STRING
IF N >= 0 AND N <= 15 THEN RETURN __MYHEX(N+1)
RETURN ""
END FUNCTION

TARGET MYHEX (N AS LONG) AS STRING
IF N >= 0 AND N <= 15 THEN __MYHEX(N+1) = RHS
' error handler could be placed here for N out of range
END TARGET

DIM I AS LONG
FOR I = &H0A TO &H0F
' in next statement, left-hand MYHEX is a target, while right-hand MYHEX is a function:
MYHEX(I) = LCASE$ (MYHEX (I))
NEXT

Charles Pegge

13-07-2013, 08:12

If function overloading overloading is supported then pseudo-variables can be expressed quite easily, by loosening the call syntax. Essentially, replacing a "," with ")="

For instance these two expressions do the same thing:

f(1,2*f(2))

f(1)=2*f(2)

code:

dim as int aa[4]={10,20,30,40}

function f(byval i as int , byval a as int)
aa[i]=a
end function

function f(byval i as int ) as int
return aa[i]
end function

f(1,2*f(2))

print f(1) 'result: 40

f(1)=2*f(2)

print f(1) 'result 40

To enforce pseudo-variable syntax, '=' could be placed in the prototype like this

function f(byval i as int ,= byval a as int)
...

Robert Hodge

13-07-2013, 15:47

Your comment about putting an = sign in the function prototype gives me an idea.

Right now, a function can set a return value by assigning it to the keyword FUNCTION. Let's rewrite your (conventional) function that way:

DIM AA(4) AS INT = 10,20,30,40

FUNCTION F (BYVAL I AS INT) AS INT
FUNCTION = AA (I)
END FUNCTION

Note that in a (conventional) function, the AS clause on the FUNCTION level corresponds to a RETURNS keyword in languages like PL/1, where it describes the data type of the returned value.

Now, the "target" function can be rewritten as the reverse of the (conventional) function, by using FUNCTION to represent the 'source' of the data that appears on the right hand side of the = sign. That way, it's not necessary to invent a new keyword like RHS, VALUE or SOURCE.

The only thing left is to define some kind of syntax to show this is a 'target' function instead of a conventional one. We can do that also without a new keyword, by replacing the AS keyword on the FUNCTION level with an = sign, like this:

FUNCTION F (BYVAL I AS INT) = INT ' the = means this is a target function
AA (I) = FUNCTION ' FUNCTION means the R.H.S. of the = sign
END FUNCTION

' use the target and conventional functions:

F(1) = 2 * F(2)

PRINT F(1) 'result: 40

One final point: I originally wrote about using target functions in combination with compound operators like +=. Because a target function and a conventional one of the same name have different implementation code, there's nothing to prevent the two things from having different function signatures. However, if you used a target with a compound assignment, the two signatures would have to be compatible.

So, you could have a function
FUNCTION ABC (A AS INT) AS INT
and a target
FUNCTION ABC (A AS INT, B AS INT) = INT

but if you tried to say ABC(1,2) += 1 it wouldn't work, because this would imply
ABC(1,2) = ABC(1,2) + 1
and while there a target compatible with ABC(1,2) there's no conventional function that could be called as ABC(1,2), so this would be a syntax error.

That means, in most cases, when there is both a target and a conventional function defined, they would pretty much have to have compatibe signatures.

ReneMiner

18-07-2013, 10:00

Reading this brought me to another idea...

It's very often the case that one calls a (user-defined) function and needs that same result a few times in a row. So not to call the same function again one had to store that result local somehow. Now I don't know how this works - if functions result is laying on a stack or cache until it gets passed back and if it gets destroyed then or if it stays static as some functions-variable unchanged until the function gets called again where it gets new initialized.

It would be nice to "recall" the last functions result whithout running the function again

small example:

If myFunc(1,2,3) < 0 Then
'...
ElseIf Function_RecallResult("myFunc") > 0 Then
'...
Else
'...
Endif

Robert Hodge

18-07-2013, 15:57

The trick would be in deciding how to 'refer' to the prior function results.

Let's take your example:

' example 1
If myFunc(1,2,3) < 0 Then
'...
ElseIf Function_RecallResult("myFunc") > 0 Then
'...
Else
'...
Endif

This is essentially the same as:

' example 2
DIM RESULT AS NUMBER = myFunc(1,2,3)
If RESULT < 0 Then
'...
ElseIf RESULT > 0 Then
'...
Else
'...
Endif

But, of course, the point is that you don't WANT to declare the result; you just want to be able to use it.

There are a lot of ways you could go about doing this, but to get this done with the least impact to the syntax, I believe the easiest way to do it would be with an "in-line assignment declaration". The idea is that you 'name' the result in a very concise way, and then you can quickly reuse it. Suppose we use inline syntax of AS within an expression to mean (a) declare the variable on the left side of AS to be of the same type as the expression on the right side of the AS and then (b) assign the expression to the variable, and finally (c) use the value of the expression in the remainder of the statement.

In C, you can do something *like* this:

/* C example */
int F;
if ((F = myFunc(1,2,3) < 0) {
/* ... */
}
else if (F > 0) {
/* ... */
}
else {
/* ... */
}

But, depending on the C/C++ version you have, you may or may not be able to *declare* F, and sometimes the 'scope' of F gets 'cut short' before you are done using it.

Let's rewrite your example using this "in-line assignment declaration" idea:

' example 3
If F AS myFunc(1,2,3) < 0 Then
'...
ElseIf F > 0 Then
'...
Else
'...
Endif

The main issue is deciding on a syntax. Using AS is just a suggestion.

Other possible syntax could perhaps use :: or == for this:

If F :: myFunc(1,2,3) < 0 Then
If F == myFunc(1,2,3) < 0 Then

The exact syntax isn't as critical as getting a correct design.

ReneMiner

18-07-2013, 21:11

Maybe user-defined functions (probably not RGB() or other already built-in stuff) could work alike this:

PrintL f ' prints 123
PrintL Function_RecallResult("f") ' prints 123
PrintL f ' prints 246...

Function Static f() as Long

Function += 123

End Function

... could offer a way to carry some value from one local scope to another also...

Robert Hodge

18-07-2013, 22:51

I think it's important not to get too carried away with the syntax. After all, the main idea is that you are trying to accomplish two things here. First, you are trying to save some typing, and you hope to avoid calling the same function twice.

Let's go back to the example. Instead of naming the function, let's just say that if a function call was prefixed with @, it means you want to use it again in some forward reference by just the @ sign alone. That would make the function look like this:

If @myFunc(1,2,3) < 0 Then
'...
ElseIf @ > 0 Then
'...
Else
'...
Endif

If you had more than one function you wanted to reference as a 'shorthand' you might need more syntax than this - maybe a named reference like F@myFunc:

If F@myFunc(1,2,3) < 0 Then
'...
ElseIf F@ > 0 Then
'...
Else
'...
Endif

As for the basic question of saving execution time, you are talking about an optimization effort. For an interpreter, it's not likely that optimizing it would be worth the effort, when you could simply assign the function result to a local variable and then reuse it.

In PL/1, they have a concept called REDUCIBLE and IRREDUCIBLE. If you have a function that is REDUCIBLE, and it's using with the same parameters, the results of the prior call are saved and then reused. In your example, the call to myFunc(1,2,3) is 'remembered' and then if the function is called with 1,2,3 again, the prior returned value is used again. When a function modifies BYREF parameters, or uses STATIC data, or perhaps works with transient data like timer values, or deals with values created in other threads, then the same passed parameters won't guarantee the same results, which means the function is IRREDUCIBLE.

In the case of PL/1, it took IBM a long time to implement the IRREDUCIBLE feature. It is probably too much to ask for thinBasic to do something like that, when its primary function to be an interpreter. Trying to detect and optimize 'reducible' functions would probably take more time than it would take to just run them. And, any function that was so complex that making it 'reducible' might help some is a situation that the developer would realize too, and they would just save off the value if it were needed to be reused.

But, suppose you really needed to optimize the function, and there wasn't anything in thinBasic to help you? You could do it yourself, by writing a "front-end" function that cached prior operands and results. If any parameters were repeated, you could pass back the cached results.

Now, let's say you had some complex function you wanted cached this way, and you wanted thinBasic to help. That would be a relatively straight-forward change to the syntax.

Example: Suppose you wanted the last 9 passed parameters and the result saved internally, so that if they were repeated you'd get the same results. You could do this by putting a CACHE option on the FUNCTION header:

FUNCTION myFunc (A AS LONG, B AS LONG, C AS LONG) WITH CACHE(9) AS LONG
' ...
END FUNCTION

Then, each time that myFunc(1,2,3) is called, for instance, code in the beginning of the function would do a table lookup of the parameters, and if a match was found, the same result calculated before would be returned again.

In this case, the contents of the 'CACHE' would essentially look like this internally:

TYPE CACHE_TYPE
RESULT AS LONG
A AS LONG
B AS LONG
C AS LONG
END TYPE

DIM CACHE(9) AS CACHE_TYPE
STATIC LAST AS LONG = 0 ' keeps number of saved entries in cache
STATIC CURR AS LONG = 0 ' next slot to be used

' on entry to function, determine if there is a cache hit:

DIM I AS LONG

FOR I = 1 TO LAST
IF A = CACHE.A(I) AND A = CACHE.A(I) AND A = CACHE.A(I) THEN
RETURN CACHE.RESULT(I)
END IF
NEXT

CURR += 1
IF CURR > 9 THEN CURR = 1
IF CURR > LAST THEN LAST = CURR

CACHE.A(CURR) = A
CACHE.B(CURR) = B
CACHE.C(CURR) = C

' ... remainder of function

ReneMiner

18-07-2013, 23:05

Robert Hodge

19-07-2013, 01:15

In most cases such function runs a few hundred or thousand times so in the end it would be using up more time to check if those parameters were passed already and also the function could calculate with different other globals then. Such cache is not secure i think

Yes, and that would be the deciding factor. Only the developer would really know if the same arguments would ever produce different results. Suppose the function were to fetch data from a read-only file, but locating it was time-consuming. A cache might make sense then.

The issue is no different than trying to save the results locally from a function and reusing them.

Example:

DIM RESULT = MYFUNCTION(1,2,3)

IF RESULT > 0 THEN PERFORM_SOME_ACTION ()

' ... several statements later ...

IF RESULT > 0 THEN PERFORM_SOME_ACTION () ' ... is this still valid ?

So, even without any change to thinBasic, you can cache results yourself, and you still have the same question to answer: Are the cached results still valid?

This is where it goes beyond grammar and language and syntax; it's a matter of proper system design.

Robert Hodge

19-07-2013, 01:18

ReneMiner

19-07-2013, 10:05

OT

@Rene & Robert - Have you guys thought about collaborating and write a BASIC of your own? You could call it LikeBasic and roll all those cool ideas into a BASIC (like) language without any restrictions. (or direction) I would suggest writing it in OxygenBasic for the most flexibility and least risk. (open source with a bright author)

I would not come to the idea to create a whole basic language myself - and we would call it "R&R-Basic" or maybe "DoubleRCode" - but maybe one day I'm gonna create some tB-module "LazyFun(ctions)" just using tB itself.

This is a forum section just about growing ideas out of our cerebrospinal fluids - these are not requests to built it into tB - therefore we use the Support-section (http://www.thinbasic.com/community/project.php?do=issuelist&projectid=1&issuetypeid=feature). If we have an idea - that one of us thinks, it might be a good one, we can discuss this here until the idea becomes something solid ready to post in support or until it's off the table. So anyone who is looking for ideas to freshen up his own basic-project or wants to know about end-users needs, opinions, leanings can snoop in here and even participate and throw own ideas and experience in or just silently read and pick up the best ideas and develop them to make his own basic more attracting to new users. Sadly - and thats the case with almost any software which was designed by one person mostly - the designer/developer does not come upon the idea to use his works in another way as he designed it because he thinks and knows "programming has to work this way!"

If thinBasic wasn't that good as it is - "the ultimate All-In-One-Package" - I would not bother using it nor trying to improve it then and just use something else.

As you see in the title, this thread was mentioned to go into some totally different direction - so my idea grew just from reading it yesterday, because I thought to myself:
"How often do I need some function-result just two or three times - and how often do I struggle to find some currently unused local variable for this and how often did I had to accept in a galled mood, that there's no other way than to create some additional local var?" - and I had to tell the world about it.
Today I would suggest this syntax:

Function f() as Static Long

and I still think if the programmer is sure the conditions are the same he can re-use the result. But if one knows already that parameters might have changed one would not re-use it anyway.
The "cache" could work if the function itself were an Array...

Your comment about putting an = sign in the function prototype gives me an idea.

Right now, a function can set a return value by assigning it to the keyword FUNCTION...

There's also "Return"

Function f()

If A Then Return B ' assign B to the function and exit

End Function

Edit:
John, saw your post just after I posted mine. The "End"-idea won't work in general especially when using RawText to setup some oxygen-script within which holds a lot of "End"s...

Robert Hodge

19-07-2013, 16:45

Nice project Robert!

You really should look at ScriptBasic. By design it's an embeddable scripting engine. It would take little effort to incorporated SB into your editor. Another idea is to use JavaScript like UltraEdit has done.

I believe the quote you cited is from the FAQ page of SPFLite as of version 6. We are now on version 7, which uses thinBasic as its script engine. There have been a few snags now and then, but overall it's worked quite well.

I can mention ScriptBasic to George, but since he's already done quite a bit of work integrating thinBasic into SPFLite, he probably would not change to another engine without a strong case for it.