View Full Version : Most efficient integer data type for small numbers?
What is the most efficient integer data type (for speed) for small numbers? For example, if I need a variable that will only contain a value between 1 and 10, is a byte data type the fastest?
Seems I read somewhere that in a 32 bit operating system, that a 32 bit variable (which would be a DWORD or LONG in thinBasic) is handled natively, and is the fastest. Other size integer types have to be converted into 32 bits, and hence are slower.
Any truth to this, or am I all wet?
Randall
ErosOlmi
12-09-2007, 03:23
Randall,
due to its interpretative nature, making such consideration in thinBasic is quite hard because most of the execution time is lost in interpreting the script and not reading/writing/converting data types or handling memory.
But back to your question, it mostly depend on the compiler used and the quality of the machine code generated.
Personally I've done many tests in the past on this matter using Power Basic compiler and I found that LONG are far the best interger class in terms of speed. Much much faster than DWORD. The reason is related to bit sign handling but this reason can be Power Basic specific and not a general reason.
Comparison example: http://www.powerbasic.com/support/pbforums/showthread.php?t=8818&highlight=LONG+DWORD+speed
From the thread you can see ...
######################################################################
DWORD
DIM d AS DWORD
d = 2
d = d + 5
compiles to ..
004010AB |. C745 80 02000000 MOV [LOCAL.32], 2
004010B2 |. B8 05000000 MOV EAX, 5
004010B7 |. 8B4D 80 MOV ECX, [LOCAL.32]
004010BA |. 03C1 ADD EAX, ECX
004010BC |. 8945 80 MOV [LOCAL.32], EAX
######################################################################
LONG
DIM l AS LONG
l = 2
l = l + 5
Compiles to ...
004010AB |. C745 80 02000000 MOV [LOCAL.32], 2
004010B2 |. B8 05000000 MOV EAX, 5
004010B7 |. 0145 80 ADD [LOCAL.32], EAX
As you can see there are 2 lines of instructions more to handle DWORD and this takes time.
Also one important thing to remember is that DWORD data type is optimized to store handle or pointers to memory areas. So they are not optimized for math. While LONG seems more related to numeric expressions.
So my advice is to always use LONG when math is involved (even a simple + or - operation) and DWORD when memory handling or pointers storage is the main target. But again, inside and interpreter it will make not any difference. 99% of the time is lost for parsing :-[
Ciao
Eros
ErosOlmi
12-09-2007, 03:34
Forgot to say about BYTE, INTEGER and LONG.
Use INTEGER only if passing parameters BYREF to a function explicitly asking for a INTEGER (16 bit) or in a UDT (TYPE/END TYPE) structure requiring an INTEGER element. Similar considerations about BYTE but maybe other considerations can be taken into account for BYTEs.
In all other cases, using LONG will be better and you will not have any problem of values overflow. Consider possible value ranges too: http://www.thinbasic.com/public/products/thinBasic/help/html/numericvariables.htm
Very interesting, thank you for the detailed explanation, Eros!
But now you have triggered my interest in how an interpreter works, which you have probably already answered and I should just search. ;)
Is the short answer, it takes the source code, parses it, and converts into assembler?
Thanks,
Randall
ErosOlmi
12-09-2007, 03:57
the short answer, it takes the source code, parses it, and converts into assembler?
NO unfortunately.
At that point I would have a ... 90% of a compiler :D
Regarding interpreters, there is not one way to do the job (as usual).
What most of the interpreters do is to take a source code, do some parsing, transform it into an intermediate code more easy to manage (sometimes called pCode) and than start the execution process using the pCode. So consider pCode an intermediate step between source code and compiled code. First version of Visual Basic did this process. Later versions produced compiled code. Imagine the pCode like a sequence of blocks made by 3 or 4 numbers (depending what technique used) telling what instruction it is the current one (command, variable, jump, ...) and some additional info each specific to the current operation. Interpreter engine will follow the pCode sequence understanding what to do.
thinBasic does not use any intermediate pCode but just pure continue parsing of the source code. Of course parser makes a lot of optimization at avery step otherwise execution speed would be hundred of times slower. And thanks to this continue optimization we think thinBasic is quite fast in many circumstances. I have to say that this technique is not used so much I have to say. Almost every other interpreter I know use pCode (or bytecode).
Who is in change of all this work in thinBasic: thinCore.dll is the engine and 90% of the job is done there.
All memory handling, internal data dictionaries, script execution, script flow, module handling is done there.
Maybe more info will follow if interested in more details but now it is 03:56 in the morning here in Italy and I really need to rest a bit ;)
Ciao
Eros