PDA

View Full Version : Benchmarking against functional languages.



RobbeK
17-04-2014, 12:11
. hi ..

Still experimenting with "other" languages , a rather unfair series of benchmarks (functional languages have their strong points, which are not shown here).

Some calculus -- the value of Zeta(1) after one million iterations.

OK..

Racket Scheme -- bytecode -- using for/sum loops : 3 sec
CLisp -- script -- 2.5 sec
Racket Scheme -- bytecode -- using apply / map : 2 sec (2/3 of the time spent on garbage collection )
CLisp -- bytecode (GNU Lightning) -- 0.6 sec
NewLisp -- script -- 0.45 sec using apply/map
ThinBasic -- script -- 0.42 sec
Clojure -- java Bytecode -- 0.4 sec (used an around 200 Mb IDE for generating the JAR file ;-)
NewLisp -- script -- 0.35 sec using (for i j step)
Corman Lisp -- native X86 -- 0.25 sec
GFA -- native X86 -- 0.09 sec
ThinBasic + O2 pre-JIT ------> 0.03 sec (compiling outside the timer loop )

(for data manipulations (filter, cleaning etc ... ) NewLisp is around 10x faster than ThinBasic without O2)

Did not time Haskell.

speaking about Haskell -- some time ago there was a competition where bots had to fight each other (till one left) using different programming languages , some statistics how they survived :

C# good ones , bad ones , less mediocre , few very bad , very good
http://i2.wp.com/1.bp.blogspot.com/_FsLa1cMTCWU/TPgy6h6N5EI/AAAAAAAAAj0/8PXgOSzN8ck/s1600/csharp_density_plot.png

Java : the most used language , tends towards average
http://i2.wp.com/1.bp.blogspot.com/_FsLa1cMTCWU/TPgybPKcEbI/AAAAAAAAAjs/2THlgOgE-mo/s1600/java_density_plot.png

Haskell : a few refused to die it seems
http://i0.wp.com/1.bp.blogspot.com/_FsLa1cMTCWU/TPg0CBK3W5I/AAAAAAAAAj4/q3iEbQvyMhc/s1600/haskell_density_plot.png

C : they were called the hippies -- tough ones (there were no Basic-users, but "if" there could be similarities )
http://i2.wp.com/1.bp.blogspot.com/_FsLa1cMTCWU/TPgyKSc-rvI/AAAAAAAAAjk/5Si6vyFPEvQ/s1600/c_density_plot.png

The winner : when the going gets tough, the tough .....
http://i1.wp.com/3.bp.blogspot.com/_FsLa1cMTCWU/TPgyBXF3PhI/AAAAAAAAAjg/M6v-8WEvv98/s1600/lisp_density_plot.png

best Rob

ReneMiner
17-04-2014, 16:36
times seem engine-depending. I'm far from 0.03 seconds and need only 0 :D

for benchmarks I would use the built-in high-resolution-timer instead of getTickCount since it's much more precise.


Uses "console"

Dim startingTime As Quad
Dim neededTime As Quad
Dim tick As Long ' we compare tickcount & hires-timer now...

HiResTimer_Init ' do always if want to use HiresTimer

tick = GetTickCount

Do

startingTime = HiResTimer_Get

' idle until it changes
Loop While tick = GetTickCount


tick = GetTickCount

While tick = GetTickCount: Wend

neededTime = HiResTimer_Get - startingTime

PrintL "Time for one tickcount-change: " + Format$(neededTime / 1000000, "#.0000")

PrintL $CRLF & Repeat$(42, "_")
PrintL $CRLF & "Press the ANY-key to end"

WaitKey


Problem with GetTickCount is: it just delivers a new value every 0.015 to 0.025 seconds - depending on system, so I'm pretty sure the test needs less than 0.016 seconds on mine - 0.0059 to be precise, see screenshot

RobbeK
17-04-2014, 19:30
Hi Reneminer,

Ah, ok -- but , wait till after December 6th ;-)
I get around 32 mSec this way.
(I think the other languages are not capable of measuring micro seconds directly ).

best, Rob

ReneMiner
17-04-2014, 20:27
What will happen on december 6th?

You get 32 mSec for what? The test?



Or is 32 mSec the tick-break on your pc? :shock28:- do you use some 486?

Petr Schreiber
17-04-2014, 21:27
Hi,

GetTickCount/Timer is very unprecise, ThinBASIC offers hiResTimer and cTimer, which work better for benchmarking purposes.


Petr

EDIT: Rene was faster

RobbeK
17-04-2014, 23:41
Well, it was to have a rough idea ..

Yes, my computer is 5x slower :-c "sniff" (but I can think of a formula 10000x faster ;-)
(that's also nonsense compared with 50000x faster)

6 december :
http://frenchfinest.files.wordpress.com/2011/12/saint-nicholas1.jpg

best, Rob

mike lobanovsky
18-04-2014, 01:07
Hello colleagues,

Strictly speaking, any attempt to accurately benchmark the throughput of a computer process as a function of computer time is useless because it dissatisfies the fundamental requirement of timing instrument being independent of the object being timed. Remember "relativistic time dilation" from Einstein's Special Relativity theory?

Time-based looped benchmarks can give only a very rough idea about the efficiency of one particular aspect of a language such as e.g. integer performance, FPU performance and transcendental mathematics, string handling, array indexing and many, many others. Hundreds of different tests are required to further be statistically processed in order to make their results relevant to realtime deployment of these features in practical use cases. Only then can the given languages be somehow ranged roughly as to their suitability for a particular applied task.

1. GetTickCount() and timeGetTime() are the poorest instruments of all. Their granularity is extremely platform-dependent and can be considered a system-wide constant similar to the real world's speed of light. The time quantum of these counters ir approx. 10 msecs on Windows 95 thru Windows Millenium, and 15 to 16 msecs, on Windows 2K+.

There's nothing you can do to improve their resolution or accuracy. Contrary to a common but ungrounded belief, Winmm.dll's timeBeginPeriod(1)/timeEndPeriod(1) has no effect whatsoever on the resolution or accuracy of multimedia timeGetTime() counter; its resolution is exactly as poor as that of GetTickCount().

The only useful thing timeBeginPeriod(1) can do is improve the resolution of Sleep() (or Wait(), or any other derivative), and standard Windows timer (SetTimer()/KillTimer() WiAPI's) and its derivatives, to 1 msec against their usual default resolution of 15/16 msecs on an NT-based Windows kernel. In other words, unless you use a call to Winmm.dll's timeBeginPeriod(1), such calls as Sleep(1) through Sleep(16) and SetTimer(hWnd, ID_Timer, 1, NULL) through SetTimer(hWnd, ID_Timer, 16, NULL) will yield a time interval of approx. 16 msecs; Sleep(17) through Sleep(32)/SetTimer(hWnd, ID_Timer, 17, NULL) through SetTimer(hWnd, ID_Timer, 32, NULL) will yield a 32-msec time interval, and so on and so forth.

2. QueryPerformanceFrequency()/QueryPerformanceCounter() and their derivatives are very susceptible to modern CPU's SpeedStep (Intel) and Cool'n'Quiet (AMD) technologies that reduce the processor's power consumption and die (a.k.a. silicon chip) temperature by clocking the processor down with respect to its designated clock speed.

Look how my CPU speed may fluctuate while I'm typing a forum message (top green line):

http://www.oxygenbasic.org/forum/index.php?action=dlattach;topic=1040.0;attach=2488;image

The only way to avoid miscalculation will be:
-- switch off the SpeedStep (or Cool'n'Quiet) option in your BIOS (or UEFI); or
-- take a QueryPerformanceFrequency() reading every time immediately before a matching call to QueryPerformanceCounter() and pray to Lord that your sytem doesn't decide to step the CPU down while your interpreter is still reading its VM code between the two calls. :)

Calling QueryPerformanceCounter() will however add extra overall inaccuracy to the measurement, therefore it would be reasonable to have a few empty-loop calibration runs beforehand to see how much time it actually takes your computer to perform the QueryPerformanceXXX calls themselves.

Also, it would be reasonable to confine the measurement process affinity to one core only on a multi-core CPU because the system switches the processor contexts unpredictably for the user and a lot of time is lost in this process spoiling the accuracy of benchmarks still further.

3. The most reasonable thing would be to abandon the seconds/milliseconds/ticks idea altogether and measure performance directly in CPU clocks instead having switched off the SpeedStep/Cool'n'Quiet options and having confined process affinity to just one core well in advance.

CPU clocks can be measured with an RDTSC assembler instruction either in thinBasic's core engine via a PowerBasic assembly inline or in an external OxygenBasic procedure using its native assembly capability. This is however high-class aerobatics which is probably a topic for another discussion.


Regards,

ReneMiner
18-04-2014, 07:18
But if we measure all languages using the same wrong method on the same, overwrought old pc? Doesn't that equalize?

Rob, you still get presents by Saint Nicholas? So have you been goody-goody then?

Charles Pegge
18-04-2014, 07:42
Perhaps the most reliable time measure is the 60Hz frame refresh signal, though how to lock into this event directly, remains a mystery to me.

mike lobanovsky
18-04-2014, 09:09
@ReneMiner:

Measure what? Volumes expressed in pounds with distances expressed in pennies? Let us find a common foothold first and then try to actually turn over the world.

I said we can subside to a very rough estimation based on a large number of tests with adequate statistical analysis. That was actually the possibility that I was trying to investigate in my items 1 thru 3. But these are only speculations in the long run because they don't resolve the contradiction which I outlined in the introductory paragraph of my message.

@Charles Pegge

1. Here are my workstation's monitor frequencies:

http://storage7.static.itmages.ru/i/14/0418/s_1397804137_5623803_c11b942492.jpg (http://itmages.ru/image/view/1621928/c11b9424)

1.1. Which one should I choose as a reference?

1.2. Here's a VSYNC'ed OpenGL benchmark test with my central monitor set to 59Hz:

http://storage6.static.itmages.ru/i/14/0418/s_1397802877_9582132_3ce9bb3063.png (http://itmages.ru/image/view/1621892/3ce9bb30)

Yet Fraps (http://www.fraps.com/) (yellow figures in the lower right corner) shows that my OpenGL window still renders at an FPS rate of 60Hz. How can I believe that such a setup can be used for any serious benchmarking?

1.3. What the unfortunate notebook owners should do whose frame rate is always set to 30Hz only?

2. Lock into VSYNC signal? Easily. A Ring 0 kernel driver that bypasses Windows abstraction layers and reads from/writes into the HW I/O ports directly. I don't have the code but I have a precompiled DLL somewhere. Can send you one if you want. :)

ReneMiner
18-04-2014, 09:34
Hi Mike,

i thought it's about measuring the time needed from different programming-languages to achieve the same result or to process the same routines to find out which is the fastest.
It probably won't make a difference if the pc is a slow or a fast one if one does all tests on the same machine - and if we compare from different machines resulting values expressed in %, 0.0..1.0 or similar in the end it will probably turn out to the same results: we have some very fast stuff as tB and O2 on our computers installed
:)
René

mike lobanovsky
18-04-2014, 10:12
Hello there René,

Glad to meet you!

Yup, it's about measuring relative efficiency of various programming languages for very narrow, specialized tasks under unnatural loop benchmark conditions. But naw, it will make a difference if a PC is "slow" because a "slow" PC these days means a completely different - obsolete - microelectronics technology and architectural design. There's no direct timing proportion between a "slow" PC and a "fast" one (actually, some pieces of machine code may work faster, i.e. more efficiently, on an older, "slower" PC) just as there is no sense in relative benchmarks being measured in units of time. They are simply not applicable in the world of computers as long as there are no objective timing instruments in it either. If there were, GetTickCount() would've been called GetElapsedMilliseconds(). :)

Ticks are OK, clocks are OK, your proposed relative per cent are OK, even simple comparative bar graphs without any measurement units at all are OK - anything but nano/micro/milli/seconds. And only together with an exact quotation what particular hardware the benchmark results were gotten on and for what particular code snippets. Any other data is uninformative and senseless. We can hardly bit Einstein's genius and his Special Relativity here.

Regards,

mike lobanovsky
18-04-2014, 17:45
Hello Charles,

In the meantime, can you please delete your last message from this thread and copy-paste it into a new topic, say, "FPS Control in 3D Camera" somewhere on Petr Schreiber's wonderful TBGL board?

It's a separate and specialized but very interesting topic that needs an own technical discussion. Let's keep it all in one place, OK?

Thank you,

Charles Pegge
19-04-2014, 05:31
Mike, I'll move it to the OxygenBasic forum since TBGL has its own sync system.

Topic Branch: Opengl Performance / measurements for adjusting sleep-time between frames

http://www.oxygenbasic.org/forum/index.php?topic=1049.0#new

mike lobanovsky
20-04-2014, 07:14
Thanks a lot, Charles, please meet me there today.

And a Happy Easter to the thinBasic community! :)