View Full Version : Trouble in (my) Paradise
Greets,
After taking way longer than I first envisioned, I've got my most important program functioning and spitting out the results that I had hoped for. But and there's always a but, I'm running into random crashes of thinBasic and the thinCore.dll. So I come looking for any guidance on the issue, before I fully consider rewriting a bunch of code.
This thread gives some insight into what I work with, as far as data and files; http://www.thinbasic.com/community/showthread.php?t=11464
Some of the files are rather large, such as for the U.S. of A., where the data in the files is a grid of 10801x10801. I subsequently changed these files into quarters and their make up is a grid of 5401x5401. Using some sample code given by Eros in the linked thread, I load the file into a string using FILE_Load and then parse it into two separate strings. One string is the first six lines which give geographic location and then the second string reads the rest of the file. The second string then gets parsed into MyMatrix(), giving me my 5401 rows and 5401 columns. After that, I get into the data manipulation and then write out the data to a file.
Before the manipulation loop ends, I go and Redim MyMatrix() and then the program starts all over reading the next data file, parsing into two string and parsing the data string into MyMatrix(). This is when I get random crashes from thinBasic, listing thinCore.dll as the faulting module.
Memory usage is not so great to blow through the Win32 limit. At a certain point parsing the data string thinBasic may get up to around 1.2gb of memory use before settling back down to around 1gb of memory use until MyMatrix() gets Redim'd. I have inserted a Sleep(10000) to allow a little time for the memory drain to occur before the loop starts all over again. I tried taking out the Redim MyMatrix() command and instead leaving MyMatrix at (5401,5401) and using nested loops to reset the elements back to a null string of "". That crashed also.
The crashes can come at any time, but they appear after the first process and always back at the top of the code, where I am trying to Parse another file. The error messages are strange, as in one from yesterday where it said the variable SourceFileName should be of range 1 to 1, but was 2372. What is strange is SourceFileName is a string variable.
So is there any thoughts as to what I can do to make the processing of the large matrix more secure? Could I programmatically generate a new name for MyMatrix(), so that subsequent loops might create MyMatrix1(), MyMatrix2() and so on, in the hopes that any repetitive use of the matrix might be a contributing factor to the crashes? Otherwise I fear that I am going to have to consider stepping through the source file line by line and incurring, what I believe would be, a much longer processing time.
Thanks for any assistance,
Lance
P.S. for Eros: I'm more than happy to share thinBasic source and my data files. I can upload to my web site if you wanted to take a look. But you MUST promise not to laugh at my spaghetti code! :D
ErosOlmi
05-06-2013, 21:29
P.S. for Eros: I'm more than happy to share thinBasic source and my data files. I can upload to my web site if you wanted to take a look. But you MUST promise not to laugh at my spaghetti code! :D
As John said, I will be happy too to have a look at data and to some code enough to show the crash.
If I can help solving the problem of fixing any bug inside thinBasic I will be happy.
Because you mentioned about more than 1 GB, my mind start to think it can be an internal thinBasic limit.
For example moving 1 GB of string buffer into another string buffer in some internal function can require 3GB of memory, creating a crash.
PS: it is impossible you created spaghetti code: thinBasic has no GOTO and no GOSUB :)
ErosOlmi
05-06-2013, 22:30
Is already written everywhere in thinBasic documentation:
"... thinBasic is a Basic like language interpreter ..."
Greets to all,
I've just about finished uploading the file to my site, but Eros' first comment set off a bell inside my head and I want to try something first.
The largest file that thinBasic has to load is 220mb, with 99.9% of that size being read into MyMatrix(). But one thing I haven't done and am going to try out, is I read all the data into a string and then Parse the string into MyMatrix(). That initial string still is populated from the file read and I'm going to null it out once MyMatrix() is filled and see if the random crashes still appear. That should reduce any memory load by a considerable amount. I hope?
And Eros, I taught myself how to program back on Quick Basic. Me and GOTO / GOSUB are old friends! I'm positive a "modern day programmer" will look at my code and say "Huh?", as Object Orientation and inheritance and the rest of those weird things were just be developed.
Lance
Greets to all,
The edits weren't successful, as the program crashed after 5 of 30 files. But the memory usage did go down by a couple hundred MB, down into the 800-850mb range.
And I forgot to mention that I'm on v1.9.0.0, dated December 13, 2011.
For John: You wrote "Are these single index element arrays you're working with? Would a combination of element and associative based arrays that can be redefined at will with any type of any value be helpful. (system memory is the only limitation. number of dimensions are unlimited)". And best I can understand I'd respond with No, they are space delimited files after the first six lines. The first six lines are for positioning of the data and rows after that are for the data. The amount of data that this type of GIS file can hold can range from very small to very large and they don't have to be square as mine are. It's just how I deal with the material, which is in one degree increment of latitude and longitude. Bottom line I guess would be, if there's a better way of working with the data it's beyond my capabilities.
I'm hoping to find the "limit" of thinBasic, so to speak, as while these files appear large, I have other that are of greater resolution and will need to understand how to slice and dice them to a workable size for thinBasic to handle.
Thanks to all for any help along the way,
Lance
ErosOlmi
06-06-2013, 21:47
I Lance,
I'm checking your code and script execution.
I have to admit it is not easy: I do not know the matter, you code is quite complex, there are too many global variables and this increase complexity when debugging.
In any case I can see many point where things can be improved but first I need to understand what the program is doing.
I will add some console output data in order to understand where the program is crashing. So far no crash.
I will give you back as soon as I will discover something.
Ciao
Eros
Greets Eros,
Too many global variables? I've tried making variables local to the sub-routine, but then it tends to make me even more confused as I work through the code.
At the top of the folder structure are a few screen grabs of when thinBasic crashes. You will note it's saying a variable is out of range, but references the MyMatrix() fill process.
I appreciate your taking the time and I've figured that if I have to, I can decrease the size of the files being read. But then I'd need some guidance to as what an acceptable size for the MyMatrix() data might be. As I mentioned at the top, the original file size is 10801x10801 and I've quartered that. So the next step is to go and make 16 files at 2701x2701. My Canada source files are 4801x4801, so would those need to be quartered? Other areas of the world might be 1201x1201, so where is the "sweet spot" on file size?
I think I mentioned that I've run 27 files through and have thinBasic crash on #28. Then when I restart I might get to #5 before it crashes.
I've earned all the gray hair on my head, thank you! :D
Best regards,
Lance
ErosOlmi
06-06-2013, 23:06
At the moment I'm trying to optimize Work1 function.
Looping though the 29 millions cells of MyMatrix just to rewrite them into a new file takes too long (almost 2 minutes here)
Maybe I will develop new dedicated thinBasic native functions to optimize such a loops.
ErosOlmi
07-06-2013, 01:15
Hi Lance,
so far I was just able to identify where the problem occurs: WORK2W function
I've modified your source in order to add some debug info to each function in such a way I could understand where the problem was occurring.
I will send by mail modified source code (with some optimization in append) plus error log.
I hope you can understand some thing more.
I think one file is double copied or erased just before it is loaded and parsed.
I will go on testing tomorrow.
Ciao
Eros
PS: please download and install current thinBasic 1.9.6
It is much better than 1.9.0
Greets Eros,
I will get 1.9.6.0 installed once I finish this post and will check out the file with your email.
I appreciate the time and effort. And the fact that you didn't say my code was horrible. Cryptic? Yes, it confuses me if I don't work on it for a short time! Now to get it to work all the time and make it's way through about 15,000 files. That's the goal! Hit the Run button and let it churn through a whole lot of files.
Best regards,
Lance
Greets all,
After looking at the great effects the debug function that Eros put it, I can clearly see that I need to do a better job with my variables. The .log.txt file shows me that the code is branching into places where it shouldn't be going. The test area and it's files means they are all single or dual files and that none of the dual files would need to go into the Work2W sub-routine. And none of the files would need to go into any of the Work4?? sub-routines, but it's looking there and that tells me I've done a poor job of resetting variables back to null or zero, whichever the case may be.
It's a convoluted process for what is a (at the moment) simple task of replace some data with another data set, all in the attempts to work with a varying set of data, in different sizes.
John, I've downloaded Script Basic to look at down the road. But as I'm in a continual learning process with thinBasic and still have to dabble some more in the 500lb gorilla known as Visual Basic, my poor head can only handle so much.
I'm hoping in a couple, three days to come back and say "It works!". Keep your fingers crossed.
Lance
ErosOlmi
07-06-2013, 22:18
Lance,
in the source code I sent you, please change the append process from this:
For RowNum = 1 To RowTotal sLine = ""
For ColNum = 1 To ColTotal
sLine += MyMatrix(RowNum, ColNum) & IIf$(ColNum < ColTotal, $SPC, $CRLF)
Next
FILE_Append(FileWriteName, sLine)
Next
to this:
Dim sLine(ColTotal) As String
For RowNum = 1 To RowTotal
For ColNum = 1 To ColTotal
sLine(ColNum) = MyMatrix(RowNum, ColNum) '& IIf$(ColNum < ColTotal, $SPC, $CRLF)
Next
FILE_Append(FileWriteName, Join$(sLine, $SPC) + $CRLF)
Next
It will run 50% faster with your matrix of 5401x5401=29170801 cells
I'm working on improving this part, that is the slowest part, even more
Ciao
Eros
Greets all,
First to Eros, thanks for the tweaked code. It will be implemented as I go through the code. In general testing I found it took about four minutes to process an area and since I have separate systems that can run all night, processing time isn't a great concern. But if I can speed things up, that's always a good thing and it will help on the electric bill.
For John, I am going to be honest and say that I found/find your earlier comments insulting and also, I find them highly inappropriate within the thinBasic forums. I have never intended to say that there is a problem or limitation that hinders my work within thinBasic, only that I had reached a point in my efforts that was giving me difficulties. The "debug code" put in by Eros serves only to write to a log file when a sub-routine is entered and when it is through and exiting the sub-routine. It took about one second to look at the text file and for me to realize that I needed to do a better job of writing the code, as I know, not think, which sub-routine would apply for a particular area. Eros was smart enough to put that in there and for that, I am thankful, as it gives me the ability to correct some code that should have been there in the first place.
When I come here with a thinBasic question, I hope to receive thinBasic assistance. Not to be told that I should look at another offering and then afforded a condescending attitude when I don't want to. I've chosen the tool that I wish to use at this time. If I felt there was a problem with thinBasic being able to handle my needs I would quietly start up Visual Basic and begin work on transferring the code. But no product would spit out the results I am seeing if there was an underlying GIGO effect.
The data sample you ask about can be found here: www.lcsims.com/storage3/N20W156HIa.rar (http://www.lcsims.com/storage3/N20W156HIa.rar) This is a part of The Big Island in Hawai'i. My efforts are to replace parts of a geographic area with different data, better suited to fix visual anomalies within a computer game. This is a commercial endeavor for me and I won't release it until it meets my standards. Once it does, I'll be able to stick my tongue out at the people who have done this type of work in the past and say "See, it can be done and over a wide area! Nanner, nanner, nanner!" While the premise behind my work is now a simplistic data replacement, it involves knowing whether I need to work with one file, two files or four files. Then it has to determine in which direction the secondary file(s) are and what their name(s). A highly trained programmer might shudder at my code. I don't care about pretty, only if it works.
I hope the data sample helps you with your efforts.
Lance
Greets all,
Small update on my efforts. Incorporated some changes suggested by Eros and did a little house cleaning on the variables, which was to reset them back to 0 in the primary While...Wend loop and darn'd if that puppy didn't run through on all my single call files. Made similar changes to the two-calls files and will re-run everything in the morning.
Eros, the changes for writing out to disk are almost shocking. Whereas before an area would process each of it's file(s) in around 4 minutes, the single files (24 of them) processed in 23 minutes on my three year old i7 930. Trying to project that out in my head, where before I anticipated 2-3 days to process all areas in Canada, now it will take less than one day. Bravo Zulu, as the term is used within the flight simulation community.
Now all I anticipate having to work through is when four files are required, but a lot of that is cut and paste. :D
Lance
ErosOlmi
08-06-2013, 10:37
Thanks a lot Lance for the update.
I will try to give you some more ... power in next days. I still have a couple of ideas of new native thinBasic functions that should help also others.
Ciao
Eros
Yesterday there was 3 pages. Where did page 3 go?
Bill
ErosOlmi
09-06-2013, 13:13
Lance,
I'm happy to say that I was able to significantly reduce time needed to append to file your 5401x5401 matrix (29 million cells) from 1 minute to just 7 seconds on my machine that is very similar to yours.
I modified JOIN$ function in order to be able to JOIN$ 2 dimension matrix in a more clever way.
The append part of your script will be like that:
'---Appen header data
FILE_Append(FileWriteName, FileWriteData)
'---Append 5401x5401 matrix using $SPC for element delimiter and $CRLF for line delimiter
FILE_Append(FileWriteName, Join$(MyMatrix, $SPC, $CRLF))
I think this would reduce the time to handle your 15000 files to just few hours.
I will release this feature in next thinBasic update 1.9.7 due to by next few days.
In the meantime, if you want to test it, just let me know and I will attach here new thinCore.dll that is thinBasic Core Engine.
Ciao
Eros
ErosOlmi
09-06-2013, 15:35
Lance,
I could not resist to publish new thnBasic Core engine and let you test.
Mainly because I need your confirmation that output results are correct.
Attached to this post you will find thinCore.dll
You need to have thinBasic 1.9.6 installed than substitute your \thinBasic\thinCore.dll with the attached one.
Than test the Append process I suggested in previous posts with this one single line:
Debug 3, "Append file: " & FileWriteName
n1 = FILE_Append(FileWriteName, FileWriteData)
FILE_Append(FileWriteName, Join$(MyMatrix, $SPC, $CRLF))
' Dim sLine(ColTotal) As String
' For RowNum = 1 To RowTotal
' For ColNum = 1 To ColTotal
' sLine(ColNum) = MyMatrix(RowNum, ColNum)
' Next
' FILE_Append(FileWriteName, Join$(sLine, $SPC) + $CRLF)
' Next
Debug 2, "End Append"
Waiting for your comments about speed and data (it must be the one you expect :) ).
Ciao
Eros
ErosOlmi
09-06-2013, 20:53
You always consider people as potential users of this or that application.
I consider people for what they ask, trying to help them for the problem they post and trying to improve thinBasic when I can.
And just in one day I got suggestions for keeping me busy for one month: http://www.thinbasic.com/community/project.php
I didn't lose anyone: Lance can do whatever he prefer and whatever is best for him. I'm always happy when someone find his way.
ErosOlmi
09-06-2013, 21:31
Anyone that cares about his/her time would have looked at the 6 lines of SB code and the few seconds it took and would have quit wasting your time.
John,
you just tested loading a 8Mb file loading a matrix. Full stop.
Something thinBasic can do in 2 lines of code and in few milliseconds:
Dim MyMatrix() As String
Parse(File "MyDafaFile", MyMatrix(), $CRLF, $SPC)
Lance and I were talking about a pair of 256MB files and times were referring to:
loading into matrices 2 big files
meshing matrices of 29 millions cells
saving back 29 millions cells into a 250 MB file
And this repeated for 15000 files
Something quite different from your example.
Eros
Waiting for your comments about speed and data (it must be the one you expect :) ).
Greets all,
I'm not a frequent visitor to the forums. Just when I need someone to nudge me with a "do better code" ideas or kick in the ..." to het me headed in the right direction. Today I spent a war5m spring day and evening spec'ing parts for a new computer build. That bundle of joy will hit my doorstep by next weekend, maybe? And I'm hoping to have a fast system, without the shortcomings of the system I'm typing on. Since I built it three years ago, it's had difficulties recognizing the 6gb of ram (tri-channel on an i7 930). So I worked with 4gb recognized. Way too many posts at the EVGA forums about this problem with the ram and I was one of the people posting. I relocated almost two years ago and had packed the box up and put it into storage until I found my new place to live. One of the first things I did upon moving was unpack my computer. Isn't that the normal thing to do? When I turned it on, lo and behold, the BIOS now said I had 6gb of working ram! Life was good!
So today I decide to give the system it's spring cleaning with a can of compressed air and... I had bought some different ram modules that the mobo should fully recognize and give the rated speed. Popped out the old and punched in the new ram. Turned the system on and got 4gb recognized at 2/3 the rated speed. Restarted and went into the bios, made a change or two and rebooted..... Then after getting nothing and not wanting to spend any time debugging things, I went back and reinstalled the old ram and restarted, only to find I now have 4gb recognized, not the 6gb from earlier today.
So here's hoping the new parts go together well, as the research says they should. Until the new parts show up, I will get the new DLL loaded and make the code changes for you, Eros. Should have a first run through for you tomorrow evening, your time. I mentioned before about the work for Canada having an original estimate of two to three days, which then went down to less than one day. So I'm figuring by the time I get my work done, you will already have found a way to have it compiled before I hit the Green button! In amongst the thoughts doing a project like mine, to think I can help make thinBasic even just a little better makes me feel good So far the parsing of the data has been improved in thinBasic and now the writing out, that's a nice side benefit for hacking endeavors.
And John, I have looked at Flight Gear and commend the efforts of those involved, but the main name in Flight Simulation belongs to Micorsoft, even still on a six year old product. Airplanes and systems utilized by actual airlines, flying into accurately modeled airports filled with other airplanes on the ground and skies. It might seem like a game to the uninitiated, but there is some heavy duty stuff done with the game. And there are plenty of people who want to whip out their credit cards to make the game look better. That's what I'm aiming for. I can make the world's terrain in about ummm, a week. But it's been done already and there are people who don't like the concessions that have to be made with higher resolution terrain. I'm hoping to show them that some of those concessions aren't necessary any more.
Best regards,
Lance
Greets Eros,
Ran through the "single files" and compared to the results of a couple of days ago. For 24 files to process, the old (revised) code took 22:33 from first line in the debug.txt file to the last. Latest code took 9:52, which rough math in my head says about 40% of the time for the old code. The new code is also hindered by the fact that the computer is no longer running in tri-channel memory. Well done!
A quick cursory look inside my GIS tool of choice didn't show any visual anomolies. But I will do a conversion into the game's required format and compile the files and check in-game. Maybe I'll post a pic or two of what I'm aiming for that will make me the world famous... :D
Time for a bowl of ice cream and then off to bed for this old man!
Lance
Lance,
I could not resist to publish new thnBasic Core engine and let you test.
Mainly because I need your confirmation that output results are correct.
Attached to this post you will find thinCore.dll
You need to have thinBasic 1.9.6 installed than substitute your \thinBasic\thinCore.dll with the attached one.
Than test the Append process I suggested in previous posts with this one single line:
Debug 3, "Append file: " & FileWriteName
n1 = FILE_Append(FileWriteName, FileWriteData)
FILE_Append(FileWriteName, Join$(MyMatrix, $SPC, $CRLF))
' Dim sLine(ColTotal) As String
' For RowNum = 1 To RowTotal
' For ColNum = 1 To ColTotal
' sLine(ColNum) = MyMatrix(RowNum, ColNum)
' Next
' FILE_Append(FileWriteName, Join$(sLine, $SPC) + $CRLF)
' Next
Debug 2, "End Append"
Waiting for your comments about speed and data (it must be the one you expect :) ).
Ciao
Eros
Greets All,
I wanted to come back and touch on the speed issue, as I was getting some supporting files ready earlier for some production work.
Into my large matrixes (5401x5401~) goes a bunch of smaller matrixes, in the average range of 400x400, though they vary. I had 1,144 of the small files to modify and save out to new files. I wondered "how long will this take?", as I had only done a much smaller set a while back. Before starting the task I went in and reviewed the code and saw that I could write the matrix to file using the revised File_Append(FileName, Join$(MyMatrix, $SPC, $CRLF)). Hit the compile button and went to make my dinner. When I came back the job had finished all 1,145 files in 20 minutes. Just for fun, I went back to the earlier code and started the process a second time. On average with the old code I got about 11 files processed in TWO minutes, compared to the revised code doing 60 to 70 files in ONE minute. Impressive tweak, Eros!
Slowly my little project comes together and I appreciate thinBasic making that happen.
Lance
ErosOlmi
01-07-2013, 17:56
Thanks for the project update.
I didn't expect that speed but I'm fine you got it.
John, John, John... it's NOT a game, it's a simulator! :rolleyes: Or so the "hardcore simmers" would leave you to believe.
My efforts deal with terrain enhancement. This particular project has been floating around in my head for a couple of years and would have been finished long ago, but some health issues popped up (pun intended for me) and it's taken longer than expected. I hope to have the initial product done by the end of July, which may or may not garner much attention from the Microsoft-based side of the FS community. I think I have a couple of pictures that I can share shortly. But it's mostly minimizing the effects of adding higher resolution terrain data into the game. For me, the terms "hardcore" or "serious" simmer are divisive and meant with a level of disdain. That's why I like to mimic those who are unable to admit they like to sit at a computer, play a game and be entertained.
Lance