PDA

View Full Version : Binary Data



Michael Clease
09-06-2008, 10:18
I am trying to produce some byte values in a string but it seems to only create the string version

say i want the number 234 in a string it creates 32 33 34 rather than 234.

I am trying to output this data to a file.


USES "FILE"

%DataSize = 10

DIM n AS LONG
DIM Data as STrING
DIM NewData AS STRING
dim FileName AS STRING
Dim FileHandle As DWORD
Dim Status As DWORD

FileName = APP_SCRIPTPATH + "TestData"

For n = 1 to %DataSize
Data += VAL(rnd(1,255))
NEXT

For n = 1 to %DataSize
NewData += HEX$(MID$(Data,N,1))
NewData += " "
NEXT

FileHandle = FILE_Open (FileName+ "1.Dat", "BINARY") ' Open file for writing
Status = FILE_PUT (FileHandle, Data) ' Write String to File
Status = FILE_Close(FileHandle) ' Release File

'FILE_SAVE(FileName+"1.Dat", Data)
FILE_SAVE(FileName+"2.TXT", NewData)
msgBOX 0, NewData

ErosOlmi
09-06-2008, 11:12
Abraxas,

when you add a number to a string, thinBasic automatically transform that number into its "human string representation" so the following are equivalent:

Data += VAL(rnd(1,255))
Data += rnd(1,255)
and are like:

Data += trim$(STR$(rnd(1,255)))

I think you want to store the ASCII representation of the number so do something like:

Data += CHR$(rnd(1,255))

Or if you want to store binay full numbers just use MK_$ function: http://www.thinbasic.com/public/products/thinBasic/help/html/mkx.htm
Those functions are dependant on the magnitude of the numeric espression and returns one or more bytes depending on the numeric magnitude you want to store. Example for using BYTE magnitude:

Data += MKBYT$(rnd(1,255))

Hope to have interpreted correctly what you need.

Ciao
Eros

Petr Schreiber
09-06-2008, 11:23
Then complete code goes as following:


USES "FILE"

%DataSize = 10

DIM n AS LONG
DIM Data as STrING
DIM NewData AS STRING
dim FileName AS STRING

FileName = APP_SCRIPTPATH + "TestData"

For n = 1 to %DataSize
Data += chr$(rnd(1,255))
NEXT

For n = 1 to %DataSize
NewData += HEX$(asc(Data,N))
NewData += " "
NEXT

' -- No need for binary mode to write binary data
FILE_SAVE(FileName+"1.Dat", Data)
FILE_SAVE(FileName+"2.TXT", NewData)
msgBOX 0, NewData



Bye,
Petr

Michael Clease
09-06-2008, 20:51
thanks guys you understood what I wanted.

It is for an idea I had about compression, everyone knows that ascii crunches really well so I thought why not turn the data into ASCII hex values then compress it. I know it makes the files a lot bigger before they are crunched but they do seem to get a lot smaller.

I will do some more testing and come back to you with details.

ErosOlmi
09-06-2008, 20:55
Abraxas,

thinBasic has 2 functions taken from ZLib that are useful when you want to zip/unzip strings:
STRZIP$ (http://www.thinbasic.com/public/products/thinBasic/help/html/strzip$.htm) and STRUNZIP$ (http://www.thinbasic.com/public/products/thinBasic/help/html/strunzip$.htm)

Maybe you can use them. There is an example in \thinBasic\SampleScripts\ZLib\ directory.

Ciao
Eros

Michael Clease
09-06-2008, 20:58
thanks Eros but Zlib doesnt support zip/pkzip 2.5 file format which is what I was looking for.

ErosOlmi
09-06-2008, 21:02
Ok, sorry.

Michael Clease
09-06-2008, 21:37
@Eros whats happening with this http://www.jose.it-berater.org/smfforum/index.php?topic=83.0 ?

ErosOlmi
09-06-2008, 21:50
Here my old tests so far.
I was testing some time ago in order to create a module but later I transformed into an include file.

Maybe we can get something usable for TBGL or other matters where size counts.

Eros

Michael Clease
09-06-2008, 22:43
Thats just what I am looking for.

I think it would be a good addition to TB, why not add it as a module.

Michael Clease
11-06-2008, 01:28
I found some time and this is what I found, you might be quite shocked ;)

The script is set to make a data buffer of about 100K.


Update didnt work so removed bad script see my next post for update.

Petr Schreiber
11-06-2008, 01:40
Night of nice surprises,

thanks for sharing, really great stuff!
The compression ratio for TestData2 is very nice :)


Petr

kryton9
11-06-2008, 02:25
Why the difference between the two?

On mine, the original num bytes = 102400
zipped = 102620
So slight increase.

But Hex, wow
hex = 204800
zipped = 7295

Thanks for conversion and example!

Michael Clease
11-06-2008, 02:43
the increase is because of the header for the zip format.

ascii crunches really well anyway, ascii only fills 7 bits so you have a spare bit so if you shift all characters left 1 bit for each 8 characters you save a byte.

The zip format uses huffman compression which i believe is a binary weighted tree system, i think it works like this

you find the most frequent byte and the next and so on.

now take that first byte and represent it from an index as 1 the next number would be 11 and so on.

this way a byte (8 bits) can be represented by 1 bit and because it is the most frequent is reduces the data size in most cases.

its been about 17 years since I did work on huffman compression back in my amiga days so I maybe getting some of that wrong.

Petr Schreiber
11-06-2008, 08:58
Hi,

I am just learning Huffman compression for examination, so I created a little example to help me understand it, hope it might come handy for you to understand too ( or reveal I got it wrong ). See the PDF for colourful pictures :)

I also find algo in PB (http://www.pbcrypto.com/view.php?algorithm=huffman) to do the Huffman, could be translated to thinBASIC for learning purposes, for the rest we can use the DLL.


Petr

Michael Clease
11-06-2008, 18:04
I am sad to say I didnt check the hex data before compressing and it was wrong mainly the same character repeated :-[

I have fixed it and my idea was incorrect probably because of the increase of data size before crunching.

try this version that works. I will delete the other version.

Petr Schreiber
11-06-2008, 19:10
Hi,

works nice! The library is a dream, adding multiple files is piece of cake.
Just in your samples they add full path, which is not always desirable.


Thanks,
Petr

Michael Clease
11-06-2008, 22:07
If anyone wants to work on a module from the original source code here it is from the code project site.

kryton9
12-06-2008, 02:14
Thanks, Abraxas.

Petr Schreiber
12-06-2008, 08:19
Abraxas,

that is a very good idea to build module from ground up, without dependancy on the 2 DLLs.
More compact is better :). I am just not sure if the license ( does it have any ? ) allows such a use.


Petr

Michael Clease
12-06-2008, 09:02
this is the code project page http://www.codeproject.com/KB/library/LiteZip.aspx

quote from the writer Jeff Glatt http://www.codeproject.com/script/Forums/View.aspx?fid=278047&msg=2389458
The licensing details are the same as the code it is based upon, zlib at http://www.gzip.org/zlib by Jean-Loup Gailly and Mark Adler.

So that means TB already uses zlib so it should be OK. I suppose who ever wants to write the module could email him to confirm but my "C" is a bit weak so I dont think it will be me.


@Eros can you split this thread from my original question, so it makes more sense. Thanks.