Does exist wrapper (or bindings) for llama.cpp coded in thinBasic? (module suggest) [Archive] - thinBasic: Basic Programming Language Community Forum

View Full Version : Does exist wrapper (or bindings) for llama.cpp coded in thinBasic? (module suggest)

JohnClaw

14-04-2024, 00:55

Hi. I'm searching bindings/wrapper for llama.cpp coded in any dialect of Basic or similiar easy-to-learn language. Does anybody know something like that? If there is no such module in thinBasic why not to create it? This is a popular AI technology that can promote thinBasic. Thanks for your attention.

ErosOlmi

17-04-2024, 07:12

Ciao,

are you referring to this: https://github.com/ggerganov/llama.cpp

Very interesting, and lot to study

Personally I'm just an OpenAi standard user using it to study AI possibilities and get some ideas.

At work we used some AI features from Microsoft Azure Cognitive Services for automating some boring process.
We used Sentiment Analysis and Text Classification to verify and classify users feedback on products purchasing integrating our ecommerce web site that is hosted under AWS with company backend databases using Azure Functions as a glue to integrate the two worlds.
And it is working pretty well ... so fare more than 200k customers recensions have been analyzed and integrated.

Will see.
I've downloaded and installed LM Studio from https://lmstudio.ai/ to get some practice on spare time.

Anything to suggest on how to proceed?

Thanks
Eros

JohnClaw

18-04-2024, 16:17

Ciao,

Anything to suggest on how to proceed?

Thanks
Eros

As far as I know Pascal is more easy to understand than C++ so i would recommend to study Pascal wrapper of llama.cpp. This file https://github.com/ortegaalfredo/fpc-llama/blob/main/llama.pas is Pascal port of C++ https://github.com/ggerganov/llama.cpp/blob/master/llama.h. So this is code of a simple console app that can chat offline with AI models in .gguf format: https://github.com/ortegaalfredo/fpc-llama/blob/main/llama_cli.lpr It uses llama.pas bindings to call functions from llama.dll. Llama.dll can be obtained from Neurochat installer: https://github.com/ortegaalfredo/neurochat/releases/download/0.6-dev/neurochat-0.6-setup-win64.exe After installing Neurochat you have to open it's folder and there you will see llama.dll. Then you should copy llama.dll to the folder where you have already put llama.pas, llama_cli.lpr and llama_cli.lpi (all these files can be found in this repo: https://github.com/ortegaalfredo/fpc-llama). After that you have to download and install free Lazarus IDE: https://sourceforge.net/projects/lazarus/files/Lazarus%20Windows%2064%20bits/Lazarus%203.2/lazarus-3.2-fpc-3.2.2-win64.exe/download Then open llama_cli.lpr in Lazarus and compile it. Now you will have llama_cli.exe that uses llama.dll to chat with files of AI models. An example of local AI model: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q8_0.gguf?download=true To run it with llama_cli.exe you will need more than 8gb RAM. If you have less RAM then try this version: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/blob/main/llama-2-7b-chat.Q4_0.gguf Other AI models can be found here: https://huggingface.co/TheBloke

If you don't like Pascal you can study other bindings/wrappers: C#/.NET - https://github.com/SciSharp/LLamaSharp, JavaScript/Wasm (works in browser) - https://github.com/tangledgroup/llama-cpp-wasm

Also you can study one-file C implementation of offline AI chatbot: https://github.com/karpathy/llama2.c/blob/master/run.c This project also has bindings/wrappers: C# - https://github.com/trrahul/llama2.cs, JavaScript - https://github.com/dmarcos/llama2.c-web Other bindings: https://github.com/karpathy/llama2.c?tab=readme-ov-file#notable-forks

Recently i discovered a Harbour wrapper to llama.cpp: https://gitflic.ru/project/alkresin/llama_prg The code of it's minimal console AI chatbot looks rather simple: https://gitflic.ru/project/alkresin/llama_prg/blob?file=test1.prg&branch=master A liitle bit more complicated than code of an average Basic dialect but still understandable. But the problem is that this console chatbot calls functions from hllama.cpp https://gitflic.ru/project/alkresin/llama_prg/blob?file=source%2Fhllama.cpp&branch=master So Harbour wrapper is written in C++ and it's quite hard to study it if you are a newbie in coding. However i compiled llama.lib from Harbour repo and now have some questions to you and other expirienced coders in this forum. Can thinBasic call functions from .lib files compiled in MSVC? Or maybe thinBasic has it's own format of .lib files? If so, does exist any way to convert MSVC .lib to thinBasic .lib? Can MSVC .lib be converted to .dll file? If yes, then AI chatting functions can be called from this .dll and so will become acceptable from thinBasic code (using some link command), right? I also have compiled llama.obj, hllama.obj and other .obj files. And here come the same questions: Can thinBasic use MSVC .obj files? Do they need to be converted to native thinBasic format? Does exist a way to make a .dll from .obj files? I can share llama.lib, hllama.obj and other files. So just notify me if you need them.

One last thing. I talked to coder Cory Smith (founder of gotBasic.com). He converted code of run.c (from llama2.c project mentioned earlier) to VB.NET. I'm not a fan of VB.NET, i prefer more simple and lightweight Basic dialects. But this converted code may be useful for you: https://jmp.sh/s/vIwmomlOkJrswE9ww33K

JohnClaw

25-04-2024, 18:39

I possibly found simplified version of llama.cpp: https://github.com/tinyBigGAMES/Dllama

Minimal code to chat with llms:

uses
System.SysUtils,
Dllama,
Dllama.Ext;

var
LResponse: string;
LTokenInputSpeed: Single;
LTokenOutputSpeed: Single;
LInputTokens: Integer;
LOutputTokens: Integer;
LTotalTokens: Integer;

begin
// init config
Dllama_InitConfig('C:\LLM\gguf', -1, False, VK_ESCAPE);

// add model
Dllama_AddModel('Meta-Llama-3-8B-Instruct-Q6_K', 'llama3', 1024*8, '<|start_header_id|>%s %s<|end_header_id|>',
'\n assistant:\n', ['<|eot_id|>', 'assistant']);

// add messages
Dllama_AddMessage(ROLE_SYSTEM, 'you are Dllama, a helpful AI assistant.');
Dllama_AddMessage(ROLE_USER, 'who are you?');

// display the user prompt
Dllama_Console_PrintLn(Dllama_GetLastUserMessage(), [], DARKGREEN);

// do inference
if Dllama_Inference('llama3', LResponse) then
begin
// display usage
Dllama_Console_PrintLn(CRLF, [], WHITE);
Dllama_GetInferenceUsage(@LTokenInputSpeed, @LTokenOutputSpeed, @LInputTokens, @LOutputTokens,
@LTotalTokens);
Dllama_Console_PrintLn('Tokens :: Input: %d, Output: %d, Total: %d, Speed: %3.1f t/s',
[LInputTokens, LOutputTokens, LTotalTokens, LTokenOutputSpeed], BRIGHTYELLOW);
end
else
begin
Dllama_Console_PrintLn('Error: %s', [Dllama_GetError()], RED);
end;
Dllama_UnloadModel();
end.

I talked to author of Dllama. He will help me to create bindings for BCX or even make it by himself. If you are interested to create bindings for ThinBasic, join Dllama discord: https://discord.gg/tPWjMwK