Bug 2255828

Summary: Review Request: llama-cpp - Port of Facebook's LLaMA model in C/C++
Product: [Fedora] Fedora Reporter: Tom Rix <trix>
Component: Package ReviewAssignee: Tomas Tomecek <ttomecek>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: rawhideCC: boeroboy, fweimer, jin, package-review, pbrobinson, petersen, thunderbirdtr, ttomecek, xavier
Target Milestone: ---Keywords: AutomationTriaged
Target Release: ---Flags: ttomecek: fedora-review+
Hardware: Unspecified   
OS: Linux   
URL: https://github.com/ggerganov/llama.cpp
Whiteboard:
Fixed In Version: llama-cpp-b2417-2.fc41 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-06-16 08:06:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1011110    
Attachments:
Description Flags
The .spec file difference from Copr build 6854918 to 7160798 none

Description Tom Rix 2023-12-25 16:07:21 UTC
Spec URL: https://trix.fedorapeople.org/llama-cpp.spec
SRPM URL: https://trix.fedorapeople.org/llama-cpp-b1695-1.fc40.src.rpm

Description:
The main goal of llama.cpp is to run the LLaMA model using 4-bit                                                                  
integer quantization on a MacBook                                                                                                 
                                                                                                                                  
* Plain C/C++ implementation without dependencies                                                                                 
* Apple silicon first-class citizen - optimized via ARM NEON, Accelerate                                                          
  and Metal frameworks                                                                                                            
* AVX, AVX2 and AVX512 support for x86 architectures                                                                              
* Mixed F16 / F32 precision                                                                                                       
* 2-bit, 3-bit, 4-bit, 5-bit, 6-bit and 8-bit integer quantization support                                                        
* CUDA, Metal and OpenCL GPU backend support                                                                                      
                                                                                                                                  
The original implementation of llama.cpp was hacked in an evening.                                                                
Since then, the project has improved significantly thanks to many                                                                 
contributions. This project is mainly for educational purposes and                                                                
serves as the main playground for developing new features for the                                                                 
ggml library.      

Reproducible: Always

Comment 1 Onuralp Sezer 2024-01-03 14:04:18 UTC
[fedora-review-service-build]

Comment 2 Onuralp Sezer 2024-01-03 14:07:01 UTC
I am taking this review

Comment 3 Fedora Review Service 2024-01-03 14:17:58 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/6854918
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2255828-llama-cpp/fedora-rawhide-x86_64/06854918-llama-cpp/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 4 Sergey Bostandzhyan 2024-03-13 18:26:17 UTC
Hi Tom,

I compiled the srpm on Rawhide just fine, but I was surprised that there were no command line utilities supplied, could we also get a `-tools` or `-cli` package, containing the main executable which allows to run prompts from command line? I know there are a bunch of other tools there as well, but I am not yet familiar with them, perhaps some of those would be nice to have as well.

Another question is about a ROCm enabled version, for example with a hipBLAS backend https://github.com/ggerganov/llama.cpp#hipblas or any other backend that would allow to run inference on the GPU?

Comment 5 Tom Rix 2024-03-14 20:54:33 UTC
Spec URL: https://trix.fedorapeople.org/llama-cpp.spec
SPRM URL: https://trix.fedorapeople.org/llama-cpp-b2417-1.fc41.src.rpm

This is the most recent llama-cpp.
I looked at packaging the examples, there were many issue flagged in fedora-review.
I have every interest in enabling the HIP backend and any others that people want, but want to do that later or when someone sends me a PR.

Comment 6 Fedora Review Service 2024-03-14 21:09:44 UTC
Created attachment 2021676 [details]
The .spec file difference from Copr build 6854918 to 7160798

Comment 7 Fedora Review Service 2024-03-14 21:09:47 UTC
Copr build:
https://copr.fedorainfracloud.org/coprs/build/7160798
(succeeded)

Review template:
https://download.copr.fedorainfracloud.org/results/@fedora-review/fedora-review-2255828-llama-cpp/fedora-rawhide-x86_64/07160798-llama-cpp/fedora-review/review.txt

Please take a look if any issues were found.


---
This comment was created by the fedora-review-service
https://github.com/FrostyX/fedora-review-service

If you want to trigger a new Copr build, add a comment containing new
Spec and SRPM URLs or [fedora-review-service-build] string.

Comment 8 Jens Petersen 2024-03-15 07:33:20 UTC
(In reply to Tom Rix from comment #5)
> Spec URL: https://trix.fedorapeople.org/llama-cpp.spec
> SPRM URL: https://trix.fedorapeople.org/llama-cpp-b2417-1.fc41.src.rpm

> I looked at packaging the examples, there were many issue flagged in
> fedora-review.

Hmm, could you elaborate or list them here? - obviously they won't count against the review if not enabled.

Though not having any executable available for quick experimentation seems a shame,
but maybe some other package could provide that functionality?
Admittedly having /usr/bin/main doesn't seem great either:
seems something worth bringing up and discussing with upstream.

Comment 9 Tom Rix 2024-03-15 12:26:27 UTC
I am trying to get to the llama-cpp-python package.
I think there will be plenty to play with when the python package hits.

Comment 10 Jens Petersen 2024-03-18 06:13:17 UTC
(In reply to Tom Rix from comment #9)
> I am trying to get to the llama-cpp-python package.
> I think there will be plenty to play with when the python package hits.

That's probably reasonable for real use, but for some of us
a big attraction of llama-cpp is that there are no python deps. ;o)

I believe the README is littered with 'main' examples.
Perhaps /usr/bin/main could be renamed as llama-cpp even?

Comment 11 Jens Petersen 2024-03-18 06:16:36 UTC
Onuralp, are you still planning to complete this review?

Comment 12 Tomas Tomecek 2024-03-18 10:31:46 UTC
Agreed with Sergey about the CLI tools: there is many of them, especially "main", "quantize" and "server" that would be really helpful to have. These tools have generic names which is unfortunate in context of the whole operating system. We should prefix them with something like llama.cpp-main, llama.cpp-quantize, etc.

I wonder how supported these binaries are because all their sources live in examples/ dir: https://github.com/ggerganov/llama.cpp/tree/master/examples

Comment 13 Tomas Tomecek 2024-03-18 10:39:17 UTC
Relevant upstream issue re binaries and downstream packaging: https://github.com/ggerganov/llama.cpp/issues/5106

Comment 14 Tom Rix 2024-03-19 00:38:46 UTC
(In reply to Tomas Tomecek from comment #12)
> Agreed with Sergey about the CLI tools: there is many of them, especially
> "main", "quantize" and "server" that would be really helpful to have. These
> tools have generic names which is unfortunate in context of the whole
> operating system. We should prefix them with something like llama.cpp-main,
> llama.cpp-quantize, etc.
> 
> I wonder how supported these binaries are because all their sources live in
> examples/ dir: https://github.com/ggerganov/llama.cpp/tree/master/examples

These are examples, they are not part of the normal build, you have to enable LLAMA_BUILD_EXAMPLES to get them to build.
If the upstream has not committed to them being part of the normal build and supporting them, we should not either.

Comment 15 Sergey Bostandzhyan 2024-03-19 20:26:51 UTC
 > These are examples, they are not part of the normal build, you have to
> enable LLAMA_BUILD_EXAMPLES to get them to build.
> If the upstream has not committed to them being part of the normal build and
> supporting them, we should not either.

Hmm, I did not have to set any LLAMA_BUILD_EXAMPLES env vars/options, I remember following the cmake build instructions for a hipBLAS enabled build and it compiled the binary utilities (examples) automatically?

Comment 16 Tomas Tomecek 2024-03-22 13:15:21 UTC
Agreed with Tom. If the upstream doesn't support those binaries officially, then building them in Fedora does not reflect the upstream decision. Even when it reduces usability of our downstream build.

I suggest we should move on with this review and open a tracking bug for the ask to include the binaries.

Comment 17 Tomas Tomecek 2024-03-22 14:32:38 UTC
I am assigning this to myself given Onuralp's inactivity.


Package Review
==============

Legend:
[x] = Pass, [!] = Fail, [-] = Not applicable, [?] = Not evaluated
[ ] = Manual review needed



===== MUST items =====

C/C++:
[x]: Provides: bundled(gnulib) in place as required.
     Note: Sources not installed
[x]: Package does not contain kernel modules.
[x]: If your application is a C or C++ application you must list a
     BuildRequires against gcc, gcc-c++ or clang.
     rpm -q --requires ./results/llama-cpp-b2417-1.fc41.src.rpm
     cmake
     gcc-c++
[x]: Header files in -devel subpackage, if present.
[x]: Package does not contain any libtool archives (.la)
[x]: Package contains no static executables.
[x]: Rpath absent or only used for internal libs.
[x]: Development (unversioned) .so files in -devel subpackage, if present.

Generic:
[x]: Package is licensed with an open-source compatible license and meets
     other legal requirements as defined in the legal section of Packaging
     Guidelines.
[x]: License field in the package spec file matches the actual license.
     Note: Checking patched sources after %prep for licenses. Licenses
     found: "Unknown or generated", "MIT License", "*No copyright* MIT
     License", "Apache License 2.0 and/or MIT License", "*No copyright* The
     Unlicense", "MIT License and/or The Unlicense", "Apache License 2.0".
     456 files have unknown license.
[x]: License file installed when any subpackage combination is installed.
[x]: If the package is under multiple licenses, the licensing breakdown
     must be documented in the spec.
[x]: Package requires other packages for directories it uses.
     Note: No known owner of /usr/lib64/cmake, /usr/lib64/cmake/Llama
[!]: Package must own all directories that it creates.
     Note: Directories without known owners: /usr/lib64/cmake/Llama
[x]: %build honors applicable compiler flags or justifies otherwise.
[x]: Package contains no bundled libraries without FPC exception.
[x]: Changelog in prescribed format.
[x]: Sources contain only permissible code or content.
[x]: Package contains desktop file if it is a GUI application.
[x]: Development files must be in a -devel package
[x]: Package uses nothing in %doc for runtime.
[x]: Package consistently uses macros (instead of hard-coded directory
     names).
[x]: Package is named according to the Package Naming Guidelines.
[x]: Package does not generate any conflict.
[x]: Package obeys FHS, except libexecdir and /usr/target.
[x]: If the package is a rename of another package, proper Obsoletes and
     Provides are present.
[x]: Requires correct, justified where necessary.
[x]: Spec file is legible and written in American English.
[x]: Package contains systemd file(s) if in need.
[x]: Useful -debuginfo package or justification otherwise.
[x]: Package is not known to require an ExcludeArch tag.
[x]: Large documentation must go in a -doc subpackage. Large could be size
     (~1MB) or number of files.
     Note: Documentation size is 58243 bytes in 1 files.
[x]: Package complies to the Packaging Guidelines
[x]: Package successfully compiles and builds into binary rpms on at least
     one supported primary architecture.
[x]: Package installs properly.
[x]: Rpmlint is run on all rpms the build produces.
     Note: There are rpmlint messages (see attachment).
[x]: If (and only if) the source package includes the text of the
     license(s) in its own file, then that file, containing the text of the
     license(s) for the package is included in %license.
[x]: The License field must be a valid SPDX expression.
[x]: Package does not own files or directories owned by other packages.
[x]: Package uses either %{buildroot} or $RPM_BUILD_ROOT
[x]: Package does not run rm -rf %{buildroot} (or $RPM_BUILD_ROOT) at the
     beginning of %install.
[x]: Macros in Summary, %description expandable at SRPM build time.
[x]: Dist tag is present.
[x]: Package does not contain duplicates in %files.
[x]: Permissions on files are set properly.
[x]: Package must not depend on deprecated() packages.
[x]: Package use %makeinstall only when make install DESTDIR=... doesn't
     work.
[x]: Package is named using only allowed ASCII characters.
[x]: Package does not use a name that already exists.
[x]: Package is not relocatable.
[x]: Sources used to build the package match the upstream source, as
     provided in the spec URL.
[x]: Spec file name must match the spec package %{name}, in the format
     %{name}.spec.
[x]: File names are valid UTF-8.
[x]: Packages must not store files under /srv, /opt or /usr/local

===== SHOULD items =====

Generic:
[x]: If the source package does not include license text(s) as a separate
     file from upstream, the packager SHOULD query upstream to include it.
[x]: Final provides and requires are sane (see attachments).
[x]: Package functions as described.
[x]: Latest version is packaged.
[x]: Package does not include license text files separate from upstream.
[x]: Sources are verified with gpgverify first in %prep if upstream
     publishes signatures.
     Note: gpgverify is not used.
[x]: %check is present and all tests pass.
[x]: Packages should try to preserve timestamps of original installed
     files.
[x]: Reviewer should test that the package builds in mock.
[x]: Buildroot is not present
[x]: Package has no %clean section with rm -rf %{buildroot} (or
     $RPM_BUILD_ROOT)
[x]: No file requires outside of /etc, /bin, /sbin, /usr/bin, /usr/sbin.
[x]: Fully versioned dependency in subpackages if applicable.
[x]: Packager, Vendor, PreReq, Copyright tags should not be in spec file
[x]: Sources can be downloaded from URI in Source: tag
[x]: SourceX is a working URL.
[x]: Package should compile and build into binary rpms on all supported
     architectures.
[x]: Spec use %global instead of %define unless justified.

===== EXTRA items =====

Generic:
[x]: Rpmlint is run on debuginfo package(s).
     Note: No rpmlint messages.
[x]: Rpmlint is run on all installed packages.
     Note: There are rpmlint messages (see attachment).
[x]: Large data in /usr/share should live in a noarch subpackage if package
     is arched.
[x]: Spec file according to URL is the same as in SRPM.


Rpmlint
-------
Checking: llama-cpp-b2417-1.fc41.x86_64.rpm
          llama-cpp-devel-b2417-1.fc41.x86_64.rpm
          llama-cpp-debuginfo-b2417-1.fc41.x86_64.rpm
          llama-cpp-debugsource-b2417-1.fc41.x86_64.rpm
          llama-cpp-b2417-1.fc41.src.rpm
============================================================================================================================= rpmlint session starts =============================================================================================================================
rpmlint: 2.5.0
configuration:
    /usr/lib/python3.12/site-packages/rpmlint/configdefaults.toml
    /etc/xdg/rpmlint/fedora-legacy-licenses.toml
    /etc/xdg/rpmlint/fedora-spdx-licenses.toml
    /etc/xdg/rpmlint/fedora.toml
    /etc/xdg/rpmlint/scoring.toml
    /etc/xdg/rpmlint/users-groups.toml
    /etc/xdg/rpmlint/warn-on-functions.toml
rpmlintrc: [PosixPath('/tmp/tmp0gd62j_4')]
checks: 32, packages: 5

llama-cpp.src: E: spelling-error ('ggml', '%description -l en_US ggml -> SGML')
llama-cpp.x86_64: E: spelling-error ('ggml', '%description -l en_US ggml -> SGML')
llama-cpp-devel.x86_64: E: spelling-error ('ggml', '%description -l en_US ggml -> SGML')
llama-cpp.x86_64: W: no-documentation
======================================================================================= 5 packages and 0 specfiles checked; 3 errors, 1 warnings, 28 filtered, 3 badness; has taken 1.0 s ========================================================================================




Rpmlint (debuginfo)
-------------------
Checking: llama-cpp-debuginfo-b2417-1.fc41.x86_64.rpm
============================================================================================================================= rpmlint session starts =============================================================================================================================
rpmlint: 2.5.0
configuration:
    /usr/lib/python3.12/site-packages/rpmlint/configdefaults.toml
    /etc/xdg/rpmlint/fedora-legacy-licenses.toml
    /etc/xdg/rpmlint/fedora-spdx-licenses.toml
    /etc/xdg/rpmlint/fedora.toml
    /etc/xdg/rpmlint/scoring.toml
    /etc/xdg/rpmlint/users-groups.toml
    /etc/xdg/rpmlint/warn-on-functions.toml
rpmlintrc: [PosixPath('/tmp/tmp84egxzuu')]
checks: 32, packages: 1

======================================================================================== 1 packages and 0 specfiles checked; 0 errors, 0 warnings, 5 filtered, 0 badness; has taken 0.2 s ========================================================================================





Rpmlint (installed packages)
----------------------------
============================ rpmlint session starts ============================
rpmlint: 2.5.0
configuration:
    /usr/lib/python3.12/site-packages/rpmlint/configdefaults.toml
    /etc/xdg/rpmlint/fedora-legacy-licenses.toml
    /etc/xdg/rpmlint/fedora-spdx-licenses.toml
    /etc/xdg/rpmlint/fedora.toml
    /etc/xdg/rpmlint/scoring.toml
    /etc/xdg/rpmlint/users-groups.toml
    /etc/xdg/rpmlint/warn-on-functions.toml
checks: 32, packages: 4

llama-cpp.x86_64: E: spelling-error ('ggml', '%description -l en_US ggml -> SGML')
llama-cpp-devel.x86_64: E: spelling-error ('ggml', '%description -l en_US ggml -> SGML')
llama-cpp.x86_64: W: no-documentation
 4 packages and 0 specfiles checked; 2 errors, 1 warnings, 23 filtered, 2 badness; has taken 0.6 s 



Source checksums
----------------
https://github.com/ggerganov/llama.cpp/archive/b2417.tar.gz#/llama.cpp-b2417.tar.gz :
  CHECKSUM(SHA256) this package     : 92337a672d9236266080a57647988a4e8c82f72afc6f4af55a1e33b609ac4715
  CHECKSUM(SHA256) upstream package : 92337a672d9236266080a57647988a4e8c82f72afc6f4af55a1e33b609ac4715


Requires
--------
llama-cpp (rpmlib, GLIBC filtered):
    libc.so.6()(64bit)
    libgcc_s.so.1()(64bit)
    libgcc_s.so.1(GCC_3.0)(64bit)
    libgcc_s.so.1(GCC_3.3.1)(64bit)
    libm.so.6()(64bit)
    libstdc++.so.6()(64bit)
    libstdc++.so.6(CXXABI_1.3)(64bit)
    libstdc++.so.6(CXXABI_1.3.5)(64bit)
    rtld(GNU_HASH)

llama-cpp-devel (rpmlib, GLIBC filtered):
    cmake-filesystem(x86-64)
    libllama.so.b2417()(64bit)
    llama-cpp(x86-64)

llama-cpp-debuginfo (rpmlib, GLIBC filtered):

llama-cpp-debugsource (rpmlib, GLIBC filtered):



Provides
--------
llama-cpp:
    libllama.so.b2417()(64bit)
    llama-cpp
    llama-cpp(x86-64)

llama-cpp-devel:
    cmake(Llama)
    cmake(llama)
    llama-cpp-devel
    llama-cpp-devel(x86-64)

llama-cpp-debuginfo:
    debuginfo(build-id)
    libllama.so.b2417-b2417-1.fc41.x86_64.debug()(64bit)
    llama-cpp-debuginfo
    llama-cpp-debuginfo(x86-64)

llama-cpp-debugsource:
    llama-cpp-debugsource
    llama-cpp-debugsource(x86-64)



Generated by fedora-review 0.10.0 (e79b66b) last change: 2023-07-24
Command line :/usr/bin/fedora-review -b 2255828
Buildroot used: fedora-rawhide-x86_64
Active plugins: C/C++, Generic, Shell-api
Disabled plugins: R, Python, fonts, Haskell, PHP, Java, Ocaml, Perl, SugarActivity
Disabled flags: EXARCH, EPEL6, EPEL7, DISTTAG, BATCH



Please add ownership of `/usr/lib64/cmake/Llama` in dist-git, otherwise all good, very nice work!

Comment 18 Jens Petersen 2024-03-23 05:55:04 UTC
Thanks Tomas for proceeding 👍

Is it possible to add the examples sources to a doc or examples subpackage? Dunno how easy it is for users to build them themselves?

Comment 19 Fedora Admin user for bugzilla script actions 2024-03-23 11:47:26 UTC
The Pagure repository was created at https://src.fedoraproject.org/rpms/llama-cpp

Comment 20 Jens Petersen 2024-06-16 08:16:13 UTC
I think it would still be nice to enable the examples subpackage.