{"id":2725,"date":"2026-06-06T03:34:23","date_gmt":"2026-06-06T03:34:23","guid":{"rendered":"https:\/\/tucumandevelopers.com\/index.php\/2026\/06\/06\/run-gemma-4-12b-on-wsl2-with-llama-cpp\/"},"modified":"2026-06-06T03:34:23","modified_gmt":"2026-06-06T03:34:23","slug":"run-gemma-4-12b-on-wsl2-with-llama-cpp","status":"publish","type":"post","link":"https:\/\/tucumandevelopers.com\/index.php\/2026\/06\/06\/run-gemma-4-12b-on-wsl2-with-llama-cpp\/","title":{"rendered":"Run Gemma-4 12B on WSL2 with llama.cpp"},"content":{"rendered":"<div>\n<div><\/header>\n<div data-article-id=\"3831352\" id=\"article-body\">\n<h2> <a name=\"1-update-wsl-environment\" href=\"#1-update-wsl-environment\"> <\/a> 1. update WSL environment <\/h2>\n<div>\n<pre><code><span>sudo <\/span>apt update <span>&amp;&amp;<\/span> <span>sudo <\/span>apt upgrade <span>-y<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<h2> <a name=\"2-install-dependencies\" href=\"#2-install-dependencies\"> <\/a> 2. install dependencies <\/h2>\n<p>If you don&#8217;t use <code>-hf<\/code> option, you don&#8217;t need to install libssl-dev in this step. <\/p>\n<div>\n<pre><code><span>sudo <\/span>apt <span>install <\/span>build-essential cmake git libssl-dev <span>-y<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p>If <code>nvidia-smi<\/code> shows a GPU\/GPUs on your terminal, you will need to install the tooklit. This will take some time. <\/p>\n<div>\n<pre><code><span>sudo <\/span>apt <span>install <\/span>nvidia-cuda-toolkit <span>-y<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<h2> <a name=\"3-clone-the-repo\" href=\"#3-clone-the-repo\"> <\/a> 3. clone the repo <\/h2>\n<p>Build llama-cli and llama-server. This step also will take some time.<br \/> If you don&#8217;t plan to use <code>-hf<\/code> option, you don&#8217;t need to use <code>-DLLAMA_OPENSSL=ON<\/code>. <\/p>\n<div>\n<pre><code>git clone https:\/\/github.com\/ggerganov\/llama.cpp <span>cd <\/span>llama.cpp cmake <span>-B<\/span> build <span>-DGGML_CUDA<\/span><span>=<\/span>ON <span>-DLLAMA_OPENSSL<\/span><span>=<\/span>ON cmake <span>--build<\/span> build <span>--config<\/span> Release <span># no GPU<\/span> git clone https:\/\/github.com\/ggerganov\/llama.cpp <span>cd <\/span>llama.cpp cmake <span>-B<\/span> build cmake <span>--build<\/span> build <span>--config<\/span> Release <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<h2> <a name=\"4-run-the-model\" href=\"#4-run-the-model\"> <\/a> 4. run the model <\/h2>\n<p>Run <code>gemma-4-12b-it<\/code> with cli and server.<\/p>\n<div>\n<p><a href=\"https:\/\/huggingface.co\/unsloth\/gemma-4-12b-it-GGUF\" target=\"_blank\" rel=\"noopener noreferrer\"> <\/a> <\/p>\n<div>\n<h2> <a href=\"https:\/\/huggingface.co\/unsloth\/gemma-4-12b-it-GGUF\" target=\"_blank\" rel=\"noopener noreferrer\"> unsloth\/gemma-4-12b-it-GGUF \u00b7 Hugging Face <\/a> <\/h2>\n<p> We\u2019re on a journey to advance and democratize artificial intelligence through open source and open science. <\/p>\n<p> huggingface.co <\/p>\n<\/p><\/div>\n<\/p><\/div>\n<div>\n<pre><code>.\/build\/bin\/llama-cli <span>-hf<\/span> unsloth\/gemma-4-12b-it-GGUF:UD-Q4_K_XL <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<div>\n<pre><code><span>&gt;<\/span> hello <span>[<\/span>Start thinking] The user said <span>\"hello\"<\/span><span>.<\/span> The user is initiating a conversation. Respond politely and offer assistance. <span>*<\/span> <span>\"Hello! How can I help you today?\"<\/span> <span>*<\/span> <span>\"Hi there! What's on your mind?\"<\/span> <span>*<\/span> <span>\"Hello! Is there anything I can assist you with?\"<\/span> <span>[<\/span>End thinking] Hello! How can I <span>help <\/span>you today? <span>[<\/span> Prompt: 19.5 t\/s | Generation: 11.8 t\/s <span>]<\/span> <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<p>or run <code>web-ui<\/code> <\/p>\n<div>\n<pre><code>.\/build\/bin\/llama-server <span>-hf<\/span> unsloth\/gemma-4-12b-it-GGUF:UD-Q4_K_XL <span>--port<\/span> 8080 <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<h3> <a name=\"optional-download-model-from-huggingface\" href=\"#optional-download-model-from-huggingface\"> <\/a> optional download model from huggingface <\/h3>\n<div>\n<pre><code><span>mkdir<\/span> <span>-p<\/span> models wget <span>-O<\/span> models\/gemma-4-12b-it-UD-Q4_K_XL.gguf https:\/\/huggingface.co\/unsloth\/gemma-4-12b-it-GGUF\/resolve\/main\/gemma-4-12b-it-UD-Q4_K_XL.gguf <\/code><\/pre>\n<div>\n<\/p><\/div>\n<\/p><\/div>\n<\/p><\/div>\n<\/article>\n<p> <!-- Bottom content skipped via SKIP_BOTTOM_CONTENT config --> <\/div>\n<p> <\/main> <\/div>\n<\/div>\n<\/div>\n<\/div>\n<p>Fuente: <a href=\"https:\/\/dev.to\/0xkoji\/run-gemma-4-12b-on-wsl2-with-llamacpp-1o2m\">Art\u00edculo original<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. update WSL environment sudo apt update &amp;&amp; sudo apt upgrade -y 2. install dependencies If you don&#8217;t use -hf option, you don&#8217;t need to install libssl-dev in this step. sudo apt install build-essential cmake git libssl-dev -y If nvidia-smi shows a GPU\/GPUs on your terminal, you will need to install the tooklit. This will [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2724,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[41],"tags":[],"class_list":["post-2725","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-devto"],"jetpack_publicize_connections":[],"_links":{"self":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/posts\/2725","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/comments?post=2725"}],"version-history":[{"count":0,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/posts\/2725\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/media\/2724"}],"wp:attachment":[{"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/media?parent=2725"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/categories?post=2725"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tucumandevelopers.com\/index.php\/wp-json\/wp\/v2\/tags?post=2725"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}