How do I download other people's models onto my computer?

Can I please get extremely simple step-by-step instructions on how to download the models of other people from huggingface.com onto my personal desktop computer? Visual aides and videos would be very helpful.

1 Like

Doesn’t get much more straight forward than this… Also depends a little bit on the model itself.
I’m not sure what area you are coming from, you will obviously need an environment to run python in to do this.

Also really depends on the type of model that you are looking at. I assume you are interested in large language models. Generally on the model card on the hugging face website you can select ‘Use this model’ then you can pick the python library you want to use it with (usually you will use the transformers library) and then you will be given code that you can copy paste and execute to download and use the model.

Other than that… I hate saying “just use AI” but this is a comparatively simple process and has been documented 100 of times on the Internet, ChatGPT and other models will be great at helping you out.

1 Like

I honestly just want to know how to properly download models like SicariusSicariiStuff/Impish_Bloodmoon_12B_LoRA onto my computer and use them for KoboldAI and SillyTavern. So you’re at least reasonably sure this link you gave me will work for someone who knows nothing about SillyTavern, KoboldAI and whatever huggingface is for?

1 Like

SicariusSicariiStuff/Impish_Bloodmoon_12B_LoRA is a LoRA (not the main model itself, but more like a differential), so it’s tricky to use. Since there seems to be an applied GGUF file, why not start by using that?


What you’re trying to do

  • Hugging Face = a website where people upload model files. (Hugging Face)
  • KoboldCpp = the program that actually runs a local model (mostly .gguf files) and exposes a local API. (SillyTavern Docs)
  • SillyTavern = the chat UI that connects to that local API. (SillyTavern Docs)

Important: SicariusSicariiStuff/Impish_Bloodmoon_12B_LoRA is a LoRA adapter, not a ready-to-run model file. The repo contains adapter_model.safetensors (and config files). (Hugging Face)
For the simplest beginner setup with KoboldCpp + SillyTavern, you usually want a GGUF model file (example below). (SillyTavern Docs)


Extremely simple step-by-step (Windows) — the easiest path

Step 1) Install SillyTavern (easy method: “SillyTavern Launcher”)

  1. Press Windows + R

  2. Paste this and press Enter:

    cmd /c winget install -e --id Git.Git
    
  3. Open File Explorer, go to where you want SillyTavern installed (example: C:\AI\)

  4. Click the address bar, type cmd, press Enter

  5. Paste this and press Enter:

    git clone https://github.com/SillyTavern/SillyTavern-Launcher.git && cd SillyTavern-Launcher && start installer.bat
    
  6. Follow the launcher’s prompts. (SillyTavern Docs)


Step 2) Download and run KoboldCpp

  1. Go to the KoboldCpp Releases page and download koboldcpp.exe (Windows). (SillyTavern Docs)

  2. Double-click it to run.

    • If Windows warns you, allow it to run (the SillyTavern docs mention the Defender popup). (SillyTavern Docs)

Step 3) Download a GGUF model from Hugging Face (recommended for KoboldCpp)

For your Impish Bloodmoon example, a ready-to-run GGUF option is:

  • mradermacher/Impish_Bloodmoon_12B-i1-GGUF (GGUF quants) (Hugging Face)

Do this:

  1. On that model page, click “Files and versions”

  2. Download one GGUF file (beginner-friendly picks listed on the page):

    • i1-Q4_K_M (often a good default), or
    • i1-Q4_K_S (a bit smaller) (Hugging Face)
  3. Save it somewhere simple, e.g.:

    • C:\AI\Models\Impish_Bloodmoon_12B\

Step 4) Load the model in KoboldCpp

  1. Open koboldcpp.exe
  2. Click Browse next to Model and select your .gguf file
  3. Click Launch
  4. A browser page (KoboldAI Lite) should open to test it. (SillyTavern Docs)

If it’s slow or crashes: reduce Context in KoboldCpp before launching, or pick a smaller GGUF quant. (SillyTavern Docs)


Step 5) Connect SillyTavern to KoboldCpp

  1. Start SillyTavern

  2. Open API Connections

  3. Set:

    • API = Text Completion

    • API Type = KoboldCpp

    • Server URL =

      http://127.0.0.1:5001/
      
  4. Click Connect (SillyTavern Docs)


How to download from Hugging Face “properly” (more reliable than browser downloads)

This is useful for huge files and resumable downloads.

  1. Install Python (if you don’t have it).

  2. Open Command Prompt and run:

    pip install -U huggingface_hub
    
  3. Download a specific file into a folder you choose:

    hf download mradermacher/Impish_Bloodmoon_12B-i1-GGUF Impish_Bloodmoon_12B.i1-Q4_K_M.gguf --local-dir C:\AI\Models\Impish_Bloodmoon_12B
    

The hf download command and --local-dir option are documented by Hugging Face. (Hugging Face)


About the LoRA you mentioned (why it’s not beginner-friendly with KoboldCpp)

SicariusSicariiStuff/Impish_Bloodmoon_12B_LoRA contains a LoRA adapter (adapter_model.safetensors) and configs. (Hugging Face)

KoboldCpp can apply a LoRA, but its text LoRA support is for GGUF models and it applies a LoRA file on top of a GGUF base model (via --lora). (GitHub)

Beginner-friendly options:

  1. Use a ready-to-run GGUF build (like the mradermacher/...-GGUF repo above). (Hugging Face)

  2. Use a backend that loads base+LoRA in the common PEFT safetensors format (often text-generation-webui / “Oobabooga”), then connect SillyTavern to it at:

    http://127.0.0.1:5000/
    

    (SillyTavern Docs)

  3. Advanced: convert/merge the LoRA into a GGUF workflow (not recommended as your first attempt). KoboldCpp’s wiki explains what LoRA is and the relevant flags. (GitHub)


Videos (step-by-step)

  • “Installing Silly Tavern with KoboldCPP (KCPP)” (youtube.com)
  • “Installing KoboldCPP on Windows” (youtube.com)
  • “Local Character.ai Alternative! - KoboldCpp & SillyTavern …” (youtube.com)
  • “SillyTavern Launcher (automatic installation …)” (youtube.com)