Instructions to use zai-org/glm-4-9b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zai-org/glm-4-9b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="zai-org/glm-4-9b", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("zai-org/glm-4-9b", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use zai-org/glm-4-9b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "zai-org/glm-4-9b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zai-org/glm-4-9b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/zai-org/glm-4-9b
- SGLang
How to use zai-org/glm-4-9b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "zai-org/glm-4-9b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zai-org/glm-4-9b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "zai-org/glm-4-9b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zai-org/glm-4-9b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use zai-org/glm-4-9b with Docker Model Runner:
docker model run hf.co/zai-org/glm-4-9b
how do i delete my account?
#20 opened 7 months ago
by
Liliana1
AttributeError: 'ChatGLMForConditionalGeneration' object has no attribute '_extract_past_from_model_output'
#19 opened about 1 year ago
by
hotyouth
Fix TypeError in _pad method by adding missing padding_side field
#18 opened about 1 year ago
by
TrueName
TypeError: ChatGLM4Tokenizer._pad() got an unexpected keyword argument 'padding_side'
2
#16 opened over 1 year ago
by
TracyMc
Converting to native Transformers
2
#15 opened over 1 year ago
by
cyrilvallez
Adding the Open Portuguese LLM Leaderboard Evaluation Results
#12 opened over 1 year ago
by
leaderboard-pt-pr-bot
del gradient_checkpointing_enable()
#11 opened almost 2 years ago
by
chandler88
UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 0-1: unexpected end of data
4
#10 opened almost 2 years ago
by
ixioxoixi
AttributeError: module 'transformers_modules.GLM-4-9B.tokenization_chatglm' has no attribute 'ChatGLM4Tokenizer'
3
#8 opened almost 2 years ago
by
wangrenzhong
How to use glm-4-9b to do classification?
1
#5 opened almost 2 years ago
by
JustTryTry
FIX autogptq compat
#4 opened almost 2 years ago
by
Qubitium
Full list of languages?
🔥 1
1
#2 opened almost 2 years ago
by
RASMUS
使用 lm_eval 测试时报错了
2
#1 opened almost 2 years ago
by
xianf