Instructions to use SparseLLM/prosparse-llama-2-7b-predictor with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SparseLLM/prosparse-llama-2-7b-predictor with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="SparseLLM/prosparse-llama-2-7b-predictor", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("SparseLLM/prosparse-llama-2-7b-predictor", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| license: mit | |
| language: | |
| - en | |
| # ProSparse-LLaMA-2-7B-Predictor | |
| - Model Creator: [THUNLP](https://nlp.csai.tsinghua.edu.cn/), [ModelBest](modelbest.cn), and [PowerInfer](https://huggingface.co/PowerInfer) | |
| This repository provides a group of sparsity predictors serving for [SparseLLM/ProSparse-LLaMA-2-7B](https://huggingface.co/SparseLLM/prosparse-llama-2-7b). | |
| Note: The folder `predictors_clip15` contains the activation predictors trained with data only reserving top 15% activation values. | |
| ### Citation | |
| Please kindly cite using the following BibTeX: | |
| ```bibtex | |
| @article{song2024prosparse, | |
| title={{ProSparse}: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models}, | |
| author={Song, Chenyang and Han, Xu and Zhang, Zhengyan and Hu, Shengding and Shi, Xiyu and Li, Kuai and Chen, Chen and Liu, Zhiyuan and Li, Guangli and Yang, Tao and Sun, Maosong}, | |
| year={2024}, | |
| journal={arXiv preprint arXiv:2402.13516}, | |
| url={https://arxiv.org/pdf/2402.13516.pdf} | |
| } | |
| ``` | |