ButterM40 commited on
Commit
7eb6066
Β·
1 Parent(s): 103e08e

Minimal Space config for testing

Browse files
Files changed (1) hide show
  1. README.md +6 -151
README.md CHANGED
@@ -1,158 +1,13 @@
1
  ---
2
- title: Local Inference
3
- emoji: πŸ‘€
4
- colorFrom: pink
5
  colorTo: gray
6
  sdk: docker
7
  pinned: false
 
8
  ---
9
 
10
- # AI Chat & Summarization Web App πŸ€–
11
 
12
- A beautiful web-based AI application featuring **Chat Generation** and **Text Summarization** powered by Hugging Face models.
13
-
14
- ## Features ✨
15
-
16
- - πŸ’¬ **Chat Generation**: Interactive AI chat using Qwen/Qwen1.5-0.5B-Chat
17
- - πŸ“ **Text Summarization**: Summarize long texts using DistilBART model
18
- - 🎨 **Beautiful UI**: Modern gradient design with smooth animations
19
- - 🌐 **Accessible**: Publicly deployable and accessible to everyone
20
- - ⚑ **Fast**: Lightweight models optimized for quick responses
21
-
22
- ## Models Used
23
-
24
- - **Chat**: `Qwen/Qwen1.5-0.5B-Chat` - Lightweight conversational AI
25
- - **Summarization**: `sshleifer/distilbart-cnn-6-6` - Efficient text summarization
26
-
27
- ## Local Development
28
-
29
- ### Prerequisites
30
- - Python 3.12+
31
- - pip
32
-
33
- ### Installation
34
-
35
- 1. Clone the repository:
36
- ```bash
37
- git clone https://github.com/DiegoAdame13322/LocalInference.git
38
- cd LocalInference
39
- ```
40
-
41
- 2. Install dependencies:
42
- ```bash
43
- pip install -r requirements.txt
44
- ```
45
-
46
- 3. Run the server:
47
- ```bash
48
- python server.py
49
- ```
50
-
51
- 4. Open your browser to `http://localhost:8000`
52
-
53
- ## Deploy Options
54
-
55
- ### Option 1: Hugging Face Spaces (Docker)
56
-
57
- See [DEPLOY_TO_SPACES.md](DEPLOY_TO_SPACES.md) for detailed instructions.
58
-
59
- ### Option 2: Render Manual Deploy
60
-
61
- 1. Go to [Render Dashboard](https://dashboard.render.com/)
62
- 2. Click "New +" β†’ "Web Service"
63
- 3. Connect your repository
64
- 4. Configure:
65
- - **Name**: `ai-chat-summarization`
66
- - **Environment**: `Python`
67
- - **Build Command**: `pip install -r requirements.txt`
68
- - **Start Command**: `python server.py`
69
- - **Instance Type**: Free or Starter (Starter recommended for better performance)
70
-
71
- 5. Click "Create Web Service"
72
-
73
- ### Important Notes for Deployment
74
-
75
- - ⚠️ **First startup takes 5-10 minutes** as models download (1.5GB+)
76
- - πŸ’Ύ **Disk space**: Free tier has 512MB, models need ~1.5GB. Use **Starter plan** or higher
77
- - πŸ”„ **Auto-sleep**: Free tier sleeps after 15min of inactivity, takes ~30s to wake up
78
- - 🎯 **Recommendation**: Use **Starter plan** for:
79
- - More disk space
80
- - Better performance
81
- - No auto-sleep
82
-
83
- ## API Endpoints
84
-
85
- ### Chat Generation
86
- ```bash
87
- POST /api/chat
88
- Content-Type: application/json
89
-
90
- {
91
- "message": "What is machine learning?",
92
- "max_new_tokens": 150,
93
- "temperature": 0.7
94
- }
95
- ```
96
-
97
- ### Text Summarization
98
- ```bash
99
- POST /api/summarize
100
- Content-Type: application/json
101
-
102
- {
103
- "text": "Your long text here...",
104
- "max_length": 130,
105
- "min_length": 30
106
- }
107
- ```
108
-
109
- ### Health Check
110
- ```bash
111
- GET /api/health
112
- ```
113
-
114
- ## Project Structure
115
-
116
- ```
117
- LocalInference/
118
- β”œβ”€β”€ server.py # FastAPI backend with model loading
119
- β”œβ”€β”€ static/
120
- β”‚ └── index.html # Frontend web interface
121
- β”œβ”€β”€ requirements.txt # Python dependencies
122
- β”œβ”€β”€ render.yaml # Render deployment config
123
- β”œβ”€β”€ runtime.txt # Python version specification
124
- └── README.md # This file
125
- ```
126
-
127
- ## Tech Stack
128
-
129
- - **Backend**: FastAPI, PyTorch, Transformers
130
- - **Frontend**: HTML5, CSS3, JavaScript (Vanilla)
131
- - **Models**: Hugging Face Transformers
132
- - **Deployment**: Hugging Face Spaces, Render
133
-
134
- ## Troubleshooting
135
-
136
- ### Models not loading
137
- - Check disk space in deployment platform
138
- - Check logs in platform dashboard
139
-
140
- ### Slow first response
141
- - Models load on first request, subsequent requests are faster
142
- - Consider keeping the service warm with periodic requests
143
-
144
- ### Out of memory errors
145
- - Reduce `max_new_tokens` in chat requests
146
- - Use plan with more RAM
147
-
148
- ## License
149
-
150
- MIT License - feel free to use and modify!
151
-
152
- ## Contributing
153
-
154
- Pull requests are welcome! For major changes, please open an issue first.
155
-
156
- ---
157
-
158
- Made with ❀️ using Hugging Face Transformers
 
1
  ---
2
+ title: Local Inference Hub
3
+ emoji: πŸ€–
4
+ colorFrom: purple
5
  colorTo: gray
6
  sdk: docker
7
  pinned: false
8
+ app_port: 7860
9
  ---
10
 
11
+ # Local Inference Hub
12
 
13
+ A simple FastAPI application for AI chat and summarization.