# HuggingFace Space Deployment Instructions

This directory contains all the files needed to deploy NanoChat as a HuggingFace Space.

## 🚀 Quick Start (Automated Deployment)

### One-Command Deployment

The easiest way to deploy is using our automated script:

```bash
# From the project root
./deploy_inference.sh

# Or with a custom space name
./deploy_inference.sh my-nanochat-demo
```

This script will:
- ✓ Check and install dependencies
- ✓ Verify you're logged into HuggingFace
- ✓ Create and configure the Space
- ✓ Upload all necessary files
- ✓ Provide you with the Space URL

**First time setup:**
```bash
# 1. Install HuggingFace CLI
pip install huggingface_hub

# 2. Login to HuggingFace
huggingface-cli login
# Paste your token from: https://huggingface.co/settings/tokens

# 3. Deploy!
./deploy_inference.sh
```

### Advanced Deployment Options

For more control, use the Python script directly:

```bash
# Basic deployment
python scripts/deploy_hf_space.py --space-name my-nanochat-demo

# Deploy to organization
python scripts/deploy_hf_space.py --space-name nanochat --org my-org

# Private space with GPU
python scripts/deploy_hf_space.py --space-name my-nanochat --private --hardware t4-small

# Use different model
python scripts/deploy_hf_space.py --space-name my-nanochat --model-id username/my-model
```

**Hardware options:**
- `cpu-basic` - Free (slower, default)
- `cpu-upgrade` - ~$0.03/hr (faster CPU)
- `t4-small` - ~$0.60/hr (GPU, recommended for production)
- `t4-medium` - ~$1.20/hr (larger GPU)
- `a10g-small` - ~$3.15/hr (fastest)

## Manual Deployment

### Option 1: Deploy via HuggingFace Web UI (Recommended)

1. **Create a new Space**:
   - Go to https://huggingface.co/new-space
   - Choose a name (e.g., `your-username/nanochat-demo`)
   - Select **Gradio** as the SDK
   - Choose **Public** or **Private**
   - Click **Create Space**

2. **Upload files**:
   - Click **Files** tab in your new Space
   - Click **Add file** > **Upload files**
   - Upload all files from this directory:
     - `app.py`
     - `requirements.txt`
     - `README.md`
     - `configuration_nanochat.py`
     - `modeling_nanochat.py`

3. **Wait for build**:
   - The Space will automatically start building
   - Building takes 5-10 minutes
   - Check the **Logs** tab for progress

4. **Test your Space**:
   - Once building is complete, the **App** tab will show the chat interface
   - Try asking it some questions!

### Option 2: Deploy via Git

1. **Clone your Space repository**:
   ```bash
   git clone https://huggingface.co/spaces/your-username/nanochat-demo
   cd nanochat-demo
   ```

2. **Copy files**:
   ```bash
   cp /path/to/nanochat561/deploy/hf_space/* .
   ```

3. **Commit and push**:
   ```bash
   git add .
   git commit -m "Initial commit: NanoChat Space"
   git push
   ```

4. **Monitor deployment**:
   - Visit https://huggingface.co/spaces/your-username/nanochat-demo
   - Check the **Logs** tab for build progress

### Option 3: Deploy via HuggingFace CLI

1. **Install HuggingFace CLI**:
   ```bash
   pip install huggingface_hub
   huggingface-cli login
   ```

2. **Create and upload**:
   ```bash
   huggingface-cli repo create nanochat-demo --type space --space_sdk gradio
   cd deploy/hf_space
   huggingface-cli upload your-username/nanochat-demo . --repo-type space
   ```

## Files Overview

- **app.py**: Main Gradio application code
- **requirements.txt**: Python dependencies
- **README.md**: Space description and metadata (appears on Space page)
- **configuration_nanochat.py**: Model configuration for transformers
- **modeling_nanochat.py**: Custom model implementation

## Configuration Options

### Hardware Settings

By default, Spaces run on CPU. For better performance:

1. Go to your Space **Settings**
2. Under **Hardware**, select:
   - **CPU basic** (free, slower)
   - **CPU upgrade** (small fee, faster)
   - **T4 small** (GPU, fastest but costs more)
3. Click **Save**

### Environment Variables (Optional)

You can add these in Space Settings > Variables:

- `HF_TOKEN`: If you want to use private models (not needed for this public model)

## Customization

### Change Model

Edit `app.py` line 10:
```python
MODEL_ID = "HarleyCooper/nanochat561"  # Change to your model
```

### Adjust UI

Edit `app.py` lines 80-95 to modify:
- Title and description
- Default parameter values
- Parameter ranges

### Update Dependencies

Edit `requirements.txt` to add/remove packages.

## Troubleshooting

### Space Fails to Build

**Check the Logs tab** for error messages. Common issues:

1. **Missing dependencies**: Add to `requirements.txt`
2. **Import errors**: Ensure `configuration_nanochat.py` and `modeling_nanochat.py` are uploaded
3. **Model not found**: Verify `MODEL_ID` in `app.py` is correct

### Space Runs but Chat Doesn't Work

1. **Check model loading**: Look for "Loading nanochat model" in logs
2. **Memory issues**: Upgrade to larger CPU or use GPU hardware
3. **Trust remote code**: Model loads with `trust_remote_code=True` by default

### Slow Responses

1. **Upgrade hardware**: Use GPU for faster inference
2. **Reduce max_new_tokens**: Lower the default value in `app.py`
3. **Use smaller model**: If available

## Space URL

Once deployed, your Space will be available at:
```
https://huggingface.co/spaces/your-username/nanochat-demo
```

You can embed it anywhere with an iframe:
```html
<iframe
  src="https://your-username-nanochat-demo.hf.space"
  width="850"
  height="600"
></iframe>
```

## Cost Estimates

- **CPU basic**: Free (with rate limits)
- **CPU upgrade**: ~$0.03/hour
- **T4 small GPU**: ~$0.60/hour
- **A10G GPU**: ~$3.15/hour

Spaces can be paused when not in use to save costs.

## Support

- HuggingFace Spaces Docs: https://huggingface.co/docs/hub/spaces
- Model Repository: https://huggingface.co/HarleyCooper/nanochat561
- Issues: https://github.com/HarleyCoops/nanochat561/issues

## License

MIT - See LICENSE file in the main repository.