--- language: en tags: - vulnerability-detection - code-analysis - autoencoder - anomaly-detection library_name: pytorch metrics: - mse --- # CATastrophe - Code Vulnerability Detector This model is an autoencoder-based vulnerability detector for Python code. It uses TF-IDF vectorization and an autoencoder architecture to detect anomalies in code that may indicate vulnerabilities. ## Model Details - **Architecture**: Autoencoder (Input → 512 → 128 → 512 → Input) - **Input Features**: 2000 (TF-IDF) - **Training Loss**: 0.0005 - **Framework**: PyTorch ## Usage ```python import torch import pickle from model import Autoencoder # Load model model = Autoencoder(input_dim=2000) model.load_state_dict(torch.load('catastrophe_model.pth')) model.eval() # Load vectorizer with open('vectorizer.pkl', 'rb') as f: vectorizer = pickle.load(f) # Analyze code code_text = "your code here" features = vectorizer.transform([code_text]).toarray() features_tensor = torch.tensor(features, dtype=torch.float32) with torch.no_grad(): reconstructed = model(features_tensor) anomaly_score = torch.mean((features_tensor - reconstructed) ** 2, dim=1) ``` ## Training Configuration - Batch Size: 256 - Epochs: 50 - Learning Rate: 0.001 - Optimizer: Adam ## Limitations This model is trained on vulnerable commits only and uses reconstruction error as an anomaly score. High scores indicate potential vulnerabilities, but manual review is recommended.