PrazNeuro commited on
Commit
7988407
·
verified ·
1 Parent(s): b950dbe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -69
README.md CHANGED
@@ -1,69 +1,96 @@
1
- Project: PRECISE-GBM - Model training & retraining helpers
2
-
3
- Overview
4
-
5
- This repository contains code to train models (Gaussian Mixture labelling + SVM and ensemble classifiers) and to persist all artifacts required to reproduce or retrain models on new data. It includes:
6
-
7
- - `Scenario_heldout_final_PRECISE.py` — training pipeline producing `.joblib` models and metadata JSONs (selected features, best params, CV results).
8
- - `retrain_helper.py` — CLI utility to rebuild pipelines, set best params and retrain using saved selected-features and params JSONs. Supports JSON/YAML config files and auto-detection of model type.
9
- - `README_RETRAIN.md` — detailed retrain examples and a notebook cell.
10
-
11
- This repo also includes helper files to make it ready for GitHub:
12
- - `requirements.txt` — Python dependencies
13
- - `.gitignore` — recommended ignores (models, caches, logs)
14
- - `LICENSE` — MIT license
15
- - GitHub Actions workflow for CI (pytest smoke test)
16
-
17
- Getting started (Windows PowerShell)
18
-
19
- 1) Create and activate a virtual environment
20
-
21
- ```powershell
22
- python -m venv .venv
23
- .\.venv\Scripts\Activate.ps1
24
- ```
25
-
26
- 2) Install dependencies
27
-
28
- ```powershell
29
- pip install --upgrade pip
30
- pip install -r requirements.txt
31
- ```
32
-
33
- 3) Run training (note: the training script reads data from absolute paths configured in the script — adjust them or run from an environment where those files are present)
34
-
35
- ```powershell
36
- python Scenario_heldout_final_PRECISE.py
37
- ```
38
-
39
- The training script will create model files under `models_LM22/` and `models_GBM/` and write metadata JSONs next to each joblib model (selected features, params, cv results) as well as group-level JSON summaries.
40
-
41
- Retraining
42
-
43
- See `README_RETRAIN.md` for detailed CLI and notebook examples. Short example:
44
-
45
- ```powershell
46
- python retrain_helper.py \
47
- --model-prefix "models_GBM/scenario_1/GBM_scen1_Tcell" \
48
- --train-csv "data\new_train.csv" \
49
- --label-col "label"
50
- ```
51
-
52
- Notes
53
-
54
- - The training script contains hard-coded absolute paths to data files. Before running on another machine, update the `scenarios_*` file paths or place the datasets in the same paths.
55
- - Retrain helper auto-detects model type when `--model-type` is omitted by looking for `{prefix}_svm_params.json` or `{prefix}_ens_params.json`.
56
- - YAML config support for retrain requires PyYAML (`pip install pyyaml`).
57
-
58
- CI
59
-
60
- A basic GitHub Actions workflow runs a smoke pytest to ensure the retrain helper imports and basic pipeline construction works. It does not run heavy training.
61
-
62
- Contributing
63
-
64
- See `CONTRIBUTING.md` for guidance on opening issues and PRs.
65
-
66
- License
67
-
68
- This project is released under the MIT License — see `LICENSE`.
69
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ tags:
6
+ - api
7
+ - brain
8
+ - cancer
9
+ - trails
10
+ - finder
11
+ - app
12
+ ---
13
+ <p align="center"> <b>Brain Cancer Trials Finder</b> </p>
14
+
15
+ <p align="center">
16
+ <img src="logo_precise.png" alt="PRECISE-GBM Logo">
17
+ </p>
18
+
19
+
20
+ Desktop Application, Streamlit app and code
21
+
22
+ # Brain Cancer Trails Finder (Desktop Application)
23
+
24
+ The link for download is here: (Coming soon)
25
+
26
+ This application uses API with Clinical Trials Registry at NIH (ClinicalTrails.gov, USA) and provides scores according to the type of brain cancer input on the GUI.
27
+ It also provides links directly to UK cancer registers and search is performed automatically on the type of cancer selected (CRUK, NIHR, ISRCTN(UK) )
28
+
29
+
30
+ # Brain Cancer Trials Finder (Streamlit)
31
+
32
+ A minimal Streamlit app that discovers actively recruiting neuro-oncology clinical trials (e.g., glioblastoma, brain tumors) from the ClinicalTrials.gov v2 API and ranks them with simple, explainable heuristics.
33
+
34
+ Demo features:
35
+ - Filters by country and basic patient factors (age, KPS, setting).
36
+ - Pulls live data from ClinicalTrials.gov v2.
37
+ - Scores trials with transparent reasons for the score.
38
+
39
+ ## Local run
40
+
41
+ Prereqs: Python 3.9+.
42
+
43
+ ```bash
44
+ python -m venv .venv
45
+ .venv\Scripts\activate
46
+ pip install -r requirements.txt
47
+ streamlit run streamlit_app.py
48
+ ```
49
+
50
+ ## Deploy to Streamlit Community Cloud
51
+
52
+ 1) Push this folder to a new GitHub repository (see steps below).
53
+ 2) Go to https://share.streamlit.io/ and connect your GitHub account.
54
+ 3) Create a new app, select your repo, branch (e.g., `main`), and set the main file path to `streamlit_app.py`.
55
+ 4) Click Deploy. Streamlit will build from `requirements.txt` automatically.
56
+
57
+ ## GitHub repository setup (Windows, cmd.exe)
58
+
59
+ 1) Create an empty repo on GitHub (e.g., `brain-trials-finder`). Do not add any files there.
60
+ 2) In this project folder, run:
61
+
62
+ ```bat
63
+ git init
64
+ git add .
65
+ git commit -m "Initial commit: Streamlit Brain Cancer Trials Finder"
66
+ git branch -M main
67
+ REM Replace <YOUR-USERNAME> and repo name below
68
+ git remote add origin https://github.com/<YOUR-USERNAME>/brain-trials-finder.git
69
+ git push -u origin main
70
+ ```
71
+
72
+ If you prefer SSH:
73
+
74
+ ```bat
75
+ git remote remove origin 2> NUL
76
+ git remote add origin git@github.com:<YOUR-USERNAME>/brain-trials-finder.git
77
+ git push -u origin main
78
+ ```
79
+
80
+ ## Configuration
81
+
82
+ - No secrets are required. All data is pulled from the public ClinicalTrials.gov API.
83
+ - If you want to pin Python version on Streamlit Cloud, add a `runtime.txt` (e.g., `python-3.10`), though it’s optional.
84
+
85
+ ## Files
86
+
87
+ - `streamlit_app.py` — Streamlit entrypoint used by Streamlit Cloud.
88
+ - `GUI_CLinicalTrial.py` — Original app file; kept for development, but `streamlit_app.py` mirrors it for deployment.
89
+ - `requirements.txt` — Python dependencies.
90
+ - `.gitignore` — Ignores typical Python/venv artifacts.
91
+
92
+ ## Disclaimer
93
+
94
+ This tool provides assistive information only and is not a substitute for professional medical advice. Always discuss clinical trials with your clinician.
95
+
96
+ Copyright: Prajwal Ghimire