ashtonteng
/

tahoe-kepler

Model card Files Files and versions

ashtonteng commited on May 11

Commit

91ff058

·

verified ·

1 Parent(s): 638a44c

Upload 2 files

Files changed (3) hide show

.gitattributes +1 -0
README.md +42 -0
Tahoe-100M.pdf +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+Tahoe-100M.pdf filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,42 @@

+---
+license: mit
+datasets:
+- tahoebio/Tahoe-100M
+tags:
+- tahoe-deepdive
+- hackathon
+- tahoe-100M
+---
+# Team Name
+**Kepler**
+## Members
+- Ashton Teng @ashtonteng
+- Quinn Leng
+- [Affiliation, GitHub handles if applicable]
+# Project
+## Title
+Kepler: Natural Language AI Agent for Tahoe-100M Exploration
+## Overview
+Kepler lets biologists query the Tahoe-100M dataset in plain English, automating data access, analysis, and visualization without coding.
+## Motivation
+High-dimensional datasets like Tahoe-100M require heavy compute setup, tool expertise, and programming skill—barriers that slow scientific insight.
+We demonstrate the capability for the agent to allow for users to perform simple analyses with natural language.
+## Methods
+- Extracted a pseudobulked subset with Vision differential expression scores.
+- Loaded metadata tables for cell lines, drugs, and gene sets.
+- Built an AI agent to translate natural-language queries into analysis code and visual outputs.
+## Results
+Demo query: “Which pathways are upregulated in BRAF.V600E mutant models after inhibitor treatment?”
+Agent automatically filtered the data, ran the analysis, and generated plots with interpretations.
+## Discussion
+- **Scalability:** Move initial subsetting to DuckDB or Databricks for larger subsets.
+- **Knowledge alignment:** Enhance the agent’s scientific context for broader, valid analyses.
+- **Next steps:** Expand to full Tahoe-100M and optimize compute pipeline.

Tahoe-100M.pdf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2fe23a77dd0c8186edbad785da55dba15e11e3bb9227fa7d3452573c86d9f478
+size 528392