---
license: apache-2.0
datasets:
- GainEnergy/SMoE-Training
- GainEnergy/reasoner
- GainEnergy/ogai-8x7B
- GainEnergy/oilandgas-engineering-dataset
- GainEnergy/ogdataset
- GainEnergy/upstrimacentral
- open-r1/OpenR1-Math-220k
- unsloth/LaTeX_OCR
base_model: mistralai/Mathstral-7B-v0.1
tags:
- oil-gas
- drilling-engineering
- mathstral-7b
- lora
- fine-tuned
- energy-ai
- pragmatic-ai
- gguf
- text-generation-inference
- text-generation
model-index:
- name: OGAI-STEM-7B
  results:
  - task:
      type: text-generation
      name: Engineering AI for Oil & Gas
    dataset:
      name: GainEnergy Oil & Gas Corpus
      type: custom
    metrics:
    - name: Engineering Calculations Accuracy
      type: accuracy
      value: 94.5
    - name: Scientific Computation Precision
      type: precision
      value: 92.3
    - name: Context Retention
      type: contextual-coherence
      value: High
  variants:
  - name: OGAI-STEM-7B-GGUF
    pipeline_tag: text-generation
    repo_name: GainEnergy/OGAI-STEM-7B-GGUF
library_name: transformers
language:
- en
widget:
- text: >-
    User: What is the pressure drop in a horizontal pipeline for crude oil transport?

    AI:
  example_title: Pipeline Pressure Drop Calculation
- text: >-
    User: Explain the differences between gas lift and electric submersible pumps in artificial lift.

    AI:
  example_title: Artificial Lift Methods
- text: >-
    User: How do you calculate mud weight for deepwater drilling?

    AI:
  example_title: Mud Weight Calculation
- text: >-
    User: Describe the steps to optimize wellbore stability in unconventional reservoirs.

    AI:
  example_title: Wellbore Stability Optimization
pipeline_tag: text-generation

---

# OGAI-STEM-7B: AI-Powered Engineering Model for Oil & Gas Calculations

![Hugging Face](https://img.shields.io/badge/HuggingFace-OGAI--STEM--7B-blue)  
[![License](https://img.shields.io/github/license/huggingface/transformers.svg)](LICENSE)

## Model Description

**OGAI-STEM-7B** is a **LoRA fine-tuned Mathstral-7B model**, designed specifically for **oil and gas engineering, scientific computing, and technical problem-solving**. It is optimized for numerical accuracy, complex engineering calculations, and technical document understanding.

The model is an integral part of **GainEnergy's Upstrima AI Platform**, enhancing workflows with **pragmatic AI agents, scientific computing tools, and retrieval-augmented generation (RAG)-based document analysis**.

## Technical Architecture

### Base Model Specifications
- **Architecture**: Mathstral-7B (Mistral fine-tuned for advanced math reasoning)
- **Parameters**: 7B
- **Context Length**: 32,768 tokens for long-form scientific queries
- **Mathematical Precision**: Enhanced for oil & gas engineering computations

### Fine-tuning Approach
- **Method**: Low-Rank Adaptation (LoRA) with rank 64
- **Training Dataset**: 3.2M datapoints from specialized oil & gas engineering sources
- **Hardware**: Trained on 8x NVIDIA A100 80GB GPUs
- **Training Time**: 2,200 GPU hours
- **Special Features**: Improved accuracy in fluid mechanics, pressure drop, and geomechanics calculations

### Performance Optimizations
- **Quantization**: 4-bit and 8-bit versions optimized for low-memory inference
- **Inference Speed**: Tuned KV cache management for real-time engineering computations
- **Memory Footprint**: Runs efficiently on **12GB VRAM** with 4-bit quantization
- **Reduced Hallucinations**: Domain-specific fine-tuning minimizes incorrect scientific results

## Deployment-Optimized Versions

| **Version** | **Memory Requirement** | **Performance** |
|------------|----------------------|----------------|
| [OGAI-STEM-7B-GGUF](https://huggingface.co/GainEnergy/OGAI-STEM-7B-GGUF) | CPU optimized | Suitable for edge computing |

### Local Deployment with vLLM
```bash
python -m vllm.entrypoints.openai.api_server \
  --model GainEnergy/ogai-stem-7b \
  --tensor-parallel-size 2
```

## How to Use

### Run Inference in Python
```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "GainEnergy/ogai-stem-7b"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

prompt = "Calculate the pressure drop in a 500m pipeline with a 10,000 BPD flow rate."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Citing OGAI-STEM-7B
```
@article{ogai_stem_7b_2025,
  title={OGAI-STEM-7B: AI Model for Oil & Gas Scientific Computing},
  author={GainEnergy AI Team},
  year={2025},
  publisher={Hugging Face Models}
}
```