--- license: apache-2.0 datasets: - GainEnergy/SMoE-Training - GainEnergy/reasoner - GainEnergy/ogai-8x7B - GainEnergy/oilandgas-engineering-dataset - GainEnergy/ogdataset - GainEnergy/upstrimacentral - open-r1/OpenR1-Math-220k - unsloth/LaTeX_OCR base_model: mistralai/Mathstral-7B-v0.1 tags: - oil-gas - drilling-engineering - mathstral-7b - lora - fine-tuned - energy-ai - pragmatic-ai - gguf - text-generation-inference - text-generation model-index: - name: OGAI-STEM-7B results: - task: type: text-generation name: Engineering AI for Oil & Gas dataset: name: GainEnergy Oil & Gas Corpus type: custom metrics: - name: Engineering Calculations Accuracy type: accuracy value: 94.5 - name: Scientific Computation Precision type: precision value: 92.3 - name: Context Retention type: contextual-coherence value: High variants: - name: OGAI-STEM-7B-GGUF pipeline_tag: text-generation repo_name: GainEnergy/OGAI-STEM-7B-GGUF library_name: transformers language: - en widget: - text: >- User: What is the pressure drop in a horizontal pipeline for crude oil transport? AI: example_title: Pipeline Pressure Drop Calculation - text: >- User: Explain the differences between gas lift and electric submersible pumps in artificial lift. AI: example_title: Artificial Lift Methods - text: >- User: How do you calculate mud weight for deepwater drilling? AI: example_title: Mud Weight Calculation - text: >- User: Describe the steps to optimize wellbore stability in unconventional reservoirs. AI: example_title: Wellbore Stability Optimization pipeline_tag: text-generation --- # OGAI-STEM-7B: AI-Powered Engineering Model for Oil & Gas Calculations ![Hugging Face](https://img.shields.io/badge/HuggingFace-OGAI--STEM--7B-blue) [![License](https://img.shields.io/github/license/huggingface/transformers.svg)](LICENSE) ## Model Description **OGAI-STEM-7B** is a **LoRA fine-tuned Mathstral-7B model**, designed specifically for **oil and gas engineering, scientific computing, and technical problem-solving**. It is optimized for numerical accuracy, complex engineering calculations, and technical document understanding. The model is an integral part of **GainEnergy's Upstrima AI Platform**, enhancing workflows with **pragmatic AI agents, scientific computing tools, and retrieval-augmented generation (RAG)-based document analysis**. ## Technical Architecture ### Base Model Specifications - **Architecture**: Mathstral-7B (Mistral fine-tuned for advanced math reasoning) - **Parameters**: 7B - **Context Length**: 32,768 tokens for long-form scientific queries - **Mathematical Precision**: Enhanced for oil & gas engineering computations ### Fine-tuning Approach - **Method**: Low-Rank Adaptation (LoRA) with rank 64 - **Training Dataset**: 3.2M datapoints from specialized oil & gas engineering sources - **Hardware**: Trained on 8x NVIDIA A100 80GB GPUs - **Training Time**: 2,200 GPU hours - **Special Features**: Improved accuracy in fluid mechanics, pressure drop, and geomechanics calculations ### Performance Optimizations - **Quantization**: 4-bit and 8-bit versions optimized for low-memory inference - **Inference Speed**: Tuned KV cache management for real-time engineering computations - **Memory Footprint**: Runs efficiently on **12GB VRAM** with 4-bit quantization - **Reduced Hallucinations**: Domain-specific fine-tuning minimizes incorrect scientific results ## Deployment-Optimized Versions | **Version** | **Memory Requirement** | **Performance** | |------------|----------------------|----------------| | [OGAI-STEM-7B-GGUF](https://huggingface.co/GainEnergy/OGAI-STEM-7B-GGUF) | CPU optimized | Suitable for edge computing | ### Local Deployment with vLLM ```bash python -m vllm.entrypoints.openai.api_server \ --model GainEnergy/ogai-stem-7b \ --tensor-parallel-size 2 ``` ## How to Use ### Run Inference in Python ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "GainEnergy/ogai-stem-7b" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto") prompt = "Calculate the pressure drop in a 500m pipeline with a 10,000 BPD flow rate." inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Citing OGAI-STEM-7B ``` @article{ogai_stem_7b_2025, title={OGAI-STEM-7B: AI Model for Oil & Gas Scientific Computing}, author={GainEnergy AI Team}, year={2025}, publisher={Hugging Face Models} } ```