Skip to content

fahiiim/NeuroGebra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neurogebra Logo

The Executable Mathematical Formula Companion for AI and Data Science

285 symbolic expressions | 100+ datasets | Real training with full math transparency


PyPI version Python versions Downloads Status

Tests Coverage Docs License: MIT

GitHub stars Issues Pull Requests Last Commit Repo Size Top Language

Total Downloads Commit Activity Format Implementation


Documentation  ·  PyPI  ·  GitHub  ·  Changelog  ·  Report Bug



What is Neurogebra?

Neurogebra is a unified Python library that bridges symbolic mathematics, numerical computation, and machine learning into a single, searchable, executable toolkit. It ships with 285 pre-built, tested, and documented mathematical expressions spanning activations, losses, statistics, optimization, linear algebra, and more -- each one symbolic, numerically evaluable, trainable, and accompanied by educational metadata. It also includes 100+ curated datasets for immediate experimentation.

Unlike traditional ML frameworks, Neurogebra is a mathematical formula companion: a searchable, executable encyclopedia of the formulas that power modern AI, complete with built-in explanations, gradient computation, composition tools, and ready-to-use datasets.

v2.5.3 -- Includes Observatory Pro with adaptive logging, automated health warnings, epoch summarization, tiered storage, visual dashboards, training fingerprinting, and full reproducibility support.



Table of Contents

Section Description
Installation Get started in seconds
Quick Start Your first 5 lines of Neurogebra
Unique Features What makes Neurogebra different
Training Observatory Real-time math transparency for training
Observatory Pro Adaptive diagnostics, dashboards, fingerprinting
End-to-End Example: Training with Observatory Logs Complete step-by-step walkthrough
Who Should Use Neurogebra? Students, Researchers, Engineers
Expression Library 285 verified mathematical expressions
Datasets 100+ curated datasets
Building and Training Models ModelBuilder and EducationalTrainer
Autograd Engine From-scratch automatic differentiation
Framework Bridges Export to PyTorch, TensorFlow, JAX
Architecture Project structure
Documentation Full docs and tutorials
Contributing How to help
What Neurogebra Is and Is Not Positioning and philosophy
License MIT


Installation

pip install neurogebra

Optional extras for extended functionality:

pip install neurogebra[viz]          # Interactive visualization (Plotly, Seaborn)
pip install neurogebra[frameworks]   # PyTorch and TensorFlow bridges
pip install neurogebra[logging]      # TensorBoard and W&B integration
pip install neurogebra[datasets]     # scikit-learn-backed real-world datasets
pip install neurogebra[docs]         # Documentation tools (MkDocs Material)
pip install neurogebra[dev]          # Development and testing tools
pip install neurogebra[all]          # Everything above

Requirements: Python 3.9+ | NumPy | SymPy | Matplotlib | SciPy | Rich | Colorama



Quick Start

from neurogebra import MathForge

forge = MathForge()

# Retrieve any of the 285 pre-built expressions by name
relu = forge.get("relu")
print(relu.eval(x=5))         # 5
print(relu.eval(x=-3))        # 0
print(relu.formula)            # LaTeX symbolic representation
print(relu.explain())          # Plain-language explanation

# Access optimizers, losses, metrics, distributions -- all the same way
adam   = forge.get("adam_step")
gauss  = forge.get("gaussian")
f1     = forge.get("f1_score_formula")

# Search and discover
results = forge.search("classification")
forge.list_all(category="activation")
forge.compare(["relu", "sigmoid", "tanh"])


Unique Features

Neurogebra provides a combination of capabilities that does not exist in any single tool today.

1. Searchable Executable Formula Repository

A curated library of 285 verified mathematical expressions organized across 10 domains. Every formula is simultaneously a symbolic object, a numerical function, and an educational resource. Access any formula by name, category, or keyword.

forge = MathForge()
cross_entropy = forge.get("cross_entropy")
print(cross_entropy.formula)       # Symbolic LaTeX
print(cross_entropy.eval(y=1, y_pred=0.9))  # Numerical result
print(cross_entropy.explain())     # What it is and when to use it

2. Symbolic + Numerical + Trainable in One Object

Every expression in Neurogebra is three things at once:

  • Symbolic -- Full SymPy integration with LaTeX rendering and analytical gradients
  • Numerical -- NumPy-backed lambdify for production-speed evaluation
  • Trainable -- Attach learnable parameters and optimize them with SGD or Adam
from neurogebra import Expression
from neurogebra.core.trainer import Trainer

expr = Expression("fit_line", "m*x + b", params={"m": 0.0, "b": 0.0}, trainable_params=["m", "b"])
trainer = Trainer(expr, learning_rate=0.01, optimizer="adam")
history = trainer.fit(X, y, epochs=200, verbose=True)

3. Training Observatory -- Full Math Transparency

The first training logging system that shows the complete mathematical picture of what happens inside a neural network during training -- layer-by-layer forward/backward formulas, gradient norms, weight distributions, activation statistics -- all colour-coded in your terminal.

4. Observatory Pro -- Active Diagnostic Engine

Six intelligent systems that go beyond passive logging:

  • Adaptive Logging -- Stays quiet until anomalies appear, reducing logs by 80-90%
  • Automated Health Warnings -- 10 rules with structured diagnoses and recommendations
  • Epoch Summarization -- Statistical rollups (mean, std, min, max) per epoch
  • Tiered Storage -- Separate files for basic, health, and debug logs
  • Visual Dashboard -- Self-contained interactive HTML with Chart.js
  • Training Fingerprint -- Full environment capture for reproducibility

5. Real Forward and Backward Computation

Unlike simulators, Neurogebra performs actual matrix multiplications and gradient computation through every layer. Weights are initialized with He initialization, activations are computed with real NumPy operations, and gradients flow through analytical backpropagation.

6. From-Scratch Autograd Engine

A fully functional automatic differentiation engine built from first principles. Build computation graphs, inspect every operation, and watch gradients propagate through Value and Tensor objects -- designed for education and transparency.

7. Expression Composition and Arithmetic

Combine any expressions using standard arithmetic operators (+, -, *, /) and function composition. Build custom losses, hybrid activations, or complex metrics from the existing library.

mse = forge.get("mse")
mae = forge.get("mae")
custom_loss = 0.7 * mse + 0.3 * mae

8. 100+ Curated Datasets with Educational Metadata

Every dataset comes with difficulty level, use cases, sample count, descriptions, and consistent (X, y) numpy output. Classification, regression, synthetic patterns, time series, image recognition, and text/NLP datasets are included.

9. Framework Bridges -- Prototype Then Deploy

Convert Neurogebra models to PyTorch, TensorFlow, or JAX modules. Prototype and verify in Neurogebra, then export to production frameworks with one call.

10. Educational Layer with explain() Everywhere

Every expression, layer, and operation has an .explain() method. The educational trainer provides real-time tips and debugging advice. Interactive tutorials walk through tensors, gradients, training, and more step by step.

11. Smart Health Diagnostics

Automatic detection of 10+ training problems (NaN/Inf, overfitting, underfitting, vanishing/exploding gradients, dead neurons, activation saturation, loss divergence, weight stagnation) with actionable, human-readable recommendations.

12. Multi-Format Export

Export training logs and reports to JSON, CSV, HTML (interactive charts via Chart.js), and Markdown. Optional TensorBoard and Weights & Biases integration.


Comparison Matrix

Capability Neurogebra SymPy NumPy Mathematica PyTorch / TF
Symbolic math (LaTeX, calculus) Yes Yes -- Yes --
Fast numerical evaluation Yes Slow Yes Yes Yes
285 curated ML/statistics formulas Yes -- -- -- --
Educational metadata per formula Yes -- -- -- --
Trainable symbolic parameters Yes -- -- -- N/A
Searchable formula repository Yes -- -- -- --
Real-time training math transparency Yes -- -- -- --
Adaptive diagnostic logging Yes -- -- -- --
Training fingerprint / reproducibility Yes -- -- -- --
Interactive HTML dashboards Yes -- -- -- --
Free and open source Yes Yes Yes -- Yes
Python native Yes Yes Yes -- Yes


Training Observatory

The Training Observatory shows the complete mathematical picture of what happens inside your neural network during training -- in real time, in colour, in your terminal.

One-Line Activation

from neurogebra.builders.model_builder import ModelBuilder

builder = ModelBuilder()
model = builder.Sequential([
    builder.Dense(64, activation="relu"),
    builder.Dense(32, activation="tanh"),
    builder.Dense(1, activation="sigmoid"),
], name="classifier")

model.compile(
    loss="binary_crossentropy",
    optimizer="adam",
    learning_rate=0.01,
    log_level="expert",          # <-- this is all you need
)

model.fit(X_train, y_train, epochs=20, batch_size=32)

Colour-Coded Terminal Output

Colour Meaning
Green Healthy metrics -- loss decreasing, gradients stable
Yellow Warnings -- high gradient variance, saturation starting
Red Danger -- vanishing/exploding gradients, NaN detected
Purple / Magenta Mathematical formulas -- forward/backward equations
Blue Informational -- epoch and batch progress

Log Levels

Level What You See
"basic" Epoch-level loss and accuracy, start/end messages
"detailed" + Batch-level progress, timing information
"expert" + Layer-by-layer formulas, gradient norms, weight stats
"debug" + Every tensor shape, raw statistics, full computation trace

Preset Configurations

from neurogebra.logging.config import LogConfig

config = LogConfig.minimal()     # Just epoch progress
config = LogConfig.standard()    # Layer info + timing + health checks
config = LogConfig.verbose()     # Full math depth -- every formula, every gradient
config = LogConfig.research()    # Everything + export to files

model.compile(loss="mse", optimizer="adam", log_config=config)

Layer-by-Layer Mathematical Display

At expert level, the Observatory renders the exact computation:

Forward:  a1 = relu(W1 * x + b1)      | shape: (32, 64) -> (32, 32)
Forward:  a2 = tanh(W2 * a1 + b2)     | shape: (32, 32) -> (32, 16)
Forward:  y_hat = sigma(W3 * a2 + b3) | shape: (32, 16) -> (32, 1)

Backward: dL/dW3 = dL/dy_hat . sigma'(z3) * a2^T
Backward: dL/dW2 = dL/da2 . tanh'(z2) * a1^T
Backward: dL/dW1 = dL/da1 . relu'(z1) * x^T

Smart Health Diagnostics

The Observatory automatically detects problems and provides actionable recommendations:

[CRITICAL] NaN/Inf Detected
   NaN values found in training loss!
   -> Check for division by zero in your data
   -> Reduce learning rate (try 1e-4)
   -> Add gradient clipping

[WARNING] Overfitting Detected
   Validation loss increasing while training loss decreases (ratio: 1.8x)
   -> Add dropout layers (rate 0.2-0.5)
   -> Reduce model complexity
   -> Increase training data or use data augmentation
   -> Try early stopping

[DANGER] Vanishing Gradients
   Layer 'dense_3' gradient L2 norm = 2.1e-09
   -> Use ReLU/LeakyReLU instead of sigmoid/tanh
   -> Add batch normalization
   -> Use skip connections

Export Training Logs

config = LogConfig.research()
config.export_formats = ["json", "csv", "html", "markdown"]
config.export_dir = "./my_training_logs"

model.compile(loss="mse", optimizer="adam", log_config=config)
model.fit(X, y, epochs=50)

# After training, find in ./my_training_logs/:
#   training_log.json   -- Full structured event log
#   metrics.csv         -- Epoch-level metrics table
#   report.html         -- Interactive HTML report with Chart.js graphs
#   report.md           -- Human-readable Markdown report


Observatory Pro

Six intelligent systems that turn the Training Observatory from a passive logger into an active diagnostic engine.

Feature Problem It Solves Impact
Adaptive Logging Expert mode generates 77k+ log entries 80-90% log reduction
Health Warnings Silent failures (58% dead neurons go unnoticed) Catches problems automatically
Epoch Summaries No statistical aggregation per epoch Mean, std, min, max per metric
Tiered Storage One massive JSON file for everything 3 focused files: basic / health / debug
Visual Dashboard Raw JSON with no visualization Interactive HTML charts with Chart.js
Training Fingerprint Cannot reproduce training runs Full environment and state capture

Adaptive Logging

from neurogebra.logging.adaptive import AdaptiveLogger, AnomalyConfig
from neurogebra.logging.logger import TrainingLogger, LogLevel

base_logger = TrainingLogger(level=LogLevel.EXPERT)
adaptive = AdaptiveLogger(base_logger, config=AnomalyConfig(
    zeros_pct_threshold=50.0,
    gradient_spike_factor=5.0,
    escalation_cooldown=10,
))
# Stays at BASIC until anomaly detected, then escalates automatically

Health Warnings

from neurogebra.logging.health_warnings import AutoHealthWarnings, WarningConfig

warnings_engine = AutoHealthWarnings(config=WarningConfig(
    dead_relu_zeros_pct=50.0,
    overfit_patience=3,
    overfit_ratio=1.3,
))
# 10 built-in rules: dead_relu, vanishing/exploding gradient, NaN/Inf,
# overfitting, loss stagnation, weight stagnation, loss divergence, activation saturation

Visual Dashboard

from neurogebra.logging.dashboard import DashboardExporter

dashboard = DashboardExporter(path="training_logs/dashboard.html")
logger.add_backend(dashboard)
# Generates self-contained HTML with loss curves, accuracy charts,
# timing bars, batch-level metrics, and health diagnostics

Training Fingerprint

from neurogebra.logging.fingerprint import TrainingFingerprint

fp = TrainingFingerprint.capture(
    model_info={"name": "my_model", "layers": 3},
    hyperparameters={"lr": 0.01, "batch_size": 32, "epochs": 50},
    dataset=X_train,
    random_seed=42,
)
print(fp.format_text())
# Captures: seeds, dataset SHA-256, library versions, CPU/RAM/GPU, OS, git state


End-to-End Example: Training with Observatory Logs

A complete, step-by-step walkthrough -- from data loading to training with full diagnostic logging, health monitoring, epoch summarization, dashboard export, and reproducibility fingerprinting.

Step 1 -- Import Everything

import numpy as np

# Core
from neurogebra.builders.model_builder import ModelBuilder
from neurogebra.datasets.loaders import Datasets

# Observatory (base logging)
from neurogebra.logging.logger import TrainingLogger, LogLevel
from neurogebra.logging.config import LogConfig

# Observatory Pro
from neurogebra.logging.adaptive import AdaptiveLogger, AnomalyConfig
from neurogebra.logging.health_warnings import AutoHealthWarnings, WarningConfig
from neurogebra.logging.epoch_summary import EpochSummarizer
from neurogebra.logging.tiered_storage import TieredStorage
from neurogebra.logging.dashboard import DashboardExporter
from neurogebra.logging.fingerprint import TrainingFingerprint

Step 2 -- Load and Prepare Data

# Load a synthetic dataset
X, y = Datasets.load_moons(n_samples=1000, noise=0.2)

# Train/test split
split = int(0.8 * len(X))
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

print(f"Training samples: {X_train.shape[0]}")
print(f"Test samples:     {X_test.shape[0]}")
print(f"Features:         {X_train.shape[1]}")

Step 3 -- Build the Model

builder = ModelBuilder()
model = builder.Sequential([
    builder.Dense(64, activation="relu", input_shape=(2,)),
    builder.Dense(32, activation="tanh"),
    builder.Dense(1, activation="sigmoid"),
], name="moon_classifier")

# Inspect before training
model.summary()
model.explain_architecture()

Step 4 -- Set Up the Logging Pipeline

# Base logger at EXPERT level
base_logger = TrainingLogger(level=LogLevel.EXPERT)

# Adaptive wrapper -- stays quiet until something goes wrong
adaptive = AdaptiveLogger(base_logger, config=AnomalyConfig(
    zeros_pct_threshold=50.0,      # Dead neuron threshold
    gradient_spike_factor=5.0,     # Gradient spike sensitivity
    escalation_cooldown=10,        # Stay escalated for 10 events
))

# Tiered storage -- separates logs into basic/health/debug files
storage = TieredStorage(
    base_dir="./training_logs",
    write_debug=True,
    buffer_size=50,
)
base_logger.add_backend(storage)

# Interactive HTML dashboard
dashboard = DashboardExporter(path="./training_logs/dashboard.html")
base_logger.add_backend(dashboard)

# Health warnings engine
warnings_engine = AutoHealthWarnings(config=WarningConfig(
    dead_relu_zeros_pct=50.0,
    overfit_patience=3,
    overfit_ratio=1.3,
))

# Epoch summarizer
summarizer = EpochSummarizer()

Step 5 -- Capture the Training Fingerprint

fingerprint = TrainingFingerprint.capture(
    model_info={"name": "moon_classifier", "layers": 3},
    hyperparameters={
        "learning_rate": 0.01,
        "batch_size": 32,
        "epochs": 20,
        "optimizer": "adam",
        "loss": "binary_crossentropy",
    },
    dataset=X_train,
    random_seed=42,
)

print(fingerprint.format_text())
# Output:
# +-- Training Fingerprint --+
#   Run ID:       a1b2c3d4e5f6
#   Timestamp:    2026-03-01 12:00:00
#   Seed:         42
#   Dataset Hash: 8f14e45fceea167a
#   Neurogebra:   2.5.3
#   Python:       3.11.5
#   NumPy:        1.26.0
#   ...
# +-------------------------+

Step 6 -- Compile and Train

# Compile with Observatory logging enabled
model.compile(
    loss="binary_crossentropy",
    optimizer="adam",
    learning_rate=0.01,
    log_level="expert",
)

# Run training
epochs = 20
batch_size = 32
num_batches = len(X_train) // batch_size

adaptive.on_train_start(total_epochs=epochs, model_info=fingerprint.model_info)

for epoch in range(epochs):
    adaptive.on_epoch_start(epoch)
    epoch_loss = 0.0

    for batch_idx in range(num_batches):
        start = batch_idx * batch_size
        end = start + batch_size
        X_batch = X_train[start:end]
        y_batch = y_train[start:end]

        # Forward pass (real computation)
        predictions = model.predict(X_batch)
        loss = np.mean((predictions.flatten() - y_batch) ** 2)
        epoch_loss += loss

        # Record batch metrics
        summarizer.record_batch(
            epoch=epoch,
            metrics={"loss": loss, "accuracy": np.mean((predictions.flatten() > 0.5) == y_batch)},
        )

        # Check batch-level health
        batch_alerts = warnings_engine.check_batch(
            epoch=epoch,
            batch=batch_idx,
            loss=loss,
        )
        for alert in batch_alerts:
            print(f"  [{alert.severity.upper()}] {alert.message}")

    # Epoch summary
    avg_loss = epoch_loss / num_batches
    summary = summarizer.finalize_epoch(epoch)
    print(summary.format_text())

    # Epoch-level health check
    epoch_alerts = warnings_engine.check_epoch(
        epoch=epoch,
        train_loss=avg_loss,
    )

    adaptive.on_epoch_end(epoch, metrics={"loss": avg_loss})

adaptive.on_train_end()

Step 7 -- Save Everything and Review

# Flush and close storage
storage.flush()
storage.close()

# Save dashboard
dashboard.save()

# Save fingerprint
import json
with open("./training_logs/fingerprint.json", "w") as f:
    json.dump(fingerprint.to_dict(), f, indent=2)

# Print summary
print(f"\nAnomalies detected: {adaptive.get_anomaly_summary()['total_anomalies']}")
print(f"Health warnings:    {warnings_engine.get_summary()['total_warnings']}")
print(f"\nFiles saved to ./training_logs/:")
print(f"  basic.log        -- Epoch-level metrics (NDJSON)")
print(f"  health.log       -- Health warnings and anomalies (NDJSON)")
print(f"  debug.log        -- Full expert-level detail (NDJSON)")
print(f"  dashboard.html   -- Interactive HTML dashboard")
print(f"  fingerprint.json -- Full reproducibility snapshot")

What the Output Looks Like

Terminal (colour-coded):

[INFO]  Training started: 20 epochs, 3 layers
[INFO]  Epoch 1/20
  Forward:  a1 = relu(W1 * x + b1)      | shape: (32, 64)
  Forward:  a2 = tanh(W2 * a1 + b2)     | shape: (32, 32)
  Forward:  y_hat = sigma(W3 * a2 + b3) | shape: (32, 1)
  Backward: dL/dW3 = dL/dy_hat . sigma'(z3) * a2^T
  Backward: dL/dW2 = dL/da2 . tanh'(z2) * a1^T
  Backward: dL/dW1 = dL/da1 . relu'(z1) * x^T

== Epoch 1 Summary (25 batches) ==
  loss      mean=0.4821  std=0.0312  min=0.4100  max=0.5500
  accuracy  mean=0.7640  std=0.0180  min=0.7200  max=0.8000

[INFO]  Epoch 2/20
  ...

[WARNING] Possible dying ReLU in 'dense_0' (52.0% zeros)
  -> Use LeakyReLU(negative_slope=0.01) instead of ReLU
  -> Lower the learning rate
  -> Use He initialisation

== Epoch 20 Summary (25 batches) ==
  loss      mean=0.0891  std=0.0045  min=0.0810  max=0.0990
  accuracy  mean=0.9620  std=0.0060  min=0.9500  max=0.9750

[INFO]  Training complete. Total anomalies: 2. Health warnings: 3.

Dashboard (HTML): Interactive loss curves, accuracy curves, epoch timing bars, batch-level metrics, and health diagnostics timeline -- all in a single self-contained HTML file you can open in any browser.



Who Should Use Neurogebra?

For Students

It is an executable reference library, not a framework to master.

Students do not need to "learn Neurogebra" the way they learn PyTorch. They use it to look up, verify, and experiment with mathematical formulas. The command forge.get("adam_step") immediately returns the Adam optimizer update rule as a symbolic expression with documentation attached -- no textbook lookup required.

It bridges the gap between mathematical theory and code. In most curricula, students learn formulas on a whiteboard and then separately implement them in code. Neurogebra collapses that gap: every formula is simultaneously a symbolic object (inspect the math), a numerical function (run it on data), and an educational resource (read what it does and when to use it).

It eliminates transcription errors. Students routinely introduce bugs when translating formulas from papers or textbooks into code. Neurogebra's 285 expressions are verified with 470+ automated tests. Using forge.get("cross_entropy") is faster and more reliable than re-deriving it from scratch.

It teaches through transparency. The autograd engine, the educational trainer with real-time debugging advice, the layer explanation system, and the interactive tutorials are designed to make invisible processes visible. Students do not just see numbers -- they see why their loss is diverging, what each layer does, and how gradients flow through a computation graph.

It complements existing tools. Prototype and verify formulas in Neurogebra, then implement production models in PyTorch or TensorFlow. The framework bridges explicitly support this workflow.


For Researchers

Rapid prototyping of custom formulas. Define a new loss function, activation, or metric as a symbolic expression and immediately evaluate it, differentiate it, compose it with others, and train it -- all without writing boilerplate code.

Reproducibility built in. The Training Fingerprint captures everything needed to reproduce a run: seeds, dataset hash, library versions, hardware info, OS, git state, model architecture hash, and hyperparameters. Save it as JSON alongside your results.

Transparent training diagnostics. The Observatory and Observatory Pro give unprecedented visibility into what happens during training. Instead of treating the model as a black box, researchers can inspect layer-by-layer forward/backward formulas, gradient distributions, weight dynamics, and activation statistics at every step.

Structured experiment logging. Tiered storage separates basic metrics from health alerts from debug-level detail. The dashboard exporter generates interactive HTML reports. TensorBoard and Weights & Biases bridges integrate with existing research workflows.

Symbolic gradient verification. Compare analytical gradients from SymPy with numerical gradients from autograd. Verify that your custom formula's gradient is correct before training.


For Engineers

Verified formula library. The 285 expressions are tested with 470+ automated tests. Use them as a reference implementation or as building blocks for custom metrics and loss functions -- without re-deriving from papers.

Production logging presets. LogConfig.production() provides lean, structured logging suitable for deployment. Tiered storage separates operational metrics from diagnostic detail. Adaptive logging reduces noise by 80-90%.

Framework bridge workflow. Prototype quickly in Neurogebra, verify correctness with symbolic gradients and the Observatory, then export to PyTorch or TensorFlow for production training with GPU support.

Self-contained diagnostics. The HTML dashboard export creates a single file with interactive charts that can be shared with team members, attached to tickets, or archived alongside model artifacts -- no external services required.

Health monitoring. The automated health warning system provides structured alerts with severity levels, diagnoses, and recommendations. Integrate these into CI/CD pipelines or monitoring dashboards.



Expression Library

285 verified mathematical expressions organized into 10 domain modules:

Module Count Scope
Activations 15 ReLU, Sigmoid, Tanh, Swish, GELU, Mish, ELU, SELU, Softmax, LeakyReLU
Losses 8 MSE, MAE, Cross-Entropy, Huber, Hinge, Log-Cosh, Quantile, Focal
Regularizers 20 L1, L2, Elastic Net, Dropout, SCAD, MCP, Group Lasso, Tikhonov
Algebra 48 Polynomials, kernels, probability distributions, special functions
Calculus 48 Elementary, trigonometric, hyperbolic, Taylor series, integral transforms
Statistics 35 PDFs, CDFs, information theory, Bayesian inference, regression
Linear Algebra 24 Norms, distances, projections, matrix operations, attention mechanisms
Optimization 27 SGD, Adam, AdamW, learning rate schedules, loss landscapes
Metrics 27 Precision, Recall, F1, R-squared, AIC, BIC, NDCG, Matthews correlation
Transforms 33 Normalization, encoding, weight initialization, signal processing

Every expression includes:

  • Symbolic representation (SymPy) with LaTeX rendering
  • Fast numerical evaluation (NumPy-backed via lambdify)
  • Gradient computation (analytical, symbolic)
  • Educational metadata (description, category, use cases, pros/cons)
  • Composability (arithmetic operations, function composition)
  • Optional trainable parameters


Datasets

100+ curated datasets for immediate experimentation and learning.

from neurogebra.datasets import Datasets

Datasets.list_all()                                           # Browse all
(X_train, y_train), (X_test, y_test) = Datasets.load_iris()  # Load
Datasets.search("classification")                             # Search
Datasets.get_info("california_housing")                       # Details
Category Count Examples
Classification 25+ Iris, Wine, Breast Cancer, MNIST, Fashion-MNIST, Spam, Titanic, Adult Income
Regression 25+ California Housing, Diabetes, Auto MPG, Bike Sharing, Energy Efficiency
Synthetic Patterns 20+ XOR, Moons, Circles, Spirals, Checkerboard, Blobs, Swiss Roll
Time Series 15+ Sine Waves, Random Walks, Stock Prices, Seasonal Data, AR Processes
Image Recognition 10+ MNIST, Fashion-MNIST, Digits (8x8), CIFAR-style
Text / NLP 5+ Spam Detection, Sentiment Analysis

Every dataset includes educational metadata (difficulty, use cases, sample count), pre-split train/test sets, verbose mode, and consistent numpy array output.



Building and Training Models

Neural Network Builder

from neurogebra.builders.model_builder import ModelBuilder
from neurogebra.training.educational_trainer import EducationalTrainer
from neurogebra.datasets.loaders import Datasets
import numpy as np

X, y = Datasets.load_moons(n_samples=500, noise=0.2)

builder = ModelBuilder()
model = builder.Sequential([
    builder.Dense(64, activation="relu", input_shape=(2,)),
    builder.Dropout(0.2),
    builder.Dense(32, activation="relu"),
    builder.Dense(1, activation="sigmoid")
], name="moon_classifier")

model.summary()
model.explain_architecture()
model.compile(optimizer="adam", loss="binary_crossentropy", learning_rate=0.01)

trainer = EducationalTrainer(model, verbose=True, explain_steps=True)
history = trainer.train(X, y, epochs=20, batch_size=32, validation_split=0.2)

Training Symbolic Expressions

from neurogebra import Expression
from neurogebra.core.trainer import Trainer
import numpy as np

expr = Expression(
    "fit_line", "m*x + b",
    params={"m": 0.0, "b": 0.0},
    trainable_params=["m", "b"]
)

X = np.linspace(0, 10, 100)
y = 2 * X + 1 + np.random.normal(0, 0.5, 100)

trainer = Trainer(expr, learning_rate=0.01, optimizer="adam")
history = trainer.fit(X, y, epochs=200, verbose=True)
print(f"Learned: m={expr.params['m']:.2f}, b={expr.params['b']:.2f}")
# Output: Learned: m=2.00, b=1.01

Symbolic Gradients and Composition

from neurogebra import MathForge

forge = MathForge()

sigmoid = forge.get("sigmoid")
sigmoid_grad = sigmoid.gradient("x")
print(sigmoid_grad.formula)   # Analytical derivative in LaTeX

mse = forge.get("mse")
mae = forge.get("mae")
custom_loss = 0.7 * mse + 0.3 * mae
print(custom_loss.eval(y=1.0, y_pred=0.8))


Autograd Engine

A from-scratch automatic differentiation engine for understanding backpropagation.

from neurogebra.core.autograd import Value

x = Value(2.0)
w = Value(-3.0)
b = Value(1.0)

y = w * x + b
z = y.relu()
z.backward()

print(f"dz/dw = {w.grad}")   # 2.0
print(f"dz/dx = {x.grad}")   # -3.0


Framework Bridges

Prototype in Neurogebra, deploy in production frameworks.

from neurogebra.bridges import to_pytorch, to_tensorflow

# Export to PyTorch
pytorch_model = to_pytorch(model)

# Export to TensorFlow/Keras
tf_model = to_tensorflow(model)


Architecture

neurogebra/
  core/
    expression.py          # Unified Expression class (symbolic + numerical + trainable)
    forge.py               # MathForge: central expression hub and search engine
    neurocraft.py          # NeuroCraft: educational interface with tutorials
    autograd.py            # Micro autograd engine (Value, Tensor)
    trainer.py             # Parameter optimization (SGD, Adam)
  repository/              # 10 domain modules, 285 expressions
  builders/                # ModelBuilder: architecture templates and layer definitions
  training/                # EducationalTrainer: training with explanations
  logging/                 # Training Observatory + Observatory Pro
    logger.py              #   Event-driven multi-level logger
    config.py              #   Preset configurations
    monitors.py            #   Gradient, weight, activation, performance monitors
    health_checks.py       #   Smart diagnostics with recommendations
    health_warnings.py     #   Automated threshold-based health rules
    adaptive.py            #   Adaptive logging with anomaly detection
    epoch_summary.py       #   Per-epoch statistical summarization
    tiered_storage.py      #   NDJSON tiered log files
    dashboard.py           #   HTML dashboard + TensorBoard + W&B bridges
    fingerprint.py         #   Training reproducibility capture
    terminal_display.py    #   Rich colour-coded terminal renderer
    formula_renderer.py    #   Unicode/LaTeX math formula display
    image_logger.py        #   ASCII rendering of images and activations
    exporters.py           #   JSON, CSV, HTML, Markdown exporters
    computation_graph.py   #   Full DAG tracker
  tutorials/               # Interactive step-by-step tutorial system
  datasets/                # 100+ built-in dataset loaders
  bridges/                 # Framework converters (PyTorch, TensorFlow, JAX)
  viz/                     # Visualization tools (Matplotlib, Plotly)
  utils/                   # Helpers and explanation engine


Documentation

Full documentation is available at neurogebra.readthedocs.io.

Section Description
Getting Started Installation, first program, how it works
Python Refresher Python basics, NumPy, data handling
ML Fundamentals What is ML, types, workflow, math
Tutorials MathForge, expressions, activations, losses, training, autograd
Advanced Topics Custom expressions, framework bridges, Observatory Pro
Projects Linear regression, image classifier, neural network from scratch
API Reference Complete API documentation


Contributing

Contributions are welcome. See CONTRIBUTING.md for guidelines.

git clone https://github.com/fahiiim/NeuroGebra.git
cd NeuroGebra
python -m venv venv
venv\Scripts\activate          # Windows
# source venv/bin/activate     # macOS/Linux
pip install -e ".[dev]"
pytest tests/ -v


What Neurogebra Is and Is Not

Neurogebra is not a competitor to TensorFlow or PyTorch.

TensorFlow and PyTorch are production-grade deep learning frameworks built for training large-scale neural networks on GPUs and TPUs. They are industry standards for model development, deployment, and research at scale. Neurogebra does not attempt to replace, replicate, or compete with them.

What Neurogebra is:

A mathematical formula library with executable, symbolic, and educational capabilities. The closest analogues are Wolfram Mathematica (proprietary, expensive, not Python-native) or manually assembling formulas from SymPy and Wikipedia (no curation, no ML focus, no educational layer). Neurogebra provides a unique combination that does not exist in any single tool today -- a searchable, executable, trainable encyclopedia of the math that powers modern AI, with built-in transparency, diagnostics, and educational features.



License

MIT License. See LICENSE for details.



Author: Fahim Sarker

GitHub  ·  PyPI  ·  Documentation  ·  Changelog

Built with precision. Designed for understanding.

About

Neural-powered mathematics for AI Enthusiasts, Educators and Engineers

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors