GitHub - fahiiim/NeuroGebra: Neural-powered mathematics for AI Enthusiasts, Educators and Engineers

The Executable Mathematical Formula Companion for AI and Data Science

285 symbolic expressions | 100+ datasets | Real training with full math transparency

Documentation · PyPI · GitHub · Changelog · Report Bug

What is Neurogebra?

Neurogebra is a unified Python library that bridges symbolic mathematics, numerical computation, and machine learning into a single, searchable, executable toolkit. It ships with 285 pre-built, tested, and documented mathematical expressions spanning activations, losses, statistics, optimization, linear algebra, and more -- each one symbolic, numerically evaluable, trainable, and accompanied by educational metadata. It also includes 100+ curated datasets for immediate experimentation.

Unlike traditional ML frameworks, Neurogebra is a mathematical formula companion: a searchable, executable encyclopedia of the formulas that power modern AI, complete with built-in explanations, gradient computation, composition tools, and ready-to-use datasets.

v2.5.3 -- Includes Observatory Pro with adaptive logging, automated health warnings, epoch summarization, tiered storage, visual dashboards, training fingerprinting, and full reproducibility support.

Section	Description
Installation	Get started in seconds
Quick Start	Your first 5 lines of Neurogebra
Unique Features	What makes Neurogebra different
Training Observatory	Real-time math transparency for training
Observatory Pro	Adaptive diagnostics, dashboards, fingerprinting
End-to-End Example: Training with Observatory Logs	Complete step-by-step walkthrough
Who Should Use Neurogebra?	Students, Researchers, Engineers
Expression Library	285 verified mathematical expressions
Datasets	100+ curated datasets
Building and Training Models	ModelBuilder and EducationalTrainer
Autograd Engine	From-scratch automatic differentiation
Framework Bridges	Export to PyTorch, TensorFlow, JAX
Architecture	Project structure
Documentation	Full docs and tutorials
Contributing	How to help
What Neurogebra Is and Is Not	Positioning and philosophy
License	MIT

Installation

pip install neurogebra

Optional extras for extended functionality:

pip install neurogebra[viz]          # Interactive visualization (Plotly, Seaborn)
pip install neurogebra[frameworks]   # PyTorch and TensorFlow bridges
pip install neurogebra[logging]      # TensorBoard and W&B integration
pip install neurogebra[datasets]     # scikit-learn-backed real-world datasets
pip install neurogebra[docs]         # Documentation tools (MkDocs Material)
pip install neurogebra[dev]          # Development and testing tools
pip install neurogebra[all]          # Everything above

Quick Start

from neurogebra import MathForge

forge = MathForge()

# Retrieve any of the 285 pre-built expressions by name
relu = forge.get("relu")
print(relu.eval(x=5))         # 5
print(relu.eval(x=-3))        # 0
print(relu.formula)            # LaTeX symbolic representation
print(relu.explain())          # Plain-language explanation

# Access optimizers, losses, metrics, distributions -- all the same way
adam   = forge.get("adam_step")
gauss  = forge.get("gaussian")
f1     = forge.get("f1_score_formula")

# Search and discover
results = forge.search("classification")
forge.list_all(category="activation")
forge.compare(["relu", "sigmoid", "tanh"])

Unique Features

Neurogebra provides a combination of capabilities that does not exist in any single tool today.

1. Searchable Executable Formula Repository

A curated library of 285 verified mathematical expressions organized across 10 domains. Every formula is simultaneously a symbolic object, a numerical function, and an educational resource. Access any formula by name, category, or keyword.

forge = MathForge()
cross_entropy = forge.get("cross_entropy")
print(cross_entropy.formula)       # Symbolic LaTeX
print(cross_entropy.eval(y=1, y_pred=0.9))  # Numerical result
print(cross_entropy.explain())     # What it is and when to use it

2. Symbolic + Numerical + Trainable in One Object

Every expression in Neurogebra is three things at once:

Symbolic -- Full SymPy integration with LaTeX rendering and analytical gradients
Numerical -- NumPy-backed lambdify for production-speed evaluation
Trainable -- Attach learnable parameters and optimize them with SGD or Adam

from neurogebra import Expression
from neurogebra.core.trainer import Trainer

expr = Expression("fit_line", "m*x + b", params={"m": 0.0, "b": 0.0}, trainable_params=["m", "b"])
trainer = Trainer(expr, learning_rate=0.01, optimizer="adam")
history = trainer.fit(X, y, epochs=200, verbose=True)

3. Training Observatory -- Full Math Transparency

The first training logging system that shows the complete mathematical picture of what happens inside a neural network during training -- layer-by-layer forward/backward formulas, gradient norms, weight distributions, activation statistics -- all colour-coded in your terminal.

4. Observatory Pro -- Active Diagnostic Engine

Six intelligent systems that go beyond passive logging:

Adaptive Logging -- Stays quiet until anomalies appear, reducing logs by 80-90%
Automated Health Warnings -- 10 rules with structured diagnoses and recommendations
Epoch Summarization -- Statistical rollups (mean, std, min, max) per epoch
Tiered Storage -- Separate files for basic, health, and debug logs
Visual Dashboard -- Self-contained interactive HTML with Chart.js
Training Fingerprint -- Full environment capture for reproducibility

5. Real Forward and Backward Computation

Unlike simulators, Neurogebra performs actual matrix multiplications and gradient computation through every layer. Weights are initialized with He initialization, activations are computed with real NumPy operations, and gradients flow through analytical backpropagation.

6. From-Scratch Autograd Engine

A fully functional automatic differentiation engine built from first principles. Build computation graphs, inspect every operation, and watch gradients propagate through Value and Tensor objects -- designed for education and transparency.

7. Expression Composition and Arithmetic

Combine any expressions using standard arithmetic operators (+, -, *, /) and function composition. Build custom losses, hybrid activations, or complex metrics from the existing library.

mse = forge.get("mse")
mae = forge.get("mae")
custom_loss = 0.7 * mse + 0.3 * mae

8. 100+ Curated Datasets with Educational Metadata

Every dataset comes with difficulty level, use cases, sample count, descriptions, and consistent (X, y) numpy output. Classification, regression, synthetic patterns, time series, image recognition, and text/NLP datasets are included.

9. Framework Bridges -- Prototype Then Deploy

Convert Neurogebra models to PyTorch, TensorFlow, or JAX modules. Prototype and verify in Neurogebra, then export to production frameworks with one call.

10. Educational Layer with explain() Everywhere

Every expression, layer, and operation has an .explain() method. The educational trainer provides real-time tips and debugging advice. Interactive tutorials walk through tensors, gradients, training, and more step by step.

11. Smart Health Diagnostics

Automatic detection of 10+ training problems (NaN/Inf, overfitting, underfitting, vanishing/exploding gradients, dead neurons, activation saturation, loss divergence, weight stagnation) with actionable, human-readable recommendations.

12. Multi-Format Export

Export training logs and reports to JSON, CSV, HTML (interactive charts via Chart.js), and Markdown. Optional TensorBoard and Weights & Biases integration.

Comparison Matrix

Capability	Neurogebra	SymPy	NumPy	Mathematica	PyTorch / TF
Symbolic math (LaTeX, calculus)	Yes	Yes	--	Yes	--
Fast numerical evaluation	Yes	Slow	Yes	Yes	Yes
285 curated ML/statistics formulas	Yes	--	--	--	--
Educational metadata per formula	Yes	--	--	--	--
Trainable symbolic parameters	Yes	--	--	--	N/A
Searchable formula repository	Yes	--	--	--	--
Real-time training math transparency	Yes	--	--	--	--
Adaptive diagnostic logging	Yes	--	--	--	--
Training fingerprint / reproducibility	Yes	--	--	--	--
Interactive HTML dashboards	Yes	--	--	--	--
Free and open source	Yes	Yes	Yes	--	Yes
Python native	Yes	Yes	Yes	--	Yes

Training Observatory

The Training Observatory shows the complete mathematical picture of what happens inside your neural network during training -- in real time, in colour, in your terminal.

One-Line Activation

from neurogebra.builders.model_builder import ModelBuilder

builder = ModelBuilder()
model = builder.Sequential([
    builder.Dense(64, activation="relu"),
    builder.Dense(32, activation="tanh"),
    builder.Dense(1, activation="sigmoid"),
], name="classifier")

model.compile(
    loss="binary_crossentropy",
    optimizer="adam",
    learning_rate=0.01,
    log_level="expert",          # <-- this is all you need
)

model.fit(X_train, y_train, epochs=20, batch_size=32)

Colour-Coded Terminal Output

Colour	Meaning
Green	Healthy metrics -- loss decreasing, gradients stable
Yellow	Warnings -- high gradient variance, saturation starting
Red	Danger -- vanishing/exploding gradients, NaN detected
Purple / Magenta	Mathematical formulas -- forward/backward equations
Blue	Informational -- epoch and batch progress

Log Levels

Level	What You See
`"basic"`	Epoch-level loss and accuracy, start/end messages
`"detailed"`	+ Batch-level progress, timing information
`"expert"`	+ Layer-by-layer formulas, gradient norms, weight stats
`"debug"`	+ Every tensor shape, raw statistics, full computation trace

Preset Configurations

from neurogebra.logging.config import LogConfig

config = LogConfig.minimal()     # Just epoch progress
config = LogConfig.standard()    # Layer info + timing + health checks
config = LogConfig.verbose()     # Full math depth -- every formula, every gradient
config = LogConfig.research()    # Everything + export to files

model.compile(loss="mse", optimizer="adam", log_config=config)

Layer-by-Layer Mathematical Display

At expert level, the Observatory renders the exact computation:

Forward:  a1 = relu(W1 * x + b1)      | shape: (32, 64) -> (32, 32)
Forward:  a2 = tanh(W2 * a1 + b2)     | shape: (32, 32) -> (32, 16)
Forward:  y_hat = sigma(W3 * a2 + b3) | shape: (32, 16) -> (32, 1)

Backward: dL/dW3 = dL/dy_hat . sigma'(z3) * a2^T
Backward: dL/dW2 = dL/da2 . tanh'(z2) * a1^T
Backward: dL/dW1 = dL/da1 . relu'(z1) * x^T

Smart Health Diagnostics

The Observatory automatically detects problems and provides actionable recommendations:

[CRITICAL] NaN/Inf Detected
   NaN values found in training loss!
   -> Check for division by zero in your data
   -> Reduce learning rate (try 1e-4)
   -> Add gradient clipping

[WARNING] Overfitting Detected
   Validation loss increasing while training loss decreases (ratio: 1.8x)
   -> Add dropout layers (rate 0.2-0.5)
   -> Reduce model complexity
   -> Increase training data or use data augmentation
   -> Try early stopping

[DANGER] Vanishing Gradients
   Layer 'dense_3' gradient L2 norm = 2.1e-09
   -> Use ReLU/LeakyReLU instead of sigmoid/tanh
   -> Add batch normalization
   -> Use skip connections

Export Training Logs

config = LogConfig.research()
config.export_formats = ["json", "csv", "html", "markdown"]
config.export_dir = "./my_training_logs"

model.compile(loss="mse", optimizer="adam", log_config=config)
model.fit(X, y, epochs=50)

# After training, find in ./my_training_logs/:
#   training_log.json   -- Full structured event log
#   metrics.csv         -- Epoch-level metrics table
#   report.html         -- Interactive HTML report with Chart.js graphs
#   report.md           -- Human-readable Markdown report

Observatory Pro

Six intelligent systems that turn the Training Observatory from a passive logger into an active diagnostic engine.

Feature	Problem It Solves	Impact
Adaptive Logging	Expert mode generates 77k+ log entries	80-90% log reduction
Health Warnings	Silent failures (58% dead neurons go unnoticed)	Catches problems automatically
Epoch Summaries	No statistical aggregation per epoch	Mean, std, min, max per metric
Tiered Storage	One massive JSON file for everything	3 focused files: basic / health / debug
Visual Dashboard	Raw JSON with no visualization	Interactive HTML charts with Chart.js
Training Fingerprint	Cannot reproduce training runs	Full environment and state capture

Adaptive Logging

from neurogebra.logging.adaptive import AdaptiveLogger, AnomalyConfig
from neurogebra.logging.logger import TrainingLogger, LogLevel

base_logger = TrainingLogger(level=LogLevel.EXPERT)
adaptive = AdaptiveLogger(base_logger, config=AnomalyConfig(
    zeros_pct_threshold=50.0,
    gradient_spike_factor=5.0,
    escalation_cooldown=10,
))
# Stays at BASIC until anomaly detected, then escalates automatically

Health Warnings

from neurogebra.logging.health_warnings import AutoHealthWarnings, WarningConfig

warnings_engine = AutoHealthWarnings(config=WarningConfig(
    dead_relu_zeros_pct=50.0,
    overfit_patience=3,
    overfit_ratio=1.3,
))
# 10 built-in rules: dead_relu, vanishing/exploding gradient, NaN/Inf,
# overfitting, loss stagnation, weight stagnation, loss divergence, activation saturation

Visual Dashboard

from neurogebra.logging.dashboard import DashboardExporter

dashboard = DashboardExporter(path="training_logs/dashboard.html")
logger.add_backend(dashboard)
# Generates self-contained HTML with loss curves, accuracy charts,
# timing bars, batch-level metrics, and health diagnostics

Training Fingerprint

from neurogebra.logging.fingerprint import TrainingFingerprint

fp = TrainingFingerprint.capture(
    model_info={"name": "my_model", "layers": 3},
    hyperparameters={"lr": 0.01, "batch_size": 32, "epochs": 50},
    dataset=X_train,
    random_seed=42,
)
print(fp.format_text())
# Captures: seeds, dataset SHA-256, library versions, CPU/RAM/GPU, OS, git state

End-to-End Example: Training with Observatory Logs

A complete, step-by-step walkthrough -- from data loading to training with full diagnostic logging, health monitoring, epoch summarization, dashboard export, and reproducibility fingerprinting.

Step 1 -- Import Everything

import numpy as np

# Core
from neurogebra.builders.model_builder import ModelBuilder
from neurogebra.datasets.loaders import Datasets

# Observatory (base logging)
from neurogebra.logging.logger import TrainingLogger, LogLevel
from neurogebra.logging.config import LogConfig

# Observatory Pro
from neurogebra.logging.adaptive import AdaptiveLogger, AnomalyConfig
from neurogebra.logging.health_warnings import AutoHealthWarnings, WarningConfig
from neurogebra.logging.epoch_summary import EpochSummarizer
from neurogebra.logging.tiered_storage import TieredStorage
from neurogebra.logging.dashboard import DashboardExporter
from neurogebra.logging.fingerprint import TrainingFingerprint

Step 2 -- Load and Prepare Data

# Load a synthetic dataset
X, y = Datasets.load_moons(n_samples=1000, noise=0.2)

# Train/test split
split = int(0.8 * len(X))
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

print(f"Training samples: {X_train.shape[0]}")
print(f"Test samples:     {X_test.shape[0]}")
print(f"Features:         {X_train.shape[1]}")

Step 3 -- Build the Model

builder = ModelBuilder()
model = builder.Sequential([
    builder.Dense(64, activation="relu", input_shape=(2,)),
    builder.Dense(32, activation="tanh"),
    builder.Dense(1, activation="sigmoid"),
], name="moon_classifier")

# Inspect before training
model.summary()
model.explain_architecture()

Step 4 -- Set Up the Logging Pipeline

# Base logger at EXPERT level
base_logger = TrainingLogger(level=LogLevel.EXPERT)

# Adaptive wrapper -- stays quiet until something goes wrong
adaptive = AdaptiveLogger(base_logger, config=AnomalyConfig(
    zeros_pct_threshold=50.0,      # Dead neuron threshold
    gradient_spike_factor=5.0,     # Gradient spike sensitivity
    escalation_cooldown=10,        # Stay escalated for 10 events
))

# Tiered storage -- separates logs into basic/health/debug files
storage = TieredStorage(
    base_dir="./training_logs",
    write_debug=True,
    buffer_size=50,
)
base_logger.add_backend(storage)

# Interactive HTML dashboard
dashboard = DashboardExporter(path="./training_logs/dashboard.html")
base_logger.add_backend(dashboard)

# Health warnings engine
warnings_engine = AutoHealthWarnings(config=WarningConfig(
    dead_relu_zeros_pct=50.0,
    overfit_patience=3,
    overfit_ratio=1.3,
))

# Epoch summarizer
summarizer = EpochSummarizer()

Step 5 -- Capture the Training Fingerprint

fingerprint = TrainingFingerprint.capture(
    model_info={"name": "moon_classifier", "layers": 3},
    hyperparameters={
        "learning_rate": 0.01,
        "batch_size": 32,
        "epochs": 20,
        "optimizer": "adam",
        "loss": "binary_crossentropy",
    },
    dataset=X_train,
    random_seed=42,
)

print(fingerprint.format_text())
# Output:
# +-- Training Fingerprint --+
#   Run ID:       a1b2c3d4e5f6
#   Timestamp:    2026-03-01 12:00:00
#   Seed:         42
#   Dataset Hash: 8f14e45fceea167a
#   Neurogebra:   2.5.3
#   Python:       3.11.5
#   NumPy:        1.26.0
#   ...
# +-------------------------+

Step 6 -- Compile and Train

# Compile with Observatory logging enabled
model.compile(
    loss="binary_crossentropy",
    optimizer="adam",
    learning_rate=0.01,
    log_level="expert",
)

# Run training
epochs = 20
batch_size = 32
num_batches = len(X_train) // batch_size

adaptive.on_train_start(total_epochs=epochs, model_info=fingerprint.model_info)

for epoch in range(epochs):
    adaptive.on_epoch_start(epoch)
    epoch_loss = 0.0

    for batch_idx in range(num_batches):
        start = batch_idx * batch_size
        end = start + batch_size
        X_batch = X_train[start:end]
        y_batch = y_train[start:end]

        # Forward pass (real computation)
        predictions = model.predict(X_batch)
        loss = np.mean((predictions.flatten() - y_batch) ** 2)
        epoch_loss += loss

        # Record batch metrics
        summarizer.record_batch(
            epoch=epoch,
            metrics={"loss": loss, "accuracy": np.mean((predictions.flatten() > 0.5) == y_batch)},
        )

        # Check batch-level health
        batch_alerts = warnings_engine.check_batch(
            epoch=epoch,
            batch=batch_idx,
            loss=loss,
        )
        for alert in batch_alerts:
            print(f"  [{alert.severity.upper()}] {alert.message}")

    # Epoch summary
    avg_loss = epoch_loss / num_batches
    summary = summarizer.finalize_epoch(epoch)
    print(summary.format_text())

    # Epoch-level health check
    epoch_alerts = warnings_engine.check_epoch(
        epoch=epoch,
        train_loss=avg_loss,
    )

    adaptive.on_epoch_end(epoch, metrics={"loss": avg_loss})

adaptive.on_train_end()

Step 7 -- Save Everything and Review

# Flush and close storage
storage.flush()
storage.close()

# Save dashboard
dashboard.save()

# Save fingerprint
import json
with open("./training_logs/fingerprint.json", "w") as f:
    json.dump(fingerprint.to_dict(), f, indent=2)

# Print summary
print(f"\nAnomalies detected: {adaptive.get_anomaly_summary()['total_anomalies']}")
print(f"Health warnings:    {warnings_engine.get_summary()['total_warnings']}")
print(f"\nFiles saved to ./training_logs/:")
print(f"  basic.log        -- Epoch-level metrics (NDJSON)")
print(f"  health.log       -- Health warnings and anomalies (NDJSON)")
print(f"  debug.log        -- Full expert-level detail (NDJSON)")
print(f"  dashboard.html   -- Interactive HTML dashboard")
print(f"  fingerprint.json -- Full reproducibility snapshot")

What the Output Looks Like

Terminal (colour-coded):

[INFO]  Training started: 20 epochs, 3 layers
[INFO]  Epoch 1/20
  Forward:  a1 = relu(W1 * x + b1)      | shape: (32, 64)
  Forward:  a2 = tanh(W2 * a1 + b2)     | shape: (32, 32)
  Forward:  y_hat = sigma(W3 * a2 + b3) | shape: (32, 1)
  Backward: dL/dW3 = dL/dy_hat . sigma'(z3) * a2^T
  Backward: dL/dW2 = dL/da2 . tanh'(z2) * a1^T
  Backward: dL/dW1 = dL/da1 . relu'(z1) * x^T

== Epoch 1 Summary (25 batches) ==
  loss      mean=0.4821  std=0.0312  min=0.4100  max=0.5500
  accuracy  mean=0.7640  std=0.0180  min=0.7200  max=0.8000

[INFO]  Epoch 2/20
  ...

[WARNING] Possible dying ReLU in 'dense_0' (52.0% zeros)
  -> Use LeakyReLU(negative_slope=0.01) instead of ReLU
  -> Lower the learning rate
  -> Use He initialisation

== Epoch 20 Summary (25 batches) ==
  loss      mean=0.0891  std=0.0045  min=0.0810  max=0.0990
  accuracy  mean=0.9620  std=0.0060  min=0.9500  max=0.9750

[INFO]  Training complete. Total anomalies: 2. Health warnings: 3.

Dashboard (HTML): Interactive loss curves, accuracy curves, epoch timing bars, batch-level metrics, and health diagnostics timeline -- all in a single self-contained HTML file you can open in any browser.

Who Should Use Neurogebra?

For Students

It is an executable reference library, not a framework to master.

Students do not need to "learn Neurogebra" the way they learn PyTorch. They use it to look up, verify, and experiment with mathematical formulas. The command forge.get("adam_step") immediately returns the Adam optimizer update rule as a symbolic expression with documentation attached -- no textbook lookup required.

It bridges the gap between mathematical theory and code. In most curricula, students learn formulas on a whiteboard and then separately implement them in code. Neurogebra collapses that gap: every formula is simultaneously a symbolic object (inspect the math), a numerical function (run it on data), and an educational resource (read what it does and when to use it).

It eliminates transcription errors. Students routinely introduce bugs when translating formulas from papers or textbooks into code. Neurogebra's 285 expressions are verified with 470+ automated tests. Using forge.get("cross_entropy") is faster and more reliable than re-deriving it from scratch.

It teaches through transparency. The autograd engine, the educational trainer with real-time debugging advice, the layer explanation system, and the interactive tutorials are designed to make invisible processes visible. Students do not just see numbers -- they see why their loss is diverging, what each layer does, and how gradients flow through a computation graph.

It complements existing tools. Prototype and verify formulas in Neurogebra, then implement production models in PyTorch or TensorFlow. The framework bridges explicitly support this workflow.

For Researchers

Rapid prototyping of custom formulas. Define a new loss function, activation, or metric as a symbolic expression and immediately evaluate it, differentiate it, compose it with others, and train it -- all without writing boilerplate code.

Reproducibility built in. The Training Fingerprint captures everything needed to reproduce a run: seeds, dataset hash, library versions, hardware info, OS, git state, model architecture hash, and hyperparameters. Save it as JSON alongside your results.

Transparent training diagnostics. The Observatory and Observatory Pro give unprecedented visibility into what happens during training. Instead of treating the model as a black box, researchers can inspect layer-by-layer forward/backward formulas, gradient distributions, weight dynamics, and activation statistics at every step.

Structured experiment logging. Tiered storage separates basic metrics from health alerts from debug-level detail. The dashboard exporter generates interactive HTML reports. TensorBoard and Weights & Biases bridges integrate with existing research workflows.

Symbolic gradient verification. Compare analytical gradients from SymPy with numerical gradients from autograd. Verify that your custom formula's gradient is correct before training.

For Engineers

Verified formula library. The 285 expressions are tested with 470+ automated tests. Use them as a reference implementation or as building blocks for custom metrics and loss functions -- without re-deriving from papers.

Production logging presets. LogConfig.production() provides lean, structured logging suitable for deployment. Tiered storage separates operational metrics from diagnostic detail. Adaptive logging reduces noise by 80-90%.

Framework bridge workflow. Prototype quickly in Neurogebra, verify correctness with symbolic gradients and the Observatory, then export to PyTorch or TensorFlow for production training with GPU support.

Self-contained diagnostics. The HTML dashboard export creates a single file with interactive charts that can be shared with team members, attached to tickets, or archived alongside model artifacts -- no external services required.

Health monitoring. The automated health warning system provides structured alerts with severity levels, diagnoses, and recommendations. Integrate these into CI/CD pipelines or monitoring dashboards.

Expression Library

285 verified mathematical expressions organized into 10 domain modules:

Module	Count	Scope
Activations	15	ReLU, Sigmoid, Tanh, Swish, GELU, Mish, ELU, SELU, Softmax, LeakyReLU
Losses	8	MSE, MAE, Cross-Entropy, Huber, Hinge, Log-Cosh, Quantile, Focal
Regularizers	20	L1, L2, Elastic Net, Dropout, SCAD, MCP, Group Lasso, Tikhonov
Algebra	48	Polynomials, kernels, probability distributions, special functions
Calculus	48	Elementary, trigonometric, hyperbolic, Taylor series, integral transforms
Statistics	35	PDFs, CDFs, information theory, Bayesian inference, regression
Linear Algebra	24	Norms, distances, projections, matrix operations, attention mechanisms
Optimization	27	SGD, Adam, AdamW, learning rate schedules, loss landscapes
Metrics	27	Precision, Recall, F1, R-squared, AIC, BIC, NDCG, Matthews correlation
Transforms	33	Normalization, encoding, weight initialization, signal processing

Every expression includes:

Symbolic representation (SymPy) with LaTeX rendering
Fast numerical evaluation (NumPy-backed via lambdify)
Gradient computation (analytical, symbolic)
Educational metadata (description, category, use cases, pros/cons)
Composability (arithmetic operations, function composition)
Optional trainable parameters

Datasets

100+ curated datasets for immediate experimentation and learning.

from neurogebra.datasets import Datasets

Datasets.list_all()                                           # Browse all
(X_train, y_train), (X_test, y_test) = Datasets.load_iris()  # Load
Datasets.search("classification")                             # Search
Datasets.get_info("california_housing")                       # Details

Category	Count	Examples
Classification	25+	Iris, Wine, Breast Cancer, MNIST, Fashion-MNIST, Spam, Titanic, Adult Income
Regression	25+	California Housing, Diabetes, Auto MPG, Bike Sharing, Energy Efficiency
Synthetic Patterns	20+	XOR, Moons, Circles, Spirals, Checkerboard, Blobs, Swiss Roll
Time Series	15+	Sine Waves, Random Walks, Stock Prices, Seasonal Data, AR Processes
Image Recognition	10+	MNIST, Fashion-MNIST, Digits (8x8), CIFAR-style
Text / NLP	5+	Spam Detection, Sentiment Analysis

Every dataset includes educational metadata (difficulty, use cases, sample count), pre-split train/test sets, verbose mode, and consistent numpy array output.

Building and Training Models

Neural Network Builder

from neurogebra.builders.model_builder import ModelBuilder
from neurogebra.training.educational_trainer import EducationalTrainer
from neurogebra.datasets.loaders import Datasets
import numpy as np

X, y = Datasets.load_moons(n_samples=500, noise=0.2)

builder = ModelBuilder()
model = builder.Sequential([
    builder.Dense(64, activation="relu", input_shape=(2,)),
    builder.Dropout(0.2),
    builder.Dense(32, activation="relu"),
    builder.Dense(1, activation="sigmoid")
], name="moon_classifier")

model.summary()
model.explain_architecture()
model.compile(optimizer="adam", loss="binary_crossentropy", learning_rate=0.01)

trainer = EducationalTrainer(model, verbose=True, explain_steps=True)
history = trainer.train(X, y, epochs=20, batch_size=32, validation_split=0.2)

Training Symbolic Expressions

from neurogebra import Expression
from neurogebra.core.trainer import Trainer
import numpy as np

expr = Expression(
    "fit_line", "m*x + b",
    params={"m": 0.0, "b": 0.0},
    trainable_params=["m", "b"]
)

X = np.linspace(0, 10, 100)
y = 2 * X + 1 + np.random.normal(0, 0.5, 100)

trainer = Trainer(expr, learning_rate=0.01, optimizer="adam")
history = trainer.fit(X, y, epochs=200, verbose=True)
print(f"Learned: m={expr.params['m']:.2f}, b={expr.params['b']:.2f}")
# Output: Learned: m=2.00, b=1.01

Symbolic Gradients and Composition

from neurogebra import MathForge

forge = MathForge()

sigmoid = forge.get("sigmoid")
sigmoid_grad = sigmoid.gradient("x")
print(sigmoid_grad.formula)   # Analytical derivative in LaTeX

mse = forge.get("mse")
mae = forge.get("mae")
custom_loss = 0.7 * mse + 0.3 * mae
print(custom_loss.eval(y=1.0, y_pred=0.8))

Autograd Engine

A from-scratch automatic differentiation engine for understanding backpropagation.

from neurogebra.core.autograd import Value

x = Value(2.0)
w = Value(-3.0)
b = Value(1.0)

y = w * x + b
z = y.relu()
z.backward()

print(f"dz/dw = {w.grad}")   # 2.0
print(f"dz/dx = {x.grad}")   # -3.0

Framework Bridges

Prototype in Neurogebra, deploy in production frameworks.

from neurogebra.bridges import to_pytorch, to_tensorflow

# Export to PyTorch
pytorch_model = to_pytorch(model)

# Export to TensorFlow/Keras
tf_model = to_tensorflow(model)

Architecture

neurogebra/
  core/
    expression.py          # Unified Expression class (symbolic + numerical + trainable)
    forge.py               # MathForge: central expression hub and search engine
    neurocraft.py          # NeuroCraft: educational interface with tutorials
    autograd.py            # Micro autograd engine (Value, Tensor)
    trainer.py             # Parameter optimization (SGD, Adam)
  repository/              # 10 domain modules, 285 expressions
  builders/                # ModelBuilder: architecture templates and layer definitions
  training/                # EducationalTrainer: training with explanations
  logging/                 # Training Observatory + Observatory Pro
    logger.py              #   Event-driven multi-level logger
    config.py              #   Preset configurations
    monitors.py            #   Gradient, weight, activation, performance monitors
    health_checks.py       #   Smart diagnostics with recommendations
    health_warnings.py     #   Automated threshold-based health rules
    adaptive.py            #   Adaptive logging with anomaly detection
    epoch_summary.py       #   Per-epoch statistical summarization
    tiered_storage.py      #   NDJSON tiered log files
    dashboard.py           #   HTML dashboard + TensorBoard + W&B bridges
    fingerprint.py         #   Training reproducibility capture
    terminal_display.py    #   Rich colour-coded terminal renderer
    formula_renderer.py    #   Unicode/LaTeX math formula display
    image_logger.py        #   ASCII rendering of images and activations
    exporters.py           #   JSON, CSV, HTML, Markdown exporters
    computation_graph.py   #   Full DAG tracker
  tutorials/               # Interactive step-by-step tutorial system
  datasets/                # 100+ built-in dataset loaders
  bridges/                 # Framework converters (PyTorch, TensorFlow, JAX)
  viz/                     # Visualization tools (Matplotlib, Plotly)
  utils/                   # Helpers and explanation engine

Documentation

Full documentation is available at neurogebra.readthedocs.io.

Section	Description
Getting Started	Installation, first program, how it works
Python Refresher	Python basics, NumPy, data handling
ML Fundamentals	What is ML, types, workflow, math
Tutorials	MathForge, expressions, activations, losses, training, autograd
Advanced Topics	Custom expressions, framework bridges, Observatory Pro
Projects	Linear regression, image classifier, neural network from scratch
API Reference	Complete API documentation

Contributing

Contributions are welcome. See CONTRIBUTING.md for guidelines.

git clone https://github.com/fahiiim/NeuroGebra.git
cd NeuroGebra
python -m venv venv
venv\Scripts\activate          # Windows
# source venv/bin/activate     # macOS/Linux
pip install -e ".[dev]"
pytest tests/ -v

What Neurogebra Is and Is Not

Neurogebra is not a competitor to TensorFlow or PyTorch.

TensorFlow and PyTorch are production-grade deep learning frameworks built for training large-scale neural networks on GPUs and TPUs. They are industry standards for model development, deployment, and research at scale. Neurogebra does not attempt to replace, replicate, or compete with them.

What Neurogebra is:

A mathematical formula library with executable, symbolic, and educational capabilities. The closest analogues are Wolfram Mathematica (proprietary, expensive, not Python-native) or manually assembling formulas from SymPy and Wikipedia (no curation, no ML focus, no educational layer). Neurogebra provides a unique combination that does not exist in any single tool today -- a searchable, executable, trainable encyclopedia of the math that powers modern AI, with built-in transparency, diagnostics, and educational features.

License

MIT License. See LICENSE for details.

Author: Fahim Sarker

GitHub · PyPI · Documentation · Changelog

_{Built with precision. Designed for understanding.}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.github		.github
assets		assets
benchmarks		benchmarks
docs		docs
examples		examples
scripts		scripts
src/neurogebra		src/neurogebra
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
.readthedocs.yml		.readthedocs.yml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DATASETS_STATUS.md		DATASETS_STATUS.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
PRE_PUBLISH_CHECKLIST.md		PRE_PUBLISH_CHECKLIST.md
PUBLISHING.md		PUBLISHING.md
PUBLISHING_STATUS.md		PUBLISHING_STATUS.md
README.md		README.md
READTHEDOCS_SETUP.md		READTHEDOCS_SETUP.md
RELEASE_NOTES.md		RELEASE_NOTES.md
mkdocs.yml		mkdocs.yml
publish.ps1		publish.ps1
pyproject.toml		pyproject.toml
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

The Executable Mathematical Formula Companion for AI and Data Science

What is Neurogebra?

Table of Contents

Installation

Quick Start

Unique Features

1. Searchable Executable Formula Repository

2. Symbolic + Numerical + Trainable in One Object

3. Training Observatory -- Full Math Transparency

4. Observatory Pro -- Active Diagnostic Engine

5. Real Forward and Backward Computation

6. From-Scratch Autograd Engine

7. Expression Composition and Arithmetic

8. 100+ Curated Datasets with Educational Metadata

9. Framework Bridges -- Prototype Then Deploy

10. Educational Layer with explain() Everywhere

11. Smart Health Diagnostics

12. Multi-Format Export

Comparison Matrix

Training Observatory

One-Line Activation

Colour-Coded Terminal Output

Log Levels

Preset Configurations

Layer-by-Layer Mathematical Display

Smart Health Diagnostics

Export Training Logs

Observatory Pro

Adaptive Logging

Health Warnings

Visual Dashboard

Training Fingerprint

End-to-End Example: Training with Observatory Logs

Step 1 -- Import Everything

Step 2 -- Load and Prepare Data

Step 3 -- Build the Model

Step 4 -- Set Up the Logging Pipeline

Step 5 -- Capture the Training Fingerprint

Step 6 -- Compile and Train

Step 7 -- Save Everything and Review

What the Output Looks Like

Who Should Use Neurogebra?

For Students

For Researchers

For Engineers

Expression Library

Datasets

Building and Training Models

Neural Network Builder

Training Symbolic Expressions

Symbolic Gradients and Composition

Autograd Engine

Framework Bridges

Architecture

Documentation

Contributing

What Neurogebra Is and Is Not

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages