285 symbolic expressions | 100+ datasets | Real training with full math transparency
Documentation · PyPI · GitHub · Changelog · Report Bug
Neurogebra is a unified Python library that bridges symbolic mathematics, numerical computation, and machine learning into a single, searchable, executable toolkit. It ships with 285 pre-built, tested, and documented mathematical expressions spanning activations, losses, statistics, optimization, linear algebra, and more -- each one symbolic, numerically evaluable, trainable, and accompanied by educational metadata. It also includes 100+ curated datasets for immediate experimentation.
Unlike traditional ML frameworks, Neurogebra is a mathematical formula companion: a searchable, executable encyclopedia of the formulas that power modern AI, complete with built-in explanations, gradient computation, composition tools, and ready-to-use datasets.
v2.5.3 -- Includes Observatory Pro with adaptive logging, automated health warnings, epoch summarization, tiered storage, visual dashboards, training fingerprinting, and full reproducibility support.
| Section | Description |
|---|---|
| Installation | Get started in seconds |
| Quick Start | Your first 5 lines of Neurogebra |
| Unique Features | What makes Neurogebra different |
| Training Observatory | Real-time math transparency for training |
| Observatory Pro | Adaptive diagnostics, dashboards, fingerprinting |
| End-to-End Example: Training with Observatory Logs | Complete step-by-step walkthrough |
| Who Should Use Neurogebra? | Students, Researchers, Engineers |
| Expression Library | 285 verified mathematical expressions |
| Datasets | 100+ curated datasets |
| Building and Training Models | ModelBuilder and EducationalTrainer |
| Autograd Engine | From-scratch automatic differentiation |
| Framework Bridges | Export to PyTorch, TensorFlow, JAX |
| Architecture | Project structure |
| Documentation | Full docs and tutorials |
| Contributing | How to help |
| What Neurogebra Is and Is Not | Positioning and philosophy |
| License | MIT |
pip install neurogebraOptional extras for extended functionality:
pip install neurogebra[viz] # Interactive visualization (Plotly, Seaborn)
pip install neurogebra[frameworks] # PyTorch and TensorFlow bridges
pip install neurogebra[logging] # TensorBoard and W&B integration
pip install neurogebra[datasets] # scikit-learn-backed real-world datasets
pip install neurogebra[docs] # Documentation tools (MkDocs Material)
pip install neurogebra[dev] # Development and testing tools
pip install neurogebra[all] # Everything aboveRequirements: Python 3.9+ | NumPy | SymPy | Matplotlib | SciPy | Rich | Colorama
from neurogebra import MathForge
forge = MathForge()
# Retrieve any of the 285 pre-built expressions by name
relu = forge.get("relu")
print(relu.eval(x=5)) # 5
print(relu.eval(x=-3)) # 0
print(relu.formula) # LaTeX symbolic representation
print(relu.explain()) # Plain-language explanation
# Access optimizers, losses, metrics, distributions -- all the same way
adam = forge.get("adam_step")
gauss = forge.get("gaussian")
f1 = forge.get("f1_score_formula")
# Search and discover
results = forge.search("classification")
forge.list_all(category="activation")
forge.compare(["relu", "sigmoid", "tanh"])Neurogebra provides a combination of capabilities that does not exist in any single tool today.
A curated library of 285 verified mathematical expressions organized across 10 domains. Every formula is simultaneously a symbolic object, a numerical function, and an educational resource. Access any formula by name, category, or keyword.
forge = MathForge()
cross_entropy = forge.get("cross_entropy")
print(cross_entropy.formula) # Symbolic LaTeX
print(cross_entropy.eval(y=1, y_pred=0.9)) # Numerical result
print(cross_entropy.explain()) # What it is and when to use itEvery expression in Neurogebra is three things at once:
- Symbolic -- Full SymPy integration with LaTeX rendering and analytical gradients
- Numerical -- NumPy-backed lambdify for production-speed evaluation
- Trainable -- Attach learnable parameters and optimize them with SGD or Adam
from neurogebra import Expression
from neurogebra.core.trainer import Trainer
expr = Expression("fit_line", "m*x + b", params={"m": 0.0, "b": 0.0}, trainable_params=["m", "b"])
trainer = Trainer(expr, learning_rate=0.01, optimizer="adam")
history = trainer.fit(X, y, epochs=200, verbose=True)The first training logging system that shows the complete mathematical picture of what happens inside a neural network during training -- layer-by-layer forward/backward formulas, gradient norms, weight distributions, activation statistics -- all colour-coded in your terminal.
Six intelligent systems that go beyond passive logging:
- Adaptive Logging -- Stays quiet until anomalies appear, reducing logs by 80-90%
- Automated Health Warnings -- 10 rules with structured diagnoses and recommendations
- Epoch Summarization -- Statistical rollups (mean, std, min, max) per epoch
- Tiered Storage -- Separate files for basic, health, and debug logs
- Visual Dashboard -- Self-contained interactive HTML with Chart.js
- Training Fingerprint -- Full environment capture for reproducibility
Unlike simulators, Neurogebra performs actual matrix multiplications and gradient computation through every layer. Weights are initialized with He initialization, activations are computed with real NumPy operations, and gradients flow through analytical backpropagation.
A fully functional automatic differentiation engine built from first principles. Build computation graphs, inspect every operation, and watch gradients propagate through Value and Tensor objects -- designed for education and transparency.
Combine any expressions using standard arithmetic operators (+, -, *, /) and function composition. Build custom losses, hybrid activations, or complex metrics from the existing library.
mse = forge.get("mse")
mae = forge.get("mae")
custom_loss = 0.7 * mse + 0.3 * maeEvery dataset comes with difficulty level, use cases, sample count, descriptions, and consistent (X, y) numpy output. Classification, regression, synthetic patterns, time series, image recognition, and text/NLP datasets are included.
Convert Neurogebra models to PyTorch, TensorFlow, or JAX modules. Prototype and verify in Neurogebra, then export to production frameworks with one call.
Every expression, layer, and operation has an .explain() method. The educational trainer provides real-time tips and debugging advice. Interactive tutorials walk through tensors, gradients, training, and more step by step.
Automatic detection of 10+ training problems (NaN/Inf, overfitting, underfitting, vanishing/exploding gradients, dead neurons, activation saturation, loss divergence, weight stagnation) with actionable, human-readable recommendations.
Export training logs and reports to JSON, CSV, HTML (interactive charts via Chart.js), and Markdown. Optional TensorBoard and Weights & Biases integration.
| Capability | Neurogebra | SymPy | NumPy | Mathematica | PyTorch / TF |
|---|---|---|---|---|---|
| Symbolic math (LaTeX, calculus) | Yes | Yes | -- | Yes | -- |
| Fast numerical evaluation | Yes | Slow | Yes | Yes | Yes |
| 285 curated ML/statistics formulas | Yes | -- | -- | -- | -- |
| Educational metadata per formula | Yes | -- | -- | -- | -- |
| Trainable symbolic parameters | Yes | -- | -- | -- | N/A |
| Searchable formula repository | Yes | -- | -- | -- | -- |
| Real-time training math transparency | Yes | -- | -- | -- | -- |
| Adaptive diagnostic logging | Yes | -- | -- | -- | -- |
| Training fingerprint / reproducibility | Yes | -- | -- | -- | -- |
| Interactive HTML dashboards | Yes | -- | -- | -- | -- |
| Free and open source | Yes | Yes | Yes | -- | Yes |
| Python native | Yes | Yes | Yes | -- | Yes |
The Training Observatory shows the complete mathematical picture of what happens inside your neural network during training -- in real time, in colour, in your terminal.
from neurogebra.builders.model_builder import ModelBuilder
builder = ModelBuilder()
model = builder.Sequential([
builder.Dense(64, activation="relu"),
builder.Dense(32, activation="tanh"),
builder.Dense(1, activation="sigmoid"),
], name="classifier")
model.compile(
loss="binary_crossentropy",
optimizer="adam",
learning_rate=0.01,
log_level="expert", # <-- this is all you need
)
model.fit(X_train, y_train, epochs=20, batch_size=32)| Colour | Meaning |
|---|---|
| Green | Healthy metrics -- loss decreasing, gradients stable |
| Yellow | Warnings -- high gradient variance, saturation starting |
| Red | Danger -- vanishing/exploding gradients, NaN detected |
| Purple / Magenta | Mathematical formulas -- forward/backward equations |
| Blue | Informational -- epoch and batch progress |
| Level | What You See |
|---|---|
"basic" |
Epoch-level loss and accuracy, start/end messages |
"detailed" |
+ Batch-level progress, timing information |
"expert" |
+ Layer-by-layer formulas, gradient norms, weight stats |
"debug" |
+ Every tensor shape, raw statistics, full computation trace |
from neurogebra.logging.config import LogConfig
config = LogConfig.minimal() # Just epoch progress
config = LogConfig.standard() # Layer info + timing + health checks
config = LogConfig.verbose() # Full math depth -- every formula, every gradient
config = LogConfig.research() # Everything + export to files
model.compile(loss="mse", optimizer="adam", log_config=config)At expert level, the Observatory renders the exact computation:
Forward: a1 = relu(W1 * x + b1) | shape: (32, 64) -> (32, 32)
Forward: a2 = tanh(W2 * a1 + b2) | shape: (32, 32) -> (32, 16)
Forward: y_hat = sigma(W3 * a2 + b3) | shape: (32, 16) -> (32, 1)
Backward: dL/dW3 = dL/dy_hat . sigma'(z3) * a2^T
Backward: dL/dW2 = dL/da2 . tanh'(z2) * a1^T
Backward: dL/dW1 = dL/da1 . relu'(z1) * x^T
The Observatory automatically detects problems and provides actionable recommendations:
[CRITICAL] NaN/Inf Detected
NaN values found in training loss!
-> Check for division by zero in your data
-> Reduce learning rate (try 1e-4)
-> Add gradient clipping
[WARNING] Overfitting Detected
Validation loss increasing while training loss decreases (ratio: 1.8x)
-> Add dropout layers (rate 0.2-0.5)
-> Reduce model complexity
-> Increase training data or use data augmentation
-> Try early stopping
[DANGER] Vanishing Gradients
Layer 'dense_3' gradient L2 norm = 2.1e-09
-> Use ReLU/LeakyReLU instead of sigmoid/tanh
-> Add batch normalization
-> Use skip connections
config = LogConfig.research()
config.export_formats = ["json", "csv", "html", "markdown"]
config.export_dir = "./my_training_logs"
model.compile(loss="mse", optimizer="adam", log_config=config)
model.fit(X, y, epochs=50)
# After training, find in ./my_training_logs/:
# training_log.json -- Full structured event log
# metrics.csv -- Epoch-level metrics table
# report.html -- Interactive HTML report with Chart.js graphs
# report.md -- Human-readable Markdown reportSix intelligent systems that turn the Training Observatory from a passive logger into an active diagnostic engine.
| Feature | Problem It Solves | Impact |
|---|---|---|
| Adaptive Logging | Expert mode generates 77k+ log entries | 80-90% log reduction |
| Health Warnings | Silent failures (58% dead neurons go unnoticed) | Catches problems automatically |
| Epoch Summaries | No statistical aggregation per epoch | Mean, std, min, max per metric |
| Tiered Storage | One massive JSON file for everything | 3 focused files: basic / health / debug |
| Visual Dashboard | Raw JSON with no visualization | Interactive HTML charts with Chart.js |
| Training Fingerprint | Cannot reproduce training runs | Full environment and state capture |
from neurogebra.logging.adaptive import AdaptiveLogger, AnomalyConfig
from neurogebra.logging.logger import TrainingLogger, LogLevel
base_logger = TrainingLogger(level=LogLevel.EXPERT)
adaptive = AdaptiveLogger(base_logger, config=AnomalyConfig(
zeros_pct_threshold=50.0,
gradient_spike_factor=5.0,
escalation_cooldown=10,
))
# Stays at BASIC until anomaly detected, then escalates automaticallyfrom neurogebra.logging.health_warnings import AutoHealthWarnings, WarningConfig
warnings_engine = AutoHealthWarnings(config=WarningConfig(
dead_relu_zeros_pct=50.0,
overfit_patience=3,
overfit_ratio=1.3,
))
# 10 built-in rules: dead_relu, vanishing/exploding gradient, NaN/Inf,
# overfitting, loss stagnation, weight stagnation, loss divergence, activation saturationfrom neurogebra.logging.dashboard import DashboardExporter
dashboard = DashboardExporter(path="training_logs/dashboard.html")
logger.add_backend(dashboard)
# Generates self-contained HTML with loss curves, accuracy charts,
# timing bars, batch-level metrics, and health diagnosticsfrom neurogebra.logging.fingerprint import TrainingFingerprint
fp = TrainingFingerprint.capture(
model_info={"name": "my_model", "layers": 3},
hyperparameters={"lr": 0.01, "batch_size": 32, "epochs": 50},
dataset=X_train,
random_seed=42,
)
print(fp.format_text())
# Captures: seeds, dataset SHA-256, library versions, CPU/RAM/GPU, OS, git stateA complete, step-by-step walkthrough -- from data loading to training with full diagnostic logging, health monitoring, epoch summarization, dashboard export, and reproducibility fingerprinting.
import numpy as np
# Core
from neurogebra.builders.model_builder import ModelBuilder
from neurogebra.datasets.loaders import Datasets
# Observatory (base logging)
from neurogebra.logging.logger import TrainingLogger, LogLevel
from neurogebra.logging.config import LogConfig
# Observatory Pro
from neurogebra.logging.adaptive import AdaptiveLogger, AnomalyConfig
from neurogebra.logging.health_warnings import AutoHealthWarnings, WarningConfig
from neurogebra.logging.epoch_summary import EpochSummarizer
from neurogebra.logging.tiered_storage import TieredStorage
from neurogebra.logging.dashboard import DashboardExporter
from neurogebra.logging.fingerprint import TrainingFingerprint# Load a synthetic dataset
X, y = Datasets.load_moons(n_samples=1000, noise=0.2)
# Train/test split
split = int(0.8 * len(X))
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]
print(f"Training samples: {X_train.shape[0]}")
print(f"Test samples: {X_test.shape[0]}")
print(f"Features: {X_train.shape[1]}")builder = ModelBuilder()
model = builder.Sequential([
builder.Dense(64, activation="relu", input_shape=(2,)),
builder.Dense(32, activation="tanh"),
builder.Dense(1, activation="sigmoid"),
], name="moon_classifier")
# Inspect before training
model.summary()
model.explain_architecture()# Base logger at EXPERT level
base_logger = TrainingLogger(level=LogLevel.EXPERT)
# Adaptive wrapper -- stays quiet until something goes wrong
adaptive = AdaptiveLogger(base_logger, config=AnomalyConfig(
zeros_pct_threshold=50.0, # Dead neuron threshold
gradient_spike_factor=5.0, # Gradient spike sensitivity
escalation_cooldown=10, # Stay escalated for 10 events
))
# Tiered storage -- separates logs into basic/health/debug files
storage = TieredStorage(
base_dir="./training_logs",
write_debug=True,
buffer_size=50,
)
base_logger.add_backend(storage)
# Interactive HTML dashboard
dashboard = DashboardExporter(path="./training_logs/dashboard.html")
base_logger.add_backend(dashboard)
# Health warnings engine
warnings_engine = AutoHealthWarnings(config=WarningConfig(
dead_relu_zeros_pct=50.0,
overfit_patience=3,
overfit_ratio=1.3,
))
# Epoch summarizer
summarizer = EpochSummarizer()fingerprint = TrainingFingerprint.capture(
model_info={"name": "moon_classifier", "layers": 3},
hyperparameters={
"learning_rate": 0.01,
"batch_size": 32,
"epochs": 20,
"optimizer": "adam",
"loss": "binary_crossentropy",
},
dataset=X_train,
random_seed=42,
)
print(fingerprint.format_text())
# Output:
# +-- Training Fingerprint --+
# Run ID: a1b2c3d4e5f6
# Timestamp: 2026-03-01 12:00:00
# Seed: 42
# Dataset Hash: 8f14e45fceea167a
# Neurogebra: 2.5.3
# Python: 3.11.5
# NumPy: 1.26.0
# ...
# +-------------------------+# Compile with Observatory logging enabled
model.compile(
loss="binary_crossentropy",
optimizer="adam",
learning_rate=0.01,
log_level="expert",
)
# Run training
epochs = 20
batch_size = 32
num_batches = len(X_train) // batch_size
adaptive.on_train_start(total_epochs=epochs, model_info=fingerprint.model_info)
for epoch in range(epochs):
adaptive.on_epoch_start(epoch)
epoch_loss = 0.0
for batch_idx in range(num_batches):
start = batch_idx * batch_size
end = start + batch_size
X_batch = X_train[start:end]
y_batch = y_train[start:end]
# Forward pass (real computation)
predictions = model.predict(X_batch)
loss = np.mean((predictions.flatten() - y_batch) ** 2)
epoch_loss += loss
# Record batch metrics
summarizer.record_batch(
epoch=epoch,
metrics={"loss": loss, "accuracy": np.mean((predictions.flatten() > 0.5) == y_batch)},
)
# Check batch-level health
batch_alerts = warnings_engine.check_batch(
epoch=epoch,
batch=batch_idx,
loss=loss,
)
for alert in batch_alerts:
print(f" [{alert.severity.upper()}] {alert.message}")
# Epoch summary
avg_loss = epoch_loss / num_batches
summary = summarizer.finalize_epoch(epoch)
print(summary.format_text())
# Epoch-level health check
epoch_alerts = warnings_engine.check_epoch(
epoch=epoch,
train_loss=avg_loss,
)
adaptive.on_epoch_end(epoch, metrics={"loss": avg_loss})
adaptive.on_train_end()# Flush and close storage
storage.flush()
storage.close()
# Save dashboard
dashboard.save()
# Save fingerprint
import json
with open("./training_logs/fingerprint.json", "w") as f:
json.dump(fingerprint.to_dict(), f, indent=2)
# Print summary
print(f"\nAnomalies detected: {adaptive.get_anomaly_summary()['total_anomalies']}")
print(f"Health warnings: {warnings_engine.get_summary()['total_warnings']}")
print(f"\nFiles saved to ./training_logs/:")
print(f" basic.log -- Epoch-level metrics (NDJSON)")
print(f" health.log -- Health warnings and anomalies (NDJSON)")
print(f" debug.log -- Full expert-level detail (NDJSON)")
print(f" dashboard.html -- Interactive HTML dashboard")
print(f" fingerprint.json -- Full reproducibility snapshot")Terminal (colour-coded):
[INFO] Training started: 20 epochs, 3 layers
[INFO] Epoch 1/20
Forward: a1 = relu(W1 * x + b1) | shape: (32, 64)
Forward: a2 = tanh(W2 * a1 + b2) | shape: (32, 32)
Forward: y_hat = sigma(W3 * a2 + b3) | shape: (32, 1)
Backward: dL/dW3 = dL/dy_hat . sigma'(z3) * a2^T
Backward: dL/dW2 = dL/da2 . tanh'(z2) * a1^T
Backward: dL/dW1 = dL/da1 . relu'(z1) * x^T
== Epoch 1 Summary (25 batches) ==
loss mean=0.4821 std=0.0312 min=0.4100 max=0.5500
accuracy mean=0.7640 std=0.0180 min=0.7200 max=0.8000
[INFO] Epoch 2/20
...
[WARNING] Possible dying ReLU in 'dense_0' (52.0% zeros)
-> Use LeakyReLU(negative_slope=0.01) instead of ReLU
-> Lower the learning rate
-> Use He initialisation
== Epoch 20 Summary (25 batches) ==
loss mean=0.0891 std=0.0045 min=0.0810 max=0.0990
accuracy mean=0.9620 std=0.0060 min=0.9500 max=0.9750
[INFO] Training complete. Total anomalies: 2. Health warnings: 3.
Dashboard (HTML): Interactive loss curves, accuracy curves, epoch timing bars, batch-level metrics, and health diagnostics timeline -- all in a single self-contained HTML file you can open in any browser.
It is an executable reference library, not a framework to master.
Students do not need to "learn Neurogebra" the way they learn PyTorch. They use it to look up, verify, and experiment with mathematical formulas. The command forge.get("adam_step") immediately returns the Adam optimizer update rule as a symbolic expression with documentation attached -- no textbook lookup required.
It bridges the gap between mathematical theory and code. In most curricula, students learn formulas on a whiteboard and then separately implement them in code. Neurogebra collapses that gap: every formula is simultaneously a symbolic object (inspect the math), a numerical function (run it on data), and an educational resource (read what it does and when to use it).
It eliminates transcription errors. Students routinely introduce bugs when translating formulas from papers or textbooks into code. Neurogebra's 285 expressions are verified with 470+ automated tests. Using forge.get("cross_entropy") is faster and more reliable than re-deriving it from scratch.
It teaches through transparency. The autograd engine, the educational trainer with real-time debugging advice, the layer explanation system, and the interactive tutorials are designed to make invisible processes visible. Students do not just see numbers -- they see why their loss is diverging, what each layer does, and how gradients flow through a computation graph.
It complements existing tools. Prototype and verify formulas in Neurogebra, then implement production models in PyTorch or TensorFlow. The framework bridges explicitly support this workflow.
Rapid prototyping of custom formulas. Define a new loss function, activation, or metric as a symbolic expression and immediately evaluate it, differentiate it, compose it with others, and train it -- all without writing boilerplate code.
Reproducibility built in. The Training Fingerprint captures everything needed to reproduce a run: seeds, dataset hash, library versions, hardware info, OS, git state, model architecture hash, and hyperparameters. Save it as JSON alongside your results.
Transparent training diagnostics. The Observatory and Observatory Pro give unprecedented visibility into what happens during training. Instead of treating the model as a black box, researchers can inspect layer-by-layer forward/backward formulas, gradient distributions, weight dynamics, and activation statistics at every step.
Structured experiment logging. Tiered storage separates basic metrics from health alerts from debug-level detail. The dashboard exporter generates interactive HTML reports. TensorBoard and Weights & Biases bridges integrate with existing research workflows.
Symbolic gradient verification. Compare analytical gradients from SymPy with numerical gradients from autograd. Verify that your custom formula's gradient is correct before training.
Verified formula library. The 285 expressions are tested with 470+ automated tests. Use them as a reference implementation or as building blocks for custom metrics and loss functions -- without re-deriving from papers.
Production logging presets. LogConfig.production() provides lean, structured logging suitable for deployment. Tiered storage separates operational metrics from diagnostic detail. Adaptive logging reduces noise by 80-90%.
Framework bridge workflow. Prototype quickly in Neurogebra, verify correctness with symbolic gradients and the Observatory, then export to PyTorch or TensorFlow for production training with GPU support.
Self-contained diagnostics. The HTML dashboard export creates a single file with interactive charts that can be shared with team members, attached to tickets, or archived alongside model artifacts -- no external services required.
Health monitoring. The automated health warning system provides structured alerts with severity levels, diagnoses, and recommendations. Integrate these into CI/CD pipelines or monitoring dashboards.
285 verified mathematical expressions organized into 10 domain modules:
| Module | Count | Scope |
|---|---|---|
| Activations | 15 | ReLU, Sigmoid, Tanh, Swish, GELU, Mish, ELU, SELU, Softmax, LeakyReLU |
| Losses | 8 | MSE, MAE, Cross-Entropy, Huber, Hinge, Log-Cosh, Quantile, Focal |
| Regularizers | 20 | L1, L2, Elastic Net, Dropout, SCAD, MCP, Group Lasso, Tikhonov |
| Algebra | 48 | Polynomials, kernels, probability distributions, special functions |
| Calculus | 48 | Elementary, trigonometric, hyperbolic, Taylor series, integral transforms |
| Statistics | 35 | PDFs, CDFs, information theory, Bayesian inference, regression |
| Linear Algebra | 24 | Norms, distances, projections, matrix operations, attention mechanisms |
| Optimization | 27 | SGD, Adam, AdamW, learning rate schedules, loss landscapes |
| Metrics | 27 | Precision, Recall, F1, R-squared, AIC, BIC, NDCG, Matthews correlation |
| Transforms | 33 | Normalization, encoding, weight initialization, signal processing |
Every expression includes:
- Symbolic representation (SymPy) with LaTeX rendering
- Fast numerical evaluation (NumPy-backed via lambdify)
- Gradient computation (analytical, symbolic)
- Educational metadata (description, category, use cases, pros/cons)
- Composability (arithmetic operations, function composition)
- Optional trainable parameters
100+ curated datasets for immediate experimentation and learning.
from neurogebra.datasets import Datasets
Datasets.list_all() # Browse all
(X_train, y_train), (X_test, y_test) = Datasets.load_iris() # Load
Datasets.search("classification") # Search
Datasets.get_info("california_housing") # Details| Category | Count | Examples |
|---|---|---|
| Classification | 25+ | Iris, Wine, Breast Cancer, MNIST, Fashion-MNIST, Spam, Titanic, Adult Income |
| Regression | 25+ | California Housing, Diabetes, Auto MPG, Bike Sharing, Energy Efficiency |
| Synthetic Patterns | 20+ | XOR, Moons, Circles, Spirals, Checkerboard, Blobs, Swiss Roll |
| Time Series | 15+ | Sine Waves, Random Walks, Stock Prices, Seasonal Data, AR Processes |
| Image Recognition | 10+ | MNIST, Fashion-MNIST, Digits (8x8), CIFAR-style |
| Text / NLP | 5+ | Spam Detection, Sentiment Analysis |
Every dataset includes educational metadata (difficulty, use cases, sample count), pre-split train/test sets, verbose mode, and consistent numpy array output.
from neurogebra.builders.model_builder import ModelBuilder
from neurogebra.training.educational_trainer import EducationalTrainer
from neurogebra.datasets.loaders import Datasets
import numpy as np
X, y = Datasets.load_moons(n_samples=500, noise=0.2)
builder = ModelBuilder()
model = builder.Sequential([
builder.Dense(64, activation="relu", input_shape=(2,)),
builder.Dropout(0.2),
builder.Dense(32, activation="relu"),
builder.Dense(1, activation="sigmoid")
], name="moon_classifier")
model.summary()
model.explain_architecture()
model.compile(optimizer="adam", loss="binary_crossentropy", learning_rate=0.01)
trainer = EducationalTrainer(model, verbose=True, explain_steps=True)
history = trainer.train(X, y, epochs=20, batch_size=32, validation_split=0.2)from neurogebra import Expression
from neurogebra.core.trainer import Trainer
import numpy as np
expr = Expression(
"fit_line", "m*x + b",
params={"m": 0.0, "b": 0.0},
trainable_params=["m", "b"]
)
X = np.linspace(0, 10, 100)
y = 2 * X + 1 + np.random.normal(0, 0.5, 100)
trainer = Trainer(expr, learning_rate=0.01, optimizer="adam")
history = trainer.fit(X, y, epochs=200, verbose=True)
print(f"Learned: m={expr.params['m']:.2f}, b={expr.params['b']:.2f}")
# Output: Learned: m=2.00, b=1.01from neurogebra import MathForge
forge = MathForge()
sigmoid = forge.get("sigmoid")
sigmoid_grad = sigmoid.gradient("x")
print(sigmoid_grad.formula) # Analytical derivative in LaTeX
mse = forge.get("mse")
mae = forge.get("mae")
custom_loss = 0.7 * mse + 0.3 * mae
print(custom_loss.eval(y=1.0, y_pred=0.8))A from-scratch automatic differentiation engine for understanding backpropagation.
from neurogebra.core.autograd import Value
x = Value(2.0)
w = Value(-3.0)
b = Value(1.0)
y = w * x + b
z = y.relu()
z.backward()
print(f"dz/dw = {w.grad}") # 2.0
print(f"dz/dx = {x.grad}") # -3.0Prototype in Neurogebra, deploy in production frameworks.
from neurogebra.bridges import to_pytorch, to_tensorflow
# Export to PyTorch
pytorch_model = to_pytorch(model)
# Export to TensorFlow/Keras
tf_model = to_tensorflow(model)neurogebra/
core/
expression.py # Unified Expression class (symbolic + numerical + trainable)
forge.py # MathForge: central expression hub and search engine
neurocraft.py # NeuroCraft: educational interface with tutorials
autograd.py # Micro autograd engine (Value, Tensor)
trainer.py # Parameter optimization (SGD, Adam)
repository/ # 10 domain modules, 285 expressions
builders/ # ModelBuilder: architecture templates and layer definitions
training/ # EducationalTrainer: training with explanations
logging/ # Training Observatory + Observatory Pro
logger.py # Event-driven multi-level logger
config.py # Preset configurations
monitors.py # Gradient, weight, activation, performance monitors
health_checks.py # Smart diagnostics with recommendations
health_warnings.py # Automated threshold-based health rules
adaptive.py # Adaptive logging with anomaly detection
epoch_summary.py # Per-epoch statistical summarization
tiered_storage.py # NDJSON tiered log files
dashboard.py # HTML dashboard + TensorBoard + W&B bridges
fingerprint.py # Training reproducibility capture
terminal_display.py # Rich colour-coded terminal renderer
formula_renderer.py # Unicode/LaTeX math formula display
image_logger.py # ASCII rendering of images and activations
exporters.py # JSON, CSV, HTML, Markdown exporters
computation_graph.py # Full DAG tracker
tutorials/ # Interactive step-by-step tutorial system
datasets/ # 100+ built-in dataset loaders
bridges/ # Framework converters (PyTorch, TensorFlow, JAX)
viz/ # Visualization tools (Matplotlib, Plotly)
utils/ # Helpers and explanation engine
Full documentation is available at neurogebra.readthedocs.io.
| Section | Description |
|---|---|
| Getting Started | Installation, first program, how it works |
| Python Refresher | Python basics, NumPy, data handling |
| ML Fundamentals | What is ML, types, workflow, math |
| Tutorials | MathForge, expressions, activations, losses, training, autograd |
| Advanced Topics | Custom expressions, framework bridges, Observatory Pro |
| Projects | Linear regression, image classifier, neural network from scratch |
| API Reference | Complete API documentation |
Contributions are welcome. See CONTRIBUTING.md for guidelines.
git clone https://github.com/fahiiim/NeuroGebra.git
cd NeuroGebra
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # macOS/Linux
pip install -e ".[dev]"
pytest tests/ -vNeurogebra is not a competitor to TensorFlow or PyTorch.
TensorFlow and PyTorch are production-grade deep learning frameworks built for training large-scale neural networks on GPUs and TPUs. They are industry standards for model development, deployment, and research at scale. Neurogebra does not attempt to replace, replicate, or compete with them.
What Neurogebra is:
A mathematical formula library with executable, symbolic, and educational capabilities. The closest analogues are Wolfram Mathematica (proprietary, expensive, not Python-native) or manually assembling formulas from SymPy and Wikipedia (no curation, no ML focus, no educational layer). Neurogebra provides a unique combination that does not exist in any single tool today -- a searchable, executable, trainable encyclopedia of the math that powers modern AI, with built-in transparency, diagnostics, and educational features.
MIT License. See LICENSE for details.
Author: Fahim Sarker
GitHub · PyPI · Documentation · Changelog
Built with precision. Designed for understanding.
