0% found this document useful (0 votes)
20 views13 pages

Build ML Project on PARAM Utkarsh

Uploaded by

mudassirlad01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views13 pages

Build ML Project on PARAM Utkarsh

Uploaded by

mudassirlad01
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Complete Step-by-Step Guide: Building Your First

Machine Learning Project in VS Code with PARAM


Utkarsh Supercomputer
This comprehensive guide will walk you through building a simple machine learning project from scratch using VS Code
and running it on the PARAM Utkarsh supercomputer.

Prerequisites
Before starting, ensure you have:

Access to PARAM Utkarsh supercomputer


VS Code installed on your local machine
Basic knowledge of Python programming
SSH client (MobaXterm or PuTTY)

Step 1: Setting Up Your Local Development Environment

Install Required Software


1. Install VS Code: Download from [Link]
2. Install Python Extension: In VS Code, install the Python extension by Microsoft
3. Install SSH Client: Download MobaXterm (recommended) or PuTTY

Create Project Directory

mkdir my_first_ml_project
cd my_first_ml_project

Step 2: Connecting to PARAM Utkarsh Supercomputer

SSH Connection

ssh -X username@[Link]

Replace username with your actual username


The -X flag enables X11 forwarding for graphical applications
Enter the captcha (case sensitive) and your password when prompted
Check Available Resources
Once logged in, check the system information displayed in the terminal. You'll see:

Total compute nodes: 156


CPU nodes: 107
High Memory nodes: 39
GPU accelerated nodes: 10

Step 3: Setting Up Python Environment on PARAM Utkarsh

Load Required Modules

# Check available modules


module avail

# Load Python with TensorFlow (recommended for ML projects)


module load anaconda3/tensorflow

# Alternative options:
# module load anaconda3/anaconda3
# module load anaconda3/pytorch

Verify Python Installation

python3 -V
pip list

Create Virtual Environment

# Create virtual environment in your project directory


python -m venv ml_project_env

# Activate virtual environment


source ml_project_env/bin/activate

# Verify virtual environment is active


which python

Step 4: Install Required Python Packages


Create [Link] file

cat > [Link] << EOF


numpy>=1.19.0
pandas>=1.1.0
matplotlib>=3.3.0
scikit-learn>=0.23.0
seaborn>=0.11.0
EOF

Install Packages

pip install -r [Link]

Step 5: Create Your First ML Project - Iris Classification

Create the main Python script

# [Link] - Iris Flower Classification Project

# Step 1: Import Required Libraries


import numpy as np
import pandas as pd
import [Link] as plt
import seaborn as sns
from sklearn.model_selection import train_test_split, cross_val_score, StratifiedKFold
from [Link] import StandardScaler
from sklearn.linear_model import LogisticRegression
from [Link] import DecisionTreeClassifier
from [Link] import RandomForestClassifier
from [Link] import SVC
from [Link] import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from [Link] import accuracy_score, classification_report, confusion_matrix
from [Link] import load_iris
import warnings
[Link]('ignore')

print("Starting Iris Classification Project...")


print("="*50)

# Step 2: Load and Explore the Dataset


def load_and_explore_data():
"""Load the Iris dataset and perform initial exploration"""
print("Loading Iris dataset...")

# Load the dataset


iris = load_iris()
df = [Link]([Link], columns=iris.feature_names)
df['target'] = [Link]
df['species'] = df['target'].map({0: 'setosa', 1: 'versicolor', 2: 'virginica'})
print(f"Dataset shape: {[Link]}")
print(f"Features: {list(iris.feature_names)}")
print(f"Target classes: {list(iris.target_names)}")
print("\nFirst 5 rows:")
print([Link]())

print("\nDataset Info:")
print([Link]())

print("\nStatistical Summary:")
print([Link]())

print("\nClass Distribution:")
print(df['species'].value_counts())

return df, iris

# Step 3: Data Visualization


def visualize_data(df):
"""Create visualizations to understand the data better"""
print("\nCreating visualizations...")

# Set up the plotting style


[Link]('default')
fig, axes = [Link](2, 2, figsize=(12, 10))

# Box plots for each feature


features = [Link][:-2] # Exclude target and species columns
for i, feature in enumerate(features):
row, col = i // 2, i % 2
[Link](data=df, x='species', y=feature, ax=axes[row, col])
axes[row, col].set_title(f'{feature} by Species')

plt.tight_layout()
[Link]('iris_boxplots.png', dpi=150, bbox_inches='tight')
print("Box plots saved as 'iris_boxplots.png'")

# Correlation heatmap
[Link](figsize=(10, 8))
correlation_matrix = [Link][:, :-2].corr()
[Link](correlation_matrix, annot=True, cmap='coolwarm', center=0)
[Link]('Feature Correlation Heatmap')
plt.tight_layout()
[Link]('correlation_heatmap.png', dpi=150, bbox_inches='tight')
print("Correlation heatmap saved as 'correlation_heatmap.png'")

# Pair plot
[Link](figsize=(12, 10))
[Link](df, hue='species', diag_kind='hist')
[Link]('[Link]', dpi=150, bbox_inches='tight')
print("Pair plot saved as '[Link]'")

# Step 4: Prepare Data for Machine Learning


def prepare_data(df):
"""Prepare features and target variables"""
print("\nPreparing data for machine learning...")

# Separate features and target


X = [Link][:, :-2] # All columns except target and species
y = df['target']

# Split the data


X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)

# Scale the features


scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = [Link](X_test)

print(f"Training set size: {X_train.shape[^0]}")


print(f"Test set size: {X_test.shape[^0]}")

return X_train_scaled, X_test_scaled, y_train, y_test, scaler

# Step 5: Train Multiple Models


def train_models(X_train, y_train):
"""Train multiple machine learning models"""
print("\nTraining multiple machine learning models...")

# Define models
models = {
'Logistic Regression': LogisticRegression(random_state=42),
'Decision Tree': DecisionTreeClassifier(random_state=42),
'Random Forest': RandomForestClassifier(n_estimators=100, random_state=42),
'SVM': SVC(random_state=42),
'KNN': KNeighborsClassifier(n_neighbors=5),
'Naive Bayes': GaussianNB()
}

# Train and evaluate models using cross-validation


results = {}
cv = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

for name, model in [Link]():


cv_scores = cross_val_score(model, X_train, y_train, cv=cv, scoring='accuracy')
results[name] = {
'model': model,
'cv_mean': cv_scores.mean(),
'cv_std': cv_scores.std(),
'cv_scores': cv_scores
}
print(f"{name}: {cv_scores.mean():.4f} (+/- {cv_scores.std() * 2:.4f})")

return results

# Step 6: Select Best Model and Make Predictions


def evaluate_best_model(results, X_train, X_test, y_train, y_test):
"""Select the best model and evaluate on test set"""
print("\nSelecting best model and making predictions...")
# Find the best model
best_model_name = max([Link](), key=lambda x: results[x]['cv_mean'])
best_model = results[best_model_name]['model']

print(f"Best model: {best_model_name}")


print(f"Cross-validation accuracy: {results[best_model_name]['cv_mean']:.4f}")

# Train the best model on full training set


best_model.fit(X_train, y_train)

# Make predictions
y_pred = best_model.predict(X_test)

# Evaluate on test set


test_accuracy = accuracy_score(y_test, y_pred)
print(f"Test set accuracy: {test_accuracy:.4f}")

print("\nClassification Report:")
print(classification_report(y_test, y_pred))

print("\nConfusion Matrix:")
cm = confusion_matrix(y_test, y_pred)
print(cm)

# Plot confusion matrix


[Link](figsize=(8, 6))
[Link](cm, annot=True, fmt='d', cmap='Blues')
[Link](f'Confusion Matrix - {best_model_name}')
[Link]('True Label')
[Link]('Predicted Label')
[Link]('confusion_matrix.png', dpi=150, bbox_inches='tight')
print("Confusion matrix saved as 'confusion_matrix.png'")

return best_model, best_model_name, test_accuracy

# Step 7: Save Results


def save_results(results, best_model_name, test_accuracy):
"""Save model results to a file"""
print("\nSaving results...")

with open('model_results.txt', 'w') as f:


[Link]("Iris Classification Project Results\n")
[Link]("="*40 + "\n\n")

[Link]("Cross-validation Results:\n")
for name, result in [Link]():
[Link](f"{name}: {result['cv_mean']:.4f} (+/- {result['cv_std'] * 2:.4f})\n")

[Link](f"\nBest Model: {best_model_name}\n")


[Link](f"Test Accuracy: {test_accuracy:.4f}\n")

print("Results saved to 'model_results.txt'")

# Main execution function


def main():
"""Main function to run the entire ML pipeline"""
print("Starting Machine Learning Pipeline...")

# Step 1: Load and explore data


df, iris = load_and_explore_data()

# Step 2: Create visualizations


visualize_data(df)

# Step 3: Prepare data


X_train, X_test, y_train, y_test, scaler = prepare_data(df)

# Step 4: Train models


results = train_models(X_train, y_train)

# Step 5: Evaluate best model


best_model, best_model_name, test_accuracy = evaluate_best_model(
results, X_train, X_test, y_train, y_test
)

# Step 6: Save results


save_results(results, best_model_name, test_accuracy)

print("\n" + "="*50)
print("Machine Learning Project Completed Successfully!")
print("="*50)

# Run the project


if __name__ == "__main__":
main()

Step 6: Create SLURM Job Script

Create [Link] file for CPU execution

cat > slurm_cpu.sh << 'EOF'


#!/bin/sh
#SBATCH --N=1
#SBATCH --ntasks-per-node=2
#SBATCH --time=[Link]
#SBATCH --job-name=iris_ml_cpu
#SBATCH --error=job.%[Link]
#SBATCH --output=job.%[Link]
#SBATCH --partition=standard

# Load required modules


module load anaconda3/tensorflow

# Change to project directory


cd $HOME/my_first_ml_project

# Activate virtual environment


source ml_project_env/bin/activate
# Run the Python program
python [Link]

# Deactivate virtual environment


deactivate
EOF

Create [Link] file for GPU execution (if needed)

cat > slurm_gpu.sh << 'EOF'


#!/bin/sh
#SBATCH --N=1
#SBATCH --ntasks-per-node=2
#SBATCH --time=[Link]
#SBATCH --job-name=iris_ml_gpu
#SBATCH --error=job.%[Link]
#SBATCH --output=job.%[Link]
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1

# Load CUDA and required modules


module load cuda/11.0
module load anaconda3/tensorflow
export CUDA_VISIBLE_DEVICES=0

# Change to project directory


cd $HOME/my_first_ml_project

# Activate virtual environment


source ml_project_env/bin/activate

# Run the Python program


python [Link]

# Deactivate virtual environment


deactivate
EOF

Step 7: Submit and Monitor Your Job

Submit the job

# For CPU execution


sbatch slurm_cpu.sh

# For GPU execution (if applicable)


sbatch slurm_gpu.sh
Monitor job status

# Check your running jobs


squeue --me

# Check job details


scontrol show job <job_id>

# View output files


tail -f job.<job_id>.out
tail -f job.<job_id>.err

Step 8: Retrieve and View Results

Check output files

# List generated files


ls -la

# View results
cat model_results.txt

# Transfer files to local machine using scp


scp username@[Link]:~/my_first_ml_project/*.png ./local_directory/
scp username@[Link]:~/my_first_ml_project/model_results.txt ./local_directory

Step 9: Advanced Project Enhancements

Create a more complex project structure

# Create organized directory structure


mkdir -p data src notebooks results models

# Move files to appropriate directories


mv [Link] src/
mv *.png results/
mv model_results.txt results/

Add configuration file

# [Link]
import os

class Config:
# Data settings
TEST_SIZE = 0.2
RANDOM_STATE = 42

# Model settings
CV_FOLDS = 5

# File paths
RESULTS_DIR = 'results'
MODELS_DIR = 'models'

# Plotting settings
FIGURE_SIZE = (12, 10)
DPI = 150

Step 10: Best Practices and Tips

Version Control

# Initialize git repository


git init
git add .
git commit -m "Initial ML project setup"

Error Handling and Logging

# Add to your [Link]


import logging

# Set up logging
[Link](
level=[Link],
format='%(asctime)s - %(levelname)s - %(message)s',
handlers=[
[Link]('ml_project.log'),
[Link]()
]
)

logger = [Link](__name__)

Performance Monitoring

# Add timing to your functions


import time
from functools import wraps

def timing_decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
start_time = [Link]()
result = func(*args, **kwargs)
end_time = [Link]()
print(f"{func.__name__} took {end_time - start_time:.2f} seconds")
return result
return wrapper

# Apply to your functions


@timing_decorator
def train_models(X_train, y_train):
# ... existing code

Troubleshooting Common Issues

Connection Issues
Ensure you have the correct username and password
Check if the cluster is under maintenance
Verify network connectivity

Module Loading Issues

# Clear all modules and reload


module purge
module load anaconda3/tensorflow

Virtual Environment Issues

# If virtual environment creation fails


rm -rf ml_project_env
python -m venv ml_project_env --system-site-packages

Job Submission Issues


Check partition availability: sinfo
Verify resource requests don't exceed limits
Ensure scripts have execute permissions: chmod +x slurm_cpu.sh

Memory Issues
Monitor memory usage in your script
Use batch processing for large datasets
Request more memory in SLURM script: #SBATCH --mem=8GB
Next Steps
1. Expand the project: Try different datasets, algorithms, or preprocessing techniques
2. Hyperparameter tuning: Use GridSearchCV or RandomizedSearchCV
3. Feature engineering: Create new features or select the most important ones
4. Deploy the model: Create a simple web service or API
5. Experiment with deep learning: Use TensorFlow or PyTorch for neural networks

Conclusion
You have successfully created and run your first machine learning project on the PARAM Utkarsh supercomputer! This
project covers the essential steps of a typical ML workflow:

Data loading and exploration


Data visualization
Model training and comparison
Model evaluation and selection
Result analysis and saving
The combination of VS Code for development and PARAM Utkarsh for execution provides a powerful environment for
machine learning experimentation and research.
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]

1. [Link]
2. [Link]
3. [Link]
ning_projects/
4. [Link]
5. [Link]
6. [Link]
-with-code
7. [Link]
8. [Link]
9. [Link]
10. [Link]
11. [Link]
12. [Link]
13. [Link]
14. [Link]
[Link]
15. [Link]
16. [Link]
17. [Link]
18. [Link]
19. [Link]
20. [Link]
21. [Link]

You might also like