Machine Learning & AI Examples
Natural language prompts for ML/AI training and experimentation.
Language Models (LLM MCP)
These examples use the LLM MCP server with GPT and Mamba architectures:
Create a GPT-2 small model and train it on WikiText for 1000 steps
Build a Mamba state-space model and compare training speed to GPT
Train a character-level model on Shakespeare and generate sonnets
Create a custom GPT with 6 layers, 8 heads, and train on TinyStories
Fine-tune a language model on code completion with low learning rate
Analyze attention patterns in a trained transformer to find head specialization
Compare perplexity between GPT and Mamba on the same validation set
Train a tokenizer using BPE and analyze the vocabulary coverage
Generate text with different temperature settings to show diversity vs coherence
Compute memory requirements for GPT-2 XL with gradient checkpointing
Language Models (General)
Train a character-level LSTM on Shakespeare and generate sonnets
Build a GPT-2 style model with 4 layers and train on code
Create word embeddings using skip-gram on Wikipedia abstracts
Train a BERT-tiny for sentiment classification
Build a seq2seq model for simple translation (numbers to words)
Train an autoregressive model to complete Python functions
Create sentence embeddings using contrastive learning
Build a small T5 model for text summarization
Train a tokenizer using BPE on a custom corpus
Fine-tune embeddings for semantic similarity
Transformers from Scratch
Implement multi-head attention and verify against PyTorch
Build a transformer encoder and train on text classification
Create positional encodings (sinusoidal and learned)
Implement the transformer decoder with causal masking
Train a vision transformer (ViT) on CIFAR-10
Build a BERT-style masked language model
Implement rotary position embeddings (RoPE)
Create a mixture-of-experts transformer layer
Train a sparse attention transformer
Implement flash attention and compare memory usage
Computer Vision
Train ResNet-18 from scratch on CIFAR-10
Build a U-Net for image segmentation
Train an autoencoder to reconstruct MNIST digits
Create a GAN to generate faces
Train YOLO-style object detection on a custom dataset
Build a siamese network for one-shot learning
Train a neural style transfer model
Create a depth estimation network from single images
Train a pose estimation model for human keypoints
Build an image captioning model with attention
Generative Models
Train a VAE on MNIST and interpolate in latent space
Build a diffusion model for image generation
Create a flow-based generative model (RealNVP style)
Train a GAN with spectral normalization
Build a VQ-VAE for discrete latent codes
Train an autoregressive image model (PixelCNN)
Create a conditional GAN for image-to-image translation
Build a neural ODE for continuous normalizing flows
Train a score-based generative model
Create a latent diffusion model for high-res images
Reinforcement Learning
Train DQN to play Atari Breakout
Implement policy gradient (REINFORCE) for CartPole
Build an actor-critic agent for continuous control
Train PPO on MuJoCo environments
Implement curiosity-driven exploration
Build a model-based RL agent with world models
Train multi-agent RL for competitive games
Implement hindsight experience replay
Build an offline RL agent from logged data
Train an agent using human feedback (RLHF style)
Neural Network Fundamentals
Visualize what each layer learns in a CNN
Compute and visualize attention weights in a transformer
Show gradient flow through a deep network
Demonstrate vanishing gradients in RNNs vs LSTMs
Visualize the loss landscape around optima
Compare batch norm, layer norm, and group norm
Show the effect of dropout at different rates
Visualize weight initialization strategies
Demonstrate mode collapse in GAN training
Show the lottery ticket hypothesis with pruning
Optimization & Training
Compare Adam, SGD, and AdamW on the same model
Implement learning rate warmup and cosine annealing
Show the effect of batch size on convergence
Implement gradient clipping and show its effect
Compare different weight initialization methods
Implement mixed precision training
Show the effect of label smoothing
Implement early stopping with patience
Compare different data augmentation strategies
Implement gradient accumulation for large batches
Graph Neural Networks
Build a GCN for node classification on Cora
Train a graph attention network (GAT)
Implement message passing neural networks
Build a model for molecular property prediction
Train a GNN for link prediction
Implement graph pooling for graph classification
Build a temporal graph network
Train a heterogeneous graph neural network
Implement over-smoothing analysis in deep GNNs
Build a knowledge graph embedding model
Time Series & Sequences
Train an LSTM for stock price prediction
Build a temporal fusion transformer
Implement wavenet-style dilated convolutions
Train a neural ODE for irregular time series
Build an attention-based anomaly detector
Implement N-BEATS for time series forecasting
Train a transformer for multi-step prediction
Build a variational RNN for uncertainty estimation
Implement temporal convolutional networks
Train a model for multivariate time series classification
Self-Supervised Learning
Implement SimCLR for image representation learning
Train a BYOL model without negative samples
Build a masked autoencoder (MAE) for vision
Implement contrastive predictive coding (CPC)
Train a CLIP-style vision-language model
Build a self-supervised model for audio
Implement DINO for self-distillation
Train a VICReg model with variance regularization
Build a Barlow Twins model
Implement SwAV with online clustering
Model Compression
Prune a neural network to 90% sparsity
Quantize a model to 8-bit integers
Implement knowledge distillation
Build a lottery ticket subnetwork
Apply low-rank factorization to weight matrices
Implement dynamic neural networks with early exit
Quantization-aware training for deployment
Structured pruning by removing entire filters
Neural architecture search for efficient models
Implement weight sharing for compression
Interpretability
Generate saliency maps for image classification
Implement integrated gradients attribution
Build attention visualization for transformers
Compute SHAP values for a tabular model
Generate counterfactual explanations
Implement concept activation vectors (CAVs)
Build a prototype-based interpretable model
Analyze neuron activations across layers
Implement layer-wise relevance propagation
Generate natural language explanations