diff --git a/README.md b/README.md index b0bd795f..0e8e9595 100644 --- a/README.md +++ b/README.md @@ -2,53 +2,58 @@ Node and channel metrics for neural network interpretability, importance, and interventions. -[![Tests](https://github.com/KempnerInstitute/nodelens/actions/workflows/test.yml/badge.svg)](https://github.com/KempnerInstitute/nodelens/actions/workflows/test.yml) -[![Lint](https://github.com/KempnerInstitute/nodelens/actions/workflows/lint.yml/badge.svg)](https://github.com/KempnerInstitute/nodelens/actions/workflows/lint.yml) -[![Documentation](https://github.com/KempnerInstitute/nodelens/actions/workflows/docs.yml/badge.svg)](https://github.com/KempnerInstitute/nodelens/actions/workflows/docs.yml) -[![Release](https://github.com/KempnerInstitute/nodelens/actions/workflows/release.yml/badge.svg)](https://github.com/KempnerInstitute/nodelens/actions/workflows/release.yml) +[![Tests](https://github.com/KempnerInstitute/NodeLens/actions/workflows/test.yml/badge.svg)](https://github.com/KempnerInstitute/NodeLens/actions/workflows/test.yml) +[![Lint](https://github.com/KempnerInstitute/NodeLens/actions/workflows/lint.yml/badge.svg)](https://github.com/KempnerInstitute/NodeLens/actions/workflows/lint.yml) +[![Documentation](https://github.com/KempnerInstitute/NodeLens/actions/workflows/docs.yml/badge.svg)](https://github.com/KempnerInstitute/NodeLens/actions/workflows/docs.yml) [![Python](https://img.shields.io/badge/python-%3E%3D3.8-3776AB?logo=python&logoColor=white)](pyproject.toml) -[![Artifacts](https://img.shields.io/badge/Hugging%20Face-artifacts-ffcc33)](https://huggingface.co/datasets/hsafaai/supernodes-scar-artifacts) [![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) NodeLens is a research codebase for studying which channels, neurons, and -features matter most for model behavior. The Python package is imported as -`nodelens`. - -The repository supports two related workflows: - -- General metric analysis for vision models, transformers, and LLMs. -- Paper-specific releases under `projects/`, including the Supernodes and SCAR - artifact workflow. +features matter most for model behavior. It combines activation capture, +importance metrics, redundancy and information measures, structured +interventions, and report generation in one configuration-driven workflow. The +Python package is imported as `nodelens`. ## What The Code Does -```mermaid -flowchart LR - A[Model + calibration data] --> B[Capture activations and gradients] - B --> C[Compute channel metrics] - C --> D[Identify loss-critical cores] - C --> E[Estimate redundancy and halo structure] - D --> F[Structured pruning and ablation probes] - E --> F - F --> G[Figures, tables, manifests, HF artifacts] +```text +Model + data + | + v +Activation and gradient capture + | + v +Channel and node metrics + |-- activation statistics + |-- Rayleigh quotient and spectral alignment + |-- mutual information, redundancy, and synergy + |-- gradients, curvature, Taylor scores, and loss proxies + | + v +Analysis and interventions + |-- identify outliers or loss-critical cores + |-- cluster channels by metric profile + |-- test ablations, pruning, and sensitivity probes + |-- generate figures, tables, summaries, and manifests ``` Core capabilities: -- Loss-sensitive channel scoring, including SCAR loss-proxy metrics. -- Activation, curvature, Taylor, Rayleigh quotient, and information-theoretic metrics. -- Structured pruning strategies for channel-level model analysis. -- Cluster and halo-style analyses for local redundancy structure. -- Reproducible project folders for paper artifacts and public releases. - -Supported model families include MLPs, CNNs, transformer language models, and -LLM backends through Hugging Face causal language models. +- Metric analysis for MLPs, CNNs, transformers, and Hugging Face causal LMs. +- Node and channel scoring with activation, alignment, information, + redundancy, gradient, curvature, and loss-sensitive metrics. +- Structured pruning and ablation tools for testing whether high-scoring + channels are functionally important. +- Clustering and cross-layer analyses for studying local organization, + redundancy, and downstream dependence. +- Project workflows under `projects/` that show how to reproduce concrete + analyses with the shared library. ## Installation ```bash -git clone https://github.com/KempnerInstitute/nodelens.git -cd nodelens +git clone https://github.com/KempnerInstitute/NodeLens.git +cd NodeLens conda env create -f environment.yml conda activate nodelens pip install -e . @@ -62,40 +67,42 @@ pip install -e .[all] ## Quick Start +Run experiments from YAML configs: + ```bash -# Vision model analysis +# Small vision smoke test python scripts/run_experiment.py --config configs/examples/mnist_basic.yaml -# CNN pruning +# CNN pruning and clustering python scripts/run_experiment.py --config configs/vision_prune/resnet18_cifar10_full.yaml -# LLM supernode and SCAR analysis +# LLM channel analysis and structured FFN pruning python scripts/run_experiment.py --config configs/prune_llm/llama3_8b_unified.yaml ``` -Package the public Supernodes and SCAR artifacts: +Use metrics directly from Python: -```bash -python projects/supernodes_scar/scripts/prepare_hf_artifacts.py \ - --output-dir outputs/supernodes_scar_hf \ - --clean +```python +from nodelens.metrics import get_metric, list_metrics + +print(list_metrics()) -python projects/supernodes_scar/scripts/verify_hf_artifacts.py \ - outputs/supernodes_scar_hf +metric = get_metric("rayleigh_quotient") +scores = metric.compute(inputs=layer_inputs, weights=layer_weights) ``` -## Paper Releases +## Project Workflows -Paper-specific release material lives under `projects/`. Reusable library code -stays in `src/nodelens`, while each project folder records the exact configs, -artifact layout, reproducibility notes, and release checklist for a paper. +Reusable library code lives in `src/nodelens`. Project folders contain the +configs, small helper scripts, and artifact descriptions needed to reproduce a +specific analysis with the shared package. Current project: -- `projects/supernodes_scar/`: release material for "Supernodes and Halos: - Loss-Critical Hubs in LLM Feed-Forward Layers". +- `projects/supernodes_scar/`: workflow for the Supernodes and SCAR study of + loss-sensitive FFN channels in LLMs. -Derived artifacts for this project are staged on Hugging Face: +The Supernodes and SCAR project also has a public derived-artifact dataset: - `https://huggingface.co/datasets/hsafaai/supernodes-scar-artifacts` @@ -106,28 +113,29 @@ Derived artifacts for this project are staged on Hugging Face: | Activation metrics | `activation_l2_norm`, `activation_variance`, `activation_outlier_index` | | Alignment metrics | `rayleigh_quotient`, `delta_alignment` | | Information metrics | `mutual_information_gaussian`, `pairwise_redundancy_gaussian`, `gaussian_pid_synergy_mmi` | -| SCAR metrics | `scar_activation_power`, `scar_taylor`, `scar_curvature`, `scar_loss_proxy` | +| Loss-sensitive metrics | `scar_activation_power`, `scar_taylor`, `scar_curvature`, `scar_loss_proxy` | | Pruning strategies | `magnitude`, `alignment`, `composite`, `cluster_aware`, `random` | ## Repository Layout ```text -nodelens/ +NodeLens/ |-- configs/ -| |-- prune_llm/ # LLM and SCAR configs -| |-- vision_prune/ # Vision pruning configs -| `-- examples/ # Small example configs -|-- projects/ # Paper-specific release material +| |-- examples/ # Small runnable configs +| |-- prune_llm/ # LLM channel-analysis and pruning configs +| `-- vision_prune/ # Vision pruning and clustering configs +|-- projects/ # Reproducible project workflows |-- scripts/ | |-- run_experiment.py # Main experiment entry point -| `-- run_analysis.py # Post-hoc analysis +| `-- run_analysis.py # Post-hoc analysis entry point |-- src/nodelens/ | |-- analysis/ # Visualization, clustering, cascade analysis | |-- experiments/ # Experiment classes -| |-- metrics/ # Importance metrics +| |-- metrics/ # Importance and information metrics | |-- models/ # Model wrappers -| `-- pruning/ # Pruning strategies -|-- tests/ # Unit tests +| |-- pruning/ # Pruning strategies +| `-- services/ # Activation capture, scoring, and mask utilities +|-- tests/ # Unit and integration tests `-- docs/ # Documentation ``` @@ -137,7 +145,8 @@ nodelens/ - [API Reference](docs/api_reference.md) - [LLM Guide](docs/llm_guide.md) - [Metric Consistency](docs/METRIC_CONSISTENCY.md) -- [Supernodes and SCAR Release Notes](projects/supernodes_scar/README.md) +- [Architecture](docs/ARCHITECTURE.md) +- [Supernodes and SCAR Workflow](projects/supernodes_scar/README.md) Build the Sphinx docs locally: @@ -155,8 +164,9 @@ pytest tests/unit/ -v ## Citation -If you use the Supernodes and SCAR release, please cite the paper and the -archived code/artifact versions listed in `CITATION.cff`. +If you use NodeLens, cite the repository metadata in `CITATION.cff`. If you use +a project workflow or public artifact dataset, also cite the associated paper +and artifact record. ## License diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 93845cc6..12be76db 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -1,65 +1,101 @@ # Architecture -NodeLens is organized as a reusable library plus paper-specific project -folders. The library code should remain general; each paper folder should only -contain release notes, configs, and artifact packaging scripts for that paper. - -```mermaid -flowchart TB - subgraph Library[src/nodelens] - M[metrics] - P[pruning] - E[experiments] - A[analysis] - S[services] - end - - subgraph Inputs[Inputs] - C[configs] - D[calibration data] - N[model checkpoints] - end - - subgraph Projects[projects] - R[supernodes_scar] - end - - C --> E - D --> S - N --> S - S --> M - M --> P - M --> A - P --> E - A --> E - E --> R - R --> H[Hugging Face artifact bundle] +NodeLens is organized around a reusable Python package and a small set of +configuration-driven workflows. The library code stays general; project folders +show how the same components are combined for a concrete study. + +## Data Flow + +```text +YAML config + | + v +Experiment runner + | + |-- loads model and dataset + |-- selects tracked layers + |-- captures activations, gradients, weights, and masks + | + v +Metric and scoring layer + | + |-- activation and norm statistics + |-- Rayleigh quotient and spectral metrics + |-- mutual information, redundancy, and synergy + |-- gradient, Taylor, curvature, and loss-proxy scores + | + v +Analysis and intervention layer + | + |-- clustering and cross-layer analyses + |-- ablation and sensitivity probes + |-- structured pruning strategies + |-- plots, tables, JSON summaries, and reports +``` + +## Package Layout + +```text +src/nodelens/ +|-- analysis/ # Aggregation, clustering, visualization, reports +|-- configs/ # Config loading and validation +|-- core/ # Registries, protocols, base abstractions +|-- dataops/ # Dataset loading and tensor preprocessing +|-- experiments/ # Config-driven experiment classes +|-- infrastructure/ # Logging, distributed helpers, storage utilities +|-- metrics/ # Node and channel metrics +|-- models/ # Model wrappers and model factory helpers +|-- pruning/ # Pruning configs, masks, and strategies +|-- services/ # Activation capture, scoring, and mask operations +`-- training/ # Training and evaluation helpers ``` ## Design Rules -- Keep reusable metrics, services, pruning code, and experiment classes in - `src/nodelens/`. -- Keep paper release instructions and packaging scripts in `projects/`. +- Keep reusable metrics, model wrappers, pruning code, and experiment classes + in `src/nodelens/`. +- Keep runnable experiment settings in `configs/`. - Keep generated outputs in `outputs/`, which is ignored by git. -- Do not store model weights, raw datasets, cluster logs, or private paths in - the repository. -- Use project manifests and checksums for anything uploaded as an artifact. - -## Supernodes and SCAR Flow - -```mermaid -sequenceDiagram - participant Config as YAML config - participant Runner as run_experiment.py - participant Capture as activation and gradient capture - participant Metrics as SCAR metrics - participant Prune as structured pruning - participant Artifacts as artifact bundle - - Config->>Runner: choose model, calibration data, sparsity, metrics - Runner->>Capture: collect layer-wise activations and gradients - Capture->>Metrics: compute LP, activation, curvature, and Taylor scores - Metrics->>Prune: protect supernode core and rank remaining channels - Prune->>Artifacts: write results, figures, tables, and manifests +- Keep project folders focused on reproducible usage: configs, helper scripts, + artifact descriptions, and notes that connect a study to the shared library. +- Do not store model weights, raw datasets, checkpoints, scheduler logs, access + tokens, or private absolute paths in the repository. + +## Common Workflows + +### Metric Analysis + +```text +model + dataloader + -> activation capture + -> metric computation + -> per-layer channel scores + -> plots or JSON summaries ``` + +Use this path for activation outliers, Rayleigh quotient scores, information +metrics, redundancy estimates, or loss-proxy ranking. + +### Intervention Analysis + +```text +channel scores + -> masks or ablation sets + -> model evaluation + -> sensitivity curves +``` + +Use this path to test whether a metric identifies channels that matter for +accuracy, perplexity, robustness, pruning, or other downstream behavior. + +### Project Workflow + +```text +shared package + configs + -> experiment outputs + -> aggregation scripts + -> figures, tables, and artifact manifests +``` + +Project folders under `projects/` should make a study easy to inspect without +turning project-specific scripts into core library code. diff --git a/docs/METRIC_CONSISTENCY.md b/docs/METRIC_CONSISTENCY.md index 3396fd47..b2f68a2a 100644 --- a/docs/METRIC_CONSISTENCY.md +++ b/docs/METRIC_CONSISTENCY.md @@ -6,7 +6,7 @@ It exists to prevent subtle drift in: - **Keys** (how values are named/stored), - **Sign conventions** (what "high" means when used for pruning/scoring). -It intentionally avoids referencing any paper draft; the canonical sources are the implementations under `src/nodelens/metrics/` and the experiment pipeline that stores per-layer metric arrays. +It intentionally avoids relying on paper-specific wording; the canonical sources are the implementations under `src/nodelens/metrics/` and the experiment pipeline that stores per-layer metric arrays. --- diff --git a/docs/README.md b/docs/README.md index dc45638e..e3d4e314 100644 --- a/docs/README.md +++ b/docs/README.md @@ -9,7 +9,7 @@ NodeLens is the public project name. The Python package is imported as - [API Reference](api_reference.md) - Core classes and functions - [LLM Guide](llm_guide.md) - LLM-specific analysis and pruning - [Metric Consistency](METRIC_CONSISTENCY.md) - Theory-code verification -- [Architecture](ARCHITECTURE.md) - Library and project-release layout +- [Architecture](ARCHITECTURE.md) - Library layout and data flow ## Configuration @@ -25,7 +25,7 @@ NodeLens is the public project name. The Python package is imported as | Type | Description | |------|-------------| | `alignment_analysis` | General alignment metrics for vision models | -| `llm_alignment` | LLM supernode and SCAR analysis | +| `llm_alignment` | LLM channel metrics and structured FFN pruning | | `cluster_analysis` | Metric-space clustering with halo analysis | ### Key Classes diff --git a/docs/api_reference.md b/docs/api_reference.md index a38c99d6..8b49d000 100644 --- a/docs/api_reference.md +++ b/docs/api_reference.md @@ -1,321 +1,138 @@ # API Reference -## Core Classes +This page summarizes the main public APIs. For full experiments, the preferred +entry point is still the YAML runner: -### ModelWrapper +```bash +python scripts/run_experiment.py --config configs/examples/mnist_basic.yaml +``` + +## Model Wrapping -Wraps PyTorch models for activation capture and analysis. +`ModelWrapper` wraps a PyTorch model and captures activations from selected +layers. ```python from nodelens import ModelWrapper -wrapper = ModelWrapper( - model, # PyTorch model - tracked_layers=None, # Layer names or None (auto-detect) - track_inputs=True, - track_outputs=True -) - +wrapper = ModelWrapper(model, tracked_layers=["layer1.0.conv1"]) outputs, activations = wrapper.forward_with_activations(inputs) -weights = wrapper.get_layer_weights(layers=None) -``` - -### BaseMetric - -All metrics inherit from `BaseMetric`: - -```python -metric.requires_inputs # bool -metric.requires_weights # bool -metric.requires_outputs # bool -metric.compute(inputs, weights, outputs, **kwargs) # Returns scores +weights = wrapper.get_layer_weights(layers=["layer1.0.conv1"]) ``` ---- - ## Metrics -### Rayleigh Quotient +Metrics are created through the registry. ```python -from nodelens.metrics import get_metric +from nodelens.metrics import get_metric, list_metrics -rq = get_metric('rayleigh_quotient', - relative=True, - regularization=1e-6 -) -scores = rq.compute(inputs, weights) -``` +print(list_metrics()) -### Redundancy - -```python -redundancy = get_metric('pairwise_redundancy_gaussian', - mode='output_based', - num_pairs=10, - aggregation='mean' -) -scores = redundancy.compute(outputs=layer_outputs) +rq = get_metric("rayleigh_quotient", relative=True, regularization=1e-6) +scores = rq.compute(inputs=layer_inputs, weights=layer_weights) ``` -### Synergy (Continuous Target) +Common metric families: -```python -from nodelens.metrics.information import SynergyContinuousTarget +| Family | Examples | +|--------|----------| +| Activation | `activation_l2_norm`, `activation_variance`, `activation_outlier_index` | +| Alignment | `rayleigh_quotient`, `delta_alignment` | +| Information | `mutual_information_gaussian`, `pairwise_redundancy_gaussian`, `average_redundancy` | +| Synergy | `gaussian_pid_synergy_mmi`, `synergy_gaussian_mmi` | +| Gradient | `taylor_saliency`, `gradient_alignment` | -synergy = SynergyContinuousTarget( - target_type='logit_margin', # or 'correct_logit', 'logit_pc1' - num_pairs=10, - sampling_strategy='top_k' -) -scores = synergy.compute(outputs=activations, logits=logits, labels=labels) -``` +LLM experiments can also write SCAR-specific score keys such as +`scar_activation_power`, `scar_taylor`, `scar_curvature`, and +`scar_loss_proxy`. Those are produced by the LLM experiment pipeline rather +than by the generic metric registry. ---- +## Clustering And Halo Analysis -## Clustering Analysis - -### MetricSpaceClustering - -Clusters channels in (RQ, Redundancy, Synergy) space. +Metric-space clustering groups channels by score profile. ```python -from nodelens.analysis.clustering import MetricSpaceClustering, ClusterResult +from nodelens.analysis.clustering import MetricSpaceClustering clusterer = MetricSpaceClustering(n_clusters=4, seed=42) -result = clusterer.fit(rq_scores, redundancy_scores, synergy_scores, layer_name="conv1") - -# Result attributes -result.labels # Cluster assignments [n_channels] -result.centroids # Cluster centers [n_clusters, 3] -result.silhouette # Silhouette score -result.type_mapping # {cluster_id: 'critical'|'redundant'|'synergistic'|'background'} -result.type_counts # {'critical': N, ...} +result = clusterer.fit( + rq_scores, + redundancy_scores, + synergy_scores, + layer_name="conv1", +) ``` -### CrossLayerHaloAnalysis - -Analyzes downstream dependencies via halos. +`CrossLayerHaloAnalysis` estimates downstream influence and local dependency +structure. ```python -from nodelens.analysis.clustering import CrossLayerHaloAnalysis, HaloResult - -halo_analyzer = CrossLayerHaloAnalysis(percentile=90.0, use_activation_weight=True) +from nodelens.analysis.clustering import CrossLayerHaloAnalysis -# Compute influence matrix -influence = halo_analyzer.compute_influence(weights, activations) - -# Find halo for a cluster -halo_indices, rel_influence = halo_analyzer.find_halo(influence, cluster_indices) - -# Analyze halo properties -halo_result = halo_analyzer.analyze_halo( - halo_indices, next_layer_redundancy, next_layer_synergy, - layer_name="layer2", cluster_name="critical" -) +halo = CrossLayerHaloAnalysis(percentile=90.0, use_activation_weight=True) +influence = halo.compute_influence(weights, activations) +halo_indices, rel_influence = halo.find_halo(influence, cluster_indices) ``` -### CascadeAnalysis +## Pruning -Validates importance via channel ablation. +Use the pruning registry for direct scripts. ```python -from nodelens.analysis import CascadeAnalysis, DamagePrediction - -cascade = CascadeAnalysis(model, test_loader, device="cuda") -baseline = cascade.baseline() +from nodelens.pruning import PruningConfig, get_pruning_strategy, list_pruning_strategies -# Ablate specific channels -result = cascade.ablate(layer_name="conv1", indices=[0, 5, 10]) -# result.accuracy_drop, result.loss_increase +print(list_pruning_strategies()) -# Test by cluster type -results = cascade.by_cluster(layer_name, labels, type_mapping, n_rm=5) +config = PruningConfig(amount=0.5, pruning_mode="low") +strategy = get_pruning_strategy("magnitude", config=config) +mask = strategy.prune(layer, amount=0.5) ``` ---- +For full-model pruning, prefer config-driven experiments under +`configs/vision_prune/` and `configs/prune_llm/`, because they handle layer +selection, dependency constraints, evaluation, and logging. ## Experiments -### ClusterAnalysisExperiment - -General cluster-based analysis for any architecture. +Load configs with `load_config` and instantiate the matching experiment family +when writing custom scripts. ```python -from nodelens.experiments import ClusterAnalysisExperiment, ClusterAnalysisConfig - -config = ClusterAnalysisConfig( - name="resnet18_cifar10_cluster_analysis", - model_name="resnet18", - dataset_name="cifar10", - n_clusters=4, - synergy_target="logit_margin", - halo_percentile=90.0, - device="cuda" +from nodelens.configs.config_loader import load_config +from nodelens.experiments import ( + ClusterAnalysisExperiment, + GeneralAlignmentExperiment, + LLMAlignmentExperiment, ) -experiment = ClusterAnalysisExperiment(config, model, train_loader, test_loader) -results = experiment.run() -experiment.generate_figures() -``` - -### LLMAlignmentExperiment - -LLM-specific analysis with SCAR metrics. - -```python -from nodelens.experiments import LLMAlignmentExperiment - -experiment = LLMAlignmentExperiment(config) -experiment.setup() - -scores = experiment.compute_importance_scores(num_samples=100) -scar_scores = experiment.compute_scar_supernode_metrics() -masks = experiment.apply_pruning(sparsity=0.3, metric="scar_loss_proxy", mode="low") -perplexity = experiment.evaluate_perplexity("wikitext", "test", num_samples=100) -``` - -### GeneralAlignmentExperiment +config = load_config("configs/examples/mnist_basic.yaml") -Vision model alignment analysis. +if config.experiment_type == "llm_alignment": + experiment = LLMAlignmentExperiment(config) +elif config.experiment_type == "cluster_analysis": + experiment = ClusterAnalysisExperiment(config) +else: + experiment = GeneralAlignmentExperiment(config) -```python -from nodelens.experiments import GeneralAlignmentExperiment - -experiment = GeneralAlignmentExperiment.from_yaml("config.yaml") results = experiment.run() ``` ---- - -## Visualization - -### Cluster Plots - -```python -from nodelens.analysis.visualization import ( - plot_metric_scatter, - plot_cluster_evolution, - plot_influence_matrix, - plot_cascade_test, - plot_halo_properties -) - -# Metric space scatter (RQ vs Red, RQ vs Syn, Red vs Syn) -plot_metric_scatter(rq, redundancy, synergy, labels, type_mapping, - layer_name, save_path) - -# Cluster composition across depth -plot_cluster_evolution(layer_results, save_path) - -# Cross-cluster influence heatmap -plot_influence_matrix(flow_dict, layer_name, save_path) - -# Cascade damage by cluster type -plot_cascade_test(cascade_results, save_path) -``` - -### UnifiedVisualizer - -```python -from nodelens.analysis.visualization import UnifiedVisualizer - -viz = UnifiedVisualizer() -viz.plot_layer_scores(scores, metric_name, plot_type='violin', save_path='plot.png') -viz.plot_importance_histogram(scores, layer_name, metric_name, plots_dir) -viz.plot_scatter_2d(x, y, xlabel, ylabel, title, save_path) -viz.plot_heatmap(data, title, cmap, save_path) -``` +## Output Analysis ---- +Most experiments write: -## Pruning +- `experiment_config.yaml` +- `logs/` +- `results/` +- `figures/` +- `analysis/` -### Quick Pruning +Use `scripts/run_analysis.py` for post-hoc analysis when an experiment has +already produced a results directory. -```python -from nodelens.pruning.orchestrator import prune_with_all_options - -result = prune_with_all_options( - model, - target_sparsity=0.7, - distribution='adaptive_sensitivity', - scoring='composite', - direction='low', - val_loader=val_loader, - eval_fn=evaluate -) +```bash +python scripts/run_analysis.py \ + --results-dir outputs/my_run \ + --output-dir outputs/my_run/analysis_extra ``` - -### Dependency-Aware Pruning - -```python -from nodelens.pruning.dependency_aware import DependencyAwarePruning - -pruner = DependencyAwarePruning(model) -result = pruner.prune(layer_scores={'conv1': scores1}, amount=0.5, mode='low') -``` - ---- - -## Services - -### ActivationCaptureService - -```python -from nodelens.services import ActivationCaptureService - -capture = ActivationCaptureService(model_wrapper) -data = capture.capture(input_batch, layers=['conv1'], include_weights=True) -``` - -### NodeScoringService - -```python -from nodelens.services import NodeScoringService - -scorer = NodeScoringService( - metrics={'rq': rq_metric, 'redundancy': redundancy_metric}, - gamma_redundancy=0.4, - delta_rq=0.3 -) -scores = scorer.compute_composite_scores(inputs, weights, targets) -``` - ---- - -## Configuration Parameters - -### Metric Parameters - -**RayleighQuotient** -- `relative` (bool): Normalize by trace -- `regularization` (float): Diagonal regularization - -**PairwiseRedundancyGaussian** -- `mode` (str): 'output_based' or 'covariance_based' -- `num_pairs` (int): Partners to sample -- `aggregation` (str): 'mean', 'median', 'max', 'sum' - -**SynergyContinuousTarget** -- `target_type` (str): 'logit_margin', 'correct_logit', 'logit_pc1' -- `num_pairs` (int): Partner neurons per channel -- `sampling_strategy` (str): 'random', 'top_k', 'all' - -### Clustering Parameters - -**MetricSpaceClustering** -- `n_clusters` (int): Number of clusters (default: 4) -- `seed` (int): Random seed - -**CrossLayerHaloAnalysis** -- `percentile` (float): Halo membership threshold (default: 90.0) -- `use_activation_weight` (bool): Weight influence by activation std - -### Pruning Parameters - -**Strategy**: 'magnitude', 'alignment', 'composite', 'cluster_aware', 'random' - -**Distribution**: 'uniform', 'global_threshold', 'adaptive_sensitivity' - -**Direction**: 'low' (prune unimportant), 'high' (ablation) diff --git a/docs/llm_guide.md b/docs/llm_guide.md index 18607504..298e791b 100644 --- a/docs/llm_guide.md +++ b/docs/llm_guide.md @@ -1,231 +1,131 @@ # LLM Analysis Guide -Guide for analyzing and pruning large language models. - -## Overview - -The `LLMAlignmentExperiment` class provides tools for: - -- Computing per-neuron importance scores -- SCAR-style second-order metrics -- Structured MLP and attention head pruning -- Supernode detection and protection -- Perplexity evaluation +NodeLens can analyze Hugging Face causal language models at the channel level. +The LLM workflow is designed for activation and gradient capture, FFN channel +metrics, ablation probes, and structured pruning. ## Quick Start ```bash -python scripts/run_experiment.py --config configs/examples/llm_alignment.yaml +python scripts/run_experiment.py --config configs/examples/gpt2_fast_test.yaml +python scripts/run_experiment.py --config configs/prune_llm/llama3_8b_unified.yaml ``` -## Configuration +The GPT-2 config is a small smoke test. The Llama, Mistral, Qwen, and OLMo +configs under `configs/prune_llm/` are larger workflows and may require model +access, GPU memory planning, and local cache setup. + +## What The LLM Workflow Computes + +- FFN activation statistics, including activation magnitude and outlier scores. +- Gradient-informed scores such as Taylor, curvature, and SCAR loss proxy. +- Supernode-style protected cores when a config asks for top-scoring channels. +- Halo and cross-layer diagnostics when enabled. +- Structured FFN channel pruning and perplexity evaluation. + +## Example Config Structure ```yaml experiment: - name: "llm_analysis" + name: "llama3_8b_analysis" type: "llm_alignment" + device: "cuda" -model_name: "hf_causal_lm" -model_config: +model: + name: "hf_causal_lm" model_id: "meta-llama/Llama-3.1-8B" - model_backend: "hf" - torch_dtype: "bfloat16" - -alignment_methods: - - "activation_l2_norm" - - "rayleigh_quotient" - -tracked_layers: - - "model.layers.*.mlp.up_proj" - - "model.layers.*.mlp.down_proj" - -do_scar_metrics: true -scar_num_samples: 100 -scar_max_length: 512 + dtype: "bfloat16" + device_map: "auto" + tracked_layers: + - "model.model.layers.*.mlp.up_proj" + - "model.model.layers.*.mlp.gate_proj" + - "model.model.layers.*.mlp.down_proj" + +dataset: + name: "wikitext" + subset: "wikitext-2-raw-v1" + split: "train" + batch_size: 1 + +calibration: + num_samples: 128 + max_length: 2048 + batch_size: 4 + +metrics: + scar: + enabled: true + num_samples: 64 + max_length: 512 supernode: enabled: true - core_fraction: 0.12 + score_metric: "scar_loss_proxy" + core_fraction: 0.01 + halo_fraction: 0.10 protect_core: true -``` - -## Available Metrics - -### Activation Metrics - -| Metric | Description | -|--------|-------------| -| `activation_l2_norm` | L2 norm of activations | -| `activation_variance` | Activation variance | -| `activation_outlier_index` | Outlier detection | - -### SCAR Metrics - -Computed via `compute_scar_supernode_metrics()`: -| Metric | Description | -|--------|-------------| -| `scar_activation_power` | Mean squared activation E[u_i^2] | -| `scar_taylor` | First-order Taylor saliency | -| `scar_curvature` | Rayleigh-style curvature | -| `scar_loss_proxy` | 0.5 x activation_power x curvature | - -## Pruning - -### MLP Pruning - -Prunes gate_proj, up_proj (output dims) and down_proj (input dims) together: - -```yaml pruning: enabled: true - algorithms: ["alignment"] - sparsity_levels: [0.1, 0.2, 0.3] - alignment_metric: "scar_loss_proxy" - selection_mode: "low" + ratios: [0.1, 0.3, 0.5] structured: true + dependency_aware: true + algorithms: + - "magnitude" + - "wanda" + - "sparsegpt" + - "scar_loss_proxy" + - "supernode_protection_score" ``` -### Attention Head Pruning +## Metric Families -Prunes entire attention heads by applying shared masks to Q/K/V/O projections. +| Family | Examples | +|--------|----------| +| Activation | `activation_l2_norm`, `activation_variance`, `activation_outlier_index` | +| SCAR | `scar_activation_power`, `scar_taylor`, `scar_curvature`, `scar_loss_proxy` | +| Alignment | `rayleigh_quotient`, `delta_alignment` | +| Information | `mutual_information_gaussian`, `average_redundancy`, `pairwise_redundancy_gaussian` | +| Baselines | `magnitude`, `weight_magnitude`, `wanda`, `sparsegpt` | -### Supernode Protection +## Structured FFN Pruning -Protects high-importance neurons from pruning: +For Llama-style FFNs, structured channel pruning masks the corresponding +intermediate channel across `gate_proj`, `up_proj`, and `down_proj`. This asks a +channel-level question: which full FFN units can be removed while preserving +model quality? -```yaml -supernode: - enabled: true - core_fraction: 0.12 - protect_core: true -``` +Unstructured weight pruning is a different setting. It can be useful as a +compression baseline, but it should be labeled separately from structured +channel pruning. -## Supernode Analysis +## Supernode And Halo Diagnostics -The framework analyzes supernode connections across transformer layers. +When `supernode.enabled` is true, NodeLens ranks channels by the configured +`score_metric` and marks the top `core_fraction` as a protected or analyzed +core. The same outputs can be used for ablation, pruning protection, or overlap +analysis with activation-defined outliers. -### Architecture Context (LLaMA FFN) +Halo diagnostics are optional. They measure local write-overlap and redundancy +around the high-scoring core and are useful when the question is whether +neighboring non-core channels behave differently from other channels. -``` -input(4096) -> gate_proj/up_proj(14336) -> down_proj -> output(4096) -> next layer - up up - INTERMEDIATE neurons OUTPUT to residual stream - (supernodes identified) (cross-layer analysis) -``` +## Memory Notes -### Analysis Workflow +- Use `batch_size: 1` for very large models. +- Use `device_map: "auto"` when model parallelism is available. +- Use `torch_dtype: "bfloat16"` or `"float16"` when supported. +- Reduce `calibration.num_samples`, `calibration.max_length`, or SCAR sample + counts for smoke tests. -1. **Compute metrics** on intermediate neurons (14336 dim) using the selected `score_metric` -2. **Identify supernodes** as top neurons by the metric (e.g., top 1%) -3. **Trace outgoing weights** from supernodes through `down_proj` -4. **Cross-layer analysis** (optional): Analyze next layer's input neurons +## Outputs -### Configuration +LLM runs usually write: -```yaml -supernode: - enabled: true - - # Supernode identification (in intermediate dimension) - score_metric: "scar_activation_power" # Options: scar_activation_power, scar_taylor, - # scar_loss_proxy, rayleigh_quotient, - # mutual_information, activation_l2_norm - core_fraction: 0.01 # Top 1% as supernodes - protect_core: true # Protect during pruning - - # Cross-layer analysis - cross_layer_analysis: true # Enable next-layer analysis - follower_fraction: 0.10 # Top 10% by weight from supernodes - - compute_metrics: - - "activation" - - "rayleigh_quotient" - - "mutual_information" - - "redundancy" - - compare_by_connection: true # Compare high vs low connected neurons - - # Target layers (optional) - # - If not specified: uses tracked_layers from main config - # - If empty list []: analyzes ALL layers with SCAR scores - # target_layers: - # - "model.layers.10.mlp.down_proj" - # - "model.layers.15" # Pattern matching -``` - -### Generated Plots - -| Plot | Description | -|------|-------------| -| `supernode_score_dist_*.png` | Distribution of supernode scores with threshold | -| `supernode_outgoing_weights_*.png` | Histogram of weights from supernodes | -| `supernode_influence_*.png` | Influence of supernodes on output neurons | -| `next_layer_correlation_*.png` | Correlation matrix of high-connection neurons | -| `next_layer_redundancy_hist_*.png` | Redundancy distribution (next layer input) | -| `next_layer_rq_hist_*.png` | RQ distribution (next layer input) | -| `next_layer_mi_hist_*.png` | MI distribution (next layer input) | -| `next_layer_rq_vs_mi_*.png` | RQ vs MI scatter (next layer input) | -| `redundancy_comparison_*.png` | High vs low connected neuron comparison | - -### Understanding Cross-Layer Analysis - -The cross-layer analysis traces how supernodes in layer N influence layer N+1: - -1. **Supernodes** are identified in the intermediate dimension (14336 neurons inside the FFN) -2. **Outgoing weights** from supernodes are traced through `down_proj` to the hidden dimension (4096) -3. **High-connection neurons** are positions in the hidden dimension that receive large weights from supernodes -4. These positions become **inputs to the next transformer block** -5. Metrics (RQ, MI, redundancy) are computed for these high-connection positions - -## Programmatic Usage - -```python -from nodelens.experiments import LLMAlignmentExperiment - -experiment = LLMAlignmentExperiment(config) -experiment.setup() - -# Compute importance scores -scores = experiment.compute_importance_scores(num_samples=100) - -# Compute SCAR metrics -scar_scores = experiment.compute_scar_supernode_metrics() - -# Apply pruning -masks = experiment.apply_pruning(sparsity=0.3, metric="scar_loss_proxy", mode="low") +- per-layer metric arrays and score summaries +- pruning and ablation results +- perplexity or downstream-task evaluations +- plots, tables, and JSON summaries when enabled -# Evaluate -perplexity = experiment.evaluate_perplexity("wikitext", "test", num_samples=100) -``` - -## Visualization - -```python -from nodelens.analysis.visualization import UnifiedVisualizer - -viz = UnifiedVisualizer() - -# SCAR metrics -viz.plot_scar_layer_scores(scar_scores, metric_name="scar_loss_proxy") -viz.plot_scar_heatmap(scar_scores, metrics=["scar_activation_power", "scar_loss_proxy"]) - -# Importance histograms -viz.plot_importance_histogram(scores, layer_name, metric_name, plots_dir) -``` - -## Memory Considerations - -- Use `batch_size: 1` for large models -- Use `device_map: "auto"` for multi-GPU -- Use `torch_dtype: "bfloat16"` to reduce memory - -## Example Workflow - -```bash -# 1. Compute importance scores -python scripts/run_experiment.py --config configs/examples/llm_alignment.yaml - -# 2. Results saved to results/experiment_YYYYMMDD_HHMMSS/ -# 3. Plots generated in results/.../plots/ -``` +Use the copied `experiment_config.yaml` in each output directory to audit the +exact settings for a run. diff --git a/docs/source/api/experiments.rst b/docs/source/api/experiments.rst index 2d21b14d..9fe8ca66 100644 --- a/docs/source/api/experiments.rst +++ b/docs/source/api/experiments.rst @@ -1,11 +1,9 @@ Experiments API Reference ========================= -This section provides detailed documentation for all experiment types available in NodeLens. - -.. contents:: Table of Contents - :local: - :depth: 2 +NodeLens experiments are configuration-driven. The public API centers on the +base experiment classes and the three main experiment families used by the +runner. Base Experiment Classes ----------------------- @@ -15,352 +13,38 @@ Base Experiment Classes :undoc-members: :show-inheritance: -ExperimentConfig -~~~~~~~~~~~~~~~~ +General Alignment Experiments +----------------------------- -.. autoclass:: nodelens.experiments.base.ExperimentConfig +.. automodule:: nodelens.experiments.general_alignment :members: :undoc-members: + :show-inheritance: - **Core Configuration Options:** - - .. attribute:: name - :type: str - - Unique identifier for the experiment - - .. attribute:: model_name - :type: str - - Name of the model architecture to use (e.g., "resnet18", "mlp", "cnn2p2") - - .. attribute:: dataset_name - :type: str - - Dataset to use (e.g., "cifar10", "mnist", "imagenet") - - .. attribute:: metrics - :type: List[str] - - List of metrics to compute. Available metrics: - - - ``"rayleigh_quotient"``: Neuron alignment with input variance - - ``"mutual_information"``: Information shared between layers - - ``"pid_shared"``: Shared information (PID) - - ``"pid_unique"``: Unique information per neuron - - ``"pid_synergy"``: Synergistic information - - ``"cka"``: Centered Kernel Alignment - - ``"cca"``: Canonical Correlation Analysis - - ``"weight_cosine_similarity"``: Cosine similarity between weights - - ``"node_redundancy"``: Redundancy between neurons - - .. attribute:: device - :type: str - :default: "cuda" if available else "cpu" - - Device to run experiments on - - .. attribute:: seed - :type: int - :default: 42 - - Random seed for reproducibility - -Progressive Dropout Experiment ------------------------------- +Cluster Analysis Experiments +---------------------------- -.. automodule:: nodelens.experiments.progressive_dropout +.. automodule:: nodelens.experiments.cluster_experiments :members: :undoc-members: :show-inheritance: -.. autoclass:: nodelens.experiments.progressive_dropout.ProgressiveDropoutExperiment - :members: - :undoc-members: - - **Description:** - - This experiment gradually increases dropout rates during evaluation to study how networks - degrade as neurons are progressively removed. It's useful for understanding network - robustness and identifying critical neurons. - - **Key Configuration Options:** - - .. attribute:: dropout_rates - :type: List[float] - :default: [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] - - List of dropout rates to evaluate - - .. attribute:: dropout_mode - :type: str - :default: "scaled" - - How to apply dropout: - - - ``"scaled"``: Scale remaining activations by 1/(1-p) - - ``"unscaled"``: No scaling (true dropout) - - .. attribute:: pruning_mode - :type: str - :default: "global_joint" - - Pruning strategy: - - - ``"global_joint"``: Prune globally across all layers - - ``"layer_wise"``: Prune each layer independently - - ``"structured"``: Remove entire channels/filters - - .. attribute:: pruning_strategy - :type: str - :default: "low" - - Which neurons to prune: - - - ``"low"``: Remove low-scoring neurons - - ``"high"``: Remove high-scoring neurons - - ``"random"``: Random pruning - - .. attribute:: pruning_metric - :type: str - :default: "rayleigh_quotient" - - Metric to use for importance scoring - - .. attribute:: exclude_classification_layer - :type: bool - :default: True - - Whether to exclude the final classification layer from pruning - - **Example Usage:** - - .. code-block:: python - - from nodelens.experiments import ProgressiveDropoutExperiment - from nodelens.experiments.base import ExperimentConfig - - config = ExperimentConfig( - name="progressive_dropout_resnet", - model_name="resnet18", - dataset_name="cifar10", - metrics=["rayleigh_quotient", "mutual_information"], - dropout_rates=[0.0, 0.2, 0.4, 0.6, 0.8], - dropout_mode="scaled", - pruning_mode="layer_wise", - pruning_strategy="low", - pruning_metric="rayleigh_quotient" - ) - - experiment = ProgressiveDropoutExperiment(config) - results = experiment.run() - - # Results contain: - # - Accuracy at each dropout rate - # - Metric values for remaining neurons - # - Layer-wise statistics - -Experiment Runner ------------------ +LLM Experiments +--------------- -.. automodule:: nodelens.experiments.runner +.. automodule:: nodelens.experiments.llm_experiments :members: :undoc-members: :show-inheritance: -.. autoclass:: nodelens.experiments.runner.ExperimentRunner - :members: - :undoc-members: - - **Description:** - - The ExperimentRunner manages multiple experiments, handling parallel execution, - result aggregation, and resource management. - - **Key Features:** - - - Parallel experiment execution - - Automatic result saving and loading - - Progress tracking and logging - - Resource management (GPU allocation) - - Experiment resumption on failure - - **Example Usage:** - - .. code-block:: python - - from nodelens.experiments import ExperimentRunner - from nodelens.experiments.base import ExperimentConfig - - # Define multiple experiments - configs = [] - - # Progressive dropout with different strategies - for strategy in ["low", "high", "random"]: - configs.append(ExperimentConfig( - name=f"progressive_{strategy}", - model_name="resnet18", - dataset_name="cifar10", - metrics=["rayleigh_quotient"], - pruning_strategy=strategy - )) - - # Different dropout rates - for rate in [0.3, 0.5, 0.7]: - configs.append(ExperimentConfig( - name=f"dropout_rate_{rate}", - model_name="resnet18", - dataset_name="cifar10", - dropout_rates=[0.0, rate] - )) - - # Run all experiments - runner = ExperimentRunner( - configs=configs, - results_dir="./results", - parallel=True, - max_workers=4, - gpu_per_worker=0.25 # Share GPUs - ) - - all_results = runner.run() - - # Analyze results - runner.generate_report(output_path="./report.html") - -Advanced Configuration Options ------------------------------- - -Training Configuration -~~~~~~~~~~~~~~~~~~~~~~ - -.. attribute:: train_before_dropout - :type: bool - - Whether to train the model before applying dropout (default: True) - -.. attribute:: training_epochs - :type: int - - Number of epochs to train (default: 100) - -.. attribute:: learning_rate - :type: float - - Initial learning rate (default: 0.1) - -.. attribute:: optimizer - :type: str - - Optimizer to use: "sgd", "adam", "adamw" (default: "sgd") - -.. attribute:: lr_schedule - :type: str - - Learning rate schedule: "cosine", "step", "exponential", "none" (default: "cosine") - -Metric Computation Options -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. attribute:: metric_configs - :type: Dict[str, Dict] - - Per-metric configuration options (default: {}) - -.. attribute:: scale_by_norm - :type: bool - - Whether to scale metrics by weight norm (default: False) - -.. attribute:: force_cpu_for_large_metric_ops - :type: bool - - Move large operations to CPU to save GPU memory (default: False) - -.. attribute:: metric_batch_size - :type: int - - Batch size for metric computation (default: 1000) - -Logging and Checkpointing -~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. attribute:: checkpoint_dir - :type: str - - Directory for saving checkpoints (default: "./checkpoints") - -.. attribute:: checkpoint_interval - :type: int - - Steps between checkpoints (default: 1000) - -.. attribute:: save_best - :type: bool - - Save best model based on validation accuracy (default: True) - -.. attribute:: wandb_project - :type: Optional[str] - - Weights & Biases project name (default: None) - -.. attribute:: tensorboard_dir - :type: Optional[str] - - TensorBoard logging directory (default: None) - -Distributed Training -~~~~~~~~~~~~~~~~~~~~ - -.. attribute:: distributed - :type: bool - - Enable distributed training (default: False) - -.. attribute:: world_size - :type: int - - Number of distributed processes (default: 1) - -.. attribute:: backend - :type: str - - Distributed backend: "nccl", "gloo" (default: "nccl") - -Result Analysis ---------------- - -All experiments return a standardized results dictionary containing: +Runner +------ -.. code-block:: python +The standard command-line entry point is: - { - "config": ExperimentConfig, # Full configuration - "metrics": { - "metric_name": { - "layer_name": { - "dropout_rate": values - } - } - }, - "accuracy": { - "dropout_rate": accuracy_value - }, - "timing": { - "total_time": seconds, - "metric_computation_time": seconds - }, - "metadata": { - "hostname": str, - "gpu_info": dict, - "timestamp": str - } - } +.. code-block:: bash -See Also --------- + python scripts/run_experiment.py --config configs/examples/mnist_basic.yaml -- :doc:`/user_guide/experiments` - User guide for experiments -- :doc:`/api/metrics` - Available metrics documentation -- :doc:`/api/pruning` - Pruning strategies documentation +For Python workflows, load a config and instantiate the matching experiment +class from ``nodelens.experiments``. diff --git a/docs/source/api/index.rst b/docs/source/api/index.rst index 7cbbf12e..470876cb 100644 --- a/docs/source/api/index.rst +++ b/docs/source/api/index.rst @@ -73,14 +73,14 @@ Most Common Classes - :class:`nodelens.experiments.base.ExperimentConfig` - Configure experiments - :class:`nodelens.metrics.RayleighQuotient` - Primary alignment metric - :class:`nodelens.models.ModelWrapper` - Wrap models for analysis -- :class:`nodelens.experiments.ProgressiveDropoutExperiment` - Main pruning experiment +- :class:`nodelens.experiments.GeneralAlignmentExperiment` - General metric experiment +- :class:`nodelens.experiments.ClusterAnalysisExperiment` - Vision clustering and pruning experiment - :class:`nodelens.pruning.strategies.MagnitudePruning` - Standard pruning method Key Functions ~~~~~~~~~~~~~ -- :func:`nodelens.core.get_metric` - Get metric by name -- :func:`nodelens.core.get_experiment` - Get experiment by type -- :func:`nodelens.core.list_metrics` - List available metrics -- :func:`nodelens.infrastructure.configuration.load_config` - Load YAML config -- :func:`nodelens.analysis.load_results` - Load experiment results +- :func:`nodelens.metrics.get_metric` - Get metric by name +- :func:`nodelens.metrics.list_metrics` - List available metrics +- :func:`nodelens.configs.config_loader.load_config` - Load YAML config +- :func:`nodelens.pruning.get_pruning_strategy` - Get pruning strategy by name diff --git a/docs/source/api/pruning.rst b/docs/source/api/pruning.rst index 8fcca18a..b8444995 100644 --- a/docs/source/api/pruning.rst +++ b/docs/source/api/pruning.rst @@ -115,25 +115,16 @@ Parallel Strategies :undoc-members: :show-inheritance: -Pruning Experiments -------------------- +Pipeline Helpers +---------------- -.. autoclass:: nodelens.pruning.experiments.ProgressiveDropoutExperiment +.. autoclass:: nodelens.pruning.pipeline.PruningPipelineOptions :members: :undoc-members: :show-inheritance: -.. autoclass:: nodelens.pruning.experiments.CascadingLayerPruningExperiment - :members: - :undoc-members: - :show-inheritance: +.. autofunction:: nodelens.pruning.pipeline.run_pruning_pipeline -.. autoclass:: nodelens.pruning.experiments.LayerIsolatedPruningExperiment - :members: - :undoc-members: - :show-inheritance: - -.. autoclass:: nodelens.pruning.experiments.EigenvectorDropoutExperiment - :members: - :undoc-members: - :show-inheritance: +Experiment-level pruning workflows are run through +``scripts/run_experiment.py`` with configs under ``configs/vision_prune/`` or +``configs/prune_llm/``. diff --git a/docs/source/contributing.rst b/docs/source/contributing.rst index 7745d59a..38b2a4a5 100644 --- a/docs/source/contributing.rst +++ b/docs/source/contributing.rst @@ -104,7 +104,7 @@ Pull Request Guidelines 2. **Description**: Explain what changes you made and why 3. **Tests**: Ensure all tests pass 4. **Documentation**: Update docs if needed -5. **Release notes**: For paper-facing changes, update the relevant file under ``projects/`` +5. **Project docs**: For project-specific behavior, update the relevant file under ``projects/`` Example PR description: diff --git a/docs/source/developer_guide/index.rst b/docs/source/developer_guide/index.rst index 70cdc431..14f58fa3 100644 --- a/docs/source/developer_guide/index.rst +++ b/docs/source/developer_guide/index.rst @@ -1,30 +1,22 @@ Developer Guide =============== -This section contains documentation for developers who want to extend or contribute -to NodeLens. +This section is for developers who want to extend NodeLens. .. toctree:: :maxdepth: 2 extensibility - internal/index Overview -------- -NodeLens is designed to be highly extensible. You can add: +NodeLens is designed to be extensible. You can add: -- **Custom Metrics**: Define new per-neuron alignment metrics -- **Custom Analyzers**: Create new analysis pipelines (clustering, halo, etc.) +- **Custom Metrics**: Define new node or channel metrics +- **Custom Analyzers**: Create new analysis pipelines - **Custom Pruners**: Implement new pruning strategies - **Custom Visualizers**: Add new plot types - **Custom Evaluators**: Define new evaluation methods See :doc:`extensibility` for detailed instructions and examples. - -Internal Documentation ----------------------- - -The :doc:`internal/index` section contains documentation for maintainers about -codebase organization and documentation structure. diff --git a/docs/source/developer_guide/internal/index.rst b/docs/source/developer_guide/internal/index.rst index 6fdfe957..bf3a2165 100644 --- a/docs/source/developer_guide/internal/index.rst +++ b/docs/source/developer_guide/internal/index.rst @@ -1,24 +1,17 @@ -Internal Developer Documentation -================================ +Codebase Notes +============== -This section contains internal documentation about the codebase organization and development processes. +This page summarizes the parts of NodeLens that contributors usually need to +understand before adding metrics, experiment types, or pruning strategies. -.. toctree:: - :maxdepth: 1 +Core extension points: - CODEBASE_ORGANIZATION - DOCUMENTATION_OVERVIEW - DOCUMENTATION_STRUCTURE - setup_github_pages +- ``src/nodelens/core/registry.py`` registers metrics, models, and experiments. +- ``src/nodelens/metrics/`` contains metric implementations. +- ``src/nodelens/models/`` wraps PyTorch and Hugging Face models for activation capture. +- ``src/nodelens/pruning/`` contains masks, pruning configs, and strategies. +- ``src/nodelens/experiments/`` connects configs, data, models, metrics, and evaluation. +- ``configs/`` contains runnable YAML examples. -Overview --------- - -These documents provide guidance for maintainers and contributors: - -- **Codebase Organization**: Current directory structure and module organization -- **Documentation Overview**: Summary of all documentation created for the framework -- **Documentation Structure**: How documentation is organized across the project -- **GitHub Pages Setup**: Instructions for setting up and maintaining the documentation site - -These are primarily for maintainers and contributors who need to understand the codebase structure and documentation system. +Keep reusable code in ``src/nodelens`` and keep project-specific workflows under +``projects/``. diff --git a/docs/source/examples/index.rst b/docs/source/examples/index.rst index 00397b33..dc9693ed 100644 --- a/docs/source/examples/index.rst +++ b/docs/source/examples/index.rst @@ -1,141 +1,69 @@ Examples and Tutorials ====================== -This section contains examples and tutorials for using NodeLens. +NodeLens examples are primarily configuration-driven. The same entry point, +``scripts/run_experiment.py``, can run small smoke tests, vision pruning jobs, +and LLM channel analyses. -Quick Start Examples --------------------- +Runnable Configs +---------------- -.. toctree:: - :maxdepth: 1 +Small examples: - basic_usage - comprehensive_experiment +.. code-block:: bash -Available Example Scripts -------------------------- + python scripts/run_experiment.py --config configs/examples/mnist_basic.yaml + python scripts/run_experiment.py --config configs/examples/resnet_pruning.yaml + python scripts/run_experiment.py --config configs/examples/gpt2_fast_test.yaml -The ``examples/`` directory contains several demonstration scripts: +Vision pruning and clustering: -1. **quick_demo.py** - Minimal Introduction +.. code-block:: bash - - Basic model wrapping and metric computation - - Simple pruning demonstration - - No configuration needed - - Runtime: ~1 minute + python scripts/run_experiment.py --config configs/vision_prune/resnet18_cifar10_full.yaml + python scripts/run_experiment.py --config configs/vision_prune/vgg16_cifar10_full.yaml + python scripts/run_experiment.py --config configs/vision_prune/mobilenetv2_cifar10_full.yaml - .. code-block:: bash +LLM channel analysis and structured FFN pruning: - python examples/quick_demo.py +.. code-block:: bash -2. **standard_alignment_experiment.py** - Complete Workflow + python scripts/run_experiment.py --config configs/prune_llm/llama3_8b_unified.yaml + python scripts/run_experiment.py --config configs/prune_llm/mistral_7b_unified.yaml + python scripts/run_experiment.py --config configs/prune_llm/qwen2_7b_unified.yaml - - Train model on MNIST - - Compute alignment metrics - - Compare pruning strategies - - Generate visualizations - - Runtime: ~5-10 minutes +Common Pattern +-------------- - .. code-block:: bash +Most workflows follow this structure: - python examples/standard_alignment_experiment.py +.. code-block:: text -3. **pruning_strategies_demo.py** - Advanced Pruning + choose a YAML config + -> run scripts/run_experiment.py + -> inspect the timestamped output directory + -> run optional aggregation or plotting scripts - - All pruning modes (low/high/random) - - Parallel pruning execution - - Tensorized GPU operations - - Performance comparisons - - Runtime: ~2-3 minutes - - .. code-block:: bash - - python examples/pruning_strategies_demo.py - -4. **pruning_visualization_demo.py** - Visualization Features - - - Performance plots - - Multi-seed analysis - - Comprehensive comparison grids - - Real pruning demonstrations - - Runtime: ~2 minutes - - .. code-block:: bash - - python examples/pruning_visualization_demo.py - -5. **comprehensive_alignment_experiment.py** - Full Framework Demo - - - YAML configuration system - - All models and datasets - - 36+ alignment metrics - - Advanced training options - - Automatic reporting - - Runtime: Varies by configuration - - .. code-block:: bash - - # Quick test - python examples/comprehensive_alignment_experiment.py \ - --config configs/quick_test_config.yaml - - # Full experiment - python examples/comprehensive_alignment_experiment.py \ - --config configs/comprehensive_alignment_config.yaml - -Example Notebooks +Direct Metric Use ----------------- -Interactive Jupyter notebooks are coming soon: - -- **Getting Started Tutorial** - Step-by-step introduction -- **Metrics Deep Dive** - Exploring all available metrics -- **Custom Experiments** - Building your own experiments -- **Analysis Workshop** - Using the analysis tools - -Configuration Examples ----------------------- - -The ``configs/`` directory contains example configurations: - -- **comprehensive_alignment_config.yaml** - Full configuration with all options documented -- **quick_test_config.yaml** - Minimal configuration for testing - -Common Patterns ---------------- - -Loading and Running Experiments -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: python - - from nodelens.experiments import GeneralAlignmentExperiment - - # From configuration file - experiment = GeneralAlignmentExperiment.from_yaml("config.yaml") - results = experiment.run() - -Computing Metrics on a Model -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Metrics can also be used directly when a script already has layer inputs, +weights, outputs, or gradients. .. code-block:: python - from nodelens import ModelWrapper, get_metric + from nodelens.metrics import get_metric, list_metrics - wrapped_model = ModelWrapper(model) - metric = get_metric("rayleigh_quotient")() + print(list_metrics()) - # Forward pass - outputs, activations = wrapped_model.forward_with_activations(inputs) + metric = get_metric("rayleigh_quotient") + scores = metric.compute(inputs=layer_inputs, weights=layer_weights) - # Compute metric - scores = metric.compute( - inputs=activations["layer_name_input"], - weights=model.layer.weight - ) +Batch Processing +---------------- -Batch Processing Multiple Metrics -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +For workflows that need several metrics over the same captured tensors, use the +batch processor from ``nodelens.dataops.processing``. .. code-block:: python @@ -143,16 +71,28 @@ Batch Processing Multiple Metrics processor = BatchMetricProcessor( metrics=["rayleigh_quotient", "mutual_information_gaussian"], - device="cuda" + device="cuda", ) results = processor.process_dataset(dataloader, model) +Project Workflows +----------------- + +The ``projects/`` directory contains applied workflows that combine configs, +helper scripts, artifact descriptions, and reproduction notes. These folders +are useful when a paper or larger analysis needs more context than a single +YAML file can provide. + +Current project: + +- ``projects/supernodes_scar/``: loss-sensitive FFN channel analysis and + structured pruning for LLMs. + Next Steps ---------- -1. Start with ``quick_demo.py`` to understand the basics -2. Run ``standard_alignment_experiment.py`` for a complete workflow -3. Explore advanced features with the other demos -4. Create your own experiments using ``comprehensive_alignment_experiment.py`` -5. Refer to the :doc:`../user_guide/index` for detailed documentation +- Read the top-level ``README.md`` for the repository overview. +- Read ``docs/usage.md`` for the config-driven workflow. +- Browse ``configs/`` to find the closest starting point for a new experiment. +- Use ``projects/`` when reproducing a specific applied study. diff --git a/docs/source/index.rst b/docs/source/index.rst index 1ea6e57a..8c4cb2d3 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -10,32 +10,30 @@ Overview The codebase provides tools for: -- Computing alignment metrics between neural representations and task structure -- Implementing and testing pruning strategies on neural networks -- Estimating channel-level loss sensitivity in LLM feed-forward layers -- Evaluating information-theoretic properties of learned representations -- Packaging paper artifacts for public release +- Computing alignment, information, redundancy, activation, and loss-sensitive metrics +- Capturing activations and gradients from vision models, transformers, and LLMs +- Testing metric-defined channels with ablation, pruning, and sensitivity probes +- Running reproducible experiments from YAML configuration files +- Generating plots, tables, JSON summaries, and manifest files Key Features ------------ -- Alignment metrics including Rayleigh quotient, mutual information, and spectral methods -- Multiple pruning strategies: magnitude-based, gradient-based, and alignment-based -- Support for vision models (ResNet, VGG, EfficientNet, ViT) and language models -- Flexible experiment framework with YAML configuration -- Paper-specific release folders under ``projects/`` +- Metrics including Rayleigh quotient, mutual information, redundancy, synergy, + activation statistics, gradient scores, curvature scores, and SCAR loss proxies +- Structured pruning strategies for channel-level model analysis +- Support for vision models and Hugging Face causal language models +- Project workflows under ``projects/`` that show complete applied analyses +- Config-driven entry points for both small smoke tests and large LLM studies Quick Start ----------- -.. code-block:: python +.. code-block:: bash - from nodelens.experiments import GeneralAlignmentExperiment - from nodelens.configs.config_loader import load_config - - config = load_config('configs/examples/mnist_basic.yaml') - experiment = GeneralAlignmentExperiment(config) - results = experiment.run() + python scripts/run_experiment.py --config configs/examples/mnist_basic.yaml + python scripts/run_experiment.py --config configs/vision_prune/resnet18_cifar10_full.yaml + python scripts/run_experiment.py --config configs/prune_llm/llama3_8b_unified.yaml .. toctree:: :maxdepth: 2 diff --git a/docs/source/user_guide/experiments.rst b/docs/source/user_guide/experiments.rst index 07d46fa9..afb92b7d 100644 --- a/docs/source/user_guide/experiments.rst +++ b/docs/source/user_guide/experiments.rst @@ -1,299 +1,116 @@ Experiments Guide ================= -This guide covers the different types of experiments available in NodeLens. +NodeLens experiments are usually launched from YAML configs. The runner chooses +the experiment class from the config, loads the model and dataset, computes the +requested metrics, and writes a structured output directory. -Overview --------- +Experiment Types +---------------- -The framework provides several experiment types for analyzing neural network alignment and pruning: +``alignment_analysis`` + General metric analysis for smaller models and vision workflows. -1. **General Alignment Experiment** - Comprehensive alignment analysis with multi-network support -2. **Layer-wise Pruning Experiments** - Analyze pruning effects on individual layers -3. **Global Pruning Experiments** - Apply uniform pruning across all layers -4. **Cascading Pruning Experiments** - Progressive pruning through network layers -5. **Eigenvector-based Pruning** - Use spectral properties for pruning decisions +``cluster_analysis`` + Metric-space clustering, pruning, and halo-style redundancy analysis for + vision models. -General Alignment Experiment ----------------------------- +``llm_alignment`` + LLM activation/gradient capture, channel metrics, ablation probes, and + structured FFN pruning. -The main experiment class that supports: - -- Training single or multiple networks -- Computing alignment metrics during and after training -- Applying various pruning strategies -- Comprehensive analysis and visualization - -.. code-block:: python - - from nodelens.experiments import GeneralAlignmentExperiment, GeneralAlignmentConfig - - config = GeneralAlignmentConfig( - experiment_name="mnist_alignment", - dataset_name="mnist", - model_name="mlp", - hidden_sizes=[128, 64], - num_epochs=10, - compute_alignment=True, - alignment_metrics=["rayleigh_quotient", "mutual_information_gaussian"] - ) - - experiment = GeneralAlignmentExperiment(config) - results = experiment.run() - -Multi-Network Analysis -^^^^^^^^^^^^^^^^^^^^^^ - -Train and analyze multiple networks in parallel: - -.. code-block:: python - - config = GeneralAlignmentConfig( - experiment_name="multi_network_study", - num_networks=5, # Train 5 networks - dataset_name="mnist", - model_name="cnn", - num_epochs=20, - compute_alignment=True - ) - - experiment = GeneralAlignmentExperiment(config) - results = experiment.run() - - # Results include statistics across all networks - print(f"Mean accuracy: {results['mean_accuracy']}") - print(f"Std accuracy: {results['std_accuracy']}") - -Pruning Experiments -------------------- - -Layer-wise Pruning -^^^^^^^^^^^^^^^^^^ - -Analyze the effect of pruning individual layers: - -.. code-block:: python - - from nodelens.pruning.experiments import LayerIsolatedPruningExperiment, LayerIsolatedConfig - - config = LayerIsolatedConfig( - experiment_name="layer_analysis", - dataset_name="mnist", - model_name="mlp", - hidden_sizes=[128, 64], - pruning_ratios=[0.1, 0.3, 0.5, 0.7, 0.9], - pruning_strategy="magnitude" - ) - - experiment = LayerIsolatedPruningExperiment(config) - results = experiment.run() - -Global Pruning -^^^^^^^^^^^^^^ - -Apply the same pruning rate across all layers: - -.. code-block:: python - - from nodelens.pruning.experiments import GlobalDropoutExperiment, GlobalDropoutConfig - - config = GlobalDropoutConfig( - experiment_name="global_pruning", - dataset_name="cifar10", - model_name="resnet18", - dropout_rates=[0.0, 0.1, 0.3, 0.5, 0.7, 0.9], - dropout_structure="magnitude" # or "random", "gradient" - ) - - experiment = GlobalDropoutExperiment(config) - results = experiment.run() - -Cascading Layer Pruning -^^^^^^^^^^^^^^^^^^^^^^^ - -Progressive pruning that cascades through the network: - -.. code-block:: python - - from nodelens.pruning.experiments import CascadingLayerPruningExperiment, CascadingConfig - - config = CascadingConfig( - experiment_name="cascading_analysis", - dataset_name="mnist", - model_name="mlp", - cascade_direction="forward", # or "backward" - pruning_ratios=[0.1, 0.2, 0.3, 0.4, 0.5] - ) - - experiment = CascadingLayerPruningExperiment(config) - results = experiment.run() - -Eigenvector-based Pruning -^^^^^^^^^^^^^^^^^^^^^^^^^ - -Use eigendecomposition for pruning decisions: - -.. code-block:: python - - from nodelens.pruning.experiments import EigenvectorDropoutExperiment, EigenvectorConfig - - config = EigenvectorConfig( - experiment_name="eigenvector_pruning", - dataset_name="mnist", - model_name="mlp", - num_components=10, # Number of eigenvectors to keep - pruning_ratios=[0.1, 0.3, 0.5, 0.7] - ) - - experiment = EigenvectorDropoutExperiment(config) - results = experiment.run() - -Configuration Options ---------------------- - -Common configuration parameters across experiments: - -**Model Configuration:** - -- ``model_name``: "mlp", "cnn", "resnet18", etc. -- ``hidden_sizes``: List of hidden layer sizes (for MLP) -- ``activation``: Activation function ("relu", "tanh", etc.) - -**Training Configuration:** - -- ``num_epochs``: Number of training epochs -- ``batch_size``: Batch size for training -- ``learning_rate``: Learning rate -- ``optimizer``: Optimizer type ("adam", "sgd", etc.) - -**Alignment Configuration:** - -- ``compute_alignment``: Whether to compute alignment metrics -- ``alignment_metrics``: List of metrics to compute -- ``alignment_layers``: Which layers to analyze - -**Pruning Configuration:** - -- ``pruning_strategy``: "magnitude", "gradient", "random", "alignment" -- ``pruning_ratios``: List of pruning ratios to test -- ``structured_pruning``: Whether to use structured pruning - -Running Experiments -------------------- - -From Configuration Files -^^^^^^^^^^^^^^^^^^^^^^^^ +Run From The Command Line +------------------------- .. code-block:: bash - python scripts/run_experiment.py --config configs/my_experiment.yaml - -From Python -^^^^^^^^^^^ - -.. code-block:: python + python scripts/run_experiment.py --config configs/examples/mnist_basic.yaml - from nodelens.experiments import create_experiment_from_config - import yaml +Use ``--base-output-dir`` to choose where job directories are written: - # Load configuration - with open("configs/my_experiment.yaml", "r") as f: - config_dict = yaml.safe_load(f) - - # Create and run experiment - experiment = create_experiment_from_config(config_dict) - results = experiment.run() +.. code-block:: bash -Analyzing Results ------------------ + python scripts/run_experiment.py \ + --config configs/vision_prune/resnet18_cifar10_full.yaml \ + --base-output-dir outputs/resnet18_cifar10 -All experiments return a results dictionary containing: +Run From Python +--------------- -- Training metrics (loss, accuracy over time) -- Final model performance -- Alignment metrics (if computed) -- Pruning analysis (for pruning experiments) -- Visualizations and plots +For custom scripts, load a config and instantiate the matching experiment +class directly. .. code-block:: python - # Access results - results = experiment.run() - - # Training history - train_loss = results['training_history']['train_loss'] - val_accuracy = results['training_history']['val_accuracy'] + from nodelens.configs.config_loader import load_config + from nodelens.experiments import ( + ClusterAnalysisExperiment, + GeneralAlignmentExperiment, + LLMAlignmentExperiment, + ) - # Alignment metrics - if 'alignment_metrics' in results: - rq_scores = results['alignment_metrics']['rayleigh_quotient'] - mi_scores = results['alignment_metrics']['mutual_information'] + config = load_config("configs/examples/mnist_basic.yaml") - # Pruning results - if 'pruning_results' in results: - for ratio, metrics in results['pruning_results'].items(): - print(f"Pruning {ratio}: Accuracy = {metrics['accuracy']}") + if config.experiment_type == "llm_alignment": + experiment = LLMAlignmentExperiment(config) + elif config.experiment_type == "cluster_analysis": + experiment = ClusterAnalysisExperiment(config) + else: + experiment = GeneralAlignmentExperiment(config) -Visualization -------------- + results = experiment.run() -The framework automatically generates visualizations: +Output Structure +---------------- -- Training curves -- Alignment metric evolution -- Pruning performance plots -- Layer-wise analysis +A typical experiment directory contains: -Plots are saved to the experiment output directory and can be customized through configuration. +.. code-block:: text -Best Practices --------------- + experiment_config.yaml + logs/ + results/ + figures/ + analysis/ -1. **Start Small**: Test with small models and datasets first -2. **Use Checkpointing**: Enable model checkpointing for long experiments -3. **Monitor Memory**: Some alignment metrics are memory-intensive -4. **Reproducibility**: Always set seeds for reproducible results -5. **Incremental Analysis**: Start with few pruning ratios, then refine +The exact files depend on the experiment type. LLM runs usually include +per-layer metric scores, pruning summaries, calibration metadata, and +evaluation outputs. Vision workflows often include pruning curves, clustering +diagnostics, and metric visualizations. -Advanced Features +Choosing A Config ----------------- -Custom Metrics -^^^^^^^^^^^^^^ +Start from the closest existing config: -Add custom alignment metrics: +.. list-table:: + :header-rows: 1 -.. code-block:: python - - from nodelens.metrics import register_metric - - @register_metric("my_custom_metric") - def my_metric(model, dataloader, device): - # Implement your metric - return metric_value - -Custom Pruning Strategies -^^^^^^^^^^^^^^^^^^^^^^^^^ + * - Use case + - Configs + * - Small smoke tests + - ``configs/examples/*.yaml`` + * - Vision pruning and clustering + - ``configs/vision_prune/*.yaml`` + * - LLM channel metrics and SCAR runs + - ``configs/prune_llm/*.yaml`` -Implement custom pruning strategies: +When creating a new experiment, change one axis at a time: model, dataset, +tracked layers, metrics, pruning strategy, or evaluation settings. This makes +result differences easier to interpret. -.. code-block:: python - - from nodelens.pruning.strategies import BasePruningStrategy +Result Analysis +--------------- - class MyPruningStrategy(BasePruningStrategy): - def compute_importance_scores(self, model, dataloader): - # Implement importance scoring - return scores +Use ``scripts/run_analysis.py`` for post-hoc analysis when an experiment has +already written results: -Parallel Execution -^^^^^^^^^^^^^^^^^^ - -For multi-network experiments, parallel execution is automatic when ``num_networks > 1``. +.. code-block:: bash -See Also --------- + python scripts/run_analysis.py \ + --results-dir outputs/my_run \ + --output-dir outputs/my_run/analysis_extra -- :doc:`configuration` - Detailed configuration options -- :doc:`metrics` - Available alignment metrics -- :doc:`pruning` - Pruning strategies and concepts +Project-specific aggregation scripts live under ``projects/`` or in the +project's public artifact bundle when a study needs extra figure and table +generation logic. diff --git a/docs/source/user_guide/installation.rst b/docs/source/user_guide/installation.rst index 31b2476a..1eb7b050 100644 --- a/docs/source/user_guide/installation.rst +++ b/docs/source/user_guide/installation.rst @@ -18,8 +18,8 @@ Create and activate the conda environment: .. code-block:: bash - git clone - cd nodelens + git clone https://github.com/KempnerInstitute/NodeLens.git + cd NodeLens conda env create -f environment.yml conda activate nodelens @@ -33,8 +33,8 @@ Install directly from source: .. code-block:: bash - git clone - cd nodelens + git clone https://github.com/KempnerInstitute/NodeLens.git + cd NodeLens pip install -e . Verification @@ -70,5 +70,5 @@ Next Steps ---------- - See :doc:`quickstart` for basic usage -- Check the repository ``examples/`` folder for runnable examples +- Browse ``configs/examples/`` for runnable example configs - Read the top-level README for current API entry points diff --git a/docs/source/user_guide/pruning.rst b/docs/source/user_guide/pruning.rst index af3a17fa..fb566c44 100644 --- a/docs/source/user_guide/pruning.rst +++ b/docs/source/user_guide/pruning.rst @@ -1,301 +1,105 @@ Pruning Guide ============= -This guide covers the pruning capabilities in NodeLens, including different strategies and experiment types. +NodeLens uses pruning as both an intervention tool and a compression baseline. +The same metric scores used for interpretability can be turned into masks, then +the pruned model can be evaluated to test whether those scores identify +functionally important channels or weights. -Overview --------- +Available Strategies +-------------------- -The framework provides comprehensive pruning capabilities: +Registered strategies include: -- **Multiple Pruning Strategies**: Magnitude, gradient, random, and alignment-based -- **Structured and Unstructured Pruning**: Support for both approaches -- **Various Experiment Types**: Global, layer-wise, cascading, and eigenvector-based +- ``magnitude`` and ``global_magnitude`` +- ``gradient``, ``fisher``, and ``momentum`` +- ``alignment`` and ``global_alignment`` +- ``eigenvector`` +- ``movement`` and ``adaptive_movement`` +- ``random`` and ``bernoulli`` +- LLM baselines such as ``wanda`` and ``sparsegpt`` -Pruning Strategies ------------------- - -The framework includes several pruning strategies in ``nodelens.pruning.strategies``: - -Magnitude-based Pruning -^^^^^^^^^^^^^^^^^^^^^^^ - -Prunes weights or neurons based on their magnitude: - -.. code-block:: python - - from nodelens.pruning.strategies import MagnitudePruning - - strategy = MagnitudePruning() - masks = strategy.compute_masks(model, pruning_ratio=0.5) - -Gradient-based Pruning -^^^^^^^^^^^^^^^^^^^^^^ - -Uses gradient information to determine importance: +List strategies from Python: .. code-block:: python - from nodelens.pruning.strategies import GradientPruning + from nodelens.pruning import list_pruning_strategies - strategy = GradientPruning() - masks = strategy.compute_masks(model, dataloader, pruning_ratio=0.5) + print(list_pruning_strategies()) -Random Pruning -^^^^^^^^^^^^^^ +Use A Strategy Directly +----------------------- -Baseline strategy that randomly prunes connections: +For low-level scripts, create a strategy from the registry: .. code-block:: python - from nodelens.pruning.strategies import RandomPruning + from nodelens.pruning import PruningConfig, get_pruning_strategy - strategy = RandomPruning(seed=42) - masks = strategy.compute_masks(model, pruning_ratio=0.5) + config = PruningConfig(amount=0.5, pruning_mode="low") + strategy = get_pruning_strategy("magnitude", config=config) + mask = strategy.prune(layer, amount=0.5) -Alignment-based Pruning -^^^^^^^^^^^^^^^^^^^^^^^ +The exact method signature depends on the strategy. Config-driven experiments +are the safer entry point for full-model pruning because they handle layer +selection, dependency constraints, evaluation, and result logging. -Uses alignment metrics to guide pruning decisions: +Run A Pruning Config +-------------------- -.. code-block:: python +Vision example: - from nodelens.pruning.strategies import AlignmentPruning +.. code-block:: bash - strategy = AlignmentPruning(metric="rayleigh_quotient") - masks = strategy.compute_masks(model, dataloader, pruning_ratio=0.5) + python scripts/run_experiment.py \ + --config configs/vision_prune/resnet18_cifar10_full.yaml -Pruning Experiments -------------------- +LLM example: -Global Pruning -^^^^^^^^^^^^^^ +.. code-block:: bash -Applies the same pruning rate across all layers: - -.. code-block:: python - - from nodelens.pruning.experiments import GlobalDropoutExperiment, GlobalDropoutConfig - - config = GlobalDropoutConfig( - experiment_name="global_pruning_mnist", - dataset_name="mnist", - model_name="mlp", - hidden_sizes=[128, 64], - dropout_rates=[0.0, 0.1, 0.3, 0.5, 0.7, 0.9], - dropout_structure="magnitude" - ) - - experiment = GlobalDropoutExperiment(config) - results = experiment.run() - -Layer-wise Pruning -^^^^^^^^^^^^^^^^^^ - -Analyzes the effect of pruning individual layers: - -.. code-block:: python + python scripts/run_experiment.py \ + --config configs/prune_llm/llama3_8b_unified.yaml - from nodelens.pruning.experiments import LayerIsolatedPruningExperiment, LayerIsolatedConfig +Structured And Unstructured Masks +--------------------------------- - config = LayerIsolatedConfig( - experiment_name="layer_analysis", - dataset_name="mnist", - model_name="mlp", - pruning_ratios=[0.1, 0.3, 0.5, 0.7, 0.9], - pruning_strategy="magnitude", - layers_to_prune=["fc1", "fc2"] # Specific layers - ) +Unstructured pruning removes individual weights. It is useful for sparsity +experiments, but it may need sparse kernels to produce wall-clock speedups. - experiment = LayerIsolatedPruningExperiment(config) - results = experiment.run() +Structured pruning removes complete channels, filters, or FFN units. It is +coarser, but it is easier to connect to architecture-level interventions and +hardware-friendly compression. -Cascading Pruning -^^^^^^^^^^^^^^^^^ - -Progressive pruning that cascades through network layers: - -.. code-block:: python - - from nodelens.pruning.experiments import CascadingLayerPruningExperiment, CascadingConfig - - config = CascadingConfig( - experiment_name="cascading_analysis", - dataset_name="mnist", - model_name="mlp", - cascade_direction="forward", - base_pruning_ratio=0.2, - cascade_factor=1.5 # Increase pruning by 50% each layer - ) - - experiment = CascadingLayerPruningExperiment(config) - results = experiment.run() - -Eigenvector-based Pruning -^^^^^^^^^^^^^^^^^^^^^^^^^ - -Uses spectral analysis for pruning: - -.. code-block:: python - - from nodelens.pruning.experiments import EigenvectorDropoutExperiment, EigenvectorConfig - - config = EigenvectorConfig( - experiment_name="eigenvector_pruning", - dataset_name="mnist", - model_name="mlp", - num_components=10, - component_selection="top", # or "bottom" - pruning_ratios=[0.1, 0.3, 0.5] - ) - - experiment = EigenvectorDropoutExperiment(config) - results = experiment.run() - -Structured vs Unstructured Pruning ------------------------------------ - -Unstructured Pruning -^^^^^^^^^^^^^^^^^^^^ - -Removes individual weights/connections: - -.. code-block:: python +NodeLens supports both settings through strategy-specific options and YAML +configs. For LLM FFN studies, structured channel pruning is often the relevant +setting because the goal is to ask which whole channels are functionally +important. - config = GlobalDropoutConfig( - structured_pruning=False, # Default - pruning_strategy="magnitude" - ) +Interpreting Results +-------------------- -- **Pros**: Fine-grained control, potentially higher accuracy retention -- **Cons**: Requires sparse matrix support for speedup +Pruning experiments typically report: -Structured Pruning -^^^^^^^^^^^^^^^^^^ +- the requested sparsity or pruning fraction +- the layer or channel groups that were masked +- model performance after applying the mask +- metric summaries for protected, pruned, or retained channels +- optional figures and JSON summaries for downstream analysis -Removes entire neurons/channels/filters: - -.. code-block:: python - - config = GlobalDropoutConfig( - structured_pruning=True, - pruning_strategy="magnitude", - structure_type="neuron" # or "channel", "filter" - ) - -- **Pros**: Direct speedup, hardware-friendly -- **Cons**: Coarser granularity, potentially more accuracy loss - -Analyzing Pruning Results -------------------------- - -The experiments return comprehensive results: - -.. code-block:: python - - results = experiment.run() - - # Pruning performance - for ratio in results['pruning_ratios']: - metrics = results['pruning_results'][ratio] - print(f"Pruning {ratio*100}%:") - print(f" Accuracy: {metrics['accuracy']:.2f}%") - print(f" Remaining params: {metrics['remaining_params']}") - - # Layer-wise analysis (for layer-wise experiments) - if 'layer_results' in results: - for layer, data in results['layer_results'].items(): - print(f"\nLayer {layer}:") - print(f" Sensitivity: {data['sensitivity']}") - print(f" Optimal pruning: {data['optimal_ratio']}") - -Visualization -------------- - -The framework automatically generates pruning analysis plots: - -- Accuracy vs pruning ratio curves -- Layer sensitivity heatmaps -- Parameter reduction charts -- Alignment metric evolution - -Custom Pruning Strategies -------------------------- - -Implement custom strategies by extending the base class: - -.. code-block:: python - - from nodelens.pruning.strategies import BasePruningStrategy - - class MyCustomPruning(BasePruningStrategy): - def compute_importance_scores(self, model, dataloader=None): - """Compute importance scores for each parameter.""" - scores = {} - for name, param in model.named_parameters(): - if 'weight' in name: - # Your custom importance calculation - scores[name] = custom_importance(param) - return scores - - def create_masks(self, scores, pruning_ratio): - """Create binary masks from scores.""" - masks = {} - for name, score in scores.items(): - threshold = torch.quantile(score.flatten(), pruning_ratio) - masks[name] = score > threshold - return masks +When comparing pruning methods, keep the pruning granularity fixed. A +structured channel-pruning result should not be treated as directly comparable +to an unstructured weight-pruning result unless the goal is explicitly to +compare different deployment regimes. Best Practices -------------- -1. **Start Conservative**: Begin with small pruning ratios (10-30%) -2. **Fine-tune After Pruning**: Allow the model to adapt after pruning -3. **Compare Strategies**: Test multiple strategies on your specific task -4. **Monitor Multiple Metrics**: Don't just track accuracy -5. **Consider Hardware**: Choose structured/unstructured based on deployment - -Common Pitfalls ---------------- - -1. **Pruning Too Aggressively**: Gradual pruning often works better -2. **Ignoring Layer Sensitivity**: Some layers are more critical -3. **Not Fine-tuning**: Models often recover performance with fine-tuning -4. **Wrong Granularity**: Match pruning type to hardware constraints - -Advanced Topics ---------------- - -Iterative Pruning -^^^^^^^^^^^^^^^^^ - -Prune in multiple rounds: - -.. code-block:: python - - config = GlobalDropoutConfig( - iterative_pruning=True, - pruning_schedule=[0.2, 0.4, 0.6], # Cumulative - fine_tune_epochs=5 # Between rounds - ) - -Dynamic Pruning -^^^^^^^^^^^^^^^ - -Adjust pruning during training: - -.. code-block:: python - - config = GlobalDropoutConfig( - dynamic_pruning=True, - initial_sparsity=0.0, - final_sparsity=0.9, - pruning_frequency=100 # Steps - ) - -See Also --------- - -- :doc:`experiments` - Overview of experiment types -- :doc:`metrics` - Alignment metrics for pruning -- :doc:`configuration` - Detailed configuration options +- Start with a small config and verify the output layout before launching a + large run. +- Compare each informed strategy against random and magnitude baselines. +- Record the calibration dataset, evaluation dataset, sparsity, and mask + granularity. +- For LLM runs, keep structured and unstructured baselines clearly labeled. +- Use ablation probes when the goal is scientific interpretation rather than + only compression quality. diff --git a/docs/source/user_guide/pruning_strategies.rst b/docs/source/user_guide/pruning_strategies.rst index 10afd852..c523514c 100644 --- a/docs/source/user_guide/pruning_strategies.rst +++ b/docs/source/user_guide/pruning_strategies.rst @@ -1,298 +1,76 @@ Pruning Strategies Guide ======================== -This guide documents all pruning strategies available in NodeLens and their use cases. +This page summarizes the pruning strategy registry. For full experiments, use a +YAML config and ``scripts/run_experiment.py``. Use the direct Python API when a +custom script already owns the model, layer selection, and evaluation loop. -Overview +Registry -------- -Pruning is a technique for reducing neural network size by removing parameters while maintaining performance. NodeLens provides several pruning strategies to analyze how network sparsity affects alignment metrics. - -Available Pruning Strategies ----------------------------- - -1. Magnitude-Based Pruning -~~~~~~~~~~~~~~~~~~~~~~~~~~ - -**Module**: :mod:`nodelens.pruning.strategies.magnitude` - -**Classes**: - -- :class:`MagnitudePruning`: Basic magnitude pruning -- :class:`GlobalMagnitudePruning`: Global magnitude pruning across all layers -- :class:`IterativeMagnitudePruning`: Gradual pruning with fine-tuning - -**Description**: Removes weights with the smallest absolute values. - -**Theory**: Small magnitude weights contribute less to the network's output and can be removed with minimal impact. - -**Usage**: - .. code-block:: python - from nodelens.pruning import get_pruning_strategy + from nodelens.pruning import get_pruning_strategy, list_pruning_strategies - # Basic magnitude pruning + print(list_pruning_strategies()) strategy = get_pruning_strategy("magnitude") - mask = strategy.compute_mask(layer.weight, amount=0.5) - strategy.apply_mask(layer, mask) - - # Global magnitude pruning - strategy = get_pruning_strategy("global_magnitude") - masks = strategy.compute_masks_for_model(model, amount=0.5) - -**Parameters**: - -- ``amount``: Fraction of weights to prune (0-1) -- ``structured``: If True, prunes entire channels/filters -- ``dim``: Dimension for structured pruning (0=output, 1=input) - -2. Random Pruning -~~~~~~~~~~~~~~~~~ - -**Module**: :mod:`nodelens.pruning.strategies.random` - -**Classes**: - -- :class:`RandomPruning`: Uniform random pruning -- :class:`LayerwiseRandomPruning`: Random pruning with per-layer control -- :class:`BernoulliPruning`: Probabilistic pruning with Bernoulli sampling - -**Description**: Randomly removes weights regardless of their values. - -**Theory**: Used as a baseline to compare against informed pruning strategies. - -**Usage**: - -.. code-block:: python - - strategy = get_pruning_strategy("random") - mask = strategy.compute_mask(layer.weight, amount=0.5) - -3. Gradient-Based Pruning -~~~~~~~~~~~~~~~~~~~~~~~~~ -**Module**: :mod:`nodelens.pruning.strategies.gradient` +Main Strategy Families +---------------------- -**Classes**: +Magnitude-based + ``magnitude``, ``global_magnitude``, ``iterative_magnitude``. These remove + low-magnitude weights or channels and are useful default baselines. -- :class:`GradientPruning`: Basic gradient magnitude pruning -- :class:`FisherPruning`: Fisher information-based pruning -- :class:`MomentumPruning`: Momentum-aware gradient pruning +Gradient-based + ``gradient``, ``fisher``, ``momentum``. These use gradients or + gradient-derived saliency and require a backward pass or stored gradients. -**Description**: Prunes weights based on gradient information. +Alignment-based + ``alignment``, ``global_alignment``, ``cascading_alignment``. These use + NodeLens metric scores, such as Rayleigh quotient or activation statistics, + as pruning signals. -**Theory**: Weights with small gradients have less impact on the loss function. +Random baselines + ``random`` and ``bernoulli``. These are useful controls for separating + metric value from sparsity effects. -**Usage**: +LLM baselines + ``wanda``, ``sparsegpt``, ``owl``, ``llm_pruner``, ``flap``, ``ria``, and + ``slimllm``. These are used by LLM configs when comparing channel or weight + pruning methods. -.. code-block:: python - - strategy = get_pruning_strategy("gradient") - - # Requires gradient computation - loss.backward() - mask = strategy.compute_mask( - layer.weight, - amount=0.5, - gradient=layer.weight.grad - ) - -**Requirements**: Requires gradient computation via backpropagation. - -4. Structured Pruning -~~~~~~~~~~~~~~~~~~~~~ - -All strategies support structured pruning by setting ``structured=True``: +Parallel and adaptive strategies + ``parallel_mode``, ``tensorized``, ``async_parallel``, + ``adaptive_movement``, and ``adaptive_sensitivity``. These support larger + sweeps or adaptive pruning behavior. -.. code-block:: python - - from nodelens.pruning import PruningConfig - - config = PruningConfig( - strategy="magnitude", - amount=0.3, - structured=True, - dim=0 # Prune output channels - ) +Structured Pruning +------------------ - strategy = get_pruning_strategy(config.strategy) - mask = strategy.compute_mask(layer.weight, **config.to_dict()) +Structured pruning removes complete channels or filters. It is the right +choice when the question is about channel-level importance, architecture-level +compression, or hardware-friendly intervention. -5. Iterative Pruning -~~~~~~~~~~~~~~~~~~~~ - -The framework provides iterative pruning through dedicated strategies: - -.. code-block:: python +Unstructured pruning removes individual weights. It can preserve quality at +higher sparsity, but it answers a different question and usually needs sparse +runtime support for speedups. - from nodelens.pruning.strategies.magnitude import IterativeMagnitudePruning +Example Configs +--------------- - strategy = IterativeMagnitudePruning( - iterations=10, - final_sparsity=0.9 - ) +.. code-block:: bash - for step in range(strategy.iterations): - mask = strategy.compute_mask_for_iteration( - layer.weight, - iteration=step - ) - strategy.apply_mask(layer, mask) - - # Fine-tune between iterations - fine_tune(model, epochs=5) - -Pruning Schedules ------------------ - -Create pruning schedules for gradual sparsification: - -.. code-block:: python - - from nodelens.pruning.schedules import PolynomialSchedule, LinearSchedule - - # Polynomial schedule (recommended) - schedule = PolynomialSchedule( - initial_sparsity=0.0, - final_sparsity=0.9, - begin_step=1000, - end_step=10000, - power=3 - ) - - # Get sparsity for current step - current_sparsity = schedule(step=5000) - -**Schedule Types**: - -- ``LinearSchedule``: Linear interpolation -- ``PolynomialSchedule``: Smooth polynomial interpolation -- ``ExponentialSchedule``: Exponential decay -- ``CosineSchedule``: Cosine annealing + python scripts/run_experiment.py --config configs/examples/resnet_pruning.yaml + python scripts/run_experiment.py --config configs/vision_prune/resnet18_cifar10_full.yaml + python scripts/run_experiment.py --config configs/prune_llm/llama3_8b_unified.yaml Best Practices -------------- -1. Choosing a Strategy -~~~~~~~~~~~~~~~~~~~~~~ - -- **Magnitude pruning**: Good default choice, simple and effective -- **Gradient-based**: When task-specific importance is crucial -- **Fisher pruning**: For second-order importance estimation -- **Structured**: When hardware efficiency is important - -2. Pruning Amount -~~~~~~~~~~~~~~~~~ - -- Start with small amounts (10-30%) for initial experiments -- Most networks can handle 50-70% sparsity with minimal accuracy loss -- 90%+ sparsity is possible but requires careful tuning - -3. Iterative vs One-Shot -~~~~~~~~~~~~~~~~~~~~~~~~ - -- **One-shot**: Fast, good for analysis -- **Iterative**: Better performance, allows adaptation - -4. Fine-Tuning -~~~~~~~~~~~~~~ - -Always fine-tune after pruning for best results: - -.. code-block:: python - - # Prune - strategy = get_pruning_strategy("magnitude") - mask = strategy.compute_mask(layer.weight, amount=0.5) - strategy.apply_mask(layer, mask) - - # Fine-tune - for epoch in range(fine_tune_epochs): - train(model, train_loader, optimizer, criterion) - -Utility Functions ------------------ - -Check Sparsity -~~~~~~~~~~~~~~~ - -.. code-block:: python - - from nodelens.pruning.utils import get_sparsity, get_model_sparsity - - # Layer sparsity - sparsity = get_sparsity(layer) - - # Model sparsity - model_sparsity = get_model_sparsity(model) - -Remove Pruning -~~~~~~~~~~~~~~ - -.. code-block:: python - - from nodelens.pruning.utils import remove_pruning - - # Makes pruning permanent and removes masks - remove_pruning(layer) - -Integration with Alignment Metrics ----------------------------------- - -Pruning affects alignment metrics in various ways: - -1. **Rayleigh Quotient**: May increase as unimportant directions are removed -2. **Mutual Information**: Can decrease if information pathways are disrupted -3. **Spectral Properties**: Eigenvalue distribution changes with sparsity - -Example analysis: - -.. code-block:: python - - from nodelens.experiments import GeneralAlignmentExperiment - - # Track how metrics change with pruning - config = { - "model_name": "resnet18", - "dataset": "cifar10", - "metrics": ["rayleigh_quotient", "mutual_information_gaussian", "spectral_gap"], - "pruning_amounts": [0.0, 0.3, 0.5, 0.7, 0.9], - "pruning_strategy": "magnitude" - } - - experiment = GeneralAlignmentExperiment(config) - results = experiment.run() - - # Visualize metric changes vs sparsity - experiment.visualize_results() - -Common Issues and Solutions ---------------------------- - -Issue: Performance Degrades Significantly -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -**Solution**: Use iterative pruning with fine-tuning between steps - -Issue: Structured Pruning Removes Important Channels -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -**Solution**: Use custom importance scores based on your task - -Issue: Pruning Masks Not Persisting -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -**Solution**: Ensure hooks are properly registered, or make pruning permanent - -Issue: Memory Not Reduced After Pruning -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -**Solution**: Use structured pruning or sparse tensor formats - -See Also --------- - -- ``nodelens.pruning`` - Pruning API entry point -- :doc:`experiments` - Pruning experiments guide -- Repository examples and configs - Example code +- Compare strategies at the same pruning granularity. +- Include random and magnitude controls. +- Keep calibration and evaluation data explicit in the config. +- Treat pruning as an intervention when the goal is interpretability, not only + as a compression benchmark. +- Store configs with results so runs can be audited later. diff --git a/docs/source/user_guide/quickstart.rst b/docs/source/user_guide/quickstart.rst index 7dab6635..aa604016 100644 --- a/docs/source/user_guide/quickstart.rst +++ b/docs/source/user_guide/quickstart.rst @@ -1,7 +1,9 @@ Quickstart Guide ================ -This guide will get you up and running with NodeLens in minutes. +This guide shows the shortest path from installation to a working NodeLens +experiment. Most users should start with a YAML config and the shared runner, +then move to direct Python APIs only when they need a custom workflow. .. contents:: Table of Contents :local: @@ -10,429 +12,136 @@ This guide will get you up and running with NodeLens in minutes. Installation ------------ -Basic Installation -~~~~~~~~~~~~~~~~~~ +Basic installation: .. code-block:: bash - # Clone the repository - git clone https://github.com/KempnerInstitute/nodelens.git - cd nodelens - - # Install the package + git clone https://github.com/KempnerInstitute/NodeLens.git + cd NodeLens pip install -e . -Full Installation -~~~~~~~~~~~~~~~~~ +Full installation with optional dependencies: .. code-block:: bash - # Install with all optional dependencies pip install -e .[all] - # Or install specific extras - pip install -e .[train] # Training and large-model utilities - pip install -e .[all] # Development and training extras - pip install -e .[docs] # Documentation building - -Your First Experiment ---------------------- - -1. Basic Metric Computation -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: python - - import torch - from nodelens.metrics import RayleighQuotient - - # Create some sample data - inputs = torch.randn(100, 512) # 100 samples, 512 features - weights = torch.randn(256, 512) # 256 neurons, 512 input features - - # Compute Rayleigh Quotient - rq = RayleighQuotient() - scores = rq.compute(inputs=inputs, weights=weights) - - print(f"RQ scores shape: {scores.shape}") # (256,) - print(f"Mean RQ: {scores.mean():.4f}") - -2. Using a Pre-trained Model -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: python - - from nodelens.models import ModelWrapper - from nodelens.metrics import get_metric - import torchvision.models as models +Run A Config +------------ - # Load a pre-trained ResNet - model = models.resnet18(pretrained=True) +The main entry point is ``scripts/run_experiment.py``. It loads a YAML config, +creates the requested model and dataset, computes metrics, and writes results to +a timestamped output directory. - # Wrap it for metric computation - wrapped_model = ModelWrapper( - model, - tracked_layers=["layer1.0.conv1", "layer2.0.conv1", "layer3.0.conv1"] - ) +.. code-block:: bash - # Create sample input - x = torch.randn(32, 3, 224, 224) + python scripts/run_experiment.py --config configs/examples/mnist_basic.yaml - # Forward pass and collect activations - output, activations = wrapped_model.forward_with_activations(x) +For a larger vision pruning workflow: - # Compute metrics on specific layer - metric = get_metric("rayleigh_quotient") - layer_name = "layer1.0.conv1" - scores = metric.compute( - inputs=activations[layer_name], - weights=wrapped_model.get_layer_weights()[layer_name] - ) +.. code-block:: bash -3. Running a Pruning Experiment -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + python scripts/run_experiment.py \ + --config configs/vision_prune/resnet18_cifar10_full.yaml -.. code-block:: python +For an LLM channel-analysis workflow: - from nodelens.experiments import ProgressiveDropoutExperiment - from nodelens.experiments.base import ExperimentConfig +.. code-block:: bash - # Configure experiment - config = ExperimentConfig( - name="resnet_pruning_demo", - model_name="resnet18", - dataset_name="cifar10", + python scripts/run_experiment.py \ + --config configs/prune_llm/llama3_8b_unified.yaml - # Metrics to track - metrics=["rayleigh_quotient", "mutual_information"], +Output Layout +------------- - # Pruning settings - dropout_rates=[0.0, 0.3, 0.5, 0.7, 0.9], - pruning_strategy="low", # Prune low RQ neurons +Experiment outputs are written under the configured output directory or the +``--base-output-dir`` argument. A typical job directory contains: - # Training settings - epochs=10, - batch_size=128, - learning_rate=0.1 - ) +.. code-block:: text - # Run experiment - experiment = ProgressiveDropoutExperiment(config) - results = experiment.run() + experiment_config.yaml + logs/ + results/ + figures/ + analysis/ - # Analyze results - print("Accuracy at different sparsity levels:") - for rate, acc in results["accuracy"].items(): - print(f" Dropout {rate}: {acc:.2%}") +Use ``results/`` for numeric outputs, ``figures/`` for generated plots, and +``experiment_config.yaml`` to confirm the exact settings used by the run. -Common Use Cases ----------------- +Use Metrics Directly +-------------------- -Comparing Network Architectures -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +If you already have layer inputs and weights, call metrics directly: .. code-block:: python - from nodelens.experiments import ExperimentRunner - from nodelens.experiments.base import ExperimentConfig - - # Define experiments for different architectures - architectures = ["resnet18", "vgg16", "efficientnet_b0"] - configs = [] - - for arch in architectures: - configs.append(ExperimentConfig( - name=f"compare_{arch}", - model_name=arch, - dataset_name="cifar10", - metrics=["rayleigh_quotient", "cka", "spectral_analysis"], - epochs=50 - )) - - # Run all experiments - runner = ExperimentRunner(configs, parallel=True) - results = runner.run() - - # Generate comparison report - runner.generate_report("architecture_comparison.html") + from nodelens.metrics import get_metric, list_metrics -Analyzing Layer Importance -~~~~~~~~~~~~~~~~~~~~~~~~~~ + print(list_metrics()) -.. code-block:: python - - from nodelens.experiments import LayerIsolatedPruningExperiment - from nodelens.analysis import LayerImportanceAnalyzer - - config = ExperimentConfig( - name="layer_importance_analysis", - model_name="resnet50", - dataset_name="imagenet", - metrics=["rayleigh_quotient"], - dropout_rates=[0.0, 0.5, 0.9], - isolation_mode="sequential" - ) + metric = get_metric("rayleigh_quotient") + scores = metric.compute(inputs=layer_inputs, weights=layer_weights) - # Run layer-isolated pruning - experiment = LayerIsolatedPruningExperiment(config) - results = experiment.run() +Activation statistics, information metrics, redundancy metrics, gradient-based +scores, and SCAR loss-proxy metrics are all available through the same +``get_metric`` registry. - # Analyze layer importance - analyzer = LayerImportanceAnalyzer(results) - importance_scores = analyzer.compute_importance() - analyzer.plot_importance_heatmap() +Wrap A Model +------------ -Custom Metric Implementation -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +``ModelWrapper`` can track activations for selected layers of a PyTorch model. .. code-block:: python - from nodelens.metrics.base import BaseMetric - from nodelens.core import register_metric import torch + import torchvision.models as models + from nodelens.models import ModelWrapper - @register_metric("gradient_alignment") - class GradientAlignment(BaseMetric): - """Measures alignment between weights and gradients.""" - - def __init__(self): - super().__init__(name="gradient_alignment") - - @property - def requires_inputs(self) -> bool: - return False - - @property - def requires_weights(self) -> bool: - return True - - @property - def requires_outputs(self) -> bool: - return False - - def compute(self, weights=None, gradients=None, **kwargs): - # Compute cosine similarity between weights and gradients - w_flat = weights.view(weights.size(0), -1) - g_flat = gradients.view(gradients.size(0), -1) - - # Normalize - w_norm = torch.nn.functional.normalize(w_flat, dim=1) - g_norm = torch.nn.functional.normalize(g_flat, dim=1) - - # Cosine similarity - alignment = (w_norm * g_norm).sum(dim=1) - - return alignment - - # Use the custom metric - metric = get_metric("gradient_alignment") - scores = metric.compute(weights=weights, gradients=gradients) - -Working with Configuration Files --------------------------------- - -Creating a Configuration File -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: yaml - - # experiment_config.yaml - name: my_experiment - description: Testing different pruning strategies - - # Model settings - model_name: resnet18 - pretrained: true - - # Dataset settings - dataset_name: cifar10 - batch_size: 128 - num_workers: 4 - - # Training settings - epochs: 100 - learning_rate: 0.1 - optimizer: sgd - lr_schedule: cosine - - # Metrics to compute - metrics: - - rayleigh_quotient - - mutual_information - - cka - - metric_configs: - rayleigh_quotient: - scale_by_norm: true - cka: - kernel: rbf - sigma: 1.0 - - # Pruning settings - dropout_rates: [0.0, 0.2, 0.4, 0.6, 0.8] - pruning_strategy: magnitude - pruning_mode: global_joint - -Loading and Running -~~~~~~~~~~~~~~~~~~~ - -.. code-block:: python - - from nodelens.infrastructure.configuration import load_config - from nodelens.experiments import create_experiment - - # Load configuration - config = load_config("experiment_config.yaml") - - # Create and run experiment - experiment = create_experiment(config) - results = experiment.run() - - # Save results - experiment.save_results("results/my_experiment/") - -Visualization and Analysis --------------------------- - -Plotting Metrics -~~~~~~~~~~~~~~~~ - -.. code-block:: python - - from nodelens.analysis import MetricVisualizer - - # Load results - results = load_results("results/my_experiment/") - - # Create visualizer - viz = MetricVisualizer(results) - - # Plot metric evolution - viz.plot_metric_vs_sparsity( - metric="rayleigh_quotient", - layers=["layer1.0.conv1", "layer2.0.conv1"], - save_path="rq_vs_sparsity.png" - ) - - # Plot layer-wise comparison - viz.plot_layer_comparison( - metrics=["rayleigh_quotient", "mutual_information"], - sparsity=0.5, - save_path="layer_comparison.png" - ) - -Generating Reports -~~~~~~~~~~~~~~~~~~ - -.. code-block:: python - - from nodelens.analysis import ReportGenerator - - # Generate comprehensive report - generator = ReportGenerator(results) - generator.generate_html_report( - output_path="report.html", - include_plots=True, - include_tables=True, - include_config=True - ) - -Tips and Tricks ---------------- - -1. **Memory Management** - - .. code-block:: python - - # Use CPU for large matrix operations - config = ExperimentConfig( - force_cpu_for_large_metric_ops=True, - cpu_threshold=1e7 - ) - -2. **Faster Experiments** - - .. code-block:: python - - # Reduce computation for quick tests - config = ExperimentConfig( - epochs=5, # Fewer epochs - eval_batches=10, # Evaluate on subset - metrics=["rayleigh_quotient"], # Fewer metrics - dropout_rates=[0.0, 0.5, 0.9] # Fewer rates - ) - -3. **Distributed Training** - - .. code-block:: python - - # Enable distributed training - config = ExperimentConfig( - distributed=True, - backend="nccl", - world_size=4 - ) - -4. **Debugging** - - .. code-block:: python - - # Enable debug logging - import logging - logging.basicConfig(level=logging.DEBUG) - - # Use smaller dataset - config = ExperimentConfig( - dataset_name="mnist", # Smaller than CIFAR - batch_size=32, - debug_mode=True - ) + model = models.resnet18(weights=None) + wrapper = ModelWrapper(model, tracked_layers=["layer1.0.conv1"]) + + x = torch.randn(4, 3, 224, 224) + output, activations = wrapper.forward_with_activations(x) + weights = wrapper.get_layer_weights(layers=["layer1.0.conv1"]) + + print(activations["layer1.0.conv1"].shape) + print(weights["layer1.0.conv1"].shape) + +Choose A Starting Config +------------------------ + +.. list-table:: + :header-rows: 1 + + * - Goal + - Starting point + * - Fast smoke test + - ``configs/examples/mnist_basic.yaml`` + * - Small vision pruning run + - ``configs/examples/resnet_pruning.yaml`` + * - Full ResNet/CIFAR-10 pruning workflow + - ``configs/vision_prune/resnet18_cifar10_full.yaml`` + * - Fast LLM smoke test + - ``configs/examples/gpt2_fast_test.yaml`` + * - LLM FFN channel analysis + - ``configs/prune_llm/llama3_8b_unified.yaml`` + +Common Adjustments +------------------ + +- Use ``--base-output-dir outputs/my_run`` to keep results in a predictable + location. +- Reduce batch size or the number of evaluation batches when debugging memory + issues. +- Start from an existing config and change one part at a time: model, dataset, + tracked layers, metrics, or pruning settings. +- For LLM runs, confirm model access and cache location before launching long + jobs. Next Steps ---------- -- :doc:`/user_guide/experiments` - Detailed experiment guide -- :doc:`/user_guide/metrics` - All available metrics -- :doc:`/user_guide/configuration` - Configuration options -- Repository examples and configs - Advanced examples -- Top-level README - Current API entry points - -Common Issues -------------- - -**CUDA Out of Memory** - -.. code-block:: python - - # Reduce batch size - config.batch_size = 32 - - # Force CPU for metrics - config.force_cpu_for_large_metric_ops = True - - # Use gradient accumulation - config.gradient_accumulation = 4 - -**Slow Metric Computation** - -.. code-block:: python - - # Use faster metrics - config.metrics = ["rayleigh_quotient"] # Fast - # Avoid: ["pid", "knn_mi"] # Slow - - # Compute on subset - config.metric_sample_size = 1000 - -**Import Errors** - -.. code-block:: bash - - # Ensure you're in the right directory - cd nodelens - - # Reinstall in development mode - pip install -e . - - # Check installation - python -c "import nodelens; print(nodelens.__version__)" +- :doc:`/user_guide/experiments` - Experiment workflow details +- :doc:`/user_guide/metrics` - Available metrics and inputs +- :doc:`/user_guide/pruning` - Pruning strategies and mask behavior +- :doc:`/user_guide/configuration` - YAML configuration options diff --git a/docs/usage.md b/docs/usage.md index 29a58606..3eb3abd4 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -39,7 +39,7 @@ See [the config catalog](../configs/README.md) for a fuller list. | `alignment_analysis` | General activation/alignment metrics and small-model pruning | `configs/examples/*.yaml` | | `cluster_analysis` | Vision channel clustering, halo analysis, cascade tests, and structured pruning | `configs/vision_prune/*.yaml` | | `llm_alignment` | Hugging Face LLM metrics, supernodes, SCAR, and structured FFN pruning | `configs/prune_llm/*.yaml` | -| `vision_synergy` | Older focused vision synergy experiments | `configs/examples/vision_synergy.yaml` | +| `vision_synergy` | Focused vision synergy experiments | `configs/examples/vision_synergy.yaml` | ## Configuration Structure diff --git a/projects/README.md b/projects/README.md index 3d28b967..717f2d4a 100644 --- a/projects/README.md +++ b/projects/README.md @@ -1,10 +1,16 @@ -# Paper Projects +# Project Workflows -This directory contains paper-specific release material layered on top of the -general `nodelens` package. Keep reusable code in `src/nodelens/`; keep -paper-specific commands, artifact manifests, and reproduction notes here. +This directory contains applied workflows built on top of the reusable +`nodelens` package. Each project folder should explain what analysis it runs, +which configs are relevant, what outputs it produces, and how to inspect or +regenerate those outputs. + +Reusable code belongs in `src/nodelens/`. Project folders should stay focused +on reproducible usage: configs, small helper scripts, artifact descriptions, +and project-specific notes that help readers connect the public results to the +shared library. ## Projects -- `supernodes_scar/`: release material for "Supernodes and Halos: - Loss-Critical Hubs in LLM Feed-Forward Layers". +- `supernodes_scar/`: workflow for the Supernodes and SCAR analysis of + loss-sensitive FFN channels in LLMs. diff --git a/projects/supernodes_scar/ARTIFACTS.md b/projects/supernodes_scar/ARTIFACTS.md index 0da9ba37..35682490 100644 --- a/projects/supernodes_scar/ARTIFACTS.md +++ b/projects/supernodes_scar/ARTIFACTS.md @@ -1,67 +1,82 @@ -# Artifact Plan +# Artifact Contents -This document defines what should be shared alongside the paper and why. +The Supernodes and SCAR artifact dataset contains derived outputs that help +readers inspect the reported results without rerunning every large-model job. +It does not contain model weights, raw benchmark datasets, checkpoints, or +cluster logs. -## Recommended Public Artifacts +## Directory Layout -`paper_artifacts/figures/` -: PNG figures used in the arXiv paper. These are useful for quick inspection -and for checking that regenerated figures match the submitted version. +```text +README.md +MANIFEST.json +MANIFEST.sha256 +metadata/ +configs/ +paper_artifacts/ +paper_scripts/ +raw_results/ +``` -`paper_artifacts/tables/` -: LaTeX table fragments used by the paper. +`MANIFEST.json` +: Machine-readable inventory. Each entry records the relative path, size, +SHA256 checksum, and artifact group. -`paper_artifacts/experiments/` -: Compact JSON summaries used for figure/table generation. These are derived -statistics, not raw datasets. +`MANIFEST.sha256` +: Checksum file that can be verified with `sha256sum -c MANIFEST.sha256`. -`raw_results/` -: Locked result JSON files copied from the runs used by the paper, sanitized -and compressed as `.json.gz`. The public paths are stable; internal cluster -paths are not included. +`metadata/` +: Dataset-level metadata, source-result mapping, and bundle-generation +information. `configs/` -: Paper experiment configs needed to rerun metric estimation, pruning, and -evaluation. +: Experiment configs for metric estimation, pruning, ablation, and 70B +validation runs. -`paper_scripts/` -: Active figure/table aggregation scripts used by the current draft. +`paper_artifacts/figures/` +: PNG figures used for quick visual inspection. -`metadata/` -: Release metadata, checksums, git commit, and manifest files. +`paper_artifacts/tables/` +: LaTeX table fragments generated from the locked results. -## Large Or Restricted Items +`paper_artifacts/experiments/` +: Compact JSON summaries used by figure and table scripts. -The public artifact repository should not contain model weights. Users should -download models through their original providers and accept the relevant model -licenses. The artifact repository should also not duplicate raw public -benchmarks; instead, document dataset names and versions in the dataset card. +`paper_scripts/` +: Figure and table aggregation scripts that operate on the included summaries +or on compatible local result folders. -## Hugging Face vs Zenodo +`raw_results/` +: Selected locked result JSON files, sanitized and compressed as `.json.gz`. +These are derived statistics from completed runs, not raw calibration data. -Hugging Face Datasets is a good fit for browsable, versioned ML artifacts that -users may download programmatically. Zenodo is better for a citable archival -snapshot with a DOI. The strongest release pattern is: +## How To Use The Bundle -1. GitHub release tag for code. -2. Hugging Face dataset repo for result artifacts. -3. Zenodo archive of the GitHub release, plus optionally the artifact bundle, - for DOI-based citation. +After downloading the dataset, verify the checksums: -## Minimal Artifact Schema +```bash +sha256sum -c MANIFEST.sha256 +``` -Each generated bundle should include: +Inspect the source mapping: -```text -README.md -MANIFEST.json -MANIFEST.sha256 -metadata/release_metadata.json -configs/ -paper_artifacts/ -paper_scripts/ -raw_results/ +```bash +python -m json.tool metadata/result_sources.json | less ``` -`MANIFEST.json` should record relative path, size, SHA256, and artifact group -for every file. It should not record private absolute paths. +Use `raw_results/` for exact numeric values, `paper_artifacts/experiments/` for +compact figure inputs, and `configs/` to rerun matching experiments with the +current NodeLens code. + +## Excluded Files + +The artifact dataset intentionally excludes: + +- model weights and tokenizer files +- raw public benchmark datasets +- checkpoints and optimizer states +- Python caches and compiled bytecode +- LaTeX build products +- local absolute paths, scheduler logs, and access tokens + +Models and datasets should be downloaded from their original providers. diff --git a/projects/supernodes_scar/README.md b/projects/supernodes_scar/README.md index a7ee797b..93dc1a19 100644 --- a/projects/supernodes_scar/README.md +++ b/projects/supernodes_scar/README.md @@ -1,28 +1,49 @@ -# Supernodes and Halos Release +# Supernodes and SCAR Project Workflow -This folder is the public-release entry point for the paper: +This folder documents the NodeLens workflow used for: > Supernodes and Halos: Loss-Critical Hubs in LLM Feed-Forward Layers -The reusable implementation lives in the main `nodelens` package. This project -folder records the paper-specific configs, artifact layout, and release process. +The reusable implementation lives in `src/nodelens`. This project folder points +to the configs, helper scripts, and derived artifacts used to reproduce the +paper's LLM channel-analysis and structured-pruning results. -## What To Release +## What This Workflow Does -The public release should have two parts: +The workflow studies feed-forward network channels in causal language models. +It uses NodeLens to: -1. A GitHub release/tag for code, configs, and reproduction scripts. -2. A Hugging Face dataset repository for derived artifacts: result JSON files, - paper figures, LaTeX tables, checksums, and a dataset card. +- capture FFN activations and gradients on a calibration set +- compute channel metrics such as activation power, Taylor scores, curvature, + and the SCAR loss proxy +- identify small loss-sensitive channel cores and compare them with + activation-defined outliers +- run structured FFN pruning and ablation probes +- aggregate numeric summaries into paper figures, tables, and manifest files -This split is intentional. Code belongs in GitHub; generated experiment outputs -and larger derived artifacts are easier to consume and version through the -Hugging Face Hub. A Zenodo DOI can additionally archive the GitHub release for -citation stability. +The halo analysis is a secondary diagnostic layer on top of the same metric +outputs. It estimates local write-overlap and redundancy structure around the +loss-sensitive core. -## Reproduce The Main Runs +## Main Configs -Install the package: +```text +configs/prune_llm/llama3_8b_unified.yaml +configs/prune_llm/mistral_7b_unified.yaml +configs/prune_llm/llama2_7b_unified.yaml +configs/prune_llm/qwen2_7b_unified.yaml +configs/prune_llm/llama3_70b_scale_pruning_curves.yaml +configs/prune_llm/llama3_70b_scale_mechanism.yaml +configs/prune_llm/llama3_70b_scale_benchmarks_50_papersafe.yaml +``` + +The 7B/8B runs are intended for one A100/H100-class GPU. The 70B configs are +targeted validation runs and need substantially more memory or model +parallelism, depending on the environment. + +## Run A Config + +Install the package from the repository root: ```bash conda env create -f environment.yml @@ -30,34 +51,43 @@ conda activate nodelens pip install -e . ``` -Run a paper config: +Run a project config: ```bash python scripts/run_experiment.py \ --config configs/prune_llm/llama3_8b_unified.yaml \ - --base-output-dir /path/to/results + --base-output-dir outputs/supernodes_scar_runs ``` -Important paper configs include: +Each run writes a timestamped job directory containing the config copy, logs, +result JSON files, and generated figures. -```text -configs/prune_llm/llama3_8b_unified.yaml -configs/prune_llm/mistral_7b_unified.yaml -configs/prune_llm/llama2_7b_unified.yaml -configs/prune_llm/qwen2_7b_unified.yaml -configs/prune_llm/llama3_70b_scale_pruning_curves.yaml -configs/prune_llm/llama3_70b_scale_mechanism.yaml -configs/prune_llm/llama3_70b_scale_benchmarks_50_papersafe.yaml +## Inspect Existing Artifacts + +The public artifact dataset contains derived outputs rather than model weights +or raw datasets. It includes compact result JSON files, selected figure/table +inputs, checksums, and metadata describing which public artifact path +corresponds to each paper result. + +Download and inspect it with: + +```bash +huggingface-cli download hsafaai/supernodes-scar-artifacts \ + --repo-type dataset \ + --local-dir supernodes_scar_artifacts + +cd supernodes_scar_artifacts +python -m json.tool MANIFEST.json | head +sha256sum -c MANIFEST.sha256 ``` -The 7B/8B runs are feasible on one A100/H100-class GPU. The 70B validation is a -targeted large-model check and needs substantially more memory or model -parallelism depending on the environment. +See `ARTIFACTS.md` for the artifact layout and `REPRODUCIBILITY.md` for the +local rerun workflow. -## Build The Artifact Bundle +## Build A Local Artifact Bundle -The artifact bundle is prepared locally under `outputs/`, which is ignored by -git: +If the expected result folders are present locally, the helper script can build +a clean derived-artifact directory under `outputs/`: ```bash python projects/supernodes_scar/scripts/prepare_hf_artifacts.py \ @@ -68,55 +98,19 @@ python projects/supernodes_scar/scripts/verify_hf_artifacts.py \ outputs/supernodes_scar_hf ``` -The script copies only releaseable material into a clean directory: - -- paper figures and LaTeX tables -- generated numeric summaries and JSON diagnostics -- selected locked result JSON files, sanitized and compressed as `.json.gz` -- experiment configs used by the paper -- active paper-side figure/table scripts -- checksums and a machine-readable manifest -- a Hugging Face dataset-card README - -It intentionally excludes model weights, raw calibration datasets, logs, -checkpoints, Python caches, LaTeX build files, and internal absolute paths. - -## Upload To Hugging Face - -After inspecting `outputs/supernodes_scar_hf`, upload it as a dataset repo: - -```bash -huggingface-cli login -huggingface-cli repo create supernodes-scar-artifacts --type dataset -huggingface-cli upload hsafaai/supernodes-scar-artifacts \ - outputs/supernodes_scar_hf \ - --repo-type dataset -``` - -For very large bundles, use the `huggingface_hub` large-folder upload workflow -instead of the simple CLI upload. - -## What Not To Upload - -Do not upload: +The verifier checks checksums and scans the staged bundle for files that should +not be included in a public derived-artifact dataset, such as Python caches, +LaTeX build files, checkpoints, raw datasets, model weights, and local absolute +paths. -- Llama, Mistral, Qwen, or OLMo model weights. -- Raw WikiText-2, C4, MMLU, or LM Evaluation Harness datasets. -- Cluster logs, SLURM stdout/stderr, checkpoints, caches, or private paths. -- Any file containing access tokens, usernames beyond public author metadata, - or absolute Harvard cluster paths. +## What Is Not Included -## Release Checklist +This repository and the public artifact dataset do not include: -- `python -m pip install -e . --no-deps --dry-run` succeeds. -- `PYTHONPATH=src python -c "import nodelens; print(nodelens.__version__)"` - succeeds. -- The artifact bundle has no `.pyc`, `__pycache__`, `.aux`, `.log`, `.out`, - model checkpoint, or raw dataset files. -- A private-path scan over both plain text files and compressed `.json.gz` - files returns no internal cluster paths. -- `MANIFEST.sha256` verifies all staged artifacts. -- GitHub release tag, Hugging Face dataset revision, and arXiv version are - recorded together in the dataset card. +- Llama, Mistral, Qwen, or OLMo model weights +- raw WikiText-2, C4, MMLU, or LM Evaluation Harness datasets +- cluster logs, SLURM stdout/stderr, checkpoints, or cache directories +- private local paths or access tokens -See `REPRODUCIBILITY.md` for the local rerun and figure-regeneration workflow. +Users should obtain model weights and benchmark datasets from their original +providers and follow the relevant licenses. diff --git a/projects/supernodes_scar/REPRODUCIBILITY.md b/projects/supernodes_scar/REPRODUCIBILITY.md index da1bedbd..1cad753e 100644 --- a/projects/supernodes_scar/REPRODUCIBILITY.md +++ b/projects/supernodes_scar/REPRODUCIBILITY.md @@ -1,17 +1,13 @@ # Reproducibility Notes -This page describes the release workflow for the paper. It separates three -tasks: rerunning experiments, rebuilding figures and tables from locked outputs, -and rebuilding the arXiv PDF. +This page describes how to rerun the Supernodes and SCAR workflow with +NodeLens and how to inspect the derived artifacts. It focuses on public inputs: +repository configs, public model identifiers, public datasets, and the artifact +bundle. -The public GitHub repository contains the reusable code, configs, project -metadata, and artifact-packaging scripts. The private paper-source checkout may -also contain draft-only LaTeX files and maintainer scripts; those paths are -called out below when they are needed. +## 1. Install NodeLens -## 1. Rerun Experiments - -Install the code: +From the repository root: ```bash conda env create -f environment.yml @@ -19,66 +15,68 @@ conda activate nodelens pip install -e . ``` -Run the main 8B config: +Install optional dependencies when building documentation, running large LLM +experiments, or using all plotting utilities: + +```bash +pip install -e .[all] +``` + +## 2. Run A Paper Config + +Run the main Llama-3.1-8B workflow: ```bash python scripts/run_experiment.py \ --config configs/prune_llm/llama3_8b_unified.yaml \ - --base-output-dir /path/to/results/Prune_LLM + --base-output-dir outputs/supernodes_scar_runs ``` -The main paper configs are listed in `projects/supernodes_scar/README.md`. -Large runs, especially the 70B validation, require substantial GPU memory and -should usually be launched through the local cluster workflow. +Useful related configs are listed in `projects/supernodes_scar/README.md`. +The 70B configs are targeted validation runs and require a large-memory or +parallel model-loading setup. -## 2. Rebuild Figures And Tables From Locked Outputs +## 3. Inspect Outputs -The paper figures and tables are regenerated from locked result JSON files. The -release bundle stores those JSON files under `raw_results/` as sanitized -`.json.gz` files and records their public names in: +Each experiment writes a timestamped job directory under the selected +`--base-output-dir`. A typical run contains: ```text -metadata/result_sources.json +experiment_config.yaml +logs/ +results/ +figures/ +analysis/ ``` -For maintainers with the private paper-source checkout, the active paper scripts -can be rerun against the original locked output folders: - -```bash -python drafts/LLM_prune/paper/scripts/regenerate_fig1_overview.py -python drafts/LLM_prune/paper/scripts/regenerate_fig2_halo.py -python drafts/LLM_prune/paper/scripts/generate_70b_scale_figures.py -python drafts/LLM_prune/paper/scripts/generate_lp_vs_activation_overlap_figure.py -python drafts/LLM_prune/paper/scripts/generate_lp_vs_activation_supernode_figure.py -python drafts/LLM_prune/paper/scripts/collect_paper_artifacts.py \ - --results-base /path/to/results/Prune_LLM/PAPER \ - --draft-dir drafts/LLM_prune -``` +The most important outputs are the per-layer metric arrays, pruning summaries, +ablation results, and generated figure inputs under `results/` and `analysis/`. -The public artifact bundle also includes the active paper scripts under -`paper_scripts/`, plus compact derived summaries under -`paper_artifacts/experiments/`. Some scripts use path constants because they -were designed for the locked local paper tree; update those constants or run the -script from a checkout that has the original output folders available. +## 4. Use The Public Artifact Dataset -## 3. Rebuild The Paper +The artifact dataset provides derived outputs from the runs used in the paper. +Download it with: -For maintainers with the private paper-source checkout, the paper has one shared -body file: - -```text -drafts/LLM_prune/paper_body.tex +```bash +huggingface-cli download hsafaai/supernodes-scar-artifacts \ + --repo-type dataset \ + --local-dir supernodes_scar_artifacts ``` -Build the arXiv and anonymous versions: +Verify the downloaded files: ```bash -cd drafts/LLM_prune -./compile_pdf.sh paper_arxiv.tex -./compile_pdf.sh paper_icml.tex +cd supernodes_scar_artifacts +sha256sum -c MANIFEST.sha256 +python -m json.tool MANIFEST.json | head ``` -## 4. Build And Verify The Hugging Face Bundle +`metadata/result_sources.json` maps paper-facing result names to public +artifact paths. See `ARTIFACTS.md` for the full layout. + +## 5. Build A Local Derived-Artifact Bundle + +If compatible result folders are available locally, build a clean bundle with: ```bash python projects/supernodes_scar/scripts/prepare_hf_artifacts.py \ @@ -89,22 +87,12 @@ python projects/supernodes_scar/scripts/verify_hf_artifacts.py \ outputs/supernodes_scar_hf ``` -The verifier checks: - -- `MANIFEST.sha256` -- absence of Python caches, LaTeX build files, PDFs, checkpoints, model weights, - and raw datasets -- absence of private local paths in plain text and compressed `.json.gz` files - -## 5. Local Storage Policy - -Uploading to Hugging Face is not a replacement for local retention. Maintainers -should keep: +The verification step checks the manifest, checksums, and exclusion rules for +public derived artifacts. -- the frozen HF bundle under `outputs/supernodes_scar_hf` -- the original locked result folders used to regenerate paper figures -- the arXiv source bundle under `drafts/LLM_prune/arxiv_bundle.tar.gz` -- the Git commit or release tag associated with the upload +## Notes On External Inputs -This lets future work continue from the exact paper state while the public HF -repo remains a clean, portable snapshot. +Model weights and benchmark datasets are not redistributed here. Users should +download them from their original providers and follow the relevant licenses. +Calibration and evaluation choices are encoded in the YAML configs whenever +they are needed for reproduction. diff --git a/projects/supernodes_scar/hf_dataset_card.md b/projects/supernodes_scar/hf_dataset_card.md index 2b11a992..af4e95dd 100644 --- a/projects/supernodes_scar/hf_dataset_card.md +++ b/projects/supernodes_scar/hf_dataset_card.md @@ -29,7 +29,7 @@ paper scripts, and checksums. ```text MANIFEST.json MANIFEST.sha256 -metadata/release_metadata.json +metadata/ configs/ paper_artifacts/ paper_scripts/ @@ -50,10 +50,10 @@ cd supernodes_scar_artifacts sha256sum -c MANIFEST.sha256 ``` -The corresponding code release is available at: +The corresponding code repository is available at: ```text -https://github.com/KempnerInstitute/nodelens +https://github.com/KempnerInstitute/NodeLens ``` Use the configs in `configs/` with `scripts/run_experiment.py` from the code