<!DOCTYPE html> <html lang="en" data-content_root="../" > <head> <meta charset="utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" /> <meta property="og:title" content="7.6. Random Projection" /> <meta property="og:type" content="website" /> <meta property="og:url" content="https://scikit-learn/stable/modules/random_projection.html" /> <meta property="og:site_name" content="scikit-learn" /> <meta property="og:description" content="The sklearn.random_projection module implements a simple and computationally efficient way to reduce the dimensionality of the data by trading a controlled amount of accuracy (as additional varianc..." /> <meta property="og:image" content="https://scikit-learn/stable/_images/sphx_glr_plot_johnson_lindenstrauss_bound_001.png" /> <meta property="og:image:alt" content="scikit-learn" /> <meta name="description" content="The sklearn.random_projection module implements a simple and computationally efficient way to reduce the dimensionality of the data by trading a controlled amount of accuracy (as additional varianc..." /> <title>7.6. Random Projection — scikit-learn 1.7.dev0 documentation</title> <script data-cfasync="false"> document.documentElement.dataset.mode = localStorage.getItem("mode") || ""; document.documentElement.dataset.theme = localStorage.getItem("theme") || ""; </script> <!-- this give us a css class that will be invisible only if js is disabled --> <noscript> <style> .pst-js-only { display: none !important; } </style> </noscript> <!-- Loaded before other Sphinx assets --> <link href="../_static/styles/theme.css?digest=8878045cc6db502f8baf" rel="stylesheet" /> <link href="../_static/styles/pydata-sphinx-theme.css?digest=8878045cc6db502f8baf" rel="stylesheet" /> <link rel="stylesheet" type="text/css" href="../_static/pygments.css?v=8f2a1f02" /> <link rel="stylesheet" type="text/css" href="../_static/copybutton.css?v=76b2166b" /> <link rel="stylesheet" type="text/css" href="../_static/plot_directive.css" /> <link rel="stylesheet" type="text/css" href="https://fonts.googleapis.com/css?family=Vibur" /> <link rel="stylesheet" type="text/css" href="../_static/jupyterlite_sphinx.css?v=8ee2c72c" /> <link rel="stylesheet" type="text/css" href="../_static/sg_gallery.css?v=d2d258e8" /> <link rel="stylesheet" type="text/css" href="../_static/sg_gallery-binder.css?v=f4aeca0c" /> <link rel="stylesheet" type="text/css" href="../_static/sg_gallery-dataframe.css?v=2082cf3c" /> <link rel="stylesheet" type="text/css" href="../_static/sg_gallery-rendered-html.css?v=1277b6f3" /> <link rel="stylesheet" type="text/css" href="../_static/sphinx-design.min.css?v=95c83b7e" /> <link rel="stylesheet" type="text/css" href="../_static/styles/colors.css?v=cc94ab7d" /> <link rel="stylesheet" type="text/css" href="../_static/styles/custom.css?v=8f525996" /> <!-- So that users can add custom icons --> <script src="../_static/scripts/fontawesome.js?digest=8878045cc6db502f8baf"></script> <!-- Pre-loaded scripts that we'll load fully later --> <link rel="preload" as="script" href="../_static/scripts/bootstrap.js?digest=8878045cc6db502f8baf" /> <link rel="preload" as="script" href="../_static/scripts/pydata-sphinx-theme.js?digest=8878045cc6db502f8baf" /> <script src="../_static/documentation_options.js?v=473747f4"></script> <script src="../_static/doctools.js?v=9bcbadda"></script> <script src="../_static/sphinx_highlight.js?v=dc90522c"></script> <script src="../_static/clipboard.min.js?v=a7894cd8"></script> <script src="../_static/copybutton.js?v=97f0b27d"></script> <script src="../_static/jupyterlite_sphinx.js?v=96e329c5"></script> <script src="../_static/design-tabs.js?v=f930bc37"></script> <script data-domain="scikit-learn.org" defer="defer" src="https://views.scientific-python.org/js/script.js"></script> <script async="async" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml.js"></script> <script>DOCUMENTATION_OPTIONS.pagename = 'modules/random_projection';</script> <script> DOCUMENTATION_OPTIONS.theme_version = '0.16.1'; DOCUMENTATION_OPTIONS.theme_switcher_json_url = 'https://scikit-learn.org/dev/_static/versions.json'; DOCUMENTATION_OPTIONS.theme_switcher_version_match = '1.7.dev0'; DOCUMENTATION_OPTIONS.show_version_warning_banner = true; </script> <script src="../_static/scripts/dropdown.js?v=d6825577"></script> <script src="../_static/scripts/version-switcher.js?v=a6dd8357"></script> <script src="../_static/scripts/sg_plotly_resize.js?v=2167d4db"></script> <link rel="canonical" href="https://scikit-learn.org/stable/modules/random_projection.html" /> <link rel="icon" href="../_static/favicon.ico"/> <link rel="author" title="About these documents" href="../about.html" /> <link rel="search" title="Search" href="../search.html" /> <link rel="next" title="7.7. Kernel Approximation" href="kernel_approximation.html" /> <link rel="prev" title="7.5. Unsupervised dimensionality reduction" href="unsupervised_reduction.html" /> <meta name="viewport" content="width=device-width, initial-scale=1"/> <meta name="docsearch:language" content="en"/> <meta name="docsearch:version" content="1.7" /> </head> <body data-bs-spy="scroll" data-bs-target=".bd-toc-nav" data-offset="180" data-bs-root-margin="0px 0px -60%" data-default-mode=""> <div id="pst-skip-link" class="skip-link d-print-none"><a href="#main-content">Skip to main content</a></div> <div id="pst-scroll-pixel-helper"></div> <button type="button" class="btn rounded-pill" id="pst-back-to-top"> <i class="fa-solid fa-arrow-up"></i>Back to top</button> <dialog id="pst-search-dialog"> <form class="bd-search d-flex align-items-center" action="../search.html" method="get"> <i class="fa-solid fa-magnifying-glass"></i> <input type="search" class="form-control" name="q" placeholder="Search the docs ..." aria-label="Search the docs ..." autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"/> <span class="search-button__kbd-shortcut"><kbd class="kbd-shortcut__modifier">Ctrl</kbd>+<kbd>K</kbd></span> </form> </dialog> <div class="pst-async-banner-revealer d-none"> <aside id="bd-header-version-warning" class="d-none d-print-none" aria-label="Version warning"></aside> </div> <header class="bd-header navbar navbar-expand-lg bd-navbar d-print-none"> <div class="bd-header__inner bd-page-width"> <button class="pst-navbar-icon sidebar-toggle primary-toggle" aria-label="Site navigation"> <span class="fa-solid fa-bars"></span> </button> <div class=" navbar-header-items__start"> <div class="navbar-item"> <a class="navbar-brand logo" href="../index.html"> <img src="../_static/scikit-learn-logo-small.png" class="logo__image only-light" alt="scikit-learn homepage"/> <img src="../_static/scikit-learn-logo-small.png" class="logo__image only-dark pst-js-only" alt="scikit-learn homepage"/> </a></div> </div> <div class=" navbar-header-items"> <div class="me-auto navbar-header-items__center"> <div class="navbar-item"> <nav> <ul class="bd-navbar-elements navbar-nav"> <li class="nav-item "> <a class="nav-link nav-internal" href="../install.html"> Install </a> </li> <li class="nav-item current active"> <a class="nav-link nav-internal" href="../user_guide.html"> User Guide </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../api/index.html"> API </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../auto_examples/index.html"> Examples </a> </li> <li class="nav-item "> <a class="nav-link nav-external" href="https://blog.scikit-learn.org/"> Community </a> </li> <li class="nav-item dropdown"> <button class="btn dropdown-toggle nav-item" type="button" data-bs-toggle="dropdown" aria-expanded="false" aria-controls="pst-nav-more-links"> More </button> <ul id="pst-nav-more-links" class="dropdown-menu"> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../getting_started.html"> Getting Started </a> </li> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../whats_new.html"> Release History </a> </li> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../glossary.html"> Glossary </a> </li> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../developers/index.html"> Development </a> </li> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../faq.html"> FAQ </a> </li> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../support.html"> Support </a> </li> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../related_projects.html"> Related Projects </a> </li> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../roadmap.html"> Roadmap </a> </li> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../governance.html"> Governance </a> </li> <li class=" "> <a class="nav-link dropdown-item nav-internal" href="../about.html"> About us </a> </li> </ul> </li> </ul> </nav></div> </div> <div class="navbar-header-items__end"> <div class="navbar-item navbar-persistent--container"> <button class="btn btn-sm pst-navbar-icon search-button search-button__button pst-js-only" title="Search" aria-label="Search" data-bs-placement="bottom" data-bs-toggle="tooltip"> <i class="fa-solid fa-magnifying-glass fa-lg"></i> </button> </div> <div class="navbar-item"> <button class="btn btn-sm nav-link pst-navbar-icon theme-switch-button pst-js-only" aria-label="Color mode" data-bs-title="Color mode" data-bs-placement="bottom" data-bs-toggle="tooltip"> <i class="theme-switch fa-solid fa-sun fa-lg" data-mode="light" title="Light"></i> <i class="theme-switch fa-solid fa-moon fa-lg" data-mode="dark" title="Dark"></i> <i class="theme-switch fa-solid fa-circle-half-stroke fa-lg" data-mode="auto" title="System Settings"></i> </button></div> <div class="navbar-item"><ul class="navbar-icon-links" aria-label="Icon Links"> <li class="nav-item"> <a href="https://github.com/scikit-learn/scikit-learn" title="GitHub" class="nav-link pst-navbar-icon" rel="noopener" target="_blank" data-bs-toggle="tooltip" data-bs-placement="bottom"><i class="fa-brands fa-square-github fa-lg" aria-hidden="true"></i> <span class="sr-only">GitHub</span></a> </li> </ul></div> <div class="navbar-item"> <div class="version-switcher__container dropdown pst-js-only"> <button id="pst-version-switcher-button-2" type="button" class="version-switcher__button btn btn-sm dropdown-toggle" data-bs-toggle="dropdown" aria-haspopup="listbox" aria-controls="pst-version-switcher-list-2" aria-label="Version switcher list" > Choose version <!-- this text may get changed later by javascript --> <span class="caret"></span> </button> <div id="pst-version-switcher-list-2" class="version-switcher__menu dropdown-menu list-group-flush py-0" role="listbox" aria-labelledby="pst-version-switcher-button-2"> <!-- dropdown will be populated by javascript on page load --> </div> </div></div> </div> </div> <div class="navbar-persistent--mobile"> <button class="btn btn-sm pst-navbar-icon search-button search-button__button pst-js-only" title="Search" aria-label="Search" data-bs-placement="bottom" data-bs-toggle="tooltip"> <i class="fa-solid fa-magnifying-glass fa-lg"></i> </button> </div> <button class="pst-navbar-icon sidebar-toggle secondary-toggle" aria-label="On this page"> <span class="fa-solid fa-outdent"></span> </button> </div> </header> <div class="bd-container"> <div class="bd-container__inner bd-page-width"> <dialog id="pst-primary-sidebar-modal"></dialog> <div id="pst-primary-sidebar" class="bd-sidebar-primary bd-sidebar"> <div class="sidebar-header-items sidebar-primary__section"> <div class="sidebar-header-items__center"> <div class="navbar-item"> <nav> <ul class="bd-navbar-elements navbar-nav"> <li class="nav-item "> <a class="nav-link nav-internal" href="../install.html"> Install </a> </li> <li class="nav-item current active"> <a class="nav-link nav-internal" href="../user_guide.html"> User Guide </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../api/index.html"> API </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../auto_examples/index.html"> Examples </a> </li> <li class="nav-item "> <a class="nav-link nav-external" href="https://blog.scikit-learn.org/"> Community </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../getting_started.html"> Getting Started </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../whats_new.html"> Release History </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../glossary.html"> Glossary </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../developers/index.html"> Development </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../faq.html"> FAQ </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../support.html"> Support </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../related_projects.html"> Related Projects </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../roadmap.html"> Roadmap </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../governance.html"> Governance </a> </li> <li class="nav-item "> <a class="nav-link nav-internal" href="../about.html"> About us </a> </li> </ul> </nav></div> </div> <div class="sidebar-header-items__end"> <div class="navbar-item"> <button class="btn btn-sm nav-link pst-navbar-icon theme-switch-button pst-js-only" aria-label="Color mode" data-bs-title="Color mode" data-bs-placement="bottom" data-bs-toggle="tooltip"> <i class="theme-switch fa-solid fa-sun fa-lg" data-mode="light" title="Light"></i> <i class="theme-switch fa-solid fa-moon fa-lg" data-mode="dark" title="Dark"></i> <i class="theme-switch fa-solid fa-circle-half-stroke fa-lg" data-mode="auto" title="System Settings"></i> </button></div> <div class="navbar-item"><ul class="navbar-icon-links" aria-label="Icon Links"> <li class="nav-item"> <a href="https://github.com/scikit-learn/scikit-learn" title="GitHub" class="nav-link pst-navbar-icon" rel="noopener" target="_blank" data-bs-toggle="tooltip" data-bs-placement="bottom"><i class="fa-brands fa-square-github fa-lg" aria-hidden="true"></i> <span class="sr-only">GitHub</span></a> </li> </ul></div> <div class="navbar-item"> <div class="version-switcher__container dropdown pst-js-only"> <button id="pst-version-switcher-button-3" type="button" class="version-switcher__button btn btn-sm dropdown-toggle" data-bs-toggle="dropdown" aria-haspopup="listbox" aria-controls="pst-version-switcher-list-3" aria-label="Version switcher list" > Choose version <!-- this text may get changed later by javascript --> <span class="caret"></span> </button> <div id="pst-version-switcher-list-3" class="version-switcher__menu dropdown-menu list-group-flush py-0" role="listbox" aria-labelledby="pst-version-switcher-button-3"> <!-- dropdown will be populated by javascript on page load --> </div> </div></div> </div> </div> <div class="sidebar-primary-items__start sidebar-primary__section"> <div class="sidebar-primary-item"> <nav class="bd-docs-nav bd-links" aria-label="Section Navigation"> <p class="bd-links__title" role="heading" aria-level="1">Section Navigation</p> <div class="bd-toc-item navbar-nav"><ul class="current nav bd-sidenav"> <li class="toctree-l1 has-children"><a class="reference internal" href="../supervised_learning.html">1. Supervised learning</a><details><summary><span class="toctree-toggle" role="presentation"><i class="fa-solid fa-chevron-down"></i></span></summary><ul> <li class="toctree-l2"><a class="reference internal" href="linear_model.html">1.1. Linear Models</a></li> <li class="toctree-l2"><a class="reference internal" href="lda_qda.html">1.2. Linear and Quadratic Discriminant Analysis</a></li> <li class="toctree-l2"><a class="reference internal" href="kernel_ridge.html">1.3. Kernel ridge regression</a></li> <li class="toctree-l2"><a class="reference internal" href="svm.html">1.4. Support Vector Machines</a></li> <li class="toctree-l2"><a class="reference internal" href="sgd.html">1.5. Stochastic Gradient Descent</a></li> <li class="toctree-l2"><a class="reference internal" href="neighbors.html">1.6. Nearest Neighbors</a></li> <li class="toctree-l2"><a class="reference internal" href="gaussian_process.html">1.7. Gaussian Processes</a></li> <li class="toctree-l2"><a class="reference internal" href="cross_decomposition.html">1.8. Cross decomposition</a></li> <li class="toctree-l2"><a class="reference internal" href="naive_bayes.html">1.9. Naive Bayes</a></li> <li class="toctree-l2"><a class="reference internal" href="tree.html">1.10. Decision Trees</a></li> <li class="toctree-l2"><a class="reference internal" href="ensemble.html">1.11. Ensembles: Gradient boosting, random forests, bagging, voting, stacking</a></li> <li class="toctree-l2"><a class="reference internal" href="multiclass.html">1.12. Multiclass and multioutput algorithms</a></li> <li class="toctree-l2"><a class="reference internal" href="feature_selection.html">1.13. Feature selection</a></li> <li class="toctree-l2"><a class="reference internal" href="semi_supervised.html">1.14. Semi-supervised learning</a></li> <li class="toctree-l2"><a class="reference internal" href="isotonic.html">1.15. Isotonic regression</a></li> <li class="toctree-l2"><a class="reference internal" href="calibration.html">1.16. Probability calibration</a></li> <li class="toctree-l2"><a class="reference internal" href="neural_networks_supervised.html">1.17. Neural network models (supervised)</a></li> </ul> </details></li> <li class="toctree-l1 has-children"><a class="reference internal" href="../unsupervised_learning.html">2. Unsupervised learning</a><details><summary><span class="toctree-toggle" role="presentation"><i class="fa-solid fa-chevron-down"></i></span></summary><ul> <li class="toctree-l2"><a class="reference internal" href="mixture.html">2.1. Gaussian mixture models</a></li> <li class="toctree-l2"><a class="reference internal" href="manifold.html">2.2. Manifold learning</a></li> <li class="toctree-l2"><a class="reference internal" href="clustering.html">2.3. Clustering</a></li> <li class="toctree-l2"><a class="reference internal" href="biclustering.html">2.4. Biclustering</a></li> <li class="toctree-l2"><a class="reference internal" href="decomposition.html">2.5. Decomposing signals in components (matrix factorization problems)</a></li> <li class="toctree-l2"><a class="reference internal" href="covariance.html">2.6. Covariance estimation</a></li> <li class="toctree-l2"><a class="reference internal" href="outlier_detection.html">2.7. Novelty and Outlier Detection</a></li> <li class="toctree-l2"><a class="reference internal" href="density.html">2.8. Density Estimation</a></li> <li class="toctree-l2"><a class="reference internal" href="neural_networks_unsupervised.html">2.9. Neural network models (unsupervised)</a></li> </ul> </details></li> <li class="toctree-l1 has-children"><a class="reference internal" href="../model_selection.html">3. Model selection and evaluation</a><details><summary><span class="toctree-toggle" role="presentation"><i class="fa-solid fa-chevron-down"></i></span></summary><ul> <li class="toctree-l2"><a class="reference internal" href="cross_validation.html">3.1. Cross-validation: evaluating estimator performance</a></li> <li class="toctree-l2"><a class="reference internal" href="grid_search.html">3.2. Tuning the hyper-parameters of an estimator</a></li> <li class="toctree-l2"><a class="reference internal" href="classification_threshold.html">3.3. Tuning the decision threshold for class prediction</a></li> <li class="toctree-l2"><a class="reference internal" href="model_evaluation.html">3.4. Metrics and scoring: quantifying the quality of predictions</a></li> <li class="toctree-l2"><a class="reference internal" href="learning_curve.html">3.5. Validation curves: plotting scores to evaluate models</a></li> </ul> </details></li> <li class="toctree-l1"><a class="reference internal" href="../metadata_routing.html">4. Metadata Routing</a></li> <li class="toctree-l1 has-children"><a class="reference internal" href="../inspection.html">5. Inspection</a><details><summary><span class="toctree-toggle" role="presentation"><i class="fa-solid fa-chevron-down"></i></span></summary><ul> <li class="toctree-l2"><a class="reference internal" href="partial_dependence.html">5.1. Partial Dependence and Individual Conditional Expectation plots</a></li> <li class="toctree-l2"><a class="reference internal" href="permutation_importance.html">5.2. Permutation feature importance</a></li> </ul> </details></li> <li class="toctree-l1"><a class="reference internal" href="../visualizations.html">6. Visualizations</a></li> <li class="toctree-l1 current active has-children"><a class="reference internal" href="../data_transforms.html">7. Dataset transformations</a><details open="open"><summary><span class="toctree-toggle" role="presentation"><i class="fa-solid fa-chevron-down"></i></span></summary><ul class="current"> <li class="toctree-l2"><a class="reference internal" href="compose.html">7.1. Pipelines and composite estimators</a></li> <li class="toctree-l2"><a class="reference internal" href="feature_extraction.html">7.2. Feature extraction</a></li> <li class="toctree-l2"><a class="reference internal" href="preprocessing.html">7.3. Preprocessing data</a></li> <li class="toctree-l2"><a class="reference internal" href="impute.html">7.4. Imputation of missing values</a></li> <li class="toctree-l2"><a class="reference internal" href="unsupervised_reduction.html">7.5. Unsupervised dimensionality reduction</a></li> <li class="toctree-l2 current active"><a class="current reference internal" href="#">7.6. Random Projection</a></li> <li class="toctree-l2"><a class="reference internal" href="kernel_approximation.html">7.7. Kernel Approximation</a></li> <li class="toctree-l2"><a class="reference internal" href="metrics.html">7.8. Pairwise metrics, Affinities and Kernels</a></li> <li class="toctree-l2"><a class="reference internal" href="preprocessing_targets.html">7.9. Transforming the prediction target (<code class="docutils literal notranslate"><span class="pre">y</span></code>)</a></li> </ul> </details></li> <li class="toctree-l1 has-children"><a class="reference internal" href="../datasets.html">8. Dataset loading utilities</a><details><summary><span class="toctree-toggle" role="presentation"><i class="fa-solid fa-chevron-down"></i></span></summary><ul> <li class="toctree-l2"><a class="reference internal" href="../datasets/toy_dataset.html">8.1. Toy datasets</a></li> <li class="toctree-l2"><a class="reference internal" href="../datasets/real_world.html">8.2. Real world datasets</a></li> <li class="toctree-l2"><a class="reference internal" href="../datasets/sample_generators.html">8.3. Generated datasets</a></li> <li class="toctree-l2"><a class="reference internal" href="../datasets/loading_other_datasets.html">8.4. Loading other datasets</a></li> </ul> </details></li> <li class="toctree-l1 has-children"><a class="reference internal" href="../computing.html">9. Computing with scikit-learn</a><details><summary><span class="toctree-toggle" role="presentation"><i class="fa-solid fa-chevron-down"></i></span></summary><ul> <li class="toctree-l2"><a class="reference internal" href="../computing/scaling_strategies.html">9.1. Strategies to scale computationally: bigger data</a></li> <li class="toctree-l2"><a class="reference internal" href="../computing/computational_performance.html">9.2. Computational Performance</a></li> <li class="toctree-l2"><a class="reference internal" href="../computing/parallelism.html">9.3. Parallelism, resource management, and configuration</a></li> </ul> </details></li> <li class="toctree-l1"><a class="reference internal" href="../model_persistence.html">10. Model persistence</a></li> <li class="toctree-l1"><a class="reference internal" href="../common_pitfalls.html">11. Common pitfalls and recommended practices</a></li> <li class="toctree-l1 has-children"><a class="reference internal" href="../dispatching.html">12. Dispatching</a><details><summary><span class="toctree-toggle" role="presentation"><i class="fa-solid fa-chevron-down"></i></span></summary><ul> <li class="toctree-l2"><a class="reference internal" href="array_api.html">12.1. Array API support (experimental)</a></li> </ul> </details></li> <li class="toctree-l1"><a class="reference internal" href="../machine_learning_map.html">13. Choosing the right estimator</a></li> <li class="toctree-l1"><a class="reference internal" href="../presentations.html">14. External Resources, Videos and Talks</a></li> </ul> </div> </nav></div> </div> <div class="sidebar-primary-items__end sidebar-primary__section"> </div> </div> <main id="main-content" class="bd-main" role="main"> <div class="bd-content"> <div class="bd-article-container"> <div class="bd-header-article d-print-none"> <div class="header-article-items header-article__inner"> <div class="header-article-items__start"> <div class="header-article-item"> <nav aria-label="Breadcrumb" class="d-print-none"> <ul class="bd-breadcrumbs"> <li class="breadcrumb-item breadcrumb-home"> <a href="../index.html" class="nav-link" aria-label="Home"> <i class="fa-solid fa-home"></i> </a> </li> <li class="breadcrumb-item"><a href="../user_guide.html" class="nav-link">User Guide</a></li> <li class="breadcrumb-item"><a href="../data_transforms.html" class="nav-link"><span class="section-number">7. </span>Dataset transformations</a></li> <li class="breadcrumb-item active" aria-current="page"><span class="ellipsis"><span class="section-number">7.6. </span>Random Projection</span></li> </ul> </nav> </div> </div> </div> </div> <div id="searchbox"></div> <article class="bd-article"> <section id="random-projection"> <span id="id1"></span><h1><span class="section-number">7.6. </span>Random Projection<a class="headerlink" href="#random-projection" title="Link to this heading">#</a></h1> <p>The <a class="reference internal" href="../api/sklearn.random_projection.html#module-sklearn.random_projection" title="sklearn.random_projection"><code class="xref py py-mod docutils literal notranslate"><span class="pre">sklearn.random_projection</span></code></a> module implements a simple and computationally efficient way to reduce the dimensionality of the data by trading a controlled amount of accuracy (as additional variance) for faster processing times and smaller model sizes. This module implements two types of unstructured random matrix: <a class="reference internal" href="#gaussian-random-matrix"><span class="std std-ref">Gaussian random matrix</span></a> and <a class="reference internal" href="#sparse-random-matrix"><span class="std std-ref">sparse random matrix</span></a>.</p> <p>The dimensions and distribution of random projections matrices are controlled so as to preserve the pairwise distances between any two samples of the dataset. Thus random projection is a suitable approximation technique for distance based method.</p> <p class="rubric">References</p> <ul class="simple"> <li><p>Sanjoy Dasgupta. 2000. <a class="reference external" href="https://cseweb.ucsd.edu/~dasgupta/papers/randomf.pdf">Experiments with random projection.</a> In Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence (UAI’00), Craig Boutilier and Moisés Goldszmidt (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 143-151.</p></li> <li><p>Ella Bingham and Heikki Mannila. 2001. <a class="reference external" href="https://cs-people.bu.edu/evimaria/cs565/kdd-rp.pdf">Random projection in dimensionality reduction: applications to image and text data.</a> In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (KDD ‘01). ACM, New York, NY, USA, 245-250.</p></li> </ul> <section id="the-johnson-lindenstrauss-lemma"> <span id="johnson-lindenstrauss"></span><h2><span class="section-number">7.6.1. </span>The Johnson-Lindenstrauss lemma<a class="headerlink" href="#the-johnson-lindenstrauss-lemma" title="Link to this heading">#</a></h2> <p>The main theoretical result behind the efficiency of random projection is the <a class="reference external" href="https://en.wikipedia.org/wiki/Johnson%E2%80%93Lindenstrauss_lemma">Johnson-Lindenstrauss lemma (quoting Wikipedia)</a>:</p> <blockquote> <div><p>In mathematics, the Johnson-Lindenstrauss lemma is a result concerning low-distortion embeddings of points from high-dimensional into low-dimensional Euclidean space. The lemma states that a small set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. The map used for the embedding is at least Lipschitz, and can even be taken to be an orthogonal projection.</p> </div></blockquote> <p>Knowing only the number of samples, the <a class="reference internal" href="generated/sklearn.random_projection.johnson_lindenstrauss_min_dim.html#sklearn.random_projection.johnson_lindenstrauss_min_dim" title="sklearn.random_projection.johnson_lindenstrauss_min_dim"><code class="xref py py-func docutils literal notranslate"><span class="pre">johnson_lindenstrauss_min_dim</span></code></a> estimates conservatively the minimal size of the random subspace to guarantee a bounded distortion introduced by the random projection:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">from</span><span class="w"> </span><span class="nn">sklearn.random_projection</span><span class="w"> </span><span class="kn">import</span> <span class="n">johnson_lindenstrauss_min_dim</span> <span class="gp">>>> </span><span class="n">johnson_lindenstrauss_min_dim</span><span class="p">(</span><span class="n">n_samples</span><span class="o">=</span><span class="mf">1e6</span><span class="p">,</span> <span class="n">eps</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span> <span class="go">np.int64(663)</span> <span class="gp">>>> </span><span class="n">johnson_lindenstrauss_min_dim</span><span class="p">(</span><span class="n">n_samples</span><span class="o">=</span><span class="mf">1e6</span><span class="p">,</span> <span class="n">eps</span><span class="o">=</span><span class="p">[</span><span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">])</span> <span class="go">array([ 663, 11841, 1112658])</span> <span class="gp">>>> </span><span class="n">johnson_lindenstrauss_min_dim</span><span class="p">(</span><span class="n">n_samples</span><span class="o">=</span><span class="p">[</span><span class="mf">1e4</span><span class="p">,</span> <span class="mf">1e5</span><span class="p">,</span> <span class="mf">1e6</span><span class="p">],</span> <span class="n">eps</span><span class="o">=</span><span class="mf">0.1</span><span class="p">)</span> <span class="go">array([ 7894, 9868, 11841])</span> </pre></div> </div> <figure class="align-center"> <a class="reference external image-reference" href="../auto_examples/miscellaneous/plot_johnson_lindenstrauss_bound.html"><img alt="../_images/sphx_glr_plot_johnson_lindenstrauss_bound_001.png" src="../_images/sphx_glr_plot_johnson_lindenstrauss_bound_001.png" style="width: 480.0px; height: 360.0px;" /> </a> </figure> <figure class="align-center"> <a class="reference external image-reference" href="../auto_examples/miscellaneous/plot_johnson_lindenstrauss_bound.html"><img alt="../_images/sphx_glr_plot_johnson_lindenstrauss_bound_002.png" src="../_images/sphx_glr_plot_johnson_lindenstrauss_bound_002.png" style="width: 480.0px; height: 360.0px;" /> </a> </figure> <p class="rubric">Examples</p> <ul class="simple"> <li><p>See <a class="reference internal" href="../auto_examples/miscellaneous/plot_johnson_lindenstrauss_bound.html#sphx-glr-auto-examples-miscellaneous-plot-johnson-lindenstrauss-bound-py"><span class="std std-ref">The Johnson-Lindenstrauss bound for embedding with random projections</span></a> for a theoretical explication on the Johnson-Lindenstrauss lemma and an empirical validation using sparse random matrices.</p></li> </ul> <p class="rubric">References</p> <ul class="simple"> <li><p>Sanjoy Dasgupta and Anupam Gupta, 1999. <a class="reference external" href="https://cseweb.ucsd.edu/~dasgupta/papers/jl.pdf">An elementary proof of the Johnson-Lindenstrauss Lemma.</a></p></li> </ul> </section> <section id="gaussian-random-projection"> <span id="gaussian-random-matrix"></span><h2><span class="section-number">7.6.2. </span>Gaussian random projection<a class="headerlink" href="#gaussian-random-projection" title="Link to this heading">#</a></h2> <p>The <a class="reference internal" href="generated/sklearn.random_projection.GaussianRandomProjection.html#sklearn.random_projection.GaussianRandomProjection" title="sklearn.random_projection.GaussianRandomProjection"><code class="xref py py-class docutils literal notranslate"><span class="pre">GaussianRandomProjection</span></code></a> reduces the dimensionality by projecting the original input space on a randomly generated matrix where components are drawn from the following distribution <span class="math notranslate nohighlight">\(N(0, \frac{1}{n_{components}})\)</span>.</p> <p>Here is a small excerpt which illustrates how to use the Gaussian random projection transformer:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">import</span><span class="w"> </span><span class="nn">numpy</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">np</span> <span class="gp">>>> </span><span class="kn">from</span><span class="w"> </span><span class="nn">sklearn</span><span class="w"> </span><span class="kn">import</span> <span class="n">random_projection</span> <span class="gp">>>> </span><span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">10000</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">transformer</span> <span class="o">=</span> <span class="n">random_projection</span><span class="o">.</span><span class="n">GaussianRandomProjection</span><span class="p">()</span> <span class="gp">>>> </span><span class="n">X_new</span> <span class="o">=</span> <span class="n">transformer</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">X</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">X_new</span><span class="o">.</span><span class="n">shape</span> <span class="go">(100, 3947)</span> </pre></div> </div> </section> <section id="sparse-random-projection"> <span id="sparse-random-matrix"></span><h2><span class="section-number">7.6.3. </span>Sparse random projection<a class="headerlink" href="#sparse-random-projection" title="Link to this heading">#</a></h2> <p>The <a class="reference internal" href="generated/sklearn.random_projection.SparseRandomProjection.html#sklearn.random_projection.SparseRandomProjection" title="sklearn.random_projection.SparseRandomProjection"><code class="xref py py-class docutils literal notranslate"><span class="pre">SparseRandomProjection</span></code></a> reduces the dimensionality by projecting the original input space using a sparse random matrix.</p> <p>Sparse random matrices are an alternative to dense Gaussian random projection matrix that guarantees similar embedding quality while being much more memory efficient and allowing faster computation of the projected data.</p> <p>If we define <code class="docutils literal notranslate"><span class="pre">s</span> <span class="pre">=</span> <span class="pre">1</span> <span class="pre">/</span> <span class="pre">density</span></code>, the elements of the random matrix are drawn from</p> <div class="math notranslate nohighlight"> \[\begin{split}\left\{ \begin{array}{c c l} -\sqrt{\frac{s}{n_{\text{components}}}} & & 1 / 2s\\ 0 &\text{with probability} & 1 - 1 / s \\ +\sqrt{\frac{s}{n_{\text{components}}}} & & 1 / 2s\\ \end{array} \right.\end{split}\]</div> <p>where <span class="math notranslate nohighlight">\(n_{\text{components}}\)</span> is the size of the projected subspace. By default the density of non zero elements is set to the minimum density as recommended by Ping Li et al.: <span class="math notranslate nohighlight">\(1 / \sqrt{n_{\text{features}}}\)</span>.</p> <p>Here is a small excerpt which illustrates how to use the sparse random projection transformer:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">import</span><span class="w"> </span><span class="nn">numpy</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">np</span> <span class="gp">>>> </span><span class="kn">from</span><span class="w"> </span><span class="nn">sklearn</span><span class="w"> </span><span class="kn">import</span> <span class="n">random_projection</span> <span class="gp">>>> </span><span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">10000</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">transformer</span> <span class="o">=</span> <span class="n">random_projection</span><span class="o">.</span><span class="n">SparseRandomProjection</span><span class="p">()</span> <span class="gp">>>> </span><span class="n">X_new</span> <span class="o">=</span> <span class="n">transformer</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">X</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">X_new</span><span class="o">.</span><span class="n">shape</span> <span class="go">(100, 3947)</span> </pre></div> </div> <p class="rubric">References</p> <ul class="simple"> <li><p>D. Achlioptas. 2003. <a class="reference external" href="https://www.sciencedirect.com/science/article/pii/S0022000003000254">Database-friendly random projections: Johnson-Lindenstrauss with binary coins</a>. Journal of Computer and System Sciences 66 (2003) 671-687.</p></li> <li><p>Ping Li, Trevor J. Hastie, and Kenneth W. Church. 2006. <a class="reference external" href="https://web.stanford.edu/~hastie/Papers/Ping/KDD06_rp.pdf">Very sparse random projections.</a> In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD ‘06). ACM, New York, NY, USA, 287-296.</p></li> </ul> </section> <section id="inverse-transform"> <span id="random-projection-inverse-transform"></span><h2><span class="section-number">7.6.4. </span>Inverse Transform<a class="headerlink" href="#inverse-transform" title="Link to this heading">#</a></h2> <p>The random projection transformers have <code class="docutils literal notranslate"><span class="pre">compute_inverse_components</span></code> parameter. When set to True, after creating the random <code class="docutils literal notranslate"><span class="pre">components_</span></code> matrix during fitting, the transformer computes the pseudo-inverse of this matrix and stores it as <code class="docutils literal notranslate"><span class="pre">inverse_components_</span></code>. The <code class="docutils literal notranslate"><span class="pre">inverse_components_</span></code> matrix has shape <span class="math notranslate nohighlight">\(n_{features} \times n_{components}\)</span>, and it is always a dense matrix, regardless of whether the components matrix is sparse or dense. So depending on the number of features and components, it may use a lot of memory.</p> <p>When the <code class="docutils literal notranslate"><span class="pre">inverse_transform</span></code> method is called, it computes the product of the input <code class="docutils literal notranslate"><span class="pre">X</span></code> and the transpose of the inverse components. If the inverse components have been computed during fit, they are reused at each call to <code class="docutils literal notranslate"><span class="pre">inverse_transform</span></code>. Otherwise they are recomputed each time, which can be costly. The result is always dense, even if <code class="docutils literal notranslate"><span class="pre">X</span></code> is sparse.</p> <p>Here is a small code example which illustrates how to use the inverse transform feature:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">>>> </span><span class="kn">import</span><span class="w"> </span><span class="nn">numpy</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">np</span> <span class="gp">>>> </span><span class="kn">from</span><span class="w"> </span><span class="nn">sklearn.random_projection</span><span class="w"> </span><span class="kn">import</span> <span class="n">SparseRandomProjection</span> <span class="gp">>>> </span><span class="n">X</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">rand</span><span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">10000</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">transformer</span> <span class="o">=</span> <span class="n">SparseRandomProjection</span><span class="p">(</span> <span class="gp">... </span> <span class="n">compute_inverse_components</span><span class="o">=</span><span class="kc">True</span> <span class="gp">... </span><span class="p">)</span> <span class="gp">...</span> <span class="gp">>>> </span><span class="n">X_new</span> <span class="o">=</span> <span class="n">transformer</span><span class="o">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">X</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">X_new</span><span class="o">.</span><span class="n">shape</span> <span class="go">(100, 3947)</span> <span class="gp">>>> </span><span class="n">X_new_inversed</span> <span class="o">=</span> <span class="n">transformer</span><span class="o">.</span><span class="n">inverse_transform</span><span class="p">(</span><span class="n">X_new</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">X_new_inversed</span><span class="o">.</span><span class="n">shape</span> <span class="go">(100, 10000)</span> <span class="gp">>>> </span><span class="n">X_new_again</span> <span class="o">=</span> <span class="n">transformer</span><span class="o">.</span><span class="n">transform</span><span class="p">(</span><span class="n">X_new_inversed</span><span class="p">)</span> <span class="gp">>>> </span><span class="n">np</span><span class="o">.</span><span class="n">allclose</span><span class="p">(</span><span class="n">X_new</span><span class="p">,</span> <span class="n">X_new_again</span><span class="p">)</span> <span class="go">True</span> </pre></div> </div> </section> </section> </article> <footer class="bd-footer-article"> <div class="footer-article-items footer-article__inner"> <div class="footer-article-item"> <div class="prev-next-area"> <a class="left-prev" href="unsupervised_reduction.html" title="previous page"> <i class="fa-solid fa-angle-left"></i> <div class="prev-next-info"> <p class="prev-next-subtitle">previous</p> <p class="prev-next-title"><span class="section-number">7.5. </span>Unsupervised dimensionality reduction</p> </div> </a> <a class="right-next" href="kernel_approximation.html" title="next page"> <div class="prev-next-info"> <p class="prev-next-subtitle">next</p> <p class="prev-next-title"><span class="section-number">7.7. </span>Kernel Approximation</p> </div> <i class="fa-solid fa-angle-right"></i> </a> </div></div> </div> </footer> </div> <dialog id="pst-secondary-sidebar-modal"></dialog> <div id="pst-secondary-sidebar" class="bd-sidebar-secondary bd-toc"><div class="sidebar-secondary-items sidebar-secondary__inner"> <div class="sidebar-secondary-item"> <div id="pst-page-navigation-heading-2" class="page-toc tocsection onthispage"> <i class="fa-solid fa-list"></i> On this page </div> <nav class="bd-toc-nav page-toc" aria-labelledby="pst-page-navigation-heading-2"> <ul class="visible nav section-nav flex-column"> <li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#the-johnson-lindenstrauss-lemma">7.6.1. The Johnson-Lindenstrauss lemma</a></li> <li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#gaussian-random-projection">7.6.2. Gaussian random projection</a></li> <li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#sparse-random-projection">7.6.3. Sparse random projection</a></li> <li class="toc-h2 nav-item toc-entry"><a class="reference internal nav-link" href="#inverse-transform">7.6.4. Inverse Transform</a></li> </ul> </nav></div> <div class="sidebar-secondary-item"> <div role="note" aria-label="source link"> <h3>This Page</h3> <ul class="this-page-menu"> <li><a href="../_sources/modules/random_projection.rst.txt" rel="nofollow">Show Source</a></li> </ul> </div></div> </div></div> </div> <footer class="bd-footer-content"> </footer> </main> </div> </div> <!-- Scripts loaded after <body> so the DOM is not blocked --> <script defer src="../_static/scripts/bootstrap.js?digest=8878045cc6db502f8baf"></script> <script defer src="../_static/scripts/pydata-sphinx-theme.js?digest=8878045cc6db502f8baf"></script> <footer class="bd-footer"> <div class="bd-footer__inner bd-page-width"> <div class="footer-items__start"> <div class="footer-item"> <p class="copyright"> © Copyright 2007 - 2025, scikit-learn developers (BSD License). <br/> </p> </div> </div> </div> </footer> </body> </html>