<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Pipelining: chaining a PCA and a logistic regression — scikit-learn 0.16.1 documentation</title> <!-- htmltitle is before nature.css - we use this hack to load bootstrap first --> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <link rel="stylesheet" href="../_static/css/bootstrap.min.css" media="screen" /> <link rel="stylesheet" href="../_static/css/bootstrap-responsive.css"/> <link rel="stylesheet" href="../_static/nature.css" type="text/css" /> <link rel="stylesheet" href="../_static/pygments.css" type="text/css" /> <link rel="stylesheet" href="../_static/gallery.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: '../', VERSION: '0.16.1', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="../_static/jquery.js"></script> <script type="text/javascript" src="../_static/underscore.js"></script> <script type="text/javascript" src="../_static/doctools.js"></script> <script type="text/javascript" src="../_static/js/copybutton.js"></script> <link rel="shortcut icon" href="../_static/favicon.ico"/> <link rel="author" title="About these documents" href="../about.html" /> <link rel="top" title="scikit-learn 0.16.1 documentation" href="../index.html" /> <link rel="up" title="Examples" href="index.html" /> <link rel="next" title="Multilabel classification" href="plot_multilabel.html" /> <link rel="prev" title="Imputing missing values before building an estimator" href="missing_values.html" /> <script type="text/javascript" src="../_static/sidebar.js"></script> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <script src="../_static/js/bootstrap.min.js" type="text/javascript"></script> <link rel="canonical" href="https://scikit-learn.org/stable/auto_examples/plot_digits_pipe.html" /> <script type="text/javascript"> $("div.buttonNext, div.buttonPrevious").hover( function () { $(this).css('background-color', '#FF9C34'); }, function () { $(this).css('background-color', '#A7D6E2'); } ); var bodywrapper = $('.bodywrapper'); var sidebarbutton = $('#sidebarbutton'); sidebarbutton.css({'height': '900px'}); </script> </head> <body> <div class="header-wrapper"> <div class="header"> <p class="logo"><a href="../index.html"> <img src="../_static/scikit-learn-logo-small.png" alt="Logo"/> </a> </p><div class="navbar"> <ul> <li><a href="../../stable/index.html">Home</a></li> <li><a href="../../stable/install.html">Installation</a></li> <li class="btn-li"><div class="btn-group"> <a href="../documentation.html">Documentation</a> <a class="btn dropdown-toggle" data-toggle="dropdown"> <span class="caret"></span> </a> <ul class="dropdown-menu"> <li class="link-title">Scikit-learn 0.16 (Stable)</li> <li><a href="../tutorial/index.html">Tutorials</a></li> <li><a href="../user_guide.html">User guide</a></li> <li><a href="../modules/classes.html">API</a></li> <li><a href="../faq.html">FAQ</a></li> <li class="divider"></li> <li><a href="http://scikit-learn.org/dev/documentation.html">Development</a></li> <li><a href="http://scikit-learn.org/0.15/">Scikit-learn 0.15</a></li> </ul> </div> </li> <li><a href="index.html">Examples</a></li> </ul> <div class="search_form"> <div id="cse" style="width: 100%;"></div> </div> </div> <!-- end navbar --></div> </div> <!-- Github "fork me" ribbon --> <a href="https://github.com/scikit-learn/scikit-learn"> <img class="fork-me" style="position: absolute; top: 0; right: 0; border: 0;" src="../_static/img/forkme.png" alt="Fork me on GitHub" /> </a> <div class="content-wrapper"> <div class="sphinxsidebar"> <div class="sphinxsidebarwrapper"> <div class="rel"> <!-- rellinks[1:] is an ugly hack to avoid link to module index --> <div class="rellink"> <a href="missing_values.html" accesskey="P">Previous <br/> <span class="smallrellink"> Imputing missing... </span> <span class="hiddenrellink"> Imputing missing values before building an estimator </span> </a> </div> <div class="spacer"> </div> <div class="rellink"> <a href="plot_multilabel.html" accesskey="N">Next <br/> <span class="smallrellink"> Multilabel class... </span> <span class="hiddenrellink"> Multilabel classification </span> </a> </div> <!-- Ad a link to the 'up' page --> <div class="spacer"> </div> <div class="rellink"> <a href="index.html"> Up <br/> <span class="smallrellink"> Examples </span> <span class="hiddenrellink"> Examples </span> </a> </div> </div> <p class="doc-version">This documentation is for scikit-learn <strong>version 0.16.1</strong> — <a href="http://scikit-learn.org/stable/support.html#documentation-resources">Other versions</a></p> <p class="citing">If you use the software, please consider <a href="../about.html#citing-scikit-learn">citing scikit-learn</a>.</p> <ul> <li><a class="reference internal" href="#">Pipelining: chaining a PCA and a logistic regression</a></li> </ul> </div> </div> <div class="content"> <div class="documentwrapper"> <div class="bodywrapper"> <div class="body"> <div class="section" id="pipelining-chaining-a-pca-and-a-logistic-regression"> <span id="example-plot-digits-pipe-py"></span><h1>Pipelining: chaining a PCA and a logistic regression<a class="headerlink" href="#pipelining-chaining-a-pca-and-a-logistic-regression" title="Permalink to this headline">¶</a></h1> <p>The PCA does an unsupervised dimensionality reduction, while the logistic regression does the prediction.</p> <p>We use a GridSearchCV to set the dimensionality of the PCA</p> <img alt="../_images/plot_digits_pipe_0011.png" class="align-center" src="../_images/plot_digits_pipe_0011.png" /> <p><strong>Python source code:</strong> <a class="reference download internal" href="../_downloads/plot_digits_pipe.py"><tt class="xref download docutils literal"><span class="pre">plot_digits_pipe.py</span></tt></a></p> <div class="highlight-python"><div class="highlight"><pre><span class="k">print</span><span class="p">(</span><span class="n">__doc__</span><span class="p">)</span> <span class="c"># Code source: Gaël Varoquaux</span> <span class="c"># Modified for documentation by Jaques Grobler</span> <span class="c"># License: BSD 3 clause</span> <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> <span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span> <span class="kn">from</span> <span class="nn">sklearn</span> <span class="kn">import</span> <span class="n">linear_model</span><span class="p">,</span> <span class="n">decomposition</span><span class="p">,</span> <span class="n">datasets</span> <span class="kn">from</span> <span class="nn">sklearn.pipeline</span> <span class="kn">import</span> <a href="../modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline"><span class="n">Pipeline</span></a> <span class="kn">from</span> <span class="nn">sklearn.grid_search</span> <span class="kn">import</span> <a href="../modules/generated/sklearn.grid_search.GridSearchCV.html#sklearn.grid_search.GridSearchCV"><span class="n">GridSearchCV</span></a> <span class="n">logistic</span> <span class="o">=</span> <span class="n">linear_model</span><span class="o">.</span><span class="n">LogisticRegression</span><span class="p">()</span> <span class="n">pca</span> <span class="o">=</span> <a href="../modules/generated/sklearn.decomposition.PCA.html#sklearn.decomposition.PCA"><span class="n">decomposition</span><span class="o">.</span><span class="n">PCA</span></a><span class="p">()</span> <span class="n">pipe</span> <span class="o">=</span> <a href="../modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline"><span class="n">Pipeline</span></a><span class="p">(</span><span class="n">steps</span><span class="o">=</span><span class="p">[(</span><span class="s">'pca'</span><span class="p">,</span> <span class="n">pca</span><span class="p">),</span> <span class="p">(</span><span class="s">'logistic'</span><span class="p">,</span> <span class="n">logistic</span><span class="p">)])</span> <span class="n">digits</span> <span class="o">=</span> <a href="../modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits"><span class="n">datasets</span><span class="o">.</span><span class="n">load_digits</span></a><span class="p">()</span> <span class="n">X_digits</span> <span class="o">=</span> <span class="n">digits</span><span class="o">.</span><span class="n">data</span> <span class="n">y_digits</span> <span class="o">=</span> <span class="n">digits</span><span class="o">.</span><span class="n">target</span> <span class="c">###############################################################################</span> <span class="c"># Plot the PCA spectrum</span> <span class="n">pca</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_digits</span><span class="p">)</span> <a href="http://matplotlib.org/api/figure_api.html#matplotlib.figure"><span class="n">plt</span><span class="o">.</span><span class="n">figure</span></a><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">4</span><span class="p">,</span> <span class="mi">3</span><span class="p">))</span> <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.clf"><span class="n">plt</span><span class="o">.</span><span class="n">clf</span></a><span class="p">()</span> <a href="http://matplotlib.org/api/axes_api.html#matplotlib.axes"><span class="n">plt</span><span class="o">.</span><span class="n">axes</span></a><span class="p">([</span><span class="o">.</span><span class="mi">2</span><span class="p">,</span> <span class="o">.</span><span class="mi">2</span><span class="p">,</span> <span class="o">.</span><span class="mi">7</span><span class="p">,</span> <span class="o">.</span><span class="mi">7</span><span class="p">])</span> <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span></a><span class="p">(</span><span class="n">pca</span><span class="o">.</span><span class="n">explained_variance_</span><span class="p">,</span> <span class="n">linewidth</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span> <a href="http://matplotlib.org/api/axis_api.html#matplotlib.axis"><span class="n">plt</span><span class="o">.</span><span class="n">axis</span></a><span class="p">(</span><span class="s">'tight'</span><span class="p">)</span> <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xlabel"><span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span></a><span class="p">(</span><span class="s">'n_components'</span><span class="p">)</span> <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.ylabel"><span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span></a><span class="p">(</span><span class="s">'explained_variance_'</span><span class="p">)</span> <span class="c">###############################################################################</span> <span class="c"># Prediction</span> <span class="n">n_components</span> <span class="o">=</span> <span class="p">[</span><span class="mi">20</span><span class="p">,</span> <span class="mi">40</span><span class="p">,</span> <span class="mi">64</span><span class="p">]</span> <span class="n">Cs</span> <span class="o">=</span> <a href="http://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.logspace.html#numpy.logspace"><span class="n">np</span><span class="o">.</span><span class="n">logspace</span></a><span class="p">(</span><span class="o">-</span><span class="mi">4</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">3</span><span class="p">)</span> <span class="c">#Parameters of pipelines can be set using ‘__’ separated parameter names:</span> <span class="n">estimator</span> <span class="o">=</span> <a href="../modules/generated/sklearn.grid_search.GridSearchCV.html#sklearn.grid_search.GridSearchCV"><span class="n">GridSearchCV</span></a><span class="p">(</span><span class="n">pipe</span><span class="p">,</span> <span class="nb">dict</span><span class="p">(</span><span class="n">pca__n_components</span><span class="o">=</span><span class="n">n_components</span><span class="p">,</span> <span class="n">logistic__C</span><span class="o">=</span><span class="n">Cs</span><span class="p">))</span> <span class="n">estimator</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X_digits</span><span class="p">,</span> <span class="n">y_digits</span><span class="p">)</span> <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.axvline"><span class="n">plt</span><span class="o">.</span><span class="n">axvline</span></a><span class="p">(</span><span class="n">estimator</span><span class="o">.</span><span class="n">best_estimator_</span><span class="o">.</span><span class="n">named_steps</span><span class="p">[</span><span class="s">'pca'</span><span class="p">]</span><span class="o">.</span><span class="n">n_components</span><span class="p">,</span> <span class="n">linestyle</span><span class="o">=</span><span class="s">':'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">'n_components chosen'</span><span class="p">)</span> <a href="http://matplotlib.org/api/legend_api.html#matplotlib.legend"><span class="n">plt</span><span class="o">.</span><span class="n">legend</span></a><span class="p">(</span><span class="n">prop</span><span class="o">=</span><span class="nb">dict</span><span class="p">(</span><span class="n">size</span><span class="o">=</span><span class="mi">12</span><span class="p">))</span> <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.show"><span class="n">plt</span><span class="o">.</span><span class="n">show</span></a><span class="p">()</span> </pre></div> </div> <p><strong>Total running time of the example:</strong> 8.43 seconds ( 0 minutes 8.43 seconds)</p> </div> </div> </div> </div> <div class="clearer"></div> </div> </div> <div class="footer"> © 2010 - 2014, scikit-learn developers (BSD License). <a href="../_sources/auto_examples/plot_digits_pipe.txt" rel="nofollow">Show this page source</a> </div> <div class="rel"> <div class="buttonPrevious"> <a href="missing_values.html">Previous </a> </div> <div class="buttonNext"> <a href="plot_multilabel.html">Next </a> </div> </div> <script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-22606712-2']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script> <script src="http://www.google.com/jsapi" type="text/javascript"></script> <script type="text/javascript"> google.load('search', '1', {language : 'en'}); google.setOnLoadCallback(function() { var customSearchControl = new google.search.CustomSearchControl('016639176250731907682:tjtqbvtvij0'); customSearchControl.setResultSetSize(google.search.Search.FILTERED_CSE_RESULTSET); var options = new google.search.DrawOptions(); options.setAutoComplete(true); customSearchControl.draw('cse', options); }, true); </script> <script src="https://scikit-learn.org/versionwarning.js"></script> </body> </html>