<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    
  
    <title>Underfitting vs. Overfitting &mdash; scikit-learn 0.16.1 documentation</title>
  <!-- htmltitle is before nature.css - we use this hack to load bootstrap first -->
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <link rel="stylesheet" href="../_static/css/bootstrap.min.css" media="screen" />
  <link rel="stylesheet" href="../_static/css/bootstrap-responsive.css"/>

    
    <link rel="stylesheet" href="../_static/nature.css" type="text/css" />
    <link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
    <link rel="stylesheet" href="../_static/gallery.css" type="text/css" />
    
    <script type="text/javascript">
      var DOCUMENTATION_OPTIONS = {
        URL_ROOT:    '../',
        VERSION:     '0.16.1',
        COLLAPSE_INDEX: false,
        FILE_SUFFIX: '.html',
        HAS_SOURCE:  true
      };
    </script>
    <script type="text/javascript" src="../_static/jquery.js"></script>
    <script type="text/javascript" src="../_static/underscore.js"></script>
    <script type="text/javascript" src="../_static/doctools.js"></script>
    <script type="text/javascript" src="../_static/js/copybutton.js"></script>
    <link rel="shortcut icon" href="../_static/favicon.ico"/>
    <link rel="author" title="About these documents" href="../about.html" />
    <link rel="top" title="scikit-learn 0.16.1 documentation" href="../index.html" />
  
   
       <script type="text/javascript" src="../_static/sidebar.js"></script>
   
  
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
  <script src="../_static/js/bootstrap.min.js" type="text/javascript"></script>
  <link rel="canonical" href="https://scikit-learn.org/stable/auto_examples/plot_underfitting_overfitting.html" />

  <script type="text/javascript">
    $("div.buttonNext, div.buttonPrevious").hover(
       function () {
           $(this).css('background-color', '#FF9C34');
       },
       function () {
           $(this).css('background-color', '#A7D6E2');
       }
    );
    var bodywrapper = $('.bodywrapper');
    var sidebarbutton = $('#sidebarbutton');
    sidebarbutton.css({'height': '900px'});
  </script>

  </head>
  <body>

<div class="header-wrapper">
    <div class="header">
        <p class="logo"><a href="../index.html">
            <img src="../_static/scikit-learn-logo-small.png" alt="Logo"/>
        </a>
        </p><div class="navbar">
            <ul>
                <li><a href="../../stable/index.html">Home</a></li>
                <li><a href="../../stable/install.html">Installation</a></li>
                <li class="btn-li"><div class="btn-group">
		      <a href="../documentation.html">Documentation</a>
		      <a class="btn dropdown-toggle" data-toggle="dropdown">
			     <span class="caret"></span>
		      </a>
		      <ul class="dropdown-menu">
			<li class="link-title">Scikit-learn 0.16 (Stable)</li>
			<li><a href="../tutorial/index.html">Tutorials</a></li>
			<li><a href="../user_guide.html">User guide</a></li>
			<li><a href="../modules/classes.html">API</a></li>
			<li><a href="../faq.html">FAQ</a></li>
			<li class="divider"></li>
		        <li><a href="http://scikit-learn.org/dev/documentation.html">Development</a></li>
			<li><a href="http://scikit-learn.org/0.15/">Scikit-learn 0.15</a></li>
		      </ul>
		    </div>
		</li>
            <li><a href="index.html">Examples</a></li>
            </ul>

            <div class="search_form">
                <div id="cse" style="width: 100%;"></div>
            </div>
        </div> <!-- end navbar --></div>
</div>


<!-- Github "fork me" ribbon -->
<a href="https://github.com/scikit-learn/scikit-learn">
  <img class="fork-me"
       style="position: absolute; top: 0; right: 0; border: 0;"
       src="../_static/img/forkme.png"
       alt="Fork me on GitHub" />
</a>

<div class="content-wrapper">
    <div class="sphinxsidebar">
    <div class="sphinxsidebarwrapper">
      <p class="doc-version">This documentation is for scikit-learn <strong>version 0.16.1</strong> &mdash; <a href="http://scikit-learn.org/stable/support.html#documentation-resources">Other versions</a></p>
    <p class="citing">If you use the software, please consider <a href="../about.html#citing-scikit-learn">citing scikit-learn</a>.</p>
    <ul>
<li><a class="reference internal" href="#">Underfitting vs. Overfitting</a></li>
</ul>

    </div>
</div>



      <div class="content">
            
      <div class="documentwrapper">
        <div class="bodywrapper">
          <div class="body">
            
  <div class="section" id="underfitting-vs-overfitting">
<span id="example-plot-underfitting-overfitting-py"></span><h1>Underfitting vs. Overfitting<a class="headerlink" href="#underfitting-vs-overfitting" title="Permalink to this headline">ΒΆ</a></h1>
<p>This example demonstrates the problems of underfitting and overfitting and
how we can use linear regression with polynomial features to approximate
nonlinear functions. The plot shows the function that we want to approximate,
which is a part of the cosine function. In addition, the samples from the
real function and the approximations of different models are displayed. The
models have polynomial features of different degrees. We can see that a
linear function (polynomial with degree 1) is not sufficient to fit the
training samples. This is called <strong>underfitting</strong>. A polynomial of degree 4
approximates the true function almost perfectly. However, for higher degrees
the model will <strong>overfit</strong> the training data, i.e. it learns the noise of the
training data.</p>
<img alt="../_images/plot_underfitting_overfitting_0011.png" class="align-center" src="../_images/plot_underfitting_overfitting_0011.png" />
<p><strong>Python source code:</strong> <a class="reference download internal" href="../_downloads/plot_underfitting_overfitting1.py"><tt class="xref download docutils literal"><span class="pre">plot_underfitting_overfitting.py</span></tt></a></p>
<div class="highlight-python"><div class="highlight"><pre><span class="k">print</span><span class="p">(</span><span class="n">__doc__</span><span class="p">)</span>

<span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="kn">as</span> <span class="nn">plt</span>
<span class="kn">from</span> <span class="nn">sklearn.pipeline</span> <span class="kn">import</span> <a href="../modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline"><span class="n">Pipeline</span></a>
<span class="kn">from</span> <span class="nn">sklearn.preprocessing</span> <span class="kn">import</span> <a href="../modules/generated/sklearn.preprocessing.PolynomialFeatures.html#sklearn.preprocessing.PolynomialFeatures"><span class="n">PolynomialFeatures</span></a>
<span class="kn">from</span> <span class="nn">sklearn.linear_model</span> <span class="kn">import</span> <a href="../modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression"><span class="n">LinearRegression</span></a>


<a href="http://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.random.seed.html#numpy.random.seed"><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">seed</span></a><span class="p">(</span><span class="mi">0</span><span class="p">)</span>

<span class="n">n_samples</span> <span class="o">=</span> <span class="mi">30</span>
<span class="n">degrees</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">15</span><span class="p">]</span>

<span class="n">true_fun</span> <span class="o">=</span> <span class="k">lambda</span> <span class="n">X</span><span class="p">:</span> <a href="http://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.cos.html#numpy.cos"><span class="n">np</span><span class="o">.</span><span class="n">cos</span></a><span class="p">(</span><span class="mf">1.5</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">pi</span> <span class="o">*</span> <span class="n">X</span><span class="p">)</span>
<span class="n">X</span> <span class="o">=</span> <a href="http://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.sort.html#numpy.sort"><span class="n">np</span><span class="o">.</span><span class="n">sort</span></a><span class="p">(</span><a href="http://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.random.rand.html#numpy.random.rand"><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">rand</span></a><span class="p">(</span><span class="n">n_samples</span><span class="p">))</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">true_fun</span><span class="p">(</span><span class="n">X</span><span class="p">)</span> <span class="o">+</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="n">n_samples</span><span class="p">)</span> <span class="o">*</span> <span class="mf">0.1</span>

<a href="http://matplotlib.org/api/figure_api.html#matplotlib.figure"><span class="n">plt</span><span class="o">.</span><span class="n">figure</span></a><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">14</span><span class="p">,</span> <span class="mi">4</span><span class="p">))</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">degrees</span><span class="p">)):</span>
    <span class="n">ax</span> <span class="o">=</span> <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.subplot"><span class="n">plt</span><span class="o">.</span><span class="n">subplot</span></a><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">degrees</span><span class="p">),</span> <span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span>
    <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.setp"><span class="n">plt</span><span class="o">.</span><span class="n">setp</span></a><span class="p">(</span><span class="n">ax</span><span class="p">,</span> <span class="n">xticks</span><span class="o">=</span><span class="p">(),</span> <span class="n">yticks</span><span class="o">=</span><span class="p">())</span>

    <span class="n">polynomial_features</span> <span class="o">=</span> <a href="../modules/generated/sklearn.preprocessing.PolynomialFeatures.html#sklearn.preprocessing.PolynomialFeatures"><span class="n">PolynomialFeatures</span></a><span class="p">(</span><span class="n">degree</span><span class="o">=</span><span class="n">degrees</span><span class="p">[</span><span class="n">i</span><span class="p">],</span>
                                             <span class="n">include_bias</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
    <span class="n">linear_regression</span> <span class="o">=</span> <a href="../modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression"><span class="n">LinearRegression</span></a><span class="p">()</span>
    <span class="n">pipeline</span> <span class="o">=</span> <a href="../modules/generated/sklearn.pipeline.Pipeline.html#sklearn.pipeline.Pipeline"><span class="n">Pipeline</span></a><span class="p">([(</span><span class="s">&quot;polynomial_features&quot;</span><span class="p">,</span> <span class="n">polynomial_features</span><span class="p">),</span>
                         <span class="p">(</span><span class="s">&quot;linear_regression&quot;</span><span class="p">,</span> <span class="n">linear_regression</span><span class="p">)])</span>
    <span class="n">pipeline</span><span class="o">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X</span><span class="p">[:,</span> <a href="http://docs.scipy.org/doc/numpy-1.6.0/reference/arrays.indexing.html#numpy.newaxis"><span class="n">np</span><span class="o">.</span><span class="n">newaxis</span></a><span class="p">],</span> <span class="n">y</span><span class="p">)</span>

    <span class="n">X_test</span> <span class="o">=</span> <a href="http://docs.scipy.org/doc/numpy-1.6.0/reference/generated/numpy.linspace.html#numpy.linspace"><span class="n">np</span><span class="o">.</span><span class="n">linspace</span></a><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
    <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span></a><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">pipeline</span><span class="o">.</span><span class="n">predict</span><span class="p">(</span><span class="n">X_test</span><span class="p">[:,</span> <a href="http://docs.scipy.org/doc/numpy-1.6.0/reference/arrays.indexing.html#numpy.newaxis"><span class="n">np</span><span class="o">.</span><span class="n">newaxis</span></a><span class="p">]),</span> <span class="n">label</span><span class="o">=</span><span class="s">&quot;Model&quot;</span><span class="p">)</span>
    <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot"><span class="n">plt</span><span class="o">.</span><span class="n">plot</span></a><span class="p">(</span><span class="n">X_test</span><span class="p">,</span> <span class="n">true_fun</span><span class="p">(</span><span class="n">X_test</span><span class="p">),</span> <span class="n">label</span><span class="o">=</span><span class="s">&quot;True function&quot;</span><span class="p">)</span>
    <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.scatter"><span class="n">plt</span><span class="o">.</span><span class="n">scatter</span></a><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="s">&quot;Samples&quot;</span><span class="p">)</span>
    <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xlabel"><span class="n">plt</span><span class="o">.</span><span class="n">xlabel</span></a><span class="p">(</span><span class="s">&quot;x&quot;</span><span class="p">)</span>
    <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.ylabel"><span class="n">plt</span><span class="o">.</span><span class="n">ylabel</span></a><span class="p">(</span><span class="s">&quot;y&quot;</span><span class="p">)</span>
    <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.xlim"><span class="n">plt</span><span class="o">.</span><span class="n">xlim</span></a><span class="p">((</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
    <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.ylim"><span class="n">plt</span><span class="o">.</span><span class="n">ylim</span></a><span class="p">((</span><span class="o">-</span><span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span>
    <a href="http://matplotlib.org/api/legend_api.html#matplotlib.legend"><span class="n">plt</span><span class="o">.</span><span class="n">legend</span></a><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="s">&quot;best&quot;</span><span class="p">)</span>
    <a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.title"><span class="n">plt</span><span class="o">.</span><span class="n">title</span></a><span class="p">(</span><span class="s">&quot;Degree </span><span class="si">%d</span><span class="s">&quot;</span> <span class="o">%</span> <span class="n">degrees</span><span class="p">[</span><span class="n">i</span><span class="p">])</span>
<a href="http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.show"><span class="n">plt</span><span class="o">.</span><span class="n">show</span></a><span class="p">()</span>
</pre></div>
</div>
<p><strong>Total running time of the example:</strong>  0.20 seconds
( 0 minutes  0.20 seconds)</p>
</div>


          </div>
        </div>
      </div>
        <div class="clearer"></div>
      </div>
    </div>

    <div class="footer">
        &copy; 2010 - 2014, scikit-learn developers (BSD License).
      <a href="../_sources/auto_examples/plot_underfitting_overfitting.txt" rel="nofollow">Show this page source</a>
    </div>
     <div class="rel rellarge">
    
    
     </div>

    
    <script type="text/javascript">
      var _gaq = _gaq || [];
      _gaq.push(['_setAccount', 'UA-22606712-2']);
      _gaq.push(['_trackPageview']);

      (function() {
        var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
        ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
        var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
      })();
    </script>
    

    <script src="http://www.google.com/jsapi" type="text/javascript"></script>
    <script type="text/javascript"> google.load('search', '1',
        {language : 'en'}); google.setOnLoadCallback(function() {
            var customSearchControl = new
            google.search.CustomSearchControl('016639176250731907682:tjtqbvtvij0');
            customSearchControl.setResultSetSize(google.search.Search.FILTERED_CSE_RESULTSET);
            var options = new google.search.DrawOptions();
            options.setAutoComplete(true);
            customSearchControl.draw('cse', options); }, true);
    </script>
    <script src="https://scikit-learn.org/versionwarning.js"></script>
  </body>
</html>