Skip to content

Commit be39edb

Browse files
committed
Pushing the docs to dev/ for branch: main, commit ce8f23df3de9659efe146308b0639f5bc681b244
1 parent 7169dea commit be39edb

File tree

1,540 files changed

+6174
-6014
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,540 files changed

+6174
-6014
lines changed
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

dev/_downloads/10bb40e21b74618cdeed618ff1eae595/plot_det.ipynb

+5-5
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
"cell_type": "markdown",
4141
"metadata": {},
4242
"source": [
43-
"## Define the classifiers\n\nHere we define two different classifiers. The goal is to visually compare their\nstatistical performance across thresholds using the ROC and DET curves. There\nis no particular reason why these classifiers are chosen other classifiers\navailable in scikit-learn.\n\n"
43+
"## Define the classifiers\n\nHere we define two different classifiers. The goal is to visually compare their\nstatistical performance across thresholds using the ROC and DET curves.\n\n"
4444
]
4545
},
4646
{
@@ -51,14 +51,14 @@
5151
},
5252
"outputs": [],
5353
"source": [
54-
"from sklearn.ensemble import RandomForestClassifier\nfrom sklearn.pipeline import make_pipeline\nfrom sklearn.svm import LinearSVC\n\nclassifiers = {\n \"Linear SVM\": make_pipeline(StandardScaler(), LinearSVC(C=0.025)),\n \"Random Forest\": RandomForestClassifier(\n max_depth=5, n_estimators=10, max_features=1\n ),\n}"
54+
"from sklearn.dummy import DummyClassifier\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.pipeline import make_pipeline\nfrom sklearn.svm import LinearSVC\n\nclassifiers = {\n \"Linear SVM\": make_pipeline(StandardScaler(), LinearSVC(C=0.025)),\n \"Random Forest\": RandomForestClassifier(\n max_depth=5, n_estimators=10, max_features=1, random_state=0\n ),\n \"Non-informative baseline\": DummyClassifier(),\n}"
5555
]
5656
},
5757
{
5858
"cell_type": "markdown",
5959
"metadata": {},
6060
"source": [
61-
"## Plot ROC and DET curves\n\nDET curves are commonly plotted in normal deviate scale. To achieve this the\nDET display transforms the error rates as returned by the\n:func:`~sklearn.metrics.det_curve` and the axis scale using\n`scipy.stats.norm`.\n\n"
61+
"## Compare ROC and DET curves\n\nDET curves are commonly plotted in normal deviate scale. To achieve this the\nDET display transforms the error rates as returned by the\n:func:`~sklearn.metrics.det_curve` and the axis scale using\n`scipy.stats.norm`.\n\n"
6262
]
6363
},
6464
{
@@ -69,14 +69,14 @@
6969
},
7070
"outputs": [],
7171
"source": [
72-
"import matplotlib.pyplot as plt\n\nfrom sklearn.metrics import DetCurveDisplay, RocCurveDisplay\n\nfig, [ax_roc, ax_det] = plt.subplots(1, 2, figsize=(11, 5))\n\nfor name, clf in classifiers.items():\n clf.fit(X_train, y_train)\n\n RocCurveDisplay.from_estimator(clf, X_test, y_test, ax=ax_roc, name=name)\n DetCurveDisplay.from_estimator(clf, X_test, y_test, ax=ax_det, name=name)\n\nax_roc.set_title(\"Receiver Operating Characteristic (ROC) curves\")\nax_det.set_title(\"Detection Error Tradeoff (DET) curves\")\n\nax_roc.grid(linestyle=\"--\")\nax_det.grid(linestyle=\"--\")\n\nplt.legend()\nplt.show()"
72+
"import matplotlib.pyplot as plt\n\nfrom sklearn.dummy import DummyClassifier\nfrom sklearn.metrics import DetCurveDisplay, RocCurveDisplay\n\nfig, [ax_roc, ax_det] = plt.subplots(1, 2, figsize=(11, 5))\n\nax_roc.set_title(\"Receiver Operating Characteristic (ROC) curves\")\nax_det.set_title(\"Detection Error Tradeoff (DET) curves\")\n\nax_roc.grid(linestyle=\"--\")\nax_det.grid(linestyle=\"--\")\n\nfor name, clf in classifiers.items():\n (color, linestyle) = (\n (\"black\", \"--\") if name == \"Non-informative baseline\" else (None, None)\n )\n clf.fit(X_train, y_train)\n RocCurveDisplay.from_estimator(\n clf, X_test, y_test, ax=ax_roc, name=name, color=color, linestyle=linestyle\n )\n DetCurveDisplay.from_estimator(\n clf, X_test, y_test, ax=ax_det, name=name, color=color, linestyle=linestyle\n )\n\nplt.legend()\nplt.show()"
7373
]
7474
},
7575
{
7676
"cell_type": "markdown",
7777
"metadata": {},
7878
"source": [
79-
"Notice that it is easier to visually assess the overall performance of\ndifferent classification algorithms using DET curves than using ROC curves. As\nROC curves are plot in a linear scale, different classifiers usually appear\nsimilar for a large part of the plot and differ the most in the top left\ncorner of the graph. On the other hand, because DET curves represent straight\nlines in normal deviate scale, they tend to be distinguishable as a whole and\nthe area of interest spans a large part of the plot.\n\nDET curves give direct feedback of the detection error tradeoff to aid in\noperating point analysis. The user can then decide the FNR they are willing to\naccept at the expense of the FPR (or vice-versa).\n\n"
79+
"Notice that it is easier to visually assess the overall performance of\ndifferent classification algorithms using DET curves than using ROC curves. As\nROC curves are plot in a linear scale, different classifiers usually appear\nsimilar for a large part of the plot and differ the most in the top left\ncorner of the graph. On the other hand, because DET curves represent straight\nlines in normal deviate scale, they tend to be distinguishable as a whole and\nthe area of interest spans a large part of the plot.\n\nDET curves give direct feedback of the detection error tradeoff to aid in\noperating point analysis. The user can then decide the FNR they are willing to\naccept at the expense of the FPR (or vice-versa).\n\n## Non-informative classifier baseline for the ROC and DET curves\n\nThe diagonal black-dotted lines in the plots above correspond to a\n:class:`~sklearn.dummy.DummyClassifier` using the default \"prior\" strategy, to\nserve as baseline for comparison with other classifiers. This classifier makes\nconstant predictions, independent of the input features in `X`, making it a\nnon-informative classifier.\n\nTo further understand the non-informative baseline of the ROC and DET curves,\nwe recall the following mathematical definitions:\n\n$\\text{FPR} = \\frac{\\text{FP}}{\\text{FP} + \\text{TN}}$\n\n$\\text{FNR} = \\frac{\\text{FN}}{\\text{TP} + \\text{FN}}$\n\n$\\text{TPR} = \\frac{\\text{TP}}{\\text{TP} + \\text{FN}}$\n\nA classifier that always predict the positive class would have no true\nnegatives nor false negatives, giving $\\text{FPR} = \\text{TPR} = 1$ and\n$\\text{FNR} = 0$, i.e.:\n\n- a single point in the upper right corner of the ROC plane,\n- a single point in the lower right corner of the DET plane.\n\nSimilarly, a classifier that always predict the negative class would have no\ntrue positives nor false positives, thus $\\text{FPR} = \\text{TPR} = 0$\nand $\\text{FNR} = 1$, i.e.:\n\n- a single point in the lower left corner of the ROC plane,\n- a single point in the upper left corner of the DET plane.\n\n"
8080
]
8181
}
8282
],
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

dev/_downloads/67703ae8c65716668dd87c31a24a069b/plot_det.py

+51-12
Original file line numberDiff line numberDiff line change
@@ -60,24 +60,24 @@
6060
# ----------------------
6161
#
6262
# Here we define two different classifiers. The goal is to visually compare their
63-
# statistical performance across thresholds using the ROC and DET curves. There
64-
# is no particular reason why these classifiers are chosen other classifiers
65-
# available in scikit-learn.
63+
# statistical performance across thresholds using the ROC and DET curves.
6664

65+
from sklearn.dummy import DummyClassifier
6766
from sklearn.ensemble import RandomForestClassifier
6867
from sklearn.pipeline import make_pipeline
6968
from sklearn.svm import LinearSVC
7069

7170
classifiers = {
7271
"Linear SVM": make_pipeline(StandardScaler(), LinearSVC(C=0.025)),
7372
"Random Forest": RandomForestClassifier(
74-
max_depth=5, n_estimators=10, max_features=1
73+
max_depth=5, n_estimators=10, max_features=1, random_state=0
7574
),
75+
"Non-informative baseline": DummyClassifier(),
7676
}
7777

7878
# %%
79-
# Plot ROC and DET curves
80-
# -----------------------
79+
# Compare ROC and DET curves
80+
# --------------------------
8181
#
8282
# DET curves are commonly plotted in normal deviate scale. To achieve this the
8383
# DET display transforms the error rates as returned by the
@@ -86,22 +86,29 @@
8686

8787
import matplotlib.pyplot as plt
8888

89+
from sklearn.dummy import DummyClassifier
8990
from sklearn.metrics import DetCurveDisplay, RocCurveDisplay
9091

9192
fig, [ax_roc, ax_det] = plt.subplots(1, 2, figsize=(11, 5))
9293

93-
for name, clf in classifiers.items():
94-
clf.fit(X_train, y_train)
95-
96-
RocCurveDisplay.from_estimator(clf, X_test, y_test, ax=ax_roc, name=name)
97-
DetCurveDisplay.from_estimator(clf, X_test, y_test, ax=ax_det, name=name)
98-
9994
ax_roc.set_title("Receiver Operating Characteristic (ROC) curves")
10095
ax_det.set_title("Detection Error Tradeoff (DET) curves")
10196

10297
ax_roc.grid(linestyle="--")
10398
ax_det.grid(linestyle="--")
10499

100+
for name, clf in classifiers.items():
101+
(color, linestyle) = (
102+
("black", "--") if name == "Non-informative baseline" else (None, None)
103+
)
104+
clf.fit(X_train, y_train)
105+
RocCurveDisplay.from_estimator(
106+
clf, X_test, y_test, ax=ax_roc, name=name, color=color, linestyle=linestyle
107+
)
108+
DetCurveDisplay.from_estimator(
109+
clf, X_test, y_test, ax=ax_det, name=name, color=color, linestyle=linestyle
110+
)
111+
105112
plt.legend()
106113
plt.show()
107114

@@ -117,3 +124,35 @@
117124
# DET curves give direct feedback of the detection error tradeoff to aid in
118125
# operating point analysis. The user can then decide the FNR they are willing to
119126
# accept at the expense of the FPR (or vice-versa).
127+
#
128+
# Non-informative classifier baseline for the ROC and DET curves
129+
# --------------------------------------------------------------
130+
#
131+
# The diagonal black-dotted lines in the plots above correspond to a
132+
# :class:`~sklearn.dummy.DummyClassifier` using the default "prior" strategy, to
133+
# serve as baseline for comparison with other classifiers. This classifier makes
134+
# constant predictions, independent of the input features in `X`, making it a
135+
# non-informative classifier.
136+
#
137+
# To further understand the non-informative baseline of the ROC and DET curves,
138+
# we recall the following mathematical definitions:
139+
#
140+
# :math:`\text{FPR} = \frac{\text{FP}}{\text{FP} + \text{TN}}`
141+
#
142+
# :math:`\text{FNR} = \frac{\text{FN}}{\text{TP} + \text{FN}}`
143+
#
144+
# :math:`\text{TPR} = \frac{\text{TP}}{\text{TP} + \text{FN}}`
145+
#
146+
# A classifier that always predict the positive class would have no true
147+
# negatives nor false negatives, giving :math:`\text{FPR} = \text{TPR} = 1` and
148+
# :math:`\text{FNR} = 0`, i.e.:
149+
#
150+
# - a single point in the upper right corner of the ROC plane,
151+
# - a single point in the lower right corner of the DET plane.
152+
#
153+
# Similarly, a classifier that always predict the negative class would have no
154+
# true positives nor false positives, thus :math:`\text{FPR} = \text{TPR} = 0`
155+
# and :math:`\text{FNR} = 1`, i.e.:
156+
#
157+
# - a single point in the lower left corner of the ROC plane,
158+
# - a single point in the upper left corner of the DET plane.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

dev/_downloads/scikit-learn-docs.zip

25 KB
Binary file not shown.
29 Bytes
28 Bytes

0 commit comments

Comments
 (0)