Field Notebook · Vol. I · Iridaceae

A small machine,
trained to recognise
three irises.

Four measurements — sepal length, sepal width, petal length, petal width. Four classical algorithms, each trained on Fisher's 1936 dataset of 150 specimens. Adjust the dials, watch every model agree (or quietly disagree) on the species.

Specimens
150
Features
4
Models
4
Species
3
Pressed plants in an open botanical book
Plate · 1936

“The use of multiple measurements in taxonomic problems.”

— R. A. Fisher

§ I · Examination

Measure a specimen.

Adjust the dials below — every value is in centimetres. The four trained models will each cast a vote.

cm
48
cm
24.5
cm
17
cm
0.12.5
Awaiting Specimen
“The plate remains blank until measurements are entered.”
§ II · The Specimens

150 flowers, plotted.

Each dot is a real iris from Fisher's herbarium. Choose any two measurements to see how the species cluster — or overlap.

Bivariate Scatter

Plot any two features

  • Iris setosa
  • Iris versicolor
  • Iris virginica
0.82.84.87.1000000000000005Petal Length (cm)-0.10.61.322.7Petal Width (cm)
Distribution

Sepal Length

4.254.755.255.756.757.257.75cm08162432
Distribution

Sepal Width

2.252.753.253.754.254.75cm020406080
Distribution

Petal Length

1.752.753.754.755.756.75cm07142128
Distribution

Petal Width

0.250.751.251.752.252.75cm015304560
§ III · The Classifiers

Four models. One verdict.

Each algorithm was trained on a 75/25 split of the dataset and evaluated on the held-out 25%. Metrics are macro-averaged across the three species.

Accuracy comparison

Best · Support Vector Machine · 94.7%
Logistic RegressionK-Nearest NeighborsSupport Vector MachineDecision Tree0%25%50%75%100%

Logistic Regression

92.11%

A linear probabilistic classifier — the lab notebook baseline.

acc 92.1prec 92.5rec 92.3f1 92.3

K-Nearest Neighbors

92.11%

Classifies by majority vote of the 5 closest specimens.

acc 92.1prec 93.8rec 92.3f1 92.2

Support Vector Machine

94.74%

Finds the maximum-margin hyperplane (RBF kernel).

acc 94.7prec 94.9rec 94.9f1 94.9

Decision Tree

92.11%

Recursive feature splits — interpretable like a field key.

acc 92.1prec 92.5rec 92.3f1 92.3
Iris Lab

A small, well-lit place for studying Fisher's 1936 dataset. Built with scikit-learn and a great deal of care.

Models
  • Logistic Regression
  • K-Nearest Neighbors
  • Support Vector Machine
  • Decision Tree
Reference

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179–188.

iris-lab · vol. i2026 · made for the curious