How each detection is classified. Adjust thresholds to watch TP / FP / FN shift in real time.
Confidence threshold: 0.50
IoU threshold: 0.50
0
TP
IoU ≥ thr & conf ≥ thr
0
FP
conf ≥ thr but IoU < thr
0
FN
GT exists but not detected
—
TN
Not defined in detection
TP
FP
FN
Precision = TP / (TP + FP) Recall = TP / (TP + FN) F1 = 2 × P × R / (P + R)
Precision
—
Recall
—
F1
—
Key rules
• Multiple detections on same GT → only 1st is TP, rest are FP
• Unmatched GTs count as FN
• No background class → TN is undefined in detection
• This is why mAP is used instead of Accuracy
Each chip = one detection sample img · conf=X · IoU=Y
Step 03
Precision–Recall Curve — Live Sample Walkthrough
Confidence threshold
0.50
IoU threshold
0.50
AP
—
Precision
—
Recall
—
TP / FP / FN
—
All detections — sorted by confidence ↓
#
Image
Conf
IoU
Verdict
P
R
Samples above threshold — classified as:
TPIoU ≥ thr & conf ≥ thr
FPconf ≥ thr but IoU < thr
FNGT not matched by any detection
Step 03 — How it works
Building the PR Curve step by step
We lower the confidence threshold one detection at a time. Each new detection adds one (Recall, Precision) point to the curve. Press Play or step through manually.
Step 0 / 20
Current detection
—
Cum TP
0
Cum FP
0
Precision
—
Recall
—
#
Conf
V
P
R
Step 03 — Intuition
Why the PR Curve looks like stairs
Three key patterns explain the characteristic shape of every PR Curve — and what they tell you about the model.
① TP hit → step right (Recall ↑)
When a new detection is a TP, the cumulative TP count goes up. Recall = TP / GT_total, so Recall increases. The curve steps right.
Recall = TP↑ / GT_total → moves right
② FP hit → step down (Precision ↓)
When a new detection is a FP, TP stays the same but TP+FP grows. Precision = TP / (TP+FP) drops. The curve steps down.
Precision = TP / (TP + FP↑) → moves down
③ High-conf detections are added first
Because we sort by confidence descending, the model's most certain predictions come first — these tend to be correct, so the curve starts high-precision. It sags as we add lower-confidence (noisier) detections.
Sort: conf 0.97 → 0.93 → … → 0.06
Takeaway: A curve that stays top-right means the model finds many TPs early (high precision) and continues finding them (high recall). AP is the area under this curve — bigger = better.
Step 05
PASCAL VOC — 20-class mAP
Compute per-class AP then take the arithmetic mean. Sample YOLOv3-style per-class AP on VOC 2007 test set.
mAP = (1/20) × Σ AP_c
mAP@0.5 : VOC standard mAP@0.5:0.95 : COCO standard
mAP (avg)
—
Best class
—
Worst class
—
# classes
20
VOC 20 classes
aeroplane · bicycle · bird · boat · bottle · bus · car · cat · chair · cow · diningtable · dog · horse · motorbike · person · pottedplant · sheep · sofa · train · tvmonitor
Summary
Key Concepts at a Glance
The full 5-step pipeline on one slide.
01
IoU
Ratio of intersection to union area between two boxes. PASCAL VOC uses 0.5 as the threshold to decide TP vs FP.
IoU = |A ∩ B| / |A ∪ B|
02
TP / FP / FN
IoU ≥ thr → TP. Second match on same GT → FP. Undetected GT → FN. TN is undefined — no background class.
Precision = TP/(TP+FP) Recall = TP/(TP+FN)
03 → 04
PR Curve & AP
Sort by confidence descending, accumulate (P, R) per row to trace the PR Curve. Area under that curve = AP.
AP = ∫₀¹ P(r) dr
05
mAP
Arithmetic mean of per-class AP over all 20 classes. VOC uses mAP@0.5; COCO uses mAP@0.5:0.95 (10 IoU levels averaged).
mAP = (1/C) × Σ AP_c
FP Deep Dive
The 3 Types of False Positives
Any detection labelled Positive that does not count as TP falls into one of three categories.
Case 1 — IoU Miss
Box was predicted but barely overlaps GT. The location is simply wrong.
IoU < threshold → FP
Case 2 — Duplicate Detection
IoU is fine, but another prediction already claimed this GT. NMS prevents this.
1st match → TP, rest → FP
Case 3 — Hallucination
A box drawn in empty background with no GT nearby. A model hallucination.
No GT in region → FP
Dataset Comparison
PASCAL VOC vs MS COCO
Both are called "mAP" — but they measure different things. Understanding the gap is essential when comparing published results.
PASCAL VOC
MS COCO
VOC mAP@0.5 Single IoU → 1 AP value Relatively lenient evaluation