10  Output Quality

How to pick a model?

Authors
Affiliations

Morgan Schwartz

HHMI Janelia Research Campus

Diane Adjavon

Starting prompt for this chapter: Chapter 10 addresses how to assess the quality of a model’s output, mentioning Metrics Reloaded. This chapter should address the question: how do I know my model is good enough? It should frame this discussion using the example of a segmentation model and discuss how tools can identify uncertain decisions from a model.

10.1 Metrics and Losses

  • difference between “metric” and “loss”

  • loss:

    • has to be “differentiable”
    • used to train the network
    • should already be close to the metric you want to use
    • can be used to assess model quality on validation data (but a custom metric might be more insightful)
  • metric:

    • an application specific measure of how close you are to the ground truth
    • used to select a model and to measure progress
    • does not need to be “differentiable”
    • but if it is, that’s great, you can use it as a loss
    • otherwise, find a loss that is a good proxy or chose your model based on the metric on the validation dataset

10.2 What Metric to Pick?

  • ideally: metric reflects time/cost needed to clean up for a particular application
  • this can mean different things, and sometimes there is no single number (see Cell Tracking below)
  • Discuss tradeoffs in metrics choice
  • Interpretation of different metrics with guidance on how to balance/trade-off priorities
    • E.g. segmentation for counting (segmentation as object detection) vs. segmentation for size estimation

10.3 Examples

Note: We are considering picking one example to place in call out boxes throughout the chapter to facilitate an ongoing discussion with a concrete example. For examples, we will use real data/GT, but synthetically generate the predictions in order to better highlight specific issues.

10.3.1 Segmentation Metrics

  • show on examples:
    • dice
    • Hausdorff
    • AP_x (and required matching)
  • mention MetricsReloaded

10.3.2 Cell Tracking Metrics

  • usually not a single number: topological correctness vs. positional correctness
  • show traccuracy and discuss some of their metrics