Technical Specification

Track updates

SA TS ISO/IEC 4213:2023

[Current]

Information technology — Artificial intelligence — Assessment of machine learning classification performance

SA TS ISO/IEC 4213:2023 identically adopts ISO/IEC TS 4213:2022, which specifies methodologies for measuring classification performance of machine learning models, systems and algorithms.

Published: 16/06/2023

Pages: 33

Table of contents

Cited references

Content history

Table of contents

Header

About this publication

Preface

Foreword

Introduction

1 Scope

2 Normative references

3 Terms and definitions

3.1 Classification and related terms

3.2 Metrics and related terms

4 Abbreviated terms

5 General principles

5.1 Generalized process for machine learning classification performance assessment

5.2 Purpose of machine learning classification performance assessment

5.3 Control criteria in machine learning classification performance assessment

5.3.1 General

5.3.2 Data representativeness and bias

5.3.3 Preprocessing

5.3.4 Training data

5.3.5 Test and validation data

5.3.6 Cross-validation

5.3.7 Limiting information leakage

5.3.8 Limiting channel effects

5.3.9 Ground truth

5.3.10 Machine learning algorithms, hyperparameters and parameters

5.3.11 Evaluation environment

5.3.12 Acceleration

5.3.13 Appropriate baselines

5.3.14 Machine learning classification performance context

6 Statistical measures of performance

6.1 General

6.2 Base elements for metric computation

6.2.1 General

6.2.2 Confusion matrix

6.2.3 Accuracy

6.2.4 Precision, recall and specificity

6.2.5 F1 score

6.2.6 Fβ

6.2.7 Kullback-Leibler divergence

6.3 Binary classification

6.3.1 General

6.3.2 Confusion matrix for binary classification

6.3.3 Accuracy for binary classification

6.3.4 Precision, recall, specificity, F1 score and Fβ for binary classification

6.3.5 Kullback-Leibler divergence for binary classification

6.3.6 Receiver operating characteristic curve and area under the receiver operating characteristic curve

6.3.7 Precision recall curve and area under the precision recall curve

6.3.8 Cumulative response curve

6.3.9 Lift curve

6.4 Multi-class classification

6.4.1 General

6.4.2 Accuracy for multi-class classification

6.4.3 Macro-average, weighted-average and micro-average

6.4.4 Distribution difference or distance metrics

6.5 Multi-label classification

6.5.1 General

6.5.2 Hamming loss

6.5.3 Exact match ratio

6.5.4 Jaccard index

6.5.5 Distribution difference or distance metrics

6.6 Computational complexity

6.6.1 General

6.6.2 Classification latency

6.6.3 Classification throughput

6.6.4 Classification efficiency

6.6.5 Energy consumption

7 Statistical tests of significance

7.1 General

7.2 Paired Student’s t-test

7.3 Analysis of variance

7.4 Kruskal-Wallis test

7.5 Chi-squared test

7.6 Wilcoxon signed-ranks test

7.7 Fisher’s exact test

7.8 Central limit theorem

7.9 McNemar test

7.10 Accommodating multiple comparisons

7.10.1 General

7.10.2 Bonferroni correction

7.10.3 False discovery rate

8 Reporting

Annex A

A.1 Progression from raw classification outputs to multi-class results

Annex B

B.1 Progression from raw binary classification output to ROC curve

Annex C

C.1 Examples of machine learning classification benchmark tests

Annex D

D.1 Calculating chance-corrected cause-specific mortality fraction accuracy

Bibliography

Cited references in this standard

ISO/IEC 22989:2022

[Current]

Information technology - Artificial intelligence - Artificial intelligence concepts and terminology

ISO/IEC 23053:2022

[Current]

Framework for Artificial Intelligence (AI) Systems Using Machine Learning (ML)

Content history

ISO/IEC TS 4213:2022

[Current]

One-time Purchase

Access via web browser on any device

One-time purchase

Single publication

Offline access via PDF^

$147.51 AUD

Inclusive of GST

Format *

Web Reader

Licenses *

1 License (for yourself - not shareable)

Total$147.51 AUD

Add to Cart

IMPORTANT