Technical Specification
Track updates
iconCreated with Sketch.

SA TS ISO/IEC 4213:2023

[Current]

Information technology — Artificial intelligence — Assessment of machine learning classification performance

SA TS ISO/IEC 4213:2023 identically adopts ISO/IEC TS 4213:2022, which specifies methodologies for measuring classification performance of machine learning models, systems and algorithms.
Published: 16/06/2023
Pages: 33
Table of contents
Cited references
Content history
Table of contents
Header
About this publication
Preface
Foreword
Introduction
1 Scope
2 Normative references
3 Terms and definitions
3.1 Classification and related terms
3.2 Metrics and related terms
4 Abbreviated terms
5 General principles
5.1 Generalized process for machine learning classification performance assessment
5.2 Purpose of machine learning classification performance assessment
5.3 Control criteria in machine learning classification performance assessment
5.3.1 General
5.3.2 Data representativeness and bias
5.3.3 Preprocessing
5.3.4 Training data
5.3.5 Test and validation data
5.3.6 Cross-validation
5.3.7 Limiting information leakage
5.3.8 Limiting channel effects
5.3.9 Ground truth
5.3.10 Machine learning algorithms, hyperparameters and parameters
5.3.11 Evaluation environment
5.3.12 Acceleration
5.3.13 Appropriate baselines
5.3.14 Machine learning classification performance context
6 Statistical measures of performance
6.1 General
6.2 Base elements for metric computation
6.2.1 General
6.2.2 Confusion matrix
6.2.3 Accuracy
6.2.4 Precision, recall and specificity
6.2.5 F1 score
6.2.6 Fβ
6.2.7 Kullback-Leibler divergence
6.3 Binary classification
6.3.1 General
6.3.2 Confusion matrix for binary classification
6.3.3 Accuracy for binary classification
6.3.4 Precision, recall, specificity, F1 score and Fβ for binary classification
6.3.5 Kullback-Leibler divergence for binary classification
6.3.6 Receiver operating characteristic curve and area under the receiver operating characteristic curve
6.3.7 Precision recall curve and area under the precision recall curve
6.3.8 Cumulative response curve
6.3.9 Lift curve
6.4 Multi-class classification
6.4.1 General
6.4.2 Accuracy for multi-class classification
6.4.3 Macro-average, weighted-average and micro-average
6.4.4 Distribution difference or distance metrics
6.5 Multi-label classification
6.5.1 General
6.5.2 Hamming loss
6.5.3 Exact match ratio
6.5.4 Jaccard index
6.5.5 Distribution difference or distance metrics
6.6 Computational complexity
6.6.1 General
6.6.2 Classification latency
6.6.3 Classification throughput
6.6.4 Classification efficiency
6.6.5 Energy consumption
7 Statistical tests of significance
7.1 General
7.2 Paired Student’s t-test
7.3 Analysis of variance
7.4 Kruskal-Wallis test
7.5 Chi-squared test
7.6 Wilcoxon signed-ranks test
7.7 Fisher’s exact test
7.8 Central limit theorem
7.9 McNemar test
7.10 Accommodating multiple comparisons
7.10.1 General
7.10.2 Bonferroni correction
7.10.3 False discovery rate
8 Reporting
Annex A
A.1 Progression from raw classification outputs to multi-class results
Annex B
B.1 Progression from raw binary classification output to ROC curve
Annex C
C.1 Examples of machine learning classification benchmark tests
Annex D
D.1 Calculating chance-corrected cause-specific mortality fraction accuracy
Bibliography
Cited references in this standard
[Current]
Information technology - Artificial intelligence - Artificial intelligence concepts and terminology
[Current]
Framework for Artificial Intelligence (AI) Systems Using Machine Learning (ML)
Content history

One-time Purchase

Access via web browser on any device
One-time purchase
Single publication
Offline access via PDF^

$147.51 AUD

Inclusive of GST
Format *
iconCreated with Sketch.
Web Reader
Licenses *
iconCreated with Sketch.
1 License (for yourself - not shareable)
Total$147.51 AUD
Add to Cart
IMPORTANT