TensorFlow Lite 8bit量化Spec

TensorFlow Lite量化spec随时间会微调，这里记录一下目前量化采用的数据类型、范围，以及对历史模型兼容等情况。

上面的TFLite int8量化归纳起来要点如下：

而历史上，非对称per-tensor量化是用uint8（范围[0, 256]）表示的。

新的相关支撑工具以及kernel实现（包括TFLite里的reference与optimized kernel）都是基于上述的spc定义。

所以，

参考

TensorFlow Lite 8bit量化spec https://tensorflow.google.cn/lite/performance/quantization_spec
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference https://arxiv.org/pdf/1712.05877.pdf