Skip to content
This repository was archived by the owner on Sep 18, 2024. It is now read-only.

Add batch normalization folding to QAT quantizer#3911

Merged
QuanluZhang merged 9 commits into
microsoft:masterfrom
AlibabaPAI:bn_fold_qat
Jul 26, 2021
Merged

Add batch normalization folding to QAT quantizer#3911
QuanluZhang merged 9 commits into
microsoft:masterfrom
AlibabaPAI:bn_fold_qat

Conversation

@chenbohua3

Copy link
Copy Markdown
Contributor

This pr adds batch normalization folding to the QAT quantizer, the core ideas are described in #3890

Comment thread nni/algorithms/compression/pytorch/quantization/quantizers.py Outdated
"""

def __init__(self, model, config_list, optimizer=None):
def __init__(self, model, config_list, optimizer=None, model_inputs=None):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is model_inputs the same concept with dummy_input in pruning speedup and quantization speedup? If so, recommend using dummy_input instead of model_inputs to be aligned.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@linbinskn

linbinskn commented Jul 11, 2021

Copy link
Copy Markdown
Contributor

Looks good. I only have one question right now. Is there any problems If we want to export simulated model with new feature bn folding to backend execution engine such as TensorRT? For instance, during inference, conv+bn+relu will be fused into singel op by updating the conv's weight/bias parameter with bn parameters. However, currently our conv's weights have already been equal to fused weight while bn layer still exists. If the problem actual exists, maybe we can discuss an appropriate method to resolve it.

@chenbohua3

Copy link
Copy Markdown
Contributor Author

You are right. I have added some code logic to restore folded weight/bias in export_model.

Comment thread nni/compression/pytorch/compressor.py
Comment thread nni/algorithms/compression/pytorch/quantization/quantizers.py Outdated
@linbinskn

Copy link
Copy Markdown
Contributor

Please update content of bn folding in doc Supported Quantization Algorithms on NNI.

@chenbohua3

Copy link
Copy Markdown
Contributor Author

the content of bn folding has been added

@QuanluZhang QuanluZhang reopened this Jul 19, 2021
def fold_bn(self, config, **kwargs):
# TODO simulate folded weight
pass
def fold_bn(self, *inputs, wrapper):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is QAT_Quantizer specific? other quantizers may have a different fold_bn function?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should also work well for other quantizers. (at least for lsq quantizer I think:) ). I will make it a common utility function in the pr that enables batch normalization folding for other quantizers.

@QuanluZhang QuanluZhang merged commit 7fc5af0 into microsoft:master Jul 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants