Intel Neural Compressor Quantization with SigOpt

Luis Bermudez
AI at Scale, Hyperparameter Optimization, Intelligent Experimentation

We are pleased to share that Intel Neural Compressor (INC) now has easy to use integration with SigOpt.

Intel Neural Compressor (formerly known as Intel Low Precision Optimization Tool) is an open-source Python library running on Intel CPUs and GPUs, which delivers unified interfaces across multiple deep learning frameworks for popular network compression technologies, such as quantization, pruning, knowledge distillation. This tool supports automatic accuracy-driven tuning strategies to help users quickly find out the best quantized model. It also implements different weight pruning algorithms to generate pruned model with predefined sparsity goal and supports knowledge distillation to distill the knowledge from the teacher model to the student model.

SigOpt is an Intelligent Experimentation platform which can be used to accelerate the performance for model development. In this case, SigOpt increases the performance gains for INC Quantization compression.


Before using SigOpt strategy, signup or login to your SigOpt account.

  • Each account has its own API token. Find your API token and then set the configuration item: sigopt_api_token.
  • Create a new project and then set the project name into the configuration item: sigopt_project_id.
  • Set the name for this experiment in configuration item sigopt_experiment_id. The default is nc-tune.

SigOpt Optimization Setup

After logging in, you can use the API token to connect the local code and the online platform, corresponding to the configuration item sigopt_api_token, which can be obtained here.

In addition to the Optimization Loop, SigOpt has two important concepts: project and experiment. Create a project before experimenting, corresponding to sigopt_project_id and sigopt_experiment_name. Multiple experiments can be created in each project. After creating an experiment, run through these three simple steps in a loop:

  • Receive a suggestion from SigOpt
  • Evaluate your metric
  • Report an observation to SigOpt

With INC’s SigOpt strategy, the metrics add accuracy as a constraint and optimize for latency.

Neural Compressor Configuration

The following INC configuration will help get you started quickly. Note that the sigopt_api_token is necessary to use the SigOpt strategy, whereas the Basic strategy does not need the API token. Also, be sure to create the corresponding project name sigopt_project_id in your SigOpt account before using the SigOpt strategy.

    name: sigopt
    sigopt_api_token: YOUR-ACCOUNT-API-TOKEN
    sigopt_project_id: PROJECT-ID
    sigopt_experiment_name: nc-tune
    relative: 0.01
    timeout: 0
  random_seed: 9527

Performance Benefits

The metric constraints from SigOpt help you easily self-define metrics and search for desirable outcomes. Also, the results of each experiment are recorded in your account. So, you can use the SigOpt data analysis function to analyze the results, such as drawing a chart, calculating F1 score, and more.

With INC’s integrated SigOpt strategy, you can achieve faster compression while maintaining accuracy. The following results show how SigOpt increased the quantization speed for MobileNet_v1 and ResNet50_v1 with TensorFlow. While the basic strategy results in slower compression time.

MobileNet_v1 TensorFlow

StrategyFP32 Baselineint8 Accuracyint8 Duration (s)

ResNet50_v1 TensorFlow

StrategyFP32 Baselineint8 Accuracyint8 Duration (s)

If you’d like to dive deeper into how SigOpt works with Intel Neural Compressor, then read on. If you want to get started addressing similar problems in your workflow, use SigOpt free today by signing up at

Luis Bermudez AI Developer Advocate

Want more content from SigOpt? Sign up now.