Classical Neural Network Module

The following classical neural network modules support automatic back propagation computation.After running the forward function, you can calculate the gradient by executing the reverse function.A simple example of the convolution layer is as follows:

from pyvqnet.tensor import arange
from pyvqnet import kfloat32
from pyvqnet.nn import Conv2D

# an image feed into two dimension convolution layer
b = 2        # batch size
ic = 2       # input channels
oc = 2      # output channels
hw = 4      # input width and heights

# two dimension convolution layer
test_conv = Conv2D(ic,oc,(2,2),(2,2),"same")

# input of shape [b,ic,hw,hw]
x0 = arange(1,b*ic*hw*hw+1,requires_grad=True,dtype=kfloat32).reshape([b,ic,hw,hw])

#forward function
x = test_conv(x0)

#backward function with autograd

Module Class

abstract calculation module


class pyvqnet.nn.module.Module

Base class for all neural network modules including quantum modules or classic modules. Your models should also be subclass of this class for autograd calculation.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes:

class Model(Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv1 = pyvqnet.nn.Conv2d(1, 20, (5,5))
        self.conv2 = pyvqnet.nn.Conv2d(20, 20, (5,5))

    def forward(self, x):
        x = pyvqnet.nn.activation.relu(self.conv1(x))
        return pyvqnet.nn.activation.relu(self.conv2(x))

Submodules assigned in this way will be registered


pyvqnet.nn.module.Module.forward(x, *args, **kwargs)

Abstract method which performs forward pass.

  • x – input QTensor

  • *args – A non-keyword variable parameter

  • **kwargs – A keyword variable parameter


module output


import numpy as np
from pyvqnet.tensor import QTensor
import pyvqnet as vq
from pyvqnet.nn import Conv2D
b = 2
ic = 3
oc = 2
test_conv = Conv2D(ic, oc, (3, 3), (2, 2), "same")
x0 = QTensor(np.arange(1, b * ic * 5 * 5 + 1).reshape([b, ic, 5, 5]),
x = test_conv.forward(x0)

pyvqnet.nn.module.Module.state_dict(destination=None, prefix='')

Return a dictionary containing a whole state of the module.

Both parameters and persistent buffers (e.g. running averages) are included. Keys are corresponding parameter and buffer names.

  • destination – a dict where state will be stored

  • prefix – the prefix for parameters and buffers used in this module


a dictionary containing a whole state of the module


from pyvqnet.nn import Conv2D
test_conv = Conv2D(2,3,(3,3),(2,2),"same")
#odict_keys(['weights', 'bias'])


pyvqnet.nn.module.Module.toGPU(device: int = DEV_GPU_0)

Move the parameters and buffer data of a module and its submodules to the specified GPU device.

device specifies the device whose internal data is stored. When device >= DEV_GPU_0, the data is stored on the GPU. If your computer has multiple GPUs, You can specify different devices to store data. For example, device = DEV_GPU_1 , DEV_GPU_2, DEV_GPU_3, … means it is stored on GPUs with different serial numbers.


Module cannot be calculated on different GPUs. A Cuda error will be raised if you try to create a QTensor on a GPU whose ID exceeds the maximum number of verified GPUs.


device – The device currently saving QTensor, default=DEV_GPU_0. device = pyvqnet.DEV_GPU_0, stored in the first GPU, devcie = DEV_GPU_1, stored in the second GPU, and so on.


Module moved to GPU device.


from pyvqnet.nn.conv import ConvT2D
test_conv = ConvT2D(3, 2, [4,4], [2, 2], "same")
test_conv = test_conv.toGPU()



Moves the parameters and buffer data of a module and its submodules to a specific CPU device.


Module moved to CPU device.


from pyvqnet.nn.conv import ConvT2D
test_conv = ConvT2D(3, 2, [4,4], [2, 2], "same")
test_conv = test_conv.toCPU()

save_parameters, f)

Saves model parmeters to a disk file.

  • obj – saved OrderedDict from state_dict()

  • f – a string or os.PathLike object containing a file name




from pyvqnet.nn import Module,Conv2D
import pyvqnet
class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = Conv2D(input_channels=1, output_channels=6, kernel_size=(5, 5), stride=(1, 1), padding="valid")

    def forward(self, x):
        return super().forward(x)

model = Net(),"tmp.model")


Loads model paramters from a disk file.

The model instance should be created first.


f – a string or os.PathLike object containing a file name


saved OrderedDict for load_state_dict()


from pyvqnet.nn import Module,Conv2D
import pyvqnet

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = Conv2D(input_channels=1, output_channels=6, kernel_size=(5, 5), stride=(1, 1), padding="valid")

    def forward(self, x):
        return super().forward(x)

model = Net()
model1 = Net()  # another Module object model.state_dict(),"tmp.model")
model_para ="tmp.model")


class pyvqnet.nn.module.ModuleList([pyvqnet.nn.module.Module])

Save submodules in a list. ModuleList can be indexed like a normal Python list, and the internal parameters of the Module it contains can be saved.


modules – list of nn.Modules


a list of modules


from pyvqnet.tensor import *
from pyvqnet.nn import Module,Linear,ModuleList
from pyvqnet.qnn import ProbsMeasure,QuantumLayer
import pyqpanda as pq
def pqctest (input,param,qubits,cubits,m_machine):
    circuit = pq.QCircuit()





    prog = pq.QProg()

    rlt_prob = ProbsMeasure([0,2],prog,m_machine,qubits)
    return rlt_prob

class M(Module):
    def __init__(self):
        super(M, self).__init__()
        self.pqc2 = ModuleList([QuantumLayer(pqctest,3,"cpu",4,1), Linear(4,1)

    def forward(self, x, *args, **kwargs):
        y = self.pqc2[0](x)  + self.pqc2[1](x)
        return y

mm = M()
#odict_keys(['pqc2.0.m_para', 'pqc2.1.weights', 'pqc2.1.bias'])


class pyvqnet.nn.module.ParameterList([pyvqnet.nn.module.Module])

To store parameters in a list, a ParameterList can be indexed like a normal Python list, and the internal parameters of the Parameter it contains can be stored.


modules – nn.Parameter list.


a Parameter list.


from pyvqnet import nn
class MyModule(nn.Module):
    def __init__(self):
        self.params = nn.ParameterList([nn.Parameter((10, 10)) for i in range(10)])
    def forward(self, x):

        # ParameterList can act as an iterable, or be indexed using ints
        for i, p in enumerate(self.params):
            x = self.params[i // 2] * x + p * x
        return x

model = MyModule()


class pyvqnet.nn.module.Sequential([pyvqnet.nn.module.Module])

Modules will be added in the order they are passed in. Alternatively, a OrderedDict of modules can be passed in. The forward() method of Sequential takes any input and forwards it to its first module. It then Sequential the output to the input of each subsequent module in turn, and finally returns the output of the last module.


modules – module to append.




from pyvqnet import nn
from collections import OrderedDict

# Using Sequential to create a small model.
model = nn.Sequential(
          nn.Conv2D(1,20,(5, 5)),
          nn.Conv2D(20,64,(5, 5)),

# Using Sequential with OrderedDict. This is functionally the same as the above code

model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2D(1,20,(5, 5))),
          ('relu1', nn.ReLu()),
          ('conv2', nn.Conv2D(20,64,(5, 5))),
          ('relu2', nn.ReLu())

Classical Neural Network Layer


class pyvqnet.nn.Conv1D(input_channels: int, output_channels: int, kernel_size: int, stride: int = 1, padding='valid', use_bias: str = True, kernel_initializer=None, bias_initializer=None, dilation_rate: int = 1, group: int = 1, dtype=None, name='')

Apply a 1-dimensional convolution kernel over an input . Inputs to the conv module are of shape (batch_size, input_channels, height)

  • input_channelsint - Number of input channels

  • output_channelsint - Number of kernels

  • kernel_sizeint - Size of a single kernel. kernel shape = [output_channels,input_channels/group,kernel_size,1]

  • strideint - Stride, defaults to 1

  • paddingstr|int - padding option, which can be a string {‘valid’, ‘same’} or an integer giving the amount of implicit padding to apply . Default “valid”.

  • use_biasbool - if use bias, defaults to True

  • kernel_initializercallable - Defaults to None

  • bias_initializercallable - Defaults to None

  • dilation_rateint - dilated size, defaults: 1

  • groupint - number of groups of grouped convolutions. Default: 1

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – The name of the module, default: “”.


a Conv1D class


padding='valid' is the same as no padding.

padding='same' pads the input so the output has the shape as the input.


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn import Conv1D
import pyvqnet
b= 2
ic =3
oc = 2
test_conv = Conv1D(ic,oc,3,2,"same")
x0 = QTensor(np.arange(1,b*ic*5*5 +1).reshape([b,ic,25]),requires_grad=True,dtype=pyvqnet.kfloat32)
x = test_conv.forward(x0)

class pyvqnet.nn.Conv2D(input_channels: int, output_channels: int, kernel_size: tuple, stride: tuple = (1, 1), padding='valid', use_bias=True, kernel_initializer=None, bias_initializer=None, dilation_rate: int = 1, group: int = 1, dtype=None, name='')

Apply a two-dimensional convolution kernel over an input . Inputs to the conv module are of shape (batch_size, input_channels, height, width)

  • input_channelsint - Number of input channels

  • output_channelsint - Number of kernels

  • kernel_sizetuple|list - Size of a single kernel. kernel shape = [output_channels,input_channels/group,kernel_size,kernel_size]

  • stridetuple|list - Stride, defaults to (1, 1)|[1,1]

  • paddingstr|tuple - padding option, which can be a string {‘valid’, ‘same’} or a tuple of integers giving the amount of implicit padding to apply on both sides. Default “valid”.

  • use_biasbool - if use bias, defaults to True

  • kernel_initializercallable - Defaults to None

  • bias_initializercallable - Defaults to None

  • dilation_rateint - dilated size, defaults: 1

  • groupint - number of groups of grouped convolutions. Default: 1.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – The name of the module, default: “”.


a Conv2D class


padding='valid' is the same as no padding.

padding='same' pads the input so the output has the shape as the input.


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn import Conv2D
import pyvqnet
b= 2
ic =3
oc = 2
test_conv = Conv2D(ic,oc,(3,3),(2,2),"same")
x0 = QTensor(np.arange(1,b*ic*5*5+1).reshape([b,ic,5,5]),requires_grad=True,dtype=pyvqnet.kfloat32)
x = test_conv.forward(x0)

class pyvqnet.nn.ConvT2D(input_channels, output_channels, kernel_size, stride=[1, 1], padding='valid', use_bias='True', kernel_initializer=None, bias_initializer=None, dilation_rate: int = 1, group: int = 1, dtype=None, name='')

Apply a two-dimensional transposed convolution kernel over an input. Inputs to the convT module are of shape (batch_size, input_channels, height, width)

  • input_channelsint - Number of input channels

  • output_channelsint - Number of kernels

  • kernel_sizetuple|list - Size of a single kernel. kernel shape = [input_channels,output_channels/group,kernel_size,kernel_size]

  • stridetuple|list - Stride, defaults to (1, 1)|[1,1]

  • paddingstr|tuple - padding option, which can be a string {‘valid’, ‘same’} or a tuple of integers giving the amount of implicit padding to apply on both sides. Default “valid”.

  • use_biasbool - Whether to use a offset item. Default to use

  • kernel_initializercallable - Defaults to None

  • bias_initializercallable - Defaults to None

  • dilation_rateint - dilated size, defaults: 1

  • groupint - number of groups of grouped convolutions. Default: 1.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – The name of the module, default: “”.


a ConvT2D class


padding='valid' is the same as no padding.

padding='same' pads the input so the output has the shape as the input.


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn import ConvT2D
import pyvqnet
test_conv = ConvT2D(3, 2, (3, 3), (1, 1), "valid")
x = QTensor(np.arange(1, 1 * 3 * 5 * 5+1).reshape([1, 3, 5, 5]), requires_grad=True,dtype=pyvqnet.kfloat32)
y = test_conv.forward(x)

class pyvqnet.nn.AvgPool1D(kernel, stride, padding='valid', name='')

This operation applies a 1D average pooling over an input signal composed of several input planes.

  • kernel – size of the average pooling windows

  • strides – factor by which to downscale

  • padding – one of “valid”, “same” or integer specifies the padding value, defaults to “valid”

  • name – name of the output layer.


AvgPool1D layer


padding='valid' is the same as no padding.

padding='same' pads the input so the output has the shape as the input.


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn import AvgPool1D
test_mp = AvgPool1D([3],[2],"same")
x= QTensor(np.array([0, 1, 0, 4, 5,
                            2, 3, 2, 1, 3,
                            4, 4, 0, 4, 3,
                            2, 5, 2, 6, 4,
                            1, 0, 0, 5, 7],dtype=float).reshape([1,5,5]),requires_grad=True)

y= test_mp.forward(x)
# [
# [[0.3333333, 1.6666666, 3],
#  [1.6666666, 2, 1.3333334],
#  [2.6666667, 2.6666667, 2.3333333],
#  [2.3333333, 4.3333335, 3.3333333],
#  [0.3333333, 1.6666666, 4]]
# ]


class pyvqnet.nn.MaxPool1D(kernel, stride, padding='valid', dtype=None, name='')

This operation applies a 1D max pooling over an input signal composed of several input planes.

  • kernel – size of the max pooling windows

  • strides – factor by which to downscale

  • padding – one of “valid”, “same” or integer specifies the padding value, defaults to “valid”

  • name – The name of the module, default: “”.


MaxPool1D layer


padding='valid' is the same as no padding.

padding='same' pads the input so the output has the shape as the input.


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn import MaxPool1D
test_mp = MaxPool1D([3],[2],"same")
x= QTensor(np.array([0, 1, 0, 4, 5,
                            2, 3, 2, 1, 3,
                            4, 4, 0, 4, 3,
                            2, 5, 2, 6, 4,
                            1, 0, 0, 5, 7],dtype=float).reshape([1,5,5]),requires_grad=True)

y= test_mp.forward(x)
class pyvqnet.nn.AvgPool2D(kernel, stride, padding='valid', name='')

This operation applies 2D average pooling over input features .

  • kernel – size of the average pooling windows

  • strides – factors by which to downscale

  • padding – one of “valid”, “same” or tuple with integers specifies the padding value of column and row,defaults to “valid”

  • name – name of the output layer


AvgPool2D layer


padding='valid' is the same as no padding.

padding='same' pads the input so the output has the shape as the input.


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn import AvgPool2D
test_mp = AvgPool2D([2,2],[2,2],"valid")
x= QTensor(np.array([0, 1, 0, 4, 5,
                            2, 3, 2, 1, 3,
                            4, 4, 0, 4, 3,
                            2, 5, 2, 6, 4,
                            1, 0, 0, 5, 7],dtype=float).reshape([1,1,5,5]),requires_grad=True)

y= test_mp.forward(x)
class pyvqnet.nn.MaxPool2D(kernel, stride, padding='valid', name='')

This operation applies 2D max pooling over input features.

  • kernel – size of the max pooling windows

  • strides – factor by which to downscale

  • padding – one of “valid”, “same” or tuple with integers specifies the padding value of column and row, defaults to “valid”

  • name – name of the output layer


MaxPool2D layer


padding='valid' is the same as no padding.

padding='same' pads the input so the output has the shape as the input.


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn import MaxPool2D
test_mp = MaxPool2D([2,2],[2,2],"valid")
x= QTensor(np.array([0, 1, 0, 4, 5,
                            2, 3, 2, 1, 3,
                            4, 4, 0, 4, 3,
                            2, 5, 2, 6, 4,
                            1, 0, 0, 5, 7],dtype=float).reshape([1,1,5,5]),requires_grad=True)

y= test_mp.forward(x)
class pyvqnet.nn.embedding.Embedding(num_embeddings, embedding_dim, weight_initializer=<function xavier_normal>, dtype=None, name: str = '')

This module is often used to store word embeddings and retrieve them using indices. The input to the module is a list of indices, and the output is the corresponding word embeddings.

  • num_embeddingsint - size of the dictionary of embeddings.

  • embedding_dimint - the size of each embedding vector.

  • weight_initializercallable - defaults to normal.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer.


a Embedding class


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn.embedding import Embedding
import pyvqnet
vlayer = Embedding(30,3)
x = QTensor(np.arange(1,25).reshape([2,3,2,2]),dtype= pyvqnet.kint64)
y = vlayer(x)

class pyvqnet.nn.BatchNorm2d(channel_num: int, momentum: float = 0.1, epsilon: float = 1e-05, beta_initializer=zeros, gamma_initializer=ones, dtype=None, name='')

Applies Batch Normalization over a 4D input (B,C,H,W) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .

\[y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]

where \(\gamma\) and \(\beta\) are learnable parameters.Also by default, during training this layer keeps running estimates of its computed mean and variance, which are then used for normalization during evaluation. The running estimates are kept with a default momentum of 0.1.

  • channel_numint - the number of input features channels.

  • momentumfloat - momentum when calculation exponentially weighted average, defaults to 0.1.

  • beta_initializercallable - defaults to zeros.

  • gamma_initializercallable - defaults to ones.

  • epsilonfloat - numerical stability constant, defaults to 1e-5.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


a BatchNorm2d class


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn import BatchNorm2d
import pyvqnet
b = 2
ic = 2
test_conv = BatchNorm2d(ic)

x = QTensor(np.arange(1, 17).reshape([b, ic, 4, 1]),
y = test_conv.forward(x)

class pyvqnet.nn.BatchNorm1d(channel_num: int, momentum: float = 0.1, epsilon: float = 1e-05, beta_initializer=zeros, gamma_initializer=ones, dtype=None, name='')

Applies Batch Normalization over a 2D input (B,C) as described in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift .

\[y = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]

where \(\gamma\) and \(\beta\) are learnable parameters.Also by default, during training this layer keeps running estimates of its computed mean and variance, which are then used for normalization during evaluation. The running estimates are kept with a default momentum of 0.1.

  • channel_numint - the number of input features channels.

  • momentumfloat - momentum when calculation exponentially weighted average, defaults to 0.1

  • beta_initializercallable - defaults to zeros.

  • gamma_initializercallable - defaults to ones.

  • epsilonfloat - numerical stability constant, defaults to 1e-5.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


a BatchNorm1d class


import numpy as np
from pyvqnet.tensor import QTensor
from pyvqnet.nn import BatchNorm1d
import pyvqnet
test_conv = BatchNorm1d(4)

x = QTensor(np.arange(1, 17).reshape([4, 4]),
y = test_conv.forward(x)

class pyvqnet.nn.layer_norm.LayerNormNd(normalized_shape: list, epsilon: float = 1e-05, affine: bool = True, dtype=None, name='')

Layer normalization is performed on the last several dimensions of any input. The specific method is as described in the paper: Layer Normalization

\[y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]

For inputs like (B,C,H,W,D), norm_shape can be [C,H,W,D],[H,W,D],[W,D] or [D] .

  • norm_shapefloat - standardize the shape.

  • epsilonfloat - numerical stability constant, defaults to 1e-5.

  • affinebool - whether to use the applied affine transformation, the default is True.

  • name – name of the output layer.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.


a LayerNormNd class.


import numpy as np
from pyvqnet.tensor import QTensor,kfloat32
from pyvqnet.nn.layer_norm import LayerNormNd
ic = 4
test_conv = LayerNormNd([2,2])
x = QTensor(np.arange(1,17).reshape([2,2,2,2]),requires_grad=True,dtype=kfloat32)
y = test_conv.forward(x)
class pyvqnet.nn.layer_norm.LayerNorm2d(norm_size: int, epsilon: float = 1e-05, affine: bool = True, dtype=None, name='')

Applies Layer Normalization over a mini-batch of 4D inputs as described in the paper Layer Normalization

\[y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]

The mean and standard-deviation are calculated over the last D dimensions size.

For input like (B,C,H,W), norm_size should equals to C * H * W.

  • norm_sizefloat - normalize size,equals to C * H * W

  • epsilonfloat - numerical stability constant, defaults to 1e-5

  • affinebool - whether to use the applied affine transformation, the default is True

  • name – name of the output layer

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.


a LayerNorm2d class


import numpy as np
import pyvqnet
from pyvqnet.tensor import QTensor
from pyvqnet.nn.layer_norm import LayerNorm2d
ic = 4
test_conv = LayerNorm2d(8)
x = QTensor(np.arange(1,17).reshape([2,2,4,1]),requires_grad=True,dtype=pyvqnet.kfloat32)
y = test_conv.forward(x)

class pyvqnet.nn.layer_norm.LayerNorm1d(norm_size: int, epsilon: float = 1e-05, affine: bool = True, dtype=None, name='')

Applies Layer Normalization over a mini-batch of 2D inputs as described in the paper Layer Normalization

\[y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]

The mean and standard-deviation are calculated over the last dimensions size, where norm_size is the value of last dim size.

  • norm_sizefloat - normalize size,equals to last dim

  • epsilonfloat - numerical stability constant, defaults to 1e-5

  • affinebool - whether to use the applied affine transformation, the default is True

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


a LayerNorm1d class


import numpy as np
import pyvqnet
from pyvqnet.tensor import QTensor
from pyvqnet.nn.layer_norm import LayerNorm1d
test_conv = LayerNorm1d(4)
x = QTensor(np.arange(1,17).reshape([4,4]),requires_grad=True,dtype=pyvqnet.kfloat32)
y = test_conv.forward(x)

class pyvqnet.nn.Linear(input_channels, output_channels, weight_initializer=None, bias_initializer=None, use_bias=True, dtype=None, name: str = '')

Linear module (fully-connected layer). \(y = Ax + b\)

  • input_channelsint - number of inputs features

  • output_channelsint - number of output features

  • weight_initializercallable - defaults to normal

  • bias_initializercallable - defaults to zeros

  • use_biasbool - defaults to True

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


a Linear class


import numpy as np
import pyvqnet
from pyvqnet.tensor import QTensor
from pyvqnet.nn import Linear
c1 =2
c2 = 3
cin = 7
cout = 5
n = Linear(cin,cout)
input = QTensor(np.arange(1,c1*c2*cin+1).reshape((c1,c2,cin)),requires_grad=True,dtype=pyvqnet.kfloat32)
y = n.forward(input)

class pyvqnet.nn.dropout.Dropout(dropout_rate=0.5)

Dropout module.The dropout module randomly sets the outputs of some units to zero, while upscale others according to the given dropout probability.


dropout_ratefloat - probability that a neuron will be set to zero


a Dropout class


from pyvqnet.nn.dropout import Dropout
import numpy as np
from pyvqnet.tensor import QTensor
b = 2
ic = 2
x = QTensor(np.arange(-1*ic*2*2,(b-1)*ic*2*2).reshape([b,ic,2,2]),requires_grad=True)
droplayer = Dropout(0.5)
y = droplayer(x)
class pyvqnet.nn.pixel_shuffle.Pixel_Shuffle(upscale_factors)

Rearrange tensors of shape: (, C * r^2, H, W) to a tensor of shape (, C, H * r, W * r) where r is the scaling factor.


upscale_factors – factor to increase the scale transformation


Pixel_Shuffle module


from pyvqnet.nn import Pixel_Shuffle
from pyvqnet.tensor import tensor
ps = Pixel_Shuffle(3)
inx = tensor.ones([5,2,3,18,4,4])
inx.requires_grad=  True
y = ps(inx)
#[5, 2, 3, 2, 12, 12]


class pyvqnet.nn.pixel_shuffle.Pixel_Unshuffle(downscale_factors)

Reverses the Pixel_Shuffle operation by rearranging the elements. Shuffles a Tensor of shape (, C, H * r, W * r) to (, C * r^2, H, W) , where r is the shrink factor.


downscale_factors – factor to increase the scale transformation


Pixel_Unshuffle module


from pyvqnet.nn import Pixel_Unshuffle
from pyvqnet.tensor import tensor
ps = Pixel_Unshuffle(3)
inx = tensor.ones([5, 2, 3, 2, 12, 12])
inx.requires_grad = True
y = ps(inx)
#[5, 2, 3, 18, 4, 4]


class pyvqnet.nn.gru.GRU(input_size, hidden_size, num_layers=1, nonlinearity='tanh', batch_first=True, use_bias=True, bidirectional=False, dtype=None, name: str = '')

Gated Recurrent Unit (GRU) module. Support multi-layer stacking, bidirectional configuration. The calculation formula of the single-layer one-way GRU is as follows:

\[\begin{split}\begin{array}{ll} r_t = \sigma(W_{ir} x_t + b_{ir} + W_{hr} h_{(t-1)} + b_{hr}) \\ z_t = \sigma(W_{iz} x_t + b_{iz} + W_{hz} h_{(t-1)} + b_{hz}) \\ n_t = \tanh(W_{in} x_t + b_{in} + r_t * (W_{hn} h_{(t-1)}+ b_{hn})) \\ h_t = (1 - z_t) * n_t + z_t * h_{(t-1)} \end{array}\end{split}\]
  • input_size – Input feature dimensions.

  • hidden_size – Hidden feature dimensions.

  • num_layers – Stack layer numbers. default: 1.

  • batch_first – If batch_first is True, input shape should be [batch_size,seq_len,feature_dim], if batch_first is False, the input shape should be [seq_len,batch_size,feature_dim],default: True.

  • use_bias – If use_bias is False, this module will not contain bias. default: True.

  • bidirectional – If bidirectional is True, the module will be bidirectional GRU. default: False.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


A GRU module instance.


from pyvqnet.nn import GRU
from pyvqnet.tensor import tensor

rnn2 = GRU(4, 6, 2, batch_first=False, bidirectional=True)

input = tensor.ones([5, 3, 4])
h0 = tensor.ones([4, 3, 6])

output, hn = rnn2(input, h0)
class pyvqnet.nn.rnn.RNN(input_size, hidden_size, num_layers=1, nonlinearity='tanh', batch_first=True, use_bias=True, bidirectional=False, dtype=None, name: str = '')

Recurrent Neural Network (RNN) Module, use \(\tanh\) or \(\text{ReLU}\) as activation function. bidirectional RNN and multi-layer RNN is supported. The calculation formula of single-layer unidirectional RNN is as follows:

\[h_t = \tanh(W_{ih} x_t + b_{ih} + W_{hh} h_{(t-1)} + b_{hh})\]

If nonlinearity is 'relu', then \(\text{ReLU}\) will replace \(\tanh\).

  • input_size – Input feature dimensions.

  • hidden_size – Hidden feature dimensions.

  • num_layers – Stack layer numbers. default: 1.

  • nonlinearity – non-linear activation function, default: 'tanh' .

  • batch_first – If batch_first is True, input shape should be [batch_size,seq_len,feature_dim], if batch_first is False, the input shape should be [seq_len,batch_size,feature_dim],default: True.

  • use_bias – If use_bias is False, this module will not contain bias. default: True.

  • bidirectional – If bidirectional is True, the module will be bidirectional RNN. default: False.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


A RNN module instance.


from pyvqnet.nn import RNN
from pyvqnet.tensor import tensor

rnn2 = RNN(4, 6, 2, batch_first=False, bidirectional = True)

input = tensor.ones([5, 3, 4])
h0 = tensor.ones([4, 3, 6])
output, hn = rnn2(input, h0)
class pyvqnet.nn.lstm.LSTM(input_size, hidden_size, num_layers=1, batch_first=True, use_bias=True, bidirectional=False, dtype=None, name: str = '')

Long Short-Term Memory (LSTM) module. Support bidirectional LSTM, stacked multi-layer LSTM and other configurations. The calculation formula of single-layer unidirectional LSTM is as follows:

\[\begin{split}\begin{array}{ll} \\ i_t = \sigma(W_{ii} x_t + b_{ii} + W_{hi} h_{t-1} + b_{hi}) \\ f_t = \sigma(W_{if} x_t + b_{if} + W_{hf} h_{t-1} + b_{hf}) \\ g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hg} h_{t-1} + b_{hg}) \\ o_t = \sigma(W_{io} x_t + b_{io} + W_{ho} h_{t-1} + b_{ho}) \\ c_t = f_t \odot c_{t-1} + i_t \odot g_t \\ h_t = o_t \odot \tanh(c_t) \\ \end{array}\end{split}\]
  • input_size – Input feature dimensions.

  • hidden_size – Hidden feature dimensions.

  • num_layers – Stack layer numbers. default: 1.

  • batch_first – If batch_first is True, input shape should be [batch_size,seq_len,feature_dim], if batch_first is False, the input shape should be [seq_len,batch_size,feature_dim],default: True.

  • use_bias – If use_bias is False, this module will not contain bias. default: True.

  • bidirectional – If bidirectional is True, the module will be bidirectional LSTM. default: False.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


A LSTM module instance.


from pyvqnet.nn import LSTM
from pyvqnet.tensor import tensor

rnn2 = LSTM(4, 6, 2, batch_first=False, bidirectional = True)

input = tensor.ones([5, 3, 4])
h0 = tensor.ones([4, 3, 6])
c0 = tensor.ones([4, 3, 6])
output, (hn, cn) = rnn2(input, (h0, c0))


class pyvqnet.nn.gru.Dynamic_GRU(input_size, hidden_size, num_layers=1, batch_first=True, use_bias=True, bidirectional=False, dtype=None, name: str = '')

Apply a multilayer gated recurrent unit (GRU) RNN to a dynamic-length input sequence.

The first input should be a variable-length batch sequence input defined Through the tensor.PackedSequence class. The tensor.PackedSequence class can be constructed as Call the next function in succession: pad_sequence, pack_pad_sequence.

The first output of Dynamic_GRU is also a tensor.PackedSequence class, It can be unpacked into a normal QTensor using tensor.pad_pack_sequence.

For each element in the input sequence, each layer computes the following formula:

\[\begin{split}\begin{array}{ll} r_t = \sigma(W_{ir} x_t + b_{ir} + W_{hr} h_{(t-1)} + b_{hr}) \\ z_t = \sigma(W_{iz} x_t + b_{iz} + W_{hz} h_{(t-1)} + b_{hz}) \\ n_t = \tanh(W_{in} x_t + b_{in} + r_t * (W_{hn} h_{(t-1)}+ b_{hn})) \\ h_t = (1 - z_t) * n_t + z_t * h_{(t-1)} \end{array}\end{split}\]
  • input_size – Input feature dimension.

  • hidden_size – Hidden feature dimension.

  • num_layers – Number of loop layers. Default: 1

  • batch_first – If True, the input shape is provided as [batch size, sequence length, feature dimension]. If False, input shape is provided as [sequence length, batch size, feature dimension], default True.

  • use_bias – If False, the layer does not use bias weights b_ih and b_hh. Default: true.

  • bidirectional – If true, becomes a bidirectional GRU. Default: false.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


A Dynamic_GRU class


from pyvqnet.nn import Dynamic_GRU
from pyvqnet.tensor import tensor
seq_len = [4,1,2]
input_size = 4
batch_size =3
hidden_size = 2
ml = 2
rnn2 = Dynamic_GRU(input_size,

a = tensor.arange(1, seq_len[0] * input_size + 1).reshape(
    [seq_len[0], input_size])
b = tensor.arange(1, seq_len[1] * input_size + 1).reshape(
    [seq_len[1], input_size])
c = tensor.arange(1, seq_len[2] * input_size + 1).reshape(
    [seq_len[2], input_size])

y = tensor.pad_sequence([a, b, c], False)

input = tensor.pack_pad_sequence(y,

h0 = tensor.ones([ml * 2, batch_size, hidden_size])

output, hn = rnn2(input, h0)

seq_unpacked, lens_unpacked = \
tensor.pad_packed_sequence(output, batch_first=False)
class pyvqnet.nn.rnn.Dynamic_RNN(input_size, hidden_size, num_layers=1, nonlinearity='tanh', batch_first=True, use_bias=True, bidirectional=False, dtype=None, name: str = '')

Applies recurrent neural networks (RNNs) to dynamic-length input sequences.

The first input should be a variable-length batch sequence input defined Through the tensor.PackedSequence class. The tensor.PackedSequence class can be constructed as Call the next function in succession: pad_sequence, pack_pad_sequence.

The first output of Dynamic_RNN is also a tensor.PackedSequence class, It can be unpacked into a normal QTensor using tensor.pad_pack_sequence.

Recurrent Neural Network (RNN) module, using \(\tanh\) or \(\text{ReLU}\) as activation function. Support two-way, multi-layer configuration. The calculation formula of single-layer one-way RNN is as follows:

\[h_t = \tanh(W_{ih} x_t + b_{ih} + W_{hh} h_{(t-1)} + b_{hh})\]

If nonlinearity is 'relu', then \(\text{ReLU}\) will replace \(\tanh\).

  • input_size – Input feature dimension.

  • hidden_size – Hidden feature dimension.

  • num_layers – Number of stacked RNN layers, default: 1.

  • nonlinearity – Non-linear activation function, default is 'tanh'.

  • batch_first – If True, the input shape is [batch size, sequence length, feature dimension], If False, the input shape is [sequence length, batch size, feature dimension], default True.

  • use_bias – If False, the module does not apply bias items, default: True.

  • bidirectional – If True, it becomes bidirectional RNN, default: False.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


Dynamic_RNN instance


from pyvqnet.nn import Dynamic_RNN
from pyvqnet.tensor import tensor
seq_len = [4,1,2]
input_size = 4
batch_size =3
hidden_size = 2
ml = 2
rnn2 = Dynamic_RNN(input_size,

a = tensor.arange(1, seq_len[0] * input_size + 1).reshape(
    [seq_len[0], input_size])
b = tensor.arange(1, seq_len[1] * input_size + 1).reshape(
    [seq_len[1], input_size])
c = tensor.arange(1, seq_len[2] * input_size + 1).reshape(
    [seq_len[2], input_size])

y = tensor.pad_sequence([a, b, c], False)

input = tensor.pack_pad_sequence(y,

h0 = tensor.ones([ml * 2, batch_size, hidden_size])

output, hn = rnn2(input, h0)

seq_unpacked, lens_unpacked = \
tensor.pad_packed_sequence(output, batch_first=False)

class pyvqnet.nn.lstm.Dynamic_LSTM(input_size, hidden_size, num_layers=1, batch_first=True, use_bias=True, bidirectional=False, dtype=None, name: str = '')

Apply Long Short-Term Memory (LSTM) RNNs to dynamic-length input sequences.

The first input should be a variable-length batch sequence input defined Through the tensor.PackedSequence class. The tensor.PackedSequence class can be constructed as Call the next function in succession: pad_sequence, pack_pad_sequence.

The first output of Dynamic_LSTM is also a tensor.PackedSequence class, It can be unpacked into a normal QTensor using tensor.pad_pack_sequence.

Recurrent Neural Network (RNN) module, using \(\tanh\) or \(\text{ReLU}\) as activation function. Support two-way, multi-layer configuration. The calculation formula of single-layer one-way RNN is as follows:

\[\begin{split}\begin{array}{ll} \\ i_t = \sigma(W_{ii} x_t + b_{ii} + W_{hi} h_{t-1} + b_{hi}) \\ f_t = \sigma(W_{if} x_t + b_{if} + W_{hf} h_{t-1} + b_{hf}) \\ g_t = \tanh(W_{ig} x_t + b_{ig} + W_{hg} h_{t-1} + b_{hg}) \\ o_t = \sigma(W_{io} x_t + b_{io} + W_{ho} h_{t-1} + b_{ho}) \\ c_t = f_t \odot c_{t-1} + i_t \odot g_t \\ h_t = o_t \odot \tanh(c_t) \\ \end{array}\end{split}\]
  • input_size – Input feature dimension.

  • hidden_size – Hidden feature dimension.

  • num_layers – Number of stacked LSTM layers, default: 1.

  • batch_first – If True, the input shape is [batch size, sequence length, feature dimension], If False, the input shape is [sequence length, batch size, feature dimension], default True.

  • use_bias – If False, the module does not apply bias items, default: True.

  • bidirectional – If True, it becomes a bidirectional LSTM, default: False.

  • dtype – The data type of the parameter, defaults: None, use the default data type kfloat32, which represents a 32-bit floating point number.

  • name – name of the output layer


Dynamic_LSTM instance


from pyvqnet.nn import Dynamic_LSTM
from pyvqnet.tensor import tensor

input_size = 2
hidden_size = 2
ml = 2
seq_len = [3, 4, 1]
batch_size = 3
rnn2 = Dynamic_LSTM(input_size,

a = tensor.arange(1, seq_len[0] * input_size + 1).reshape(
    [seq_len[0], input_size])
b = tensor.arange(1, seq_len[1] * input_size + 1).reshape(
    [seq_len[1], input_size])
c = tensor.arange(1, seq_len[2] * input_size + 1).reshape(
    [seq_len[2], input_size])
a.requires_grad = True
b.requires_grad = True
c.requires_grad = True
y = tensor.pad_sequence([a, b, c], False)

input = tensor.pack_pad_sequence(y,

h0 = tensor.ones([ml * 2, batch_size, hidden_size])
c0 = tensor.ones([ml * 2, batch_size, hidden_size])

output, (hn, cn) = rnn2(input, (h0, c0))

seq_unpacked, lens_unpacked = \
tensor.pad_packed_sequence(output, batch_first=False)


Loss Function Layer


Please note that unlike pytorch and other frameworks, in the forward function of the following loss function, the first parameter is the label, and the second parameter is the predicted value.


class pyvqnet.nn.MeanSquaredError

Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input \(x\) and target \(y\).

The unreduced loss can be described as:

\[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = \left( x_n - y_n \right)^2,\]

where \(N\) is the batch size. , then:

\[\ell(x, y) = \operatorname{mean}(L)\]

\(x\) and \(y\) are QTensors of arbitrary shapes with a total of \(n\) elements each.

The mean operation still operates over all the elements, and divides by \(n\).


name – name of the output layer


a MeanSquaredError class

Parameters for loss forward function:

x: \((N, *)\) where \(*\) means, any number of additional dimensions

y: \((N, *)\), same shape as the input


from pyvqnet.tensor import QTensor, kfloat64
from pyvqnet.nn import MeanSquaredError
y = QTensor([[0, 0, 1, 0, 0, 0, 0, 0, 0, 0]],
x = QTensor([[0.1, 0.05, 0.7, 0, 0.05, 0.1, 0, 0, 0, 0]],

loss_result = MeanSquaredError()
result = loss_result(y, x)

# [0.0115000]


class pyvqnet.nn.BinaryCrossEntropy

Measures the Binary Cross Entropy between the target and the output:

The unreduced loss can be described as:

\[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_n \left[ y_n \cdot \log x_n + (1 - y_n) \cdot \log (1 - x_n) \right],\]

where \(N\) is the batch size.

\[\ell(x, y) = \operatorname{mean}(L)\]

a BinaryCrossEntropy class

Parameters for loss forward function:

x: \((N, *)\) where \(*\) means, any number of additional dimensions

y: \((N, *)\), same shape as the input


import pyvqnet
from pyvqnet.tensor import QTensor
x = QTensor([[0.3, 0.7, 0.2], [0.2, 0.3, 0.1]], requires_grad=True)
y = QTensor([[0, 1.0, 0], [0, 0.0, 1]], requires_grad=True)

loss_result = pyvqnet.nn.BinaryCrossEntropy()
result = loss_result(y, x)

# [0.6364825]


class pyvqnet.nn.CategoricalCrossEntropy

This criterion combines LogSoftmax and NLLLoss in one single class.

The loss can be described as below, where class is index of target’s class:

\[\text{loss}(x, class) = -\log\left(\frac{\exp(x[class])}{\sum_j \exp(x[j])}\right) = -x[class] + \log\left(\sum_j \exp(x[j])\right)\]

a CategoricalCrossEntropy class

Parameters for loss forward function:

x: \((N, *)\) where \(*\) means, any number of additional dimensions

y: \((N, *)\), same shape as the input, should have data type of the 64-bit integer.


from pyvqnet.tensor import QTensor,kfloat32,kint64
from pyvqnet.nn import CategoricalCrossEntropy
x = QTensor([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5]], requires_grad=True,dtype=kfloat32)
y = QTensor([[0, 1, 0, 0, 0], [0, 1, 0, 0, 0], [1, 0, 0, 0, 0]], requires_grad=False,dtype=kint64)
loss_result = CategoricalCrossEntropy()
result = loss_result(y, x)

# [3.7852428]


class pyvqnet.nn.SoftmaxCrossEntropy

This criterion combines LogSoftmax and NLLLoss in one single class with more numeral stablity.

The loss can be described as below, where class is index of target’s class:

\[\text{loss}(x, class) = -\log\left(\frac{\exp(x[class])}{\sum_j \exp(x[j])}\right) = -x[class] + \log\left(\sum_j \exp(x[j])\right)\]

a SoftmaxCrossEntropy class

Parameters for loss forward function:

x: \((N, *)\) where \(*\) means, any number of additional dimensions

y: \((N, *)\), same shape as the input, should have data type of the 64-bit integer.


from pyvqnet.tensor import QTensor, kfloat32, kint64
from pyvqnet.nn import SoftmaxCrossEntropy
x = QTensor([[1, 2, 3, 4, 5], [1, 2, 3, 4, 5], [1, 2, 3, 4, 5]],
y = QTensor([[0, 1, 0, 0, 0], [0, 1, 0, 0, 0], [1, 0, 0, 0, 0]],
loss_result = SoftmaxCrossEntropy()
result = loss_result(y, x)

# [3.7852478]


class pyvqnet.nn.NLL_Loss

The average negative log likelihood loss. It is useful to train a classification problem with C classes

The x given through a forward call is expected to contain log-probabilities of each class. x has to be a Tensor of size either \((N, C)\) or \((N, C, d_1, d_2, ..., d_K)\) with \(K \geq 1\) for the K-dimensional case. The y that this loss expects should be a class index in the range \([0, C-1]\) where C = number of classes.

\[\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - \sum_{n=1}^N \frac{1}{N}x_{n,y_n}, \quad\]

a NLL_Loss class

Parameters for loss forward function:

x: \((N, *)\), the output of the loss function, which can be a multidimensional variable.

y: \((N, *)\), the true value expected by the loss function, should have data type of the 64-bit integer.


from pyvqnet.tensor import QTensor, kint64
from pyvqnet.nn import NLL_Loss

x = QTensor([
    0.9476322568516703, 0.226547421131723, 0.5944201443911326,
    0.42830868492969476, 0.76414068655387, 0.00286059168094277,
    0.3574236812873617, 0.9096948856639084, 0.4560809854582528,
    0.9818027091583286, 0.8673569904602182, 0.9860275114020933,
    0.9232667066664217, 0.303693313961628, 0.8461034903175555
x.reshape_([1, 3, 1, 5])
x.requires_grad = True
y = QTensor([[[2, 1, 0, 0, 2]]], dtype=kint64)

loss_result = NLL_Loss()
result = loss_result(y, x)


class pyvqnet.nn.CrossEntropyLoss

This criterion combines LogSoftmax and NLLLoss in one single class.

x is expected to contain raw, unnormalized scores for each class. x has to be a Tensor of size \((C)\) for unbatched input, \((N, C)\) or \((N, C, d_1, d_2, ..., d_K)\) with \(K \geq 1\) for the K-dimensional case.

The loss can be described as below, where class is index of target’s class:

\[\text{loss}(x, class) = -\log\left(\frac{\exp(x[class])}{\sum_j \exp(x[j])}\right) = -x[class] + \log\left(\sum_j \exp(x[j])\right)\]

a CrossEntropyLoss class

Parameters for loss forward function:

x: \((N, *)\), the output of the loss function, which can be a multidimensional variable.

y: \((N, *)\), the true value expected by the loss function, should have data type of the 64-bit integer.


from pyvqnet.tensor import QTensor, kint64
from pyvqnet.nn import CrossEntropyLoss
x = QTensor([
    0.9476322568516703, 0.226547421131723, 0.5944201443911326,
    0.42830868492969476, 0.76414068655387, 0.00286059168094277,
    0.3574236812873617, 0.9096948856639084, 0.4560809854582528,
    0.9818027091583286, 0.8673569904602182, 0.9860275114020933,
    0.9232667066664217, 0.303693313961628, 0.8461034903175555
x.reshape_([1, 3, 1, 5])
x.requires_grad = True
y = QTensor([[[2, 1, 0, 0, 2]]], dtype=kint64)

loss_result = CrossEntropyLoss()
result = loss_result(y, x)


Activation Function


class pyvqnet.nn.activation.Activation

Base class of activation. Specific activation functions inherit this functions.


class pyvqnet.nn.Sigmoid(name: str = '')

Applies a sigmoid activation function to the given layer.

\[\text{Sigmoid}(x) = \frac{1}{1 + \exp(-x)}\]

name – name of the output layer


sigmoid Activation layer


from pyvqnet.nn import Sigmoid
from pyvqnet.tensor import QTensor
layer = Sigmoid()
y = layer(QTensor([1.0, 2.0, 3.0, 4.0]))

# [0.7310586, 0.8807970, 0.9525741, 0.9820138]


class pyvqnet.nn.Softplus(name: str = '')

Applies the softplus activation function to the given layer.

\[\text{Softplus}(x) = \log(1 + \exp(x))\]
param name:

name of the output layer


softplus Activation layer


from pyvqnet.nn import Softplus
from pyvqnet.tensor import QTensor
layer = Softplus()
y = layer(QTensor([1.0, 2.0, 3.0, 4.0]))

# [1.3132616, 2.1269281, 3.0485873, 4.0181499]


class pyvqnet.nn.Softsign(name: str = '')

Applies the softsign activation function to the given layer.

\[\text{SoftSign}(x) = \frac{x}{ 1 + |x|}\]

name – name of the output layer


softsign Activation layer


from pyvqnet.nn import Softsign
from pyvqnet.tensor import QTensor
layer = Softsign()
y = layer(QTensor([1.0, 2.0, 3.0, 4.0]))

# [0.5000000, 0.6666667, 0.7500000, 0.8000000]


class pyvqnet.nn.Softmax(axis: int = -1, name: str = '')

Applies a softmax activation function to the given layer.

\[\text{Softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_j \exp(x_j)}\]
  • axis – dimension on which to operate (-1 for last axis),default = -1

  • name – name of the output layer


softmax Activation layer


from pyvqnet.nn import Softmax
from pyvqnet.tensor import QTensor
layer = Softmax()
y = layer(QTensor([1.0, 2.0, 3.0, 4.0]))

# [0.0320586, 0.0871443, 0.2368828, 0.6439142]


class pyvqnet.nn.HardSigmoid(name: str = '')

Applies a hard sigmoid activation function to the given layer.

\[\begin{split}\text{Hardsigmoid}(x) = \begin{cases} 0 & \text{ if } x \le -3, \\ 1 & \text{ if } x \ge +3, \\ x / 6 + 1 / 2 & \text{otherwise} \end{cases}\end{split}\]

name – name of the output layer


hard sigmoid Activation layer


from pyvqnet.nn import HardSigmoid
from pyvqnet.tensor import QTensor
layer = HardSigmoid()
y = layer(QTensor([1.0, 2.0, 3.0, 4.0]))

# [0.6666667, 0.8333334, 1, 1]


class pyvqnet.nn.ReLu(name: str = '')

Applies a rectified linear unit activation function to the given layer.

\[\begin{split}\text{ReLu}(x) = \begin{cases} x, & \text{ if } x > 0\\ 0, & \text{ if } x \leq 0 \end{cases}\end{split}\]

name – name of the output layer


ReLu Activation layer


from pyvqnet.nn import ReLu
from pyvqnet.tensor import QTensor
layer = ReLu()
y = layer(QTensor([-1, 2.0, -3, 4.0]))

# [0, 2, 0, 4]


class pyvqnet.nn.LeakyReLu(alpha: float = 0.01, name: str = '')

Applies the leaky version of a rectified linear unit activation function to the given layer.

\[\begin{split}\text{LeakyRelu}(x) = \begin{cases} x, & \text{ if } x \geq 0 \\ \alpha * x, & \text{ otherwise } \end{cases}\end{split}\]
  • alpha – LeakyRelu coefficient, default: 0.01

  • name – name of the output layer


leaky ReLu Activation layer


from pyvqnet.nn import LeakyReLu
from pyvqnet.tensor import QTensor
layer = LeakyReLu()
y = layer(QTensor([-1, 2.0, -3, 4.0]))

# [-0.0100000, 2, -0.0300000, 4]


class pyvqnet.nn.ELU(alpha: float = 1.0, name: str = '')

Applies the exponential linear unit activation function to the given layer.

\[\begin{split}\text{ELU}(x) = \begin{cases} x, & \text{ if } x > 0\\ \alpha * (\exp(x) - 1), & \text{ if } x \leq 0 \end{cases}\end{split}\]
  • alpha – Elu coefficient, default: 1.0

  • name – name of the output layer


Elu Activation layer


from pyvqnet.nn import ELU
from pyvqnet.tensor import QTensor
layer = ELU()
y = layer(QTensor([-1, 2.0, -3, 4.0]))

# [-0.6321205, 2, -0.9502130, 4]


class pyvqnet.nn.Tanh(name: str = '')

Applies the hyperbolic tangent activation function to the given layer.

\[\text{Tanh}(x) = \frac{\exp(x) - \exp(-x)} {\exp(x) + \exp(-x)}\]

name – name of the output layer


hyperbolic tangent Activation layer


from pyvqnet.nn import Tanh
from pyvqnet.tensor import QTensor
layer = Tanh()
y = layer(QTensor([-1, 2.0, -3, 4.0]))

# [-0.7615942, 0.9640276, -0.9950548, 0.9993293]

Optimizer Module


class pyvqnet.optim.optimizer.Optimizer(params, lr=0.01)

Base class for all optimizers.

  • params – params of model which need to be optimized

  • lr – learning_rate of model (default: 0.01)


class pyvqnet.optim.adadelta.Adadelta(params, lr=0.01, beta=0.99, epsilon=1e-08)

ADADELTA: An Adaptive Learning Rate Method. reference: (

\[\begin{split}E(g_t^2) &= \beta * E(g_{t-1}^2) + (1-\beta) * g^2\\ Square\_avg &= \sqrt{ ( E(dx_{t-1}^2) + \epsilon ) / ( E(g_t^2) + \epsilon ) }\\ E(dx_t^2) &= \beta * E(dx_{t-1}^2) + (1-\beta) * (-g*square\_avg)^2 \\ param\_new &= param - lr * Square\_avg\end{split}\]
  • params – params of model which need to be optimized

  • lr – learning_rate of model (default: 0.01)

  • beta – for computing a running average of squared gradients (default: 0.99)

  • epsilon – term added to the denominator to improve numerical stability (default: 1e-8)


a Adadelta optimizer


import numpy as np
from pyvqnet.optim import adadelta
from pyvqnet.tensor import QTensor
w = np.arange(24).reshape(1,2,3,4).astype(np.float64)
param = QTensor(w)
param.grad = QTensor(np.arange(24).reshape(1, 2, 3, 4).astype(np.float64))
params = [param]
opti = adadelta.Adadelta(params)

for i in range(1,3):

class pyvqnet.optim.adagrad.Adagrad(params, lr=0.01, epsilon=1e-08)

Implements Adagrad algorithm. reference: (

\[\begin{split}\begin{align} moment\_new &= moment + g * g\\param\_new &= param - \frac{lr * g}{\sqrt{moment\_new} + \epsilon} \end{align}\end{split}\]
  • params – params of model which need to be optimized

  • lr – learning_rate of model (default: 0.01)

  • epsilon – term added to the denominator to improve numerical stability (default: 1e-8)


a Adagrad optimizer


import numpy as np
from pyvqnet.optim import adagrad
from pyvqnet.tensor import QTensor
w = np.arange(24).reshape(1,2,3,4).astype(np.float64)
param = QTensor(w)
param.grad = QTensor(np.arange(24).reshape(1, 2, 3, 4).astype(np.float64))
params = [param]
opti = adagrad.Adagrad(params)

for i in range(1,3):

# [
# [[[0, 0.9900000, 1.9900000, 2.9900000],
#  [3.9900000, 4.9899998, 5.9899998, 6.9899998],
#  [7.9899998, 8.9899998, 9.9899998, 10.9899998]],
# [[11.9899998, 12.9899998, 13.9899998, 14.9899998],
#  [15.9899998, 16.9899998, 17.9899998, 18.9899998],
#  [19.9899998, 20.9899998, 21.9899998, 22.9899998]]]
# ]

# [
# [[[0, 0.9829289, 1.9829290, 2.9829290],
#  [3.9829290, 4.9829288, 5.9829288, 6.9829288],
#  [7.9829288, 8.9829283, 9.9829283, 10.9829283]],
# [[11.9829283, 12.9829283, 13.9829283, 14.9829283],
#  [15.9829283, 16.9829292, 17.9829292, 18.9829292],
#  [19.9829292, 20.9829292, 21.9829292, 22.9829292]]]
# ]


class pyvqnet.optim.adam.Adam(params, lr=0.01, beta1=0.9, beta2=0.999, epsilon=1e-08, amsgrad: bool = False)

Adam: A Method for Stochastic Optimization reference: (,it can dynamically adjusts the learning rate of each parameter using the 1st moment estimates and the 2nd moment estimates of the gradient.

\[t = t + 1\]
\[lr = lr*\frac{\sqrt{1-\beta2^t}}{1-\beta1^t}\]

if amsgrad = True

\[moment\_2\_max = max(moment\_2\_max,moment\_2)\]


  • params – params of model which need to be optimized

  • lr – learning_rate of model (default: 0.01)

  • beta1 – coefficients used for computing running averages of gradient and its square (default: 0.9)

  • beta2 – coefficients used for computing running averages of gradient and its square (default: 0.999)

  • epsilon – term added to the denominator to improve numerical stability (default: 1e-8)

  • amsgrad – whether to use the AMSGrad variant of this algorithm (default: False)


a Adam optimizer


import numpy as np
from pyvqnet.optim import adam
from pyvqnet.tensor import QTensor
w = np.arange(24).reshape(1,2,3,4).astype(np.float64)
param = QTensor(w)
param.grad = QTensor(np.arange(24).reshape(1, 2, 3, 4).astype(np.float64))
params = [param]
opti = adam.Adam(params)

for i in range(1,3):

# [
# [[[0, 0.9900000, 1.9900000, 2.9900000],
#  [3.9900000, 4.9899998, 5.9899998, 6.9899998],
#  [7.9899998, 8.9899998, 9.9899998, 10.9899998]],
# [[11.9899998, 12.9899998, 13.9899998, 14.9899998],
#  [15.9899998, 16.9899998, 17.9899998, 18.9899998],
#  [19.9899998, 20.9899998, 21.9899998, 22.9899998]]]
# ]

# [
# [[[0, 0.9800000, 1.9800000, 2.9800000],
#  [3.9800000, 4.9799995, 5.9799995, 6.9799995],
#  [7.9799995, 8.9799995, 9.9799995, 10.9799995]],
# [[11.9799995, 12.9799995, 13.9799995, 14.9799995],
#  [15.9799995, 16.9799995, 17.9799995, 18.9799995],
#  [19.9799995, 20.9799995, 21.9799995, 22.9799995]]]
# ]


class pyvqnet.optim.adamax.Adamax(params, lr=0.01, beta1=0.9, beta2=0.999, epsilon=1e-08)

Implements Adamax algorithm (a variant of Adam based on infinity norm).reference: (

\[\begin{split}\\t = t + 1\end{split}\]
\[norm\_new = \max{(\beta1∗norm+\epsilon, \left|g\right|)}\]
\[lr = \frac{lr}{1-\beta1^t}\]
\[\begin{split}param\_new = param − lr*\frac{moment\_new}{norm\_new}\\\end{split}\]
  • params – params of model which need to be optimized

  • lr – learning_rate of model (default: 0.01)

  • beta1 – coefficients used for computing running averages of gradient and its square (default: 0.9)

  • beta2 – coefficients used for computing running averages of gradient and its square (default: 0.999)

  • epsilon – term added to the denominator to improve numerical stability (default: 1e-8)


a Adamax optimizer


import numpy as np
from pyvqnet.optim import adamax
from pyvqnet.tensor import QTensor
w = np.arange(24).reshape(1,2,3,4).astype(np.float64)
param = QTensor(w)
param.grad = QTensor(np.arange(24).reshape(1,2,3,4).astype(np.float64))
params = [param]
opti = adamax.Adamax(params)

for i in range(1,3):

# [
# [[[0, 0.9900000, 1.9900000, 2.9900000],
#  [3.9900000, 4.9899998, 5.9899998, 6.9899998],
#  [7.9899998, 8.9899998, 9.9899998, 10.9899998]],
# [[11.9899998, 12.9899998, 13.9899998, 14.9899998],
#  [15.9899998, 16.9899998, 17.9899998, 18.9899998],
#  [19.9899998, 20.9899998, 21.9899998, 22.9899998]]]
# ]

# [
# [[[0, 0.9800000, 1.9800000, 2.9800000],
#  [3.9800000, 4.9799995, 5.9799995, 6.9799995],
#  [7.9799995, 8.9799995, 9.9799995, 10.9799995]],
# [[11.9799995, 12.9799995, 13.9799995, 14.9799995],
#  [15.9799995, 16.9799995, 17.9799995, 18.9799995],
#  [19.9799995, 20.9799995, 21.9799995, 22.9799995]]]
# ]


class pyvqnet.optim.rmsprop.RMSProp(params, lr=0.01, beta=0.99, epsilon=1e-08)

Implements RMSprop algorithm. reference: (

\[s_{t+1} = s_{t} + (1 - \beta)*(g)^2\]
\[param_new = param - \frac{g}{\sqrt{s_{t+1}} + epsilon}\]
  • params – params of model which need to be optimized

  • lr – learning_rate of model (default: 0.01)

  • beta – coefficients used for computing running averages of gradient and its square (default: 0.99)

  • epsilon – term added to the denominator to improve numerical stability (default: 1e-8)


a RMSProp optimizer


import numpy as np
from pyvqnet.optim import rmsprop
from pyvqnet.tensor import QTensor
w = np.arange(24).reshape(1,2,3,4).astype(np.float64)
param = QTensor(w)
param.grad = QTensor(np.arange(24).reshape(1,2,3,4).astype(np.float64))
params = [param]
opti = rmsprop.RMSProp(params)

for i in range(1,3):

# [
# [[[0, 0.9000000, 1.9000000, 2.8999999],
#  [3.8999999, 4.9000001, 5.9000001, 6.9000001],
#  [7.9000001, 8.8999996, 9.8999996, 10.8999996]],
# [[11.8999996, 12.8999996, 13.8999996, 14.8999996],
#  [15.8999996, 16.8999996, 17.8999996, 18.8999996],
#  [19.8999996, 20.8999996, 21.8999996, 22.8999996]]]
# ]

# [
# [[[0, 0.8291118, 1.8291118, 2.8291118],
#  [3.8291118, 4.8291121, 5.8291121, 6.8291121],
#  [7.8291121, 8.8291111, 9.8291111, 10.8291111]],
# [[11.8291111, 12.8291111, 13.8291111, 14.8291111],
#  [15.8291111, 16.8291111, 17.8291111, 18.8291111],
#  [19.8291111, 20.8291111, 21.8291111, 22.8291111]]]
# ]


class pyvqnet.optim.sgd.SGD(params, lr=0.01, momentum=0, nesterov=False)

Implements SGD algorithm. reference: (

  • params – params of model which need to be optimized

  • lr – learning_rate of model (default: 0.01)

  • momentum – momentum factor (default: 0)

  • nesterov – enables Nesterov momentum (default: False)


a SGD optimizer


import numpy as np
from pyvqnet.optim import sgd
from pyvqnet.tensor import QTensor
w = np.arange(24).reshape(1,2,3,4).astype(np.float64)
param = QTensor(w)
param.grad = QTensor(np.arange(24).reshape(1,2,3,4).astype(np.float64))
params = [param]
opti = sgd.SGD(params)

for i in range(1,3):

# [
# [[[0, 0.9900000, 1.9800000, 2.9700000],
#  [3.9600000, 4.9499998, 5.9400001, 6.9299998],
#  [7.9200001, 8.9099998, 9.8999996, 10.8900003]],
# [[11.8800001, 12.8699999, 13.8599997, 14.8500004],
#  [15.8400002, 16.8299999, 17.8199997, 18.8099995],
#  [19.7999992, 20.7900009, 21.7800007, 22.7700005]]]
# ]

# [
# [[[0, 0.9800000, 1.9600000, 2.9400001],
#  [3.9200001, 4.8999996, 5.8800001, 6.8599997],
#  [7.8400002, 8.8199997, 9.7999992, 10.7800007]],
# [[11.7600002, 12.7399998, 13.7199993, 14.7000008],
#  [15.6800003, 16.6599998, 17.6399994, 18.6199989],
#  [19.5999985, 20.5800018, 21.5600014, 22.5400009]]]
# ]


Rotosolve algorithm, which allows a direct jump to the optimal value of a single parameter relative to the fixed value of other parameters, can directly find the optimal parameters of the quantum circuit optimization algorithm.

class pyvqnet.optim.rotosolve.Rotosolve(max_iter=50)

Rotosolve: The rotosolve algorithm can be used to minimize a linear combination of quantum measurement expectation values. See the following paper:, Ken M. Nakanishi., Mateusz Ostaszewski.


max_iter – max number of iterations of the rotosolve update


a Rotosolve optimizer


from pyvqnet.optim.rotosolve import Rotosolve
import pyqpanda as pq
from pyvqnet.tensor import QTensor,kfloat64
from pyvqnet.qnn.measure import expval
machine = pq.CPUQVM()
nqbits = machine.qAlloc_many(2)

def gen(param, generators, qbits, circuit):
    if generators == "X":
        circuit.insert(pq.RX(qbits, param))
    elif generators == "Y":
        circuit.insert(pq.RY(qbits, param))
        circuit.insert(pq.RZ(qbits, param))

def circuits(params, generators, circuit):
    gen(params[0], generators[0], nqbits[0], circuit)
    gen(params[1], generators[1], nqbits[1], circuit)
    circuit.insert(pq.CNOT(nqbits[0], nqbits[1]))
    prog = pq.QProg()
    return prog

def ansatz1(params: QTensor, generators):
    circuit = pq.QCircuit()
    params = params.getdata()
    prog = circuits(params, generators, circuit)
    return expval(machine, prog, {"Z0": 1},
                nqbits), expval(machine, prog, {"Y1": 1}, nqbits)

def ansatz2(params: QTensor, generators):
    circuit = pq.QCircuit()
    params = params.getdata()
    prog = circuits(params, generators, circuit)
    return expval(machine, prog, {"X0": 1}, nqbits)

def loss(params):
    Z, Y = ansatz1(params, ["X", "Y"])
    X = ansatz2(params, ["X", "Y"])
    return 0.5 * Y + 0.8 * Z - 0.2 * X

t = QTensor([0.3, 0.25],dtype=kfloat64)
opt = Rotosolve(max_iter=5)

costs_rotosolve = opt.minimize(t, loss)
# [0.7642691884821847, -0.799999999999997, -0.799999999999997, -0.799999999999997, -0.799999999999997]



class pyvqnet.utils.metrics.MSE(y_true_Qtensor, y_pred_Qtensor)

MSE: Mean Squared Error.

  • y_true_Qtensor – A QTensor of shape like (n_samples,) or (n_samples, n_outputs), true target value.

  • y_pred_Qtensor – A QTensor of shape like (n_samples,) or (n_samples, n_outputs), estimated target values.


return with float result.


import numpy as np
from pyvqnet.tensor import tensor
from pyvqnet.utils import metrics as vqnet_metrics
from pyvqnet import _core
_vqnet = _core.vqnet

y_true_Qtensor = tensor.arange(1, 12)
y_pred_Qtensor = tensor.arange(4, 15)
result = vqnet_metrics.MSE(y_true_Qtensor, y_pred_Qtensor)
# 9.0

y_true_Qtensor = tensor.arange(1, 13).reshape([3, 4])
y_pred_Qtensor = tensor.arange(4, 16).reshape([3, 4])
result = vqnet_metrics.MSE(y_true_Qtensor, y_pred_Qtensor)
# 9.0


class pyvqnet.utils.metrics.RMSE(y_true_Qtensor, y_pred_Qtensor)

RMSE: Root Mean Squared Error.

  • y_true_Qtensor – A QTensor of shape like (n_samples,) or (n_samples, n_outputs), true target value.

  • y_pred_Qtensor – A QTensor of shape like (n_samples,) or (n_samples, n_outputs), estimated target values.


return with float result.


import numpy as np
from pyvqnet.tensor import tensor
from pyvqnet.utils import metrics as vqnet_metrics
from pyvqnet import _core
_vqnet = _core.vqnet

y_true_Qtensor = tensor.arange(1, 12)
y_pred_Qtensor = tensor.arange(4, 15)
result = vqnet_metrics.RMSE(y_true_Qtensor, y_pred_Qtensor)
# 3.0

y_true_Qtensor = tensor.arange(1, 13).reshape([3, 4])
y_pred_Qtensor = tensor.arange(4, 16).reshape([3, 4])
result = vqnet_metrics.RMSE(y_true_Qtensor, y_pred_Qtensor)
# 3.0


class pyvqnet.utils.metrics.MAE(y_true_Qtensor, y_pred_Qtensor)

MAE: Mean Absolute Error.

  • y_true_Qtensor – A QTensor of shape like (n_samples,) or (n_samples, n_outputs), true target value.

  • y_pred_Qtensor – A QTensor of shape like (n_samples,) or (n_samples, n_outputs), estimated target values.


return with float result.


import numpy as np
from pyvqnet.tensor import tensor
from pyvqnet.utils import metrics as vqnet_metrics
from pyvqnet import _core
_vqnet = _core.vqnet

y_true_Qtensor = tensor.arange(1, 12)
y_pred_Qtensor = tensor.arange(4, 15)
result = vqnet_metrics.MAE(y_true_Qtensor, y_pred_Qtensor)
# 3.0

y_true_Qtensor = tensor.arange(1, 13).reshape([3, 4])
y_pred_Qtensor = tensor.arange(4, 16).reshape([3, 4])
result = vqnet_metrics.MAE(y_true_Qtensor, y_pred_Qtensor)
# 3.0


class pyvqnet.utils.metrics.R_Square(y_true_Qtensor, y_pred_Qtensor, sample_weight=None)

R_Square: R^2 (coefficient of determination) regression score function. The best possible score is 1.0, which can be negative (since the model can deteriorate arbitrarily). One that always predicts the expected value of y, ignoring the input features, will get an R^2 score of 0.0.

  • y_true_Qtensor – A QTensor of shape like (n_samples,) or (n_samples, n_outputs), true target value.

  • y_pred_Qtensor – A QTensor of shape like (n_samples,) or (n_samples, n_outputs), estimated target values.

  • sample_weight – Array of shape like (n_samples,), optional sample weight, default:None.


return with float result.


import numpy as np
from pyvqnet.tensor import tensor
from pyvqnet.utils import metrics as vqnet_metrics
from pyvqnet import _core
_vqnet = _core.vqnet

y_true_Qtensor = tensor.arange(1, 12)
y_pred_Qtensor = tensor.arange(4, 15)
result = vqnet_metrics.R_Square(y_true_Qtensor, y_pred_Qtensor)
# 0.09999999999999998

y_true_Qtensor = tensor.arange(1, 13).reshape([3, 4])
y_pred_Qtensor = tensor.arange(4, 16).reshape([3, 4])
result = vqnet_metrics.R_Square(y_true_Qtensor, y_pred_Qtensor)
# 0.15625


class pyvqnet.utils.metrics.precision_recall_f1_2_score(y_true_Qtensor, y_pred_Qtensor)

Calculate the precision, recall and F1 score of the predicted values under the 2-classification task. The predicted and true values need to be QTensors of similar shape (n_samples, ), with a value of 0 or 1, representing the labels of the two classes.

  • y_true_Qtensor – A 1D QTensor, true target value.

  • y_pred_Qtensor – A 1D QTensor, estimated target value.


  • precision - precision result

  • recall - recall result

  • f1 - f1 score


import numpy as np
from pyvqnet.tensor import tensor
from pyvqnet.utils import metrics as vqnet_metrics
from pyvqnet import _core
_vqnet = _core.vqnet

y_true_Qtensor = tensor.QTensor([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
y_pred_Qtensor = tensor.QTensor([0, 0, 1, 1, 1, 0, 0, 1, 1, 1])

precision, recall, f1 = vqnet_metrics.precision_recall_f1_2_score(
    y_true_Qtensor, y_pred_Qtensor)
print(precision, recall, f1)
# 0.5 0.6 0.5454545454545454


class pyvqnet.utils.metrics.precision_recall_f1_N_score(y_true_Qtensor, y_pred_Qtensor, N, average)

Precision, recall, and F1 score calculations for multi-classification tasks. where the predicted value and the true value are QTensors of similar shape (n_samples, ), and the values are integers from 0 to N-1, representing the labels of N classes.

  • y_true_Qtensor – A 1D QTensor, true target value.

  • y_pred_Qtensor – A 1D QTensor, estimated target value.

  • N – N classes (number of classes).

  • average

    string, [‘micro’, ‘macro’, ‘weighted’]. This parameter is required for multi-class/multi-label targets.

    'micro': Compute metrics globally by counting total true counts, false negatives and false positives.

    'macro': Calculate the metric for each label and find its unweighted value. Meaning that the balance of labels is not considered.

    'weighted': Calculate the metrics for each label and find their average (the number of true instances of each label). This changes 'macro' to account for label imbalance; this may result in F-scores not being between precision and recall.


  • precision - precision result

  • recall - recall result

  • f1 - f1 score


import numpy as np
from pyvqnet.tensor import tensor
from pyvqnet.utils import metrics as vqnet_metrics
from pyvqnet import _core
_vqnet = _core.vqnet

reference_list = [1, 1, 2, 2, 2, 3, 3, 3, 3, 3]
prediciton_list = [1, 2, 2, 2, 3, 1, 2, 3, 3, 3]
y_true_Qtensor = tensor.QTensor(reference_list)
y_pred_Qtensor = tensor.QTensor(prediciton_list)

precision_micro, recall_micro, f1_micro = vqnet_metrics.precision_recall_f1_N_score(
    y_true_Qtensor, y_pred_Qtensor, 3, average='micro')
print(precision_micro, recall_micro, f1_micro)
# 0.6 0.6 0.6

precision_macro, recall_macro, f1_macro = vqnet_metrics.precision_recall_f1_N_score(
    y_true_Qtensor, y_pred_Qtensor, 3, average='macro')
print(precision_macro, recall_macro, f1_macro)
# 0.5833333333333334 0.5888888888888889 0.5793650793650794

precision_weighted, recall_weighted, f1_weighted = vqnet_metrics.precision_recall_f1_N_score(
    y_true_Qtensor, y_pred_Qtensor, 3, average='weighted')
print(precision_weighted, recall_weighted, f1_weighted)
# 0.625 0.6 0.6047619047619047

precision_recall_f1_Multi_score =================================^^^^^^^

class pyvqnet.utils.metrics.precision_recall_f1_Multi_score(y_true_Qtensor, y_pred_Qtensor, N, average)

Precision, recall, and F1 score calculations for multi-classification tasks. where the predicted and true values are QTensors of similar shape (n_samples, N), where the values are N-dimensional one-hot encoded label values.

  • y_true_Qtensor – A 1D QTensor, true target value.

  • y_pred_Qtensor – A 1D QTensor, estimated target value.

  • N – N classes (number of classes).

  • average

    string, [‘micro’, ‘macro’, ‘weighted’]. This parameter is required for multi-class/multi-label targets.

    'micro': Compute metrics globally by counting total true counts, false negatives and false positives.

    'macro': Calculate the metric for each label and find its unweighted value. Meaning that the balance of labels is not considered.

    'weighted': Calculate the metrics for each label and find their average (the number of true instances of each label). This changes 'macro' to account for label imbalance; this may result in F-scores not being between precision and recall.


  • precision - precision result

  • recall - recall result

  • f1 - f1 score


import numpy as np
from pyvqnet.tensor import tensor
from pyvqnet.utils import metrics as vqnet_metrics
from pyvqnet import _core
_vqnet = _core.vqnet

reference_list = [[1, 0], [0, 1], [0, 0], [1, 1], [1, 0]]
prediciton_list = [[1, 0], [0, 0], [1, 0], [0, 0], [0, 0]]
y_true_Qtensor = tensor.QTensor(reference_list)
y_pred_Qtensor = tensor.QTensor(prediciton_list)

micro_precision, micro_recall, micro_f1 = vqnet_metrics.precision_recall_f1_Multi_score(y_true_Qtensor,
            y_pred_Qtensor, 2, average='micro')
print(micro_precision, micro_recall, micro_f1)
# 0.5 0.2 0.28571428571428575

macro_precision, macro_recall, macro_f1 = vqnet_metrics.precision_recall_f1_Multi_score(y_true_Qtensor,
            y_pred_Qtensor, 2, average='macro')
print(macro_precision, macro_recall, macro_f1)
# 0.25 0.16666666666666666 0.2

weighted_precision, weighted_recall, weighted_f1 = vqnet_metrics.precision_recall_f1_Multi_score(y_true_Qtensor,
            y_pred_Qtensor, 2, average='weighted')
print(weighted_precision, weighted_recall, weighted_f1)
# 0.3 0.19999999999999998 0.24

reference_list = [[1, 0, 0], [0, 1, 0], [0, 0, 1], [1, 1, 0], [1, 0, 1]]
prediciton_list = [[1, 0, 0], [1, 0, 0], [1, 1, 1], [1, 0, 0], [0, 1, 1]]
y_true_Qtensor = tensor.QTensor(reference_list)
y_pred_Qtensor = tensor.QTensor(prediciton_list)

micro_precision, micro_recall, micro_f1 = vqnet_metrics.precision_recall_f1_Multi_score(y_true_Qtensor,
            y_pred_Qtensor, 3, average='micro')
print(micro_precision, micro_recall, micro_f1) # 0.5 0.5714285714285714 0.5333333333333333

macro_precision, macro_recall, macro_f1 = vqnet_metrics.precision_recall_f1_Multi_score(y_true_Qtensor,
            y_pred_Qtensor, 3, average='macro')
print(macro_precision, macro_recall, macro_f1)
# 0.5 0.5555555555555555 0.5238095238095238

weighted_precision, weighted_recall, weighted_f1 = vqnet_metrics.precision_recall_f1_Multi_score(y_true_Qtensor,
            y_pred_Qtensor, 3, average='weighted')
print(weighted_precision, weighted_recall, weighted_f1)
# 0.5 0.5714285714285714 0.5306122448979592


class pyvqnet.utils.metrics.auc_calculate(y_true_Qtensor, y_pred_Qtensor, pos_label=None, sample_weight=None, drop_intermediate=True)

Compute the precision, recall and f1 score of the classification task.

  • y_true_Qtensor – A QTensor like of shape [n_samples]. A true binary label. If the label is not {1,1} or {0,1}, pos_label should be given explicitly.

  • y_pred_Qtensor – A QTensor like of shape [n_samples]. Target score, which can be a positive probability estimate class, confidence value, or a non-threshold measure of the decision (returned by “decision_function” on some classifiers)

  • pos_label – int or str. The label of the positive class. default=None. When pos_label is None, if y_true_Qtensor is at {-1,1} or {0,1}, pos_label is set to 1, otherwise an error will be raised.

  • sample_weight – array of shape (n_samples,), default=None.

  • drop_intermediate – boolean, optional (default=True). Whether to lower some suboptimal thresholds that don’t appear on the drawn ROC curve.


output float result.


import numpy as np
from pyvqnet.tensor import tensor
from pyvqnet.utils import metrics as vqnet_metrics
from pyvqnet import _core
_vqnet = _core.vqnet

y = np.array([1, 1, 1, 1, 0, 1, 0, 0, 0, 0])
pred = np.array([0.9, 0.8, 0.7, 0.6, 0.6, 0.4, 0.4, 0.3, 0.2, 0.1])
y_Qtensor = tensor.QTensor(y)
pred_Qtensor = tensor.QTensor(pred)
result = vqnet_metrics.auc_calculate(y_Qtensor, pred_Qtensor)
print("auc:", result)
# 0.92

y = np.array([1, 1, 1, 1, 1, 0, 0, 1, 1, 1])
pred = np.array([1, 0, 1, 1, 1, 1, 0, 1, 1, 0])
y_Qtensor = tensor.QTensor(y)
pred_Qtensor = tensor.QTensor(pred)
result = vqnet_metrics.auc_calculate(y_Qtensor, pred_Qtensor)
print("auc:", result)
# 0.625

y = [1, 2, 1, 1, 1, 0, 0, 1, 1, 1]
pred = [1, 0, 2, 1, 1, 1, 0, 1, 1, 0]
y_Qtensor = tensor.QTensor(y)
pred_Qtensor = tensor.QTensor(pred)
result = vqnet_metrics.auc_calculate(y_Qtensor, pred_Qtensor, pos_label=2)
print("auc:", result)
# 0.1111111111111111

Distributed Computing Module

Environment deployment

The following describes the deployment of the environment under the Linux system based on CPU and GPU distributed computing, respectively.

MPI Installation

MPI is a common library for inter-CPU communication, and the distributed computing function of CPU in VQNet is realized based on MPI, and the following section describes how to install MPI in Linux system (at present, the distributed computing function based on CPU is realized only on Linux).

Detect if gcc, gfortran compilers are installed.

which gcc
which gfortran

When the paths to gcc and gfortran are shown, you can proceed to the next step of installation, if you do not have the corresponding compilers, please install the compilers first. When the compilers have been checked, use the wget command to download them.

tar -zxvf mpich-3.3.2.tar.gz
cd mpich-3.3.2
./configure --prefix=/usr/local/mpich
make install

Finish compiling and installing mpich and configure its environment variables.

vim ~/.bashrc

# At the bottom of the document, add
export PATH="/usr/local/mpich/bin:$PATH"

After saving and exiting, use source to execute

source ~/.bashrc

Use which to verify that the environment variables are configured correctly. If the path is displayed, the installation has completed successfully.

In addition, you can install mpi4py via pip install, if you get the following error


To solve the problem of incompatibility between mpi4py and python versions, you can do the following

# Staging the compiler for the current python environment with the following code
pushd /root/anaconda3/envs/mpi39/compiler_compat && mv ld ld.bak && popd

# Re-installation
pip install mpi4py

# reduction
pushd /root/anaconda3/envs/mpi39/compiler_compat && mv ld.bak ld && popd

NCCL Installation

NCCL is a common library for communication between GPUs, and the distributed computing function of GPUs in VQNet is realized based on NCCL, and the following introduces how to install NCCL in Linux system (at present, the distributed computing function based on GPUs is realized only on Linux). This section requires MPI support, so the MPI environment needs to be deployed as well.

Pull the NCCL repositories from github to local

git clone

Go to the nccl root directory and compile

cd nccl
make -j

If cuda is not installed in the default path /usr/local/cuda, you need to define the path to CUDA, and compile it using the following code

make CUDA_HOME=<path to cuda install>

And you can specify the installation directory according to BUILDDIR, the command is as follows

make CUDA_HOME=<path to cuda install> BUILDDIR=/usr/local/nccl

Add configuration to the .bashrc file after installation is complete

vim ~/.bashrc

# Add at the bottom
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/nccl/lib
export PATH=$PATH:/usr/local/nccl/bin

After saving, execute

source ~/.bashrc

It can be verified with nccl-test

git clone
cd nccl-tests
make -j12 CUDA_HOME=/usr/local/cuda
./build/all_reduce_perf -b 8 -e 256M -f 2 -g 1

Inter-node communication environment deployment

To implement distributed computing on multiple nodes, firstly, we need to ensure the consistency of the mpich environment on multiple nodes and the consistency of the python environment, and secondly, we need to set up secret-free communication between nodes. Let’s assume that we need to set up three nodes, node0 (master node), node1, and node2, for secret-free communication.

# Execute on each node

# After that, keep entering to generate a public key ( and a private key (id_rsa) in the .ssh folder
# Add the public keys of both of its other nodes to the authorized_keys file of the first node.
# Then pass the authorized_keys file from the first node to the other two nodes to achieve password-free communication between the nodes.
# Execute on child node node1
cat ~/.ssh/ >> node0:~/.ssh/authorized_keys

# Execute on child node node2
cat ~/.ssh/ >> node0:~/.ssh/authorized_keys

# After deleting the authorized_keys files on node1 and node2, copy the authorized_keys file on node0 to the other two nodes.
scp ~/.ssh/authorized_keys  node1:~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys  node2:~/.ssh/authorized_keys

# After deleting the authorized_keys files on node1 and node2, copy the authorized_keys file on node0 to the other two nodes.

In addition to this, it is also a good idea to set up a shared directory so that when files in the shared directory are changed, files in different nodes are also changed, preventing files in different nodes from being out of sync when the model is run on multiple nodes. The shared directory is implemented using nfs-utils and rpcbind.

# Installation of software packages
yum -y install nfs* rpcbind

# Edit the configuration file on the master node
vim /etc/exports
/data/mpi *(rw,sync,no_all_squash,no_subtree_check)

# Start the service on the master node
systemctl start rpcbind
systemctl start nfs

# Mount the directory to be shared on all child nodes node1,node2.
mount node1:/data/mpi/ /data/mpi
mount node2:/data/mpi/ /data/mpi

CPU Distributed Computing Interface and Samples

This block describes how to use VQNet distributed computing interface to realize data parallel training model on cpu hardware platform (currently only supported on Linux system).


Initialize distributed computing parameters using init_process.

pyvqnet.distributed.init.init_process(size, path, hostpath=None, train_size=None, test_size=None, shuffle=False)

Setting Distributed Computing Parameters.

  • size – Number of processes.

  • path – absolute path to the current runtime file.

  • hostpath – absolute path to the multi-node configuration file.

  • train_size – The size of the training set.

  • test_size – The size of the test set.

  • shuffle – If or not random sampling.


import argparse
import os
from pyvqnet.distributed import *

parser = argparse.ArgumentParser(description='parser example')
parser.add_argument('--init', default=False, type=bool, help='whether to use multiprocessing')
parser.add_argument('--np', default=1, type=int, help='number of processes')
parser.add_argument('--hostpath', default=None, type=str, help='multi node configuration files')
parser.add_argument('--shuffle', default=False, type=bool, help='shuffle')
parser.add_argument('--train_size', default=120, type=int, help='train_size')
parser.add_argument('--test_size', default=50, type=int, help='test_size')
args = parser.parse_args()

    init_process(, os.path.realpath(__file__))

# python --init true --np 2


Use average_parameters_allreduce to pass model parameters on different processes in an allreduce fashion and update them with the average value.


Setting Distributed Computing Parameters.


modelModule - Trained Models.


Model after parameter update.


from pyvqnet.distributed import average_parameters_allreduce
import numpy as np
from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn import activation as F
from pyvqnet.distributed import *

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = Linear(input_channels=5, output_channels=1)

    def forward(self, x):
        x = F.ReLu()(self.fc(x))
        return x

model = Net()
print(f"rank {get_rank()} parameters is {model.parameters()}")
model = average_parameters_allreduce(model)

if get_rank() == 0:

# mpirun -n 2 python


Use average_grad_allreduce to pass the model parameter gradients across processes in an allreduce fashion and update them with the average.


Setting Distributed Computing Parameters.


optimizer – optimizer.


Optimizer after gradient update.


from pyvqnet.distributed import average_grad_allreduce
import numpy as np
from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn import activation as F
from pyvqnet.distributed import *
from pyvqnet.nn.loss import MeanSquaredError
from pyvqnet.optim import Adam

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = Linear(input_channels=5, output_channels=1)

    def forward(self, x):
        x = F.ReLu()(self.fc(x))
        return x
model = Net()
opti = Adam(model.parameters(), lr=0.01)
actual = tensor.QTensor([1,1,1,1,1,0,0,0,0,0],dtype=6).reshape((10,1))

x = tensor.randn((10, 5))
for i in range(10):

    result = model(x)
    loss = MeanSquaredError()(actual, result)

    print(f"rank {get_rank()} grad is {model.parameters()[0].grad}")
    opti = average_grad_allreduce(opti)
    # if get_rank() == 0 :
    print(f"rank {get_rank()} grad is {model.parameters()[0].grad}")

# mpirun -n 2 python


Use average_parameters_reduce to pass model parameters on a process as a reduce, and update the parameters on the specified process.

pyvqnet.distributed.comm.average_parameters_reduce(model, root=0)

Setting Distributed Computing Parameters.

  • modelModule - Trained Models.

  • root – Specified process number.


Model after parameter update.


from pyvqnet.distributed import average_parameters_reduce
import numpy as np
from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn import activation as F
from pyvqnet.distributed import *

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = Linear(input_channels=5, output_channels=1)

    def forward(self, x):
        x = F.ReLu()(self.fc(x))
        return x

model = Net()
print(f"rank {get_rank()} parameters is {model.parameters()}")
model = average_parameters_reduce(model)

if get_rank() == 0:

# mpirun -n 2 python


Use average_grad_reduce to pass the gradient of a parameter on a process as a reduce, and update the gradient of the parameter on the specified process.

pyvqnet.distributed.comm.average_grad_reduce(optimizer, root=0)

Setting Distributed Computing Parameters.

  • optimizer – optimizer.

  • root – Specified process number.


Optimizer after gradient update.


from pyvqnet.distributed import average_grad_reduce
import numpy as np
from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn import activation as F
from pyvqnet.distributed import *
from pyvqnet.nn.loss import MeanSquaredError
from pyvqnet.optim import Adam

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = Linear(input_channels=5, output_channels=1)

    def forward(self, x):
        x = F.ReLu()(self.fc(x))
        return x
model = Net()
opti = Adam(model.parameters(), lr=0.01)
actual = tensor.QTensor([1,1,1,1,1,0,0,0,0,0],dtype=6).reshape((10,1))

x = tensor.randn((10, 5))
for i in range(10):

    result = model(x)
    loss = MeanSquaredError()(actual, result)

    print(f"rank {get_rank()} grad is {model.parameters()[0].grad}")
    opti = average_grad_reduce(opti)
    # if get_rank() == 0 :
    print(f"rank {get_rank()} grad is {model.parameters()[0].grad}")

# mpirun -n 2 python


Importing related libraries

import sys
import time
import os
import struct
import gzip
from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn.conv import Conv2D

from pyvqnet.nn import activation as F
from pyvqnet.nn.pooling import MaxPool2D
from pyvqnet.nn.loss import CategoricalCrossEntropy
from pyvqnet.optim.adam import Adam
from import data_generator
from pyvqnet.tensor import tensor
from pyvqnet.tensor.tensor import QTensor
import pyqpanda as pq
import time
import numpy as np
import matplotlib
from pyvqnet.distributed import *
import argparse

Data Acquisition

url_base = ""
key_file = {
    "train_img": "train-images-idx3-ubyte.gz",
    "train_label": "train-labels-idx1-ubyte.gz",
    "test_img": "t10k-images-idx3-ubyte.gz",
    "test_label": "t10k-labels-idx1-ubyte.gz"
if_show_sample = 0
grad_time = []
forward_time = []
forward_time_sum = []

def _download(dataset_dir, file_name):
    Download mnist data if needed.
    file_path = dataset_dir + "/" + file_name

    if os.path.exists(file_path):
        with gzip.GzipFile(file_path) as file:
            file_path_ungz = file_path[:-3].replace("\\", "/")
            if not os.path.exists(file_path_ungz):
                open(file_path_ungz, "wb").write(

    print("Downloading " + file_name + " ... ")
    urllib.request.urlretrieve(url_base + file_name, file_path)
    if os.path.exists(file_path):
        with gzip.GzipFile(file_path) as file:
            file_path_ungz = file_path[:-3].replace("\\", "/")
            file_path_ungz = file_path_ungz.replace("-idx", ".idx")
            if not os.path.exists(file_path_ungz):
                open(file_path_ungz, "wb").write(

def download_mnist(dataset_dir):
    for v in key_file.values():
        _download(dataset_dir, v)

def load_mnist(dataset="training_data", digits=np.arange(2), path="./"):
    load mnist data
    from array import array as pyarray
    if dataset == "training_data":
        fname_image = os.path.join(path, "train-images.idx3-ubyte").replace(
            "\\", "/")
        fname_label = os.path.join(path, "train-labels.idx1-ubyte").replace(
            "\\", "/")
    elif dataset == "testing_data":
        fname_image = os.path.join(path, "t10k-images.idx3-ubyte").replace(
            "\\", "/")
        fname_label = os.path.join(path, "t10k-labels.idx1-ubyte").replace(
            "\\", "/")
        raise ValueError("dataset must be 'training_data' or 'testing_data'")

    flbl = open(fname_label, "rb")
    _, size = struct.unpack(">II",
    lbl = pyarray("b",

    fimg = open(fname_image, "rb")
    _, size, rows, cols = struct.unpack(">IIII",
    img = pyarray("B",

    ind = [k for k in range(size) if lbl[k] in digits]
    num = len(ind)
    images = np.zeros((num, rows, cols))
    labels = np.zeros((num, 1), dtype=int)
    for i in range(len(ind)):
        images[i] = np.array(img[ind[i] * rows * cols:(ind[i] + 1) * rows *
                                 cols]).reshape((rows, cols))
        labels[i] = lbl[ind[i]]

    return images, labels

def data_select(train_num, test_num):
    Select data from mnist dataset.

    x_train, y_train = load_mnist("training_data")
    x_test, y_test = load_mnist("testing_data")
    idx_train = np.append(
            np.where(y_train == 0)[0][0:train_num],
            np.where(y_train == 1)[0][0:train_num])
    x_train = x_train[idx_train]
    y_train = y_train[idx_train]
    x_train = x_train / 255
    y_train = np.eye(2)[y_train].reshape(-1, 2)

    idx_test = np.append(
            np.where(y_test == 0)[0][:test_num],
            np.where(y_test == 1)[0][:test_num])
    x_test = x_test[idx_test]
    y_test = y_test[idx_test]
    x_test = x_test / 255
    y_test = np.eye(2)[y_test].reshape(-1, 2)

    return x_train, y_train, x_test, y_test

Model Definition

def circuit_func(weights):
    A function using QPanda to create quantum circuits and run.
    num_qubits = 1
    machine = pq.CPUQVM()
    qubits = machine.qAlloc_many(num_qubits)
    cbits = machine.cAlloc_many(num_qubits)
    circuit = pq.QCircuit()
    circuit.insert(pq.RY(qubits[0], weights[0]))
    prog = pq.QProg()
    prog << pq.measure_all(qubits, cbits)  #pylint:disable=expression-not-assigned

    result = machine.run_with_configuration(prog, cbits, 1000)

    counts = np.array(list(result.values()))
    states = np.array(list(result.keys())).astype(float)
    # Compute probabilities for each state
    probabilities = counts / 100
    # Get state expectation
    expectation = np.sum(states * probabilities)
    return expectation

class Hybrid(Module):
    """ Hybrid quantum - Quantum layer definition """
    def __init__(self, shift):
        super(Hybrid, self).__init__()
        self.shift = shift
        self.input = None

    def forward(self, x):
        self.input = x
        expectation_z = circuit_func(np.array(
        result = [[expectation_z]]
        # requires_grad = x.requires_grad and not QTensor.NO_GRAD
        requires_grad = x.requires_grad
        def _backward_mnist(g, x):
            """ Backward pass computation """
            start_grad_time = time.time()
            input_list = np.array(
            shift_right = input_list + np.ones(input_list.shape) * self.shift
            shift_left = input_list - np.ones(input_list.shape) * self.shift

            gradients = []
            for i in range(len(input_list)):
                expectation_right = circuit_func(shift_right[i])
                expectation_left = circuit_func(shift_left[i])
                gradient = expectation_right - expectation_left
            gradients = np.array([gradients]).T

            end_grad_time = time.time()
            grad_time.append(end_grad_time - start_grad_time)
            in_g = gradients * np.array(g)
            return in_g

        nodes = []
        if x.requires_grad:
                                  df=lambda g: _backward_mnist(g, x)))
        return QTensor(data=result, requires_grad=requires_grad, nodes=nodes)

class Net(Module):
    Hybird Quantum Classci Neural Network Module
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = Conv2D(input_channels=1,
                            kernel_size=(5, 5),
                            stride=(1, 1),
        self.maxpool1 = MaxPool2D([2, 2], [2, 2], padding="valid")
        self.conv2 = Conv2D(input_channels=6,
                            kernel_size=(5, 5),
                            stride=(1, 1),
        self.maxpool2 = MaxPool2D([2, 2], [2, 2], padding="valid")

        self.fc1 = Linear(input_channels=256, output_channels=64)
        self.fc2 = Linear(input_channels=64, output_channels=1)

        self.hybrid = Hybrid(np.pi / 2)
        self.fc3 = Linear(input_channels=1, output_channels=2)

    def forward(self, x):
        start_time_forward = time.time()
        x = F.ReLu()(self.conv1(x))

        x = self.maxpool1(x)
        x = F.ReLu()(self.conv2(x))

        x = self.maxpool2(x)
        x = tensor.flatten(x, 1)

        x = F.ReLu()(self.fc1(x))
        x = self.fc2(x)

        start_time_hybrid = time.time()
        x = self.hybrid(x)

        end_time_hybrid = time.time()

        forward_time.append(end_time_hybrid - start_time_hybrid)

        x = self.fc3(x)
        end_time_forward = time.time()
        forward_time_sum.append(end_time_forward - start_time_forward)
        return x

Split_data, average_parameters_allreduce, and init_process are referenced during training to implement distributed computation based on CPU data parallelism.

The method of use is as follows

def run(args):
    Run mnist train function
    x_train, y_train, x_test, y_test = data_select(args.train_size, args.test_size)

    x_train, y_train= split_data(x_train, y_train)
    model = Net()
    optimizer = Adam(model.parameters(), lr=0.001)
    loss_func = CategoricalCrossEntropy()

    epochs = 10
    train_loss_list = []
    val_loss_list = []
    train_acc_list = []
    val_acc_list = []

    for epoch in range(1, epochs):
        total_loss = []
        batch_size = 1
        correct = 0
        n_train = 0

        for x, y in data_generator(x_train,

            x = x.reshape(-1, 1, 28, 28)

            output = model(x)
            loss = loss_func(y, output)
            loss_np = np.array(

            np_output = np.array(, copy=False)
            mask = (np_output.argmax(1) == y.argmax(1))
            correct += np.sum(np.array(mask))
            n_train += batch_size

            # optimizer = average_grad_allreduce(optimizer) Passing parameter gradients in the optimizer as allreduce and updating the

        model = average_parameters_allreduce(model)

        train_loss_list.append(np.sum(total_loss) / len(total_loss))
        train_acc_list.append(np.sum(correct) / n_train)
        print("{:.0f} loss is : {:.10f}".format(epoch, train_loss_list[-1]))

        correct = 0
        n_eval = 0

        for x, y in data_generator(x_test, y_test, batch_size=1, shuffle=True):
            x = x.reshape(-1, 1, 28, 28)
            output = model(x)
            loss = loss_func(y, output)
            loss_np = np.array(
            np_output = np.array(, copy=False)
            mask = (np_output.argmax(1) == y.argmax(1))
            correct += np.sum(np.array(mask))
            n_eval += 1

        print(f"Eval Accuracy: {correct / n_eval}")
        val_loss_list.append(np.sum(total_loss) / len(total_loss))
        val_acc_list.append(np.sum(correct) / n_eval)

if __name__ == "__main__":

    parser = argparse.ArgumentParser(description='parser example')
    parser.add_argument('--init', default=False, type=bool, help='whether to use multiprocessing')
    parser.add_argument('--np', default=1, type=int, help='number of processes')
    parser.add_argument('--hostpath', default=None, type=str, help='hosts absolute path')
    parser.add_argument('--shuffle', default=False, type=bool, help='shuffle')
    parser.add_argument('--train_size', default=120, type=int, help='train_size')
    parser.add_argument('--test_size', default=50, type=int, help='test_size')
    args = parser.parse_args()
    # p_path = os.path.realpath (__file__)

        init_process(, os.path.realpath(__file__), args.hostpath, args.train_size,args.test_size, args.shuffle)
        a = time.time()
            print("time: {}",format(b-a))

Where init represents whether the model is based on distributed training, np represents the number of processes, in addition to the hostpath file code on multiple nodes to run the model when the absolute path of the configuration file, the configuration file content including the ip of multiple nodes and process allocation, as follows


At the command line

python --init true

1 loss is : 0.8230862300
Eval Accuracy: 0.5
9 loss is : 0.5660219193
Eval Accuracy: 0.46
time: {} 15.132369756698608

python --init true --np 2


1 loss is : 0.0316730281
Eval Accuracy: 0.5
9 loss is : 0.0006756162
Eval Accuracy: 0.5

1 loss is : 0.0072183679
Eval Accuracy: 0.85
9 loss is : 0.0001979264
Eval Accuracy: 0.82
time: {} 9.132536888122559

Above is the multi-process model training on a single node, it can be clearly seen that the training time is shortened

To train on multiple nodes, the command is as follows

python3 --init true --np 4 --hostpath ~/example/host.txt

1 loss is : 0.8609524409
Eval Accuracy: 0.5
9 loss is : 0.4251357079
Eval Accuracy: 0.5
time: {} 6.5950517654418945

1 loss is : 0.0034498004
Eval Accuracy: 0.5
9 loss is : 0.0001483827
Eval Accuracy: 0.5

1 loss is : 0.0990966797
Eval Accuracy: 0.5
9 loss is : 0.0037492002
Eval Accuracy: 0.5

1 loss is : 0.8468652089
Eval Accuracy: 0.5
Eval Accuracy: 0.53
9 loss is : 0.4186156909
Eval Accuracy: 0.52

GPU Distributed Computing Interface and Samples


GPU Distributed Computing Interface and Sample Usage nccl_average_parameters_allreduce Passing and updating model parameters on different processes in an allreduce manner.

pyvqnet.distributed.nccl_api.nccl_average_parameters_allreduce(optimizer, Ncclop: NCCL_api, c_op='avg')

Set parameters for distributed computation.

param model:

Module - the model for training.

param Ncclop:


param c_op:

Calculation method.


import numpy as np
from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn import activation as F
from pyvqnet.distributed.nccl_api import *

nccl_op = NCCL_api()

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = Linear(input_channels=5, output_channels=1)
    def forward(self, x):
        x = F.ReLu()(self.fc(x))
        return x
model = Net().toGPU(1000 + get_rank())
print(f"rank {get_rank()} parameters is {model.parameters()}")
nccl_average_parameters_allreduce(model, nccl_op)

if get_rank() == 0:

# mpirun -n 2 python

# rank 1 parameters is [[[ 0.8647987],
#  [ 0.8910748],
#  [-0.3896213],
#  [-0.871486 ],
#  [-0.8997867]], [0.4014191]]
# rank 0 parameters is [[[-0.6880538],
#  [ 0.0963508],
#  [-0.3776291],
#  [ 0.1773794],
#  [ 0.6670241]], [-0.1019871]]
# [[[ 0.0883724],
#  [ 0.4937128],
#  [-0.3836252],
#  [-0.3470533],
#  [-0.1163813]], [0.149716]]


Use nccl_average_parameters_reduce to pass and update model parameters on different processes in a reduce manner.

pyvqnet.distributed.nccl_api.nccl_average_parameters_reduce(model, Ncclop: NCCL_api, root=0, c_op='avg')

Set parameters for distributed computation.


model: Module - the model for training.

param Ncclop:


param root:

Specifies the process number.

param c_op:

Calculation method.


import numpy as np
from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn import activation as F
from pyvqnet.distributed.nccl_api import *

nccl_op = NCCL_api()

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = Linear(input_channels=5, output_channels=1)
    def forward(self, x):
        x = F.ReLu()(self.fc(x))
        return x
model = Net().toGPU(1000 + get_rank())
print(f"rank {get_rank()} parameters is {model.parameters()}")

nccl_average_parameters_reduce(model, nccl_op)

if get_rank() == 0:

# mpirun -n 2 python

# rank 1 parameters is [[[-0.7666817],
#  [ 0.3023796],
#  [-0.6021696],
#  [ 0.5293468],
#  [-0.1318247]], [0.4162451]]
# rank 0 parameters is [[[ 0.1145883],
#  [-0.3539237],
#  [ 0.8672745],
#  [ 0.5483069],
#  [-0.5038487]], [0.4179307]]
# [[[-0.3260467],
#  [-0.025772 ],
#  [ 0.1325525],
#  [ 0.5388269],
#  [-0.3178367]], [0.4170879]]


Use nccl_average_grad_allreduce to pass and update parameter gradients on different processes in an allreduce fashion.

pyvqnet.distributed.nccl_api.nccl_average_grad_allreduce(optimizer, Ncclop: NCCL_api, c_op='avg')

Sets parameters for distributed computation.

param optimizer:


param Ncclop:


param root:

Specified process number.


import numpy as np
from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn import activation as F
from pyvqnet.distributed.nccl_api import *
from pyvqnet.nn.loss import MeanSquaredError
from pyvqnet.optim import Adam
from pyvqnet.tensor import tensor

nccl_op = NCCL_api()

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = Linear(input_channels=5, output_channels=1)
    def forward(self, x):
        x = F.ReLu()(self.fc(x))
        return x

model = Net().toGPU(1000+ get_rank())
opti = Adam(model.parameters(), lr=0.01)

actual = tensor.QTensor([1,1,1,1,1,0,0,0,0,0],dtype=6).reshape((10,1)).toGPU(1000+get_rank())

x = tensor.randn((10, 5)).toGPU(1000+get_rank())

for i in range(10):

    result = model(x)
    loss = MeanSquaredError()(actual, result)

    print(f"rank {get_rank()} grad is {model.parameters()[0].grad}")

    nccl_average_grad_allreduce(opti, nccl_op)
    if get_rank() == 0 :
        print(f"rank {get_rank()} grad is {model.parameters()[0].grad}")


# mpirun -n 2 python
# rank 1 grad is [[-0.2537998],
#  [-0.0411504],
#  [-0.3565139],
#  [ 0.5702319],
#  [ 0.0177623]]
# rank 0 grad is [[-0.1322807],
#  [ 0.481559 ],
#  [-0.8823745],
#  [ 0.211081 ],
#  [-0.0234532]]
# rank 0 grad is [[-0.1930403],
#  [ 0.2202043],
#  [-0.6194442],
#  [ 0.3906564],
#  [-0.0028455]]


Use nccl_average_grad_reduce to pass and update parameter gradients on different processes in a reduce fashion.

pyvqnet.distributed.nccl_api.nccl_average_grad_reduce(optimizer, Ncclop: NCCL_api, root=0, c_op='avg')

Set parameters for distributed computation.

param optimizer:


param Ncclop:


param root:

Update parameter gradient on specified node.

param c_op:

Calculation method.


import numpy as np
from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn import activation as F
from pyvqnet.distributed.nccl_api import *
from pyvqnet.nn.loss import MeanSquaredError
from pyvqnet.optim import Adam
from pyvqnet.tensor import tensor

nccl_op = NCCL_api()

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = Linear(input_channels=5, output_channels=1)
    def forward(self, x):
        x = F.ReLu()(self.fc(x))
        return x

model = Net().toGPU(1000+ get_rank())
opti = Adam(model.parameters(), lr=0.01)

actual = tensor.QTensor([1,1,1,1,1,0,0,0,0,0],dtype=6).reshape((10,1)).toGPU(1000+get_rank())

x = tensor.randn((10, 5)).toGPU(1000+get_rank())

for i in range(10):

    result = model(x)
    loss = MeanSquaredError()(actual, result)

    print(f"rank {get_rank()} grad is {model.parameters()[0].grad}")

    nccl_average_grad_reduce(opti, nccl_op)
    if get_rank() == 0 :
        print(f"rank {get_rank()} grad is {model.parameters()[0].grad}")


# mpirun -n 2 python

# rank 1 grad is [[ 0.2536973],
#  [ 0.1971456],
#  [ 0.2229966],
#  [-0.1126524],
#  [-0.4308025]]
# rank 0 grad is [[-0.7967089],
#  [ 0.3266841],
#  [ 0.087491 ],
#  [-2.0684564],
#  [ 1.0999191]]
# rank 0 grad is [[-0.2715058],
#  [ 0.2619148],
#  [ 0.1552438],
#  [-1.0905544],
#  [ 0.3345583]]


from pyvqnet.qnn.vqc import *
from pyvqnet.optim import Adam
from pyvqnet.nn import Module, BinaryCrossEntropy, Sigmoid
from import data_generator
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from pyvqnet.tensor import QTensor

from pyvqnet.distributed.nccl_api import *
from pyvqnet.distributed import split_data, broadcast_model_params

from time import time

# NCCL init
nccl_op = NCCL_api()

iris_dataset = datasets.load_iris()

X1 =[:100, :].astype(np.float32)
X_feature_names = iris_dataset.feature_names
y =[:100].astype(int)
y_target_names = iris_dataset.target_names[:2]

alpha = X1[:, :3] * X1[:,1:]
X1 = np.append(X1, alpha, axis=1)
X_train, X_test, y_train, y_test = train_test_split(X1,

class Q_model(Module):
    def __init__(self):
        super(Q_model, self).__init__()

        self.hardward = VQC_HardwareEfficientAnsatz(
        obs_list = [{
            'wires': [2, 3],
            'observables': ['Z', 'Z'],
            'coefficient': [1, 1]
        # print(obs_list) = MeasureAll(obs=obs_list) = Sigmoid()
        self.qm = QMachine(4)

    def forward(self, input):
        qm = self.qm

        def cir(qm, x):
            for i in range(4):
                hadamard(qm, i)

            for i in range(4):
                rz(qm, i, x[:, [i]])

            for i in range(3):
                cnot(qm, [i, i + 1])
                rz(qm, i + 1, x[:, [4 + i]])
            return qm

        qm = cir(qm, input)
        y =
        y =

        return y

def run():
    Main run function

    model = Q_model()
    model = broadcast_model_params(model)
    model = model.toGPU(1000 + get_rank())
    # print(model.parameters())
    optimizer = Adam(model.parameters(), lr=0.1)
    batch_size = 20
    epoch = 20
    loss = BinaryCrossEntropy()
    print("start training..............")

    datas, labels= split_data(X_train, y_train)

    def get_accuary(result, label):
        result = (result > 0.5).astype(4)
        score = tensor.sums(result == label)
        return score

    time2 = time()
    runtime = 0
    for i in range(epoch):
        count = 0
        sum_loss = 0
        accuary = 0
        t = 0
        for data, label in data_generator(datas, labels, batch_size, False):
            time3 = time()
            data, label = QTensor(data,requires_grad=True).toGPU(1000 + get_rank()), QTensor(label,
                                                 requires_grad=False).toGPU(1000 + get_rank())

            result = model(data)

            loss_b = loss(label.reshape([-1, 1]), result)


            nccl_average_grad_allreduce(optimizer, nccl_op)

            sum_loss += loss_b.item()
            count += batch_size
            accuary += get_accuary(result, label.reshape([-1,1]))
            t = t + 1
            runtime += time() - time3

        # nccl_average_parameters_reduce(model, nccl_op)
        if get_rank()==0:
                f"epoch:{i}, #### loss:{sum_loss/count} #####accuray:{accuary/count}"

    print("start testing..............")
    count = 0
    if get_rank() == 0:
        print(time() - time2)
    test_data, test_label = X_test, y_test
    test_batch_size = 5
    accuary = 0
    sum_loss = 0
    for testd, testl in data_generator(test_data, test_label, test_batch_size):
        testd = QTensor(testd).toGPU(1000+get_rank())
        testl = QTensor(testl, dtype=6).toGPU(1000+get_rank())
        test_result = model(testd)
        test_loss = loss(testl.reshape([-1, 1]), test_result)
        sum_loss += test_loss
        count += test_batch_size
        accuary += get_accuary(test_result, testl.reshape([-1, 1]))
    if get_rank()==0:
            f"test:--------------->loss:{sum_loss/count} #####accuray:{accuary/count}"


In multi-process, use split_data to slice the data according to the number of processes and return the data on the corresponding process.

pyvqnet.distributed.datasplit.split_data(x_train, y_train, shuffle=False)

Set parameters for distributed computation.

param x_train:

np.array - training data.

param y_train:

np.array - Training data labels.

param shuffle:

bool - Whether to shuffle and then slice, default is False.


sliced training data and labels.


from pyvqnet.distributed import split_data
import numpy as np

x_train = np.random.randint(255, size = (100, 5))
y_train = np.random.randint(2, size = (100, 1))

x_train, y_train= split_data(x_train, y_train)


Use broadcast_model_params to broadcast the model parameters on the specified process to other processes before model training to keep the parameters consistent before model training.

pyvqnet.distributed.comm.broadcast_model_params(model, root=0)

Set parameters for distributed computation.


model: Module - the model for training.


root: Specified process number.


from pyvqnet.nn.module import Module
from pyvqnet.nn.linear import Linear
from pyvqnet.nn import activation as F
from pyvqnet.distributed import broadcast_model_params, get_rank

class Net(Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = Linear(input_channels=5, output_channels=1)
    def forward(self, x):
        x = F.ReLu()(self.fc(x))
        return x

model = Net()
print(f"bcast before rank {get_rank()}:{model.parameters()}")
model = broadcast_model_params(model)
model = model.toGPU(1000+ get_rank())
print(f"bcast after rank {get_rank()}: {model.parameters()}")

# mpirun -n 2 python