where :math:`\star` is the valid 1D cross-correlation operator,
:math:`N` is batch size, :math:`C` denotes number of channels, and
:math:`H` is length of 1D data element.
When `groups == in_channels` and `out_channels == K * in_channels`,
where K is a positive integer, this operation is also known as depthwise
convolution.
In other words, for an input of size :math:`(N, C_{in}, H_{in})`,
a depthwise convolution with a depthwise multiplier `K`, can be constructed
by arguments :math:`(in\_channels=C_{in}, out\_channels=C_{in} \times K, ..., groups=C_{in})`.
:param in_channels: number of input channels.
:param out_channels: number of output channels.
:param kernel_size: size of weight on spatial dimensions. If kernel_size is
an :class:`int`, the actual kernel size would be
`(kernel_size, kernel_size)`. Default: 1
:param stride: stride of the 1D convolution operation. Default: 1
:param padding: size of the paddings added to the input on both sides of its
spatial dimensions. Only zero-padding is supported. Default: 0
:param dilation: dilation of the 1D convolution operation. Default: 1
:param groups: number of groups into which the input and output channels are divided, so as to perform a "grouped convolution". When ``groups`` is not 1,
``in_channels`` and ``out_channels`` must be divisible by ``groups``,
and there would be an extra dimension at the beginning of the weight's
shape. Specifically, the shape of weight would be `(groups,