9 Jan 2026
Mateo Lafalce - Blog
In CNNs, the kernel size is a critical hyperparameter that determines how the model interprets images. While kernels are the standard, increasing the size to or drastically changes how the network learns.
The most immediate benefit of a larger kernel is an increased receptive field. A larger kernel sees a bigger section of the image at once.
The primary cost of increasing kernel size is computational. The number of parameters grows quadratically, not linearly.
This nearly triples the parameter count per filter. Consequently, training becomes slower due to higher FLOPs, and the model becomes more prone to overfitting because it has to learn significantly more parameters from the same amount of data.
Modern deep learning architectures generally prefer stacking multiple small kernels rather than using one large kernel.
Two stacked layers provide the same effective receptive field as a single layer, but with two major advantages:
This blog is open source. See an error? Go ahead and propose a change.