Vision Transformers vs Convolutional Neural Networks