Introduction to ViTs and Joint Training with Embeddings