Turing Post
Posts
15+ papers to understand foundation models

15+ papers to understand foundation models

Familiarize yourself with transformers, diffusion-based models and generative AI in general

December 24, 2023

Before moving to the list, we recommend reading our explainer of the Transformer and Diffusion-Based Foundation model architectures. Also, if you want to understand where large language models (LLMs) come from, check our historical series. Now, to the resources!

Transformers and large language models

Efficient Transformers: A Survey
This paper discusses the evolution of Transformer models in natural language processing and other domains. The aim is to guide researchers through the plethora of Transformer variants (X-formers), offering a comprehensive overview across multiple domains. → Read here
A Survey of Transformers
The survey provides a comprehensive review of Transformer variants (X-formers) in AI fields. It introduces the vanilla Transformer and a new taxonomy of X-formers, covering architectural modifications, pre-training, and applications. → Read here
Vision Language Transformers: A Survey
This paper discusses the adaptation of transformers to vision-language modeling, their performance, transfer learning approach, and implications in vision-language tasks. → Read here
Recent Advances in Vision Transformer: A Survey and Outlook of Recent Work
This paper offers a comprehensive overview of Vision Transformers (ViTs) in computer vision, comparing them to CNNs. A GitHub repository compiles related papers. → Read here
A Survey on Visual Transformer
This review categorizes and analyzes vision transformer models across various tasks in computer vision. It covers applications like backbone network, high/mid-level vision, and video processing. → Read here
Transformers in Vision: A Survey
The survey provides a comprehensive overview of Transformer models in computer vision. It covers applications like image classification, object detection, multi-modal tasks, and more. → Read here
Transformers in Time Series: A Survey
This paper systematically reviews Transformer applications in time series modeling. It discusses network structure adaptations, applications in forecasting, anomaly detection, and classification. → Read here
A Survey of Large Language Models
This survey reviews the evolution and impact of Large Language Models (LLMs) in language understanding and generation. It discusses the scaling effect, capabilities of enlarged models, and specific abilities of LLMs. It focuses on pre-training, adaptation tuning, and utilization. → Read here
Efficient Large Language Models: A Survey
This survey reviews efficient Large Language Models (LLMs), essential for tasks like language understanding, generation, and reasoning. The literature is organized into model-centric, data-centric, and framework-centric perspectives. A GitHub repository compiles and maintains the surveyed papers, serving as a resource for understanding efficient LLM developments. → Read here
Formal Algorithms for Transformers
This document provides a mathematically precise overview of transformer architectures and algorithms, focusing on their design and training rather than on specific results. It is intended for readers familiar with basic ML concepts and neural network architectures. The paper covers the fundamentals of transformers, and their key components, and previews some prominent models. → Read here
Multimodal Learning with Transformers: A Survey
This survey presents a comprehensive overview of Transformer techniques in multimodal learning. It covers theoretical aspects, applications, challenges, and future research directions in multimodal Transformer models. → Read here

Diffusion Models

Diffusion Models: A Comprehensive Survey of Methods and Applications
This overview discusses diffusion models in deep generative models, categorizing research into efficient sampling and handling special structures. It reviews applications across various fields and suggests future exploration areas. A GitHub repository compiles and maintains the surveyed papers. → Read here
Diffusion Models in Vision: A Survey
This paper presents a comprehensive survey on denoising diffusion models in computer vision. It covers the two-stage process of these models: the forward diffusion that adds Gaussian noise to input data, and the reverse diffusion to recover the original data. Despite their computational demands, diffusion models are praised for their quality and diversity in sample generation. The paper reviews three diffusion model frameworks: denoising diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations. It compares diffusion models with other deep generative models and categorizes diffusion models in computer vision. Lastly, it discusses current limitations and future research directions. → Read here
A Survey on Generative Diffusion Model
This survey explores advancements in diffusion models, a subset of deep generative models known for high-quality data generation. The paper highlights challenges like the iterative generation process and the models' restriction to high-dimensional Euclidean spaces. It presents advanced techniques for enhancing diffusion models, including faster sampling and new diffusion processes, and strategies for manifold and discrete space implementation. The paper also discusses maximum likelihood training and methods to bridge arbitrary distributions. It reviews applications in various fields and concludes with a summary of the field's limitations and future directions. A GitHub repository compiles related papers. → Read here
Text-to-image Diffusion Models in Generative AI: A Survey
The paper reviews the application of text-to-image diffusion models in generative AI. Starting with a basic introduction to diffusion models for image synthesis, it progresses to discuss how text conditioning improves these models. The survey includes a review of state-of-the-art text-to-image synthesis methods, applications in creative generation and image editing, and the challenges faced in this domain. It concludes with a discussion on potential future developments in text-to-image diffusion models. → Read here

Catalogs of models

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT
This survey reviews Pretrained Foundation Models (PFMs) like BERT and ChatGPT. It covers basics, pretraining methods, and applications across data modalities. The review discusses model efficiency, security, privacy, and future research challenges in PFMs. → Read here
Transformer models: an introduction and catalog
This paper aims to catalog and classify the most popular Transformer models, known for their wide-ranging applications and often unique names. The catalog includes models trained through self-supervised learning and those further trained with human-in-the-loop approaches, covering both foundational models like BERT or GPT-3 and interactive models like InstructGPT. → Read here

Every day we post helpful lists and bite-sized explanations on our Twitter. Please join us there:

3 papers to understand transformers/LLMs
1. Formal Algorithms for Transformers @DeepMind
2. A Survey of Large Language Models
3. Transformer models @xamat
🧵
— TuringPost (@TheTuringPost)
11:17 AM • Apr 10, 2023

Reply

or to participate.