- Turing Post
- Posts
- FOD#27: "Now And Then"
FOD#27: "Now And Then"
we want to pay tribute to the beauty of what technology can (re)create
While governments compete over who can impose better AI restrictions while simultaneously hoping to stimulate innovation and win the race,
…and companies vie to outdo GPT-4 and other benchmarks,
…and AI researchers, psychologists, effective altruism activists, and casual bystanders argue over AGI,
let's pause for 4 minutes and 35 seconds to appreciate what AI has helped achieve after more than 28 years of human effort falling short:
Over 20 million views and an immeasurable volume of tears shed in joyful recognition. It was wonderful.
Additional info: A short film, including Peter Jackson’s remarks on how it was made possible.
You are currently on the free list. Join Premium members from top companies like Datadog, FICO, UbiOps, etc., AI labs such as MIT, Berkeley, and .gov, as well as many VCs, to learn and start understanding AI →
News from The Usual Suspects ©
Google DeepMind offers another positive AI news
Google DeepMind AlphaFold and Isomorphic Labs introduced the updates to their groundbreaking AlphaFold model. The latest iteration presents a transformative advance in scientific discovery, particularly drug development. DeepMind's AI can now predict the structures of various biomolecules, including nucleic acids and ligands, with near-atomic accuracy. This expansion from proteins to a broader range of molecules could significantly cut drug discovery times and inform early trials. AlphaFold exemplifies the increasing influence of AI in expediting scientific research and innovation, potentially leading to faster breakthroughs in various domains, including therapeutics and environmental sustainability. This is truly remarkable.
AI Summit in the UK – a new pact signed
This pact, known as the Bletchley Declaration, saw signatories from the US, EU, China, and 25 other nations agreeing on a unified strategy for preempting and curtailing AI risks. While the declaration acknowledges the perilous implications of AI and suggests frameworks for risk reduction, it stops short of enacting specific regulations. The declaration, spanning about 1,300 words, emphasizes the urgency of international collaboration to address the challenges posed by cutting-edge AI technologies. All very important words are properly used.
Additional reading: The case for a little AI regulation by Platformer and AI Summit a start but global agreement a distant hope by Reuters
Cohere’s update
Cohere introduced Embed v3, an advanced model for generating document embeddings, boasting top performance on a few benchmarks. It excels in matching document topics to queries and content quality, improving search applications and retrieval-augmentation generation (RAG) systems. The new version offers models with 1024 or 384 dimensions, supports over 100 languages, and is optimized for cost-effective scalability. The models demonstrate superior capability in handling noisy datasets and multi-hop queries, vital for real-world applications and RAG systems.
Microsoft develops smaller models
Microsoft introduced Phi 1.5 – a compact AI model with multimodal capabilities, meaning it can process images as well as text. Despite being significantly smaller than OpenAI's GPT-4, with only 1.3 billion parameters, it demonstrates advanced features like those found in larger models. Phi 1.5 is open-source, emphasizing the trend towards efficient AI that’s accessible and less demanding on computational resources. This innovation not only offers a glimpse into economical AI deployment but also contributes to fundamental AI research, exploring how AI models learn and the potential for democratizing AI technology. Microsoft's progress with Phi 1.5 underscores a broader movement in AI research towards creating powerful yet smaller models that could proliferate outside of big tech companies, transforming industries with their efficiency and capability. Meta set a good example of open-sourcing the models!
Kai-fu Lee’s 01.AI
01.AI, a Chinese startup, soared to unicorn status with a valuation of over $1 billion, buoyed by a fundraising that included Alibaba's cloud unit. They recently launched Yi-34B, a superior open-source AI model excelling in key performance metrics over existing models like Meta's Llama 2. This model caters to both English and Chinese developers and marks a significant milestone in China's AI landscape, amid growing competition and political tensions with US AI advancements.
Speaking about open-source
Elon Musk and his ‘snarky’ Grok
Elon Musk’s xAI announced Grok, an AI model with a few interesting features:
it's not open-source, it’s not yet available (will be available only to Twitter’s premium+ subscribers = $16), it’s not state-of-the-art (can't compete with GPT4), it’s not addressing AI risks Musk’s was talking about earlier. But it’s “snarky” and is built on Twitter data.
What it reminds me of? Chatbot Tay and this lovely article title from 2016: “Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day.“
Additional reading: The problem with Elon Musk’s ‘first principles thinking’ by Untangled
The question is: how many models do we really need?
Twitter Library
Other news, categorized for your convenience
Video and Vision Understanding
MM-VID - Enhances video understanding through transcription into detailed scripts by GPT-4V, facilitating comprehension of long-form content and character identification →the paper
Battle of the Backbones - Evaluates different pre-trained models in computer vision tasks to guide selection for practitioners, comparing CNNs, ViTs, and others →the paper
LLaVA-Interactive - Demonstrates a multimodal interaction system capable of dialogues and image-related tasks, leveraging pre-built AI models →the paper
Large Language Models (LLMs) Enhancements and Evaluation
ChipNeMo - Applies domain-adapted LLMs to chip design, utilizing techniques like custom tokenizers and fine-tuning for specific tasks such as script generation and bug analysis →the paper
Evaluating LLMs - A survey categorizing LLM evaluations into knowledge, alignment, safety, and more, serving as a comprehensive overview of LLM performance →the paper
The Alignment Ceiling - Addresses the objective mismatch in reinforcement learning from human feedback (RLHF), proposing solutions for aligning LLMs with user expectations for safety and performance →the paper
Learning From Mistakes (LeMa) - Improves reasoning in LLMs by fine-tuning them on mistake-correction data pairs, enhancing their math problem-solving abilities →the paper
Model Efficiency and Distillation
Distil-Whisper - Focuses on distilling large speech recognition models into smaller, faster, and more robust versions without significant performance loss →the paper
FlashDecoding++ - Presents a method for faster inference of LLMs on GPUs, aiming to maintain performance while increasing efficiency →the paper
Innovative Prompting and Cross-Modal Interfaces
Zero-shot Adaptive Prompting - Improves zero-shot performance of LLMs by introducing self-adaptive prompting methods that generate pseudo-demonstrations for better task handling →the paper
De-Diffusion - Transforms images into text representations, enhancing cross-modal tasks by enabling the use of standard text-to-image LLMs for image processing →the paper
Coding
Phind has introduced its 7th-gen model, surpassing GPT-4 in coding proficiency while offering a 5x speed increase. This model, fine-tuned with over 70 billion tokens, achieved a 74.7% HumanEval pass rate. Despite HumanEval's limited real-world applicability, user feedback suggests Phind's model is equally or more helpful than GPT-4 for practical queries. It also boasts a 16k token context capacity and utilizes NVIDIA's TensorRT-LLM for rapid processing, although some consistency issues remain to be polished →read more
DeepSeek Coder is a suite of open-source code language models, varying in size from 1B to 33B parameters, trained on a mix of code and natural language in both English and Chinese. The training utilized a large 2T token dataset and specialized tasks to create base and instruction-tuned versions. These models, with their project-level code understanding, offer state-of-the-art code completion and are free for both research and commercial use →read more
3D Content Generation Innovations – what a rich 3D week!
Stable 3D by Stability AI streamlines 3D creation, allowing rapid generation of textured 3D models from images or text, simplifying the design process for professionals and amateurs alike, with output ready for further refinement in standard tools →read more
Genie by Luma AI introduced a discord-integrated text-to-3D model generator ready for a research preview →read more
DreamCraft3D unveils a hierarchical approach for creating detailed and consistent 3D objects, using a view-dependent diffusion model and Bootstrapped Score Distillation for geometric precision and textured refinement →read more
Rodin Gen-1 by Deemos offers a GenAI model that is capable of producing complex 3D shapes and photorealistic PBR textures based on textual inputs. This represents a significant leap in text-to-3D synthesis, particularly in creating textures that mimic real-world lighting and reflection properties →read more
In other newsletters
Boosting LLMs with External Knowledge: The Case for Knowledge Graphs by Gradient Flow
Open LLM company playbook by Interconnects
Unbundling AI by Benedict Evans
Thank you for reading, please feel free to share with your friends and colleagues. In the next couple of weeks, we are announcing our referral program 🤍
Another week with fascinating innovations! We call this overview “Froth on the Daydream" - or simply, FOD. It’s a reference to the surrealistic and experimental novel by Boris Vian – after all, AI is experimental and feels quite surrealistic, and a lot of writing on this topic is just a froth on the daydream.
How was today's FOD?Please give us some constructive feedback |
Reply