Turing Post
Posts
FOD#72: A Meta Reflection: How AI Connects Us

FOD#72: A Meta Reflection: How AI Connects Us

"Meta" news, free courses on RAG, and a curated list of research papers to supercharge your ML projects

Ksenia Se
October 21, 2024

This Week in Turing Post:

Wednesday, AI 101: What is Whiteboard-of-Thought?
Friday, Guest post: The Elusive Definition of AGI

The main topic

I live in a town with fewer than 1,500 residents – rural Connecticut, beautiful foliage, bears in our backyard. Imagine my surprise when, at an event celebrating volunteers of our local fire department, I – in my slippers with a baby hanging off me! – found myself in the company of two of my readers. Both women, both with young children, both top-notch professionals – one, the head of data science at a leading newspaper, the other, an AI creator working on producing AI-empowered art videos. I mean, how much more meta could that be? (And if you’re wondering, 'meta' comes from the Greek word meaning 'beyond' or 'after.' It’s when life feels like it’s stepping outside itself, almost winking back at you – like me, an AI educator, running into fellow AI professionals at a local event in this tiny, unexpected corner of the world. It’s life saying, ‘You’re living the script you write.’) And I just want to highlight how humbled I feel and how proud I was to talk AI over hot dogs with these knowledgeable ladies.

So today, I want to offer you a few more "meta" things from last week. Completely unrelated to each other! But both fascinating.

Meta is embracing the very idea of “meta” itself – building an ecosystem that is not only advancing AI but also reflecting on how AI research can be collaborative, open, and self-reinforcing. Meta’s latest announcements centered around advancing machine intelligence (AMI) while embracing open science and reproducibility.
The highlights include the launch of Meta Segment Anything 2.1 (SAM 2.1), an updated version of their image and video segmentation model. This version comes with a new developer suite featuring training code and a web demo, reflecting Meta’s emphasis on community collaboration and accessibility.
Meta also introduced Spirit LM, their first open-source language model that integrates text and speech seamlessly, enhancing expressiveness across modalities. Additionally, they unveiled the Layer Skip framework, designed to boost large language model (LLM) efficiency without specialized hardware, enabling faster and more resource-efficient deployments.
On the cryptography front, Meta released SALSA, a tool for validating post-quantum cryptographic standards, showcasing their focus on securing future technologies. They also launched Meta Lingua, a lightweight codebase for efficient language model training, and Meta Open Materials 2024, an open-source dataset aimed at accelerating inorganic materials discovery.
Lastly, MEXMA, a cross-lingual sentence encoder, and the Self-Taught Evaluator (the original paper published in august 2024), which generates synthetic preference data for reward model training, were released, demonstrating Meta's commitment to advancing research capabilities and AI evaluation methods.
The paper Neural Metamorphosis (NeuMeta) introduces a revolutionary approach to neural networks, proposing self-morphable architectures that dynamically adapt their structure without retraining. This meta-thinking goes beyond traditional static models, exploring continuous weight manifolds that allow networks to flexibly resize and adjust based on hardware or task demands, essentially reconfiguring their own identity in response to external conditions.
This shift represents a significant evolution, as NeuMeta treats neural networks as entities capable of self-reflection and change – perfectly aligning with the meta concept of re-evaluating and adapting systems. By leveraging Implicit Neural Representations (INR) as hypernetworks, it enables these dynamic transformations while ensuring smooth performance across configurations. With results maintaining performance at a 75% compression rate, it outperforms existing pruning techniques and sets the stage for a new era of flexible, scalable AI.
NeuMeta’s approach can be a powerful enabler in developing agents capable of fluidly adjusting their capabilities and resources, a key feature of advanced agentic workflows. Integrating such flexible neural architectures could enhance the efficiency and adaptability of AI agents in real-world, dynamic contexts.

Circling back to the beginning: while we explore AI and its complexities here, it’s always with an eye on how it serves us, the people behind the technology. AI might seem abstract, but moments like these remind me it's about building tools that enrich our lives, connect our communities, and even make unexpected connections over hot dogs in a small-town celebration.

❓Question of the day: Is generative AI actually a good investment for a business in 2024?

Join 15k+ business leaders and AI experts from leading companies like Moderna and S&P Global on Nov. 14 at Section’s AI:ROI Conference — a free, virtual event for leaders looking to achieve tangible results with AI.

What you’ll learn at the event:

Strategies to prioritize AI initiatives that deliver real returns
Lessons from real AI success stories and case studies
How to achieve ROI from productivity gains to secure investor support
New research on the state of AI proficiency in today’s workforce, and how to apply it to drive more ROI from your own team

Twitter Library

7 Free Courses to Master RAG

We’ve put together a super helpful collection of courses for you – you can’t miss it

www.turingpost.com/p/7-free-courses-to-master-rag

Weekly recommendation from AI practitioner👍🏼

Unsloth AI – open-source toolkit to fine-tune LLMs like Llama and Mistral faster, using less memory. A practical, efficient boost for developers optimizing model performance.

News from The Usual Suspects ©

Turing Post is powered by readers like you. Show your support and enjoy exclusive content by becoming a Premium subscriber →

Mistral AI on the Edge

Mistral AI celebrates a year of Mistral 7B with its new les Ministraux models: Ministral 3B and 8B. These models target edge applications like smart assistants and robotics with low-latency, privacy-focused AI for local use cases. Efficiency meets elegance with up to 128k context lengths. Read about Mistral AI’s Bold Journey here.

NVIDIA’s Nemotron: A Helpfulness Upgrade

NVIDIA’s quietly launched Nemotron. Optimized for NVIDIA hardware, the 70B model fine-tunes output quality, setting a new standard in efficient and effective language models for practical AI assistance. But there are some other opinions on that. The problem - IMO - too many benchmarks, one can choose whichever to demonstrate their model from the best side.

The Nvidia Nemotron Fine-Tune Isn’t A Very Good 70b Model!
While it improves on the base 70b Llama model on reasoning, it underperforms across several categories
It’s worse than 405b and isn’t as good as the other SOTA models
Detailed numbers coming soon on Livebench AI
— Bindu Reddy (@bindureddy)
8:04 AM • Oct 17, 2024

Meta: Horror movies and coding magic

Meta teams up with horror giant Blumhouse to launch Movie Gen, AI models generating HD video and sound. Directors Aneesh Chaganty and Casey Affleck are experimenting with the tech, showing a glimpse of AI-driven filmmaking. A wider release is set for 2025; until then, it's all about perfecting the scares.
CodeGPT, powered by Meta’s latest Llama update, claims to boost coding productivity by 30%. Offering code suggestions, debugging help, and onboarding automation, it's a new developer's best friend. With Llama 3.2, Meta positions itself as the bridge between code and creativity.

Moonshot AI: Kimi Levels Up

Moonshot AI’s Kimi Chat Explore now rivals OpenAI, with expanded search and problem-solving skills. Backed by Tencent and Alibaba, the Chinese startup aims to automate complex tasks like investment analysis. It’s a strategic move in the AI arms race, and they’re not holding back. Read how Moonshot Revolutionizing Long-Context AI here.

Claude: AI’s Sabotage Scenario

Anthropic explores AI sabotage risks, evaluating threats like code tampering with models like Claude 3 Opus. While current sabotage capabilities are limited, the research underscores the need for proactive defenses to keep AI on the straight and narrow. Developers are called to refine and innovate safeguards.

Lightmatter’s Photonic Boom

Lightmatter gets $400 million in Series D, boosting its valuation to $4.4B. The photonics leader aims to expand its Passage engine, optimizing AI data centers with ultra-low latency and lightning speed. With heavyweights like Google Ventures backing, Lightmatter's redefining the future of AI infrastructure.

Recently, the unicorn family has seen a few valuable additions.

Vote on who we should cover next ->

Microsoft’s BitNet Breakthrough

Microsoft’s bitnet.cpp is a game-changer for 1-bit LLMs, offering over 6x speed improvements and 80% energy savings on x86 CPUs. Capable of running 100B models on a single CPU, it's designed for scalability, making local AI as efficient as it gets—keeping performance and photonics hand-in-hand.
Read more

Google’s NotebookLM Gets Smarter

NotebookLM adds customizable audio summaries, blending advanced visuals and insights for 200 countries. It leverages Gemini AI, now piloting team collaboration for businesses and universities. Google’s upgraded document analysis tool inches closer to knowledge synthesis dominance.
Read more

Google Shuffles the Search Deck

With Nick Fox now leading search, Google repositions Prabhakar Raghavan as chief technologist. Gemini AI shifts under DeepMind, reinforcing product-research synergy. The reshuffle comes amid rising antitrust heat and revenue concerns – will a new team reverse Google’s fortunes?

Boston Dynamics and Toyota Join Forces

Boston Dynamics and Toyota Research Institute team up to refine humanoid robotics. Combining Toyota’s AI and Large Behavior Models with the Atlas platform, the goal is to revolutionize automation and human-robot interaction. The future of dexterous, multi-tasking robots looks promising.

We are watching/reading:

Super interesting about geoengineering from Andrew Ng
What would you do with an abundance of computing power? by Exponential View

The freshest research papers, categorized for your convenience

Our top

Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices →read the paper

A Survey on Deep Tabular Learning →read the paper

TapeAgents: A Holistic Framework for Agent Development and Optimization

Researchers from ServiceNow present TapeAgents, a framework using detailed logs ("tapes") for LLM agent sessions, enabling session resumability and optimization. It integrates features from other frameworks like AutoGen (multi-agent support) and LangGraph (fine-grained control) but uniquely combines them. TapeAgents supports debugging, fine-tuning, and prompt-tuning, demonstrated through various agent setups and optimizing a Llama-3.1-8B model to match GPT-4's performance cost-effectively →read the paper

Agent-as-a-Judge: Evaluate Agents with Agents

Researchers from Meta AI and KAUST propose the "Agent-as-a-Judge" framework for evaluating agentic systems using other agentic systems. They introduce DevAI, a benchmark with 55 realistic AI development tasks. Agent-as-a-Judge delivers intermediate feedback, outperforming LLM-based evaluations and aligning closely (90%) with human judges, while reducing costs and time by over 97%. This method shows potential for scalable, dynamic self-improvement in AI systems →read the paper

Multimodal Systems & Visual Understanding

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation explores enhancing multimodal models by decoupling visual encoders for different tasks, improving flexibility and performance.
OMCAT: Omni Context Aware Transformer enhances cross-modal temporal understanding using a new dataset, improving multimodal model performance.
γ-MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models focuses on optimizing multimodal LLMs by adapting layers dynamically, reducing computation time significantly.

Self-Improvement and Learning

Web Agents with World Models develops a web agent that uses a world-model approach to improve long-horizon web tasks and decision-making efficiency.
Retrospective Learning from Interactions proposes a method for LLMs to learn from user interactions without external annotation, enhancing performance over time.
Looking Inward: Language Models Can Learn About Themselves by Introspection explores the ability of LLMs to introspect and predict their behavior, suggesting models can have privileged internal access.
Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse RL applies IRL to uncover reward models guiding LLM behavior, improving alignment and interpretability.

Language Model Optimization & Alignment

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement identifies gradient entanglement issues in RLHF and proposes strategies for improved safety and alignment.
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation proposes a pipeline to enhance instruction-following capabilities in retrieval-augmented generation models.
Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements introduces a framework for adapting LLM safety requirements without retraining, enhancing flexibility.
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment proposes a new framework for aligning LLMs with diverse strategies, enhancing safety and model alignment comprehensively.

Transformer Optimization

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads optimizes attention mechanisms in LLMs, improving efficiency and reducing memory usage.
What Matters in Transformers? Not All Attention is Needed explores redundancy in transformer models, showing the efficiency benefits of pruning attention layers.
Thinking LLMs: General Instruction Following with Thought Generation proposes a training method for generating internal thoughts, improving LLM instruction-following capabilities.

Model Adaptation & Embedding Strategies

Your Mixture-of-Expert LLM is Secretly an Embedding for Free demonstrates how MoE models can function as effective embedding models without additional training.
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence proposes using swarm intelligence for collaborative model adaptation, optimizing LLMs across tasks.

Safety & Calibration in Reinforcement Learning

Taming Overconfidence in LLMs: Reward Calibration in RLHF introduces new methods for calibrating LLM confidence during RLHF, reducing errors while maintaining accuracy.

Leave a review!

Please send this newsletter to your colleagues if it can help them enhance their understanding of AI and stay ahead of the curve. You will get a 1-month subscription!

Reply

or to participate.