Turing Post
Posts
FOD#36: Navigating the Global AI Landscape

FOD#36: Navigating the Global AI Landscape

recent AI developments across the globe, main news from AI leaders, and the best-curated list of research papers

Ksenia Se
January 08, 2024

Next Week in Turing Post:

Wednesday, Token 1.16: Understanding RLHF: we will discuss its evolution, applications, human role, synergies, challenges, and future prospects.
Friday, Foreign AI Affairs: In-depth piece on South Korea's AI sector.

As we prepare our in-depth piece on South Korea's AI sector for the 'Foreign AI Affairs' rubric, let's briefly tour the recent AI developments across the globe. Especially rich for the AI-related events was last November.

The UK: All about safety In November 2023, the UK made a significant leap in AI safety with Prime Minister Rishi Sunak announcing the establishment of the UK AI Safety Institute at the AI Safety Summit. The initiative aims to become a global hub for AI safety, extending beyond testing to addressing risks ranging from social impacts to loss of control over AI technologies. The same month, Microsoft announced a £2.5 billion investment in the UK's AI sector, focusing on infrastructure, training, and, again, safety.

France: The rising open-source AI hub Also in November, we spotlighted France's emergence as an open-source AI powerhouse, challenging Silicon Valley. This trend was recently exemplified by Mistral AI's skyrocketing valuation, underscoring France's blend of corporate and academic strengths in driving AI innovation.

European Union: Towards a harmonized AI regulation Germany, France, and Italy have reached a consensus on AI regulation, aiming to balance AI's potential against its risks, emphasizing self-regulation over strict norms.

Russia: Against the West President Putin's November strategy for AI development aims to counter Western dominance and emphasizes the use of Russian AI solutions. It’s worth noting, that nothing has been heard about the next steps ever since.

Italy: Societal effects This January, on the threshold of Italy's G7 presidency, Prime Minister Giorgia Meloni is emphasizing AI's impact on labor markets, bringing AI’s societal and economic implications to be a focal point of global discussions.

China: Facing challenges As ChinAI reports, the country's computing centers suffer from high idle rates due to overinvestment and low demand, highlighting an "implementation gap" in large model applications. Additionally, Kevin Xu's article discusses China's cautious approach to generative AI. There's a fear that GenAI might aggravate existing social issues. Such ambivalence could potentially lead to missed economic opportunities and a brain drain, affecting China's long-term growth in the tech sector. The AI scenario in China is further complicated when considering its military applications. ChinaTalk article sheds light on widespread corruption within the People's Liberation Army (PLA) and its potential impact on military actions, especially concerning Taiwan. Compromised military equipment and operational inefficiencies have led U.S. intelligence to assess a lower likelihood of major military actions by President Xi Jinping in the near term.

India: Embracing Generative AI The Indian IT industry is pivoting towards GenAI, with significant expectations for its contribution to the industry's revenue in 2024. This strategic shift, led by key players like Infosys and TCS, signals about more impactful AI projects. India has a deep pool of talents in software development who are eager to be retrained for GenAI purposes.

News from The Usual Suspects ©

Microsoft adds a new button to keyboards to call up their Copilot – it’s their first change for the device in nearly three decades. We would recommend adding two more buttons: red pill and blue pill.
OpenAI and NYT lawsuit. The New York Times tries to sue OpenAI for overused content. There are a lot of conversations pro and against the whole deal.
Important to read: 1) The original Allegation Paper 2) OpenAI response to the lawsuit
Might be interesting to read: A detailed commentary on the issue. And a great detailed post from Stratechery
Stanford's researchers published a study titled “Mobile ALOHA,” focusing on bimanual mobile manipulation using behavior cloning from human-assisted tasks, enhancing robot performance in mobile manipulation tasks.
Google DeepMind introduced AutoRT, SARA-RT, and RT-Trajectory aimed at improving data collection, speed, and generalization in advanced robotics, envisioning robots capable of performing complex tasks like housekeeping or cooking.
It’s interesting, that robotics companies are increasingly adopting imitation learning, similar to how language models like ChatGPT learn from human language patterns.

Twitter Library

10 open-source tools for LLM applications development

Development Frameworks and Platforms. Data Management and Integration. User Interface and Interaction.

www.turingpost.com/p/llm-applications-tools

Research papers, categorized for your convenience

Large Language Model Scaling and Optimization

Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws: Researchers from MosaicML modified the Chinchilla scaling laws to include inference costs in determining the optimal size and training data for LLMs. Their analysis suggests adjustments to LLM size and training duration for handling high inference demand →read the paper
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism: The paper focuses on scaling open-source LLMs for long-term growth. It explores scaling laws for LLMs, particularly in configurations of 7B and 67B parameters, developing a dataset of 2 trillion tokens →read the paper
Improving Text Embeddings with LLMs: Researchers from Microsoft Corporation introduce a novel method for generating high-quality text embeddings using synthetic data and less than 1k training steps, diverging from complex multi-stage pre-training methods →read the paper

Large Language Model Fine-Tuning and Applications

ASTRAIOS: Parameter-Efficient Instruction Tuning Code LLMs: The paper investigates parameter-efficient fine-tuning methods for LLMs in code comprehension and generation tasks. Their study introduces ASTRAIOS, comprising 28 instruction-tuned models using 7 PEFT methods across 4 model sizes →read the paper
COSMO: COntrastive Streamlined Multimodal Model with Interleaved Pre-Training: Researchers from the National University of Singapore and Microsoft Azure AI developed CosMo, a framework that combines unimodal and multimodal components in language models, enhancing performance in tasks involving text and visual data →read the paper
GeoGalactica: A 30 Billion Parameter LLM for Geoscience Applications: the papers demonstrates GEOGALACTICA, a 30 billion parameter LLM, specifically for geoscience applications →read the paper
DOCLLM: A Layout-Aware Generative Language Model for Multimodal Document Understanding: Researchers from JPMorgan AI Research developed DocLLM, a novel extension to LLMs, focusing on visual documents like forms and invoices →read the paper

Multilingual Language Models and Transfer Learning

LLaMA Beyond English: An Empirical Study on Language Capability Transfer: Researchers from the Fudan University, examined the transfer of language generation and instruction-following capabilities of LLMs like LLaMA to non-English languages →read the paper
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models: The paper introduces a novel fine-tuning method called Self-Play Fine-Tuning (SPIN) for LLMs →read the paper

Lightweight and Efficient Language Models

AMUSED: AN OPEN MUSE REPRODUCTION: Researchers from Hugging Face and Stability AI present aMUSEd, an open-source, lightweight model for text-to-image generation based on MUSE →read the paper
TinyLlama: An Open-Source Small Language Model by the StatNLP Research Group. It’s a compact 1.1 billion parameter language model trained on around 1 trillion tokens for approximately 3 epochs →read the paper

Expanding Large Language Models

LLAMA PRO: Progressive LLaMA with Block Expansion: Researchers propose LLAMA PRO, a method for expanding pre-trained LLMs with additional Transformer blocks →read the paper
LLM Augmented LLMs: Expanding Capabilities Through Composition: Researchers introduce CALM (Composition to Augment Language Models), a method for enhancing LLMs by composing them with specialized models →read the paper
Exploring Knowledge Editing for LLMs: This comprehensive research delves into the realm of knowledge editing for LLMs, presenting a holistic view encompassing performance, usability, mechanisms, and broad impacts →read the paper

In other newsletters

A Risk Expert's Analysis of What We Get Wrong about AI Risks by Artificial Intelligence Made Simple

We are watching

If you decide to become a Premium subscriber, remember, that in most cases, you can expense this subscription through your company! Join our community of forward-thinking professionals. Please also send this newsletter to your colleagues if it can help them enhance their understanding of AI and stay ahead of the curve. 🤍 Thank you for reading

How was today's FOD?

Please give us some constructive feedback

Reply

or to participate.