AI horizons: new frontiers and thoughtful considerations – March 2024

Here’s our monthly rendezvous with the observatory on the latest in the world of artificial intelligence. This month, we have a particularly complex landscape with overlapping news that pushes the boundaries of artificial intelligence. But let’s start with the top stories of the month.

Tops of the Month

There are two pieces of news that I consider most important this month. The first is the release of Anthropic’s Cloude 3. Cloude 3 is the first closed-source Large Language Model (LLM) that can compete with OpenAI’s GPT-4. In fact, according to all benchmarks, albeit slightly, Cloude 3 surpasses GPT-4. Additionally, in the user-led ranking on Chatbot Arena for Large Language Models, Cloude 3 Opus, the most advanced model released by Anthropic, has jumped to the lead, ahead of GPT-4. Incidentally, Chatbot Arena is managed by the Large Model Systems Organization (LMSYS ORG), a research organization that operates as a collaboration between students and faculty from the University of California, Berkeley, UC San Diego, and Carnegie Mellon University. This news certainly testifies that Large Language Models are reaching the level of GPT-4, and equally, given that GPT-4 was released a year ago, OpenAI has nearly a 12-month head start.

The second key news of the month is undoubtedly the release of the definitive text of the AI Act by the European Parliament. The European Parliament has voted on the text, and it has passed, so now only the vote by the Council of the EU, which appears to be a formality, is missing for its final approval. The text is quite detailed, we have already talked about it and I have already written about it in the past; it is a nearly 500-page document.

Microsoft

Microsoft continues its dominant position among the major players in artificial intelligence for both the business and consumer segments. This month is full of news. First, from a research perspective, there are significant investments in Small Language Models, with the release of Orca Math, for example, which is a Small Language Model aimed at solving mathematical and algebraic problems. Microsoft continues this trend following the release of Phi-2 in recent months. Small Language Models are of absolute interest also for their ability to be run on limited hardware, up to the point of thinking about their execution on our mobile phones. The turning point lies in the quality of the data used to train these models, as seen in “textbooks are all you need.”

This month, actually from April 1st, Microsoft released Copilot for Security. AI serving cybersecurity is one of the most interesting applications of artificial intelligence that can provide the greatest results in speed of detection, productivity, hunting capabilities, reconstructing the attack chain, and supporting security operations. Copilot for Security initially works with the entire Microsoft security ecosystem, so with all the Defender and Sentinel parts, but it also has the possibility of being integrated with third-party systems. The dedicated page for Copilot for Security lists the currently available plugins. Copilot for Security also has the advantage of being used in a pay-as-you-go manner, so one can approach the tool simply by paying for the chatbot’s usage hours relative to their security data. Of course, there is also a version included within Microsoft’s security products.

Speaking of news, probably the most advanced individual productivity copilot for Microsoft Teams, which I use regularly, has become an indispensable tool in my workday. Microsoft has released a major upgrade to this digital assistant for teams.
Meeting Summaries: Copilot will be able to combine voice transcriptions and written chats into a single view, making it easier to catch up on missed meetings.
Message Composition: Copilot in Teams will receive improvements for composing messages in chat, allowing users to rephrase a message in new ways.
Speaker Recognition: Copilot will also have speaker recognition in Teams, allowing for the correct identification of who is speaking in meeting summaries.

As a testament to Microsoft’s continuous investment in AI, the hiring of Mustafa Suleyman must be noted. The CEO and founder of Inflection AI, previously of Google Deepmind, will lead the division dedicated to artificial intelligence for consumer products.
But we’re not done here, because this month also saw the release of Copilot Pro for Android and iOS. It’s the advanced version compared to the free one that adds reserved capacity for access to GPT-4, the creation of one’s own GPTs, the possibility of having copilot in M365 productivity applications, and a larger budget in generating images with Dall-e 3. All for 22€/month. As related news, Microsoft has added the ability to edit images generated via prompting on specific areas. This is the surfacing of a feature that OpenAI has in preview.
Lastly, in this long paragraph dedicated to Microsoft, the announcement together with OpenAI to build a

100 billion dollar supercomputer, and the project’s nickname is “Stargate,” with delivery expected in 2028. The supercomputer will be 100 times larger than any other known supercomputer (and also 100 times more expensive, I add).

New Trends

Regarding new trends in the AI industry, this month I want to highlight three trends.
The first is the emergence of increasingly specific attacks targeting generative AI. In particular, architectures with autonomous or semi-autonomous agents lend themselves to the development of worms that exploit these architectures to launch new types of attacks that can generate spam, phishing, information disclosure, and more. The study [2403.02817] Here Comes The AI Worm: Unleashing Zero-click Worms that Target GenAI-Powered Applications (arxiv.org) is interesting.
The second trend is certainly that of coding agents, or artificial intelligence tools to support code writing. It’s worth mentioning Devin by Cognition, but after GitHub Copilot, the competition is now open to various players and different strategies (example with and without prompting).
The third element to highlight is humanoid robotics. In this field, Nvidia is not wasting time, the great earnings realized last year, and which are also being confirmed in 2024, push the giant to make investments. The chip giant has established partnerships with several robotics companies (Figure, Apptronik, Agility Robotics, Sanctuary AI, and Unitree) to build systems that, thanks to GenAI, are not only able to interact naturally with humans but also to acquire new skills from experience.

Market News

Inflection, orphaned by Suleyman, has released Inflection-2.5 to support its “Pi personal AI assistant.” An extremely interesting assistant that I use to schedule visits and recreational activities. Not only do the performances of this model approach those of GPT-4 in various areas.
Nvidia continues to invest of course also in its core business, its new B200 Blackwell chip can complete operations 30 times faster, consuming 25 times less than its predecessor H100. With this type of processor, Nvidia claims it will be possible to train models up to 10 Trillion parameters. Nvidia’s CEO Huang also revealed in an interview that GPT-4 has today 1.8 trillion parameters and that this model can be trained by 2000 Blackwell chips in 90 days.
Nvidia also released a series of solutions, to be more precise 24 solutions dedicated to healthcare and pharma. This is a sector with strong investments in AI and once again a testimony of how Nvidia is not resting on its laurels, but reinvesting the great earnings made thanks to its chips. Among the noteworthy announcements are those with Johnson & Johnson for the use of generative AI in surgery and with GE Health for the improvement of medical image recognition.
Regarding market movements, it’s worth noting Amazon’s confirmation of investment in Anthropic. Amazon seems to have chosen Anthropic as a partner for its initiatives on generative LLM models, investing another 2.75 billion dollars, bringing the total investment to 4 billion dollars. Consequently, the models made in AWS (e.g., Titan) are currently being left in oblivion.
Moving to open-source models, the news of the month is the release by Databricks of its own model called DBRX. According to all tests, it is the open-source model today with the best performance, surpassing Grok, Llama, Mistral, and all the others. The model has performance comparable or slightly better than GPT-3.5.

Legal & Compliance

Similarly to what happened with privacy laws and GDPR, Europe is leading the way with its own AI Act, even though a vote is still missing. The AI Act has the difficult task of balancing the safety and privacy of its citizens with respect to this disruptive new technology and the need for innovation. We have repeatedly argued that artificial intelligence is not a fad, not something fleeting, but will be what characterizes the next decades in terms of technological, economic, and social life evolution. See EU AI Act: A Primer.

Parallel in the United States, while we are still at the stage of President Biden’s Executive Order, the federal government has issued a policy that requires the presence of a chief AI officer in all federal agencies to govern and manage investments and the adoption of artificial intelligence.

#AI, #OPENAI

This entry was posted on April 3, 2024, 9:40 pm and is filed under AI. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.

Quae Nocent Docent