With Five New Multimodal Models Across the 3B, 4B, and 9B Scales, the OpenFlamingo Team Releases OpenFlamingo v2 which Outperforms the Previous Model

A group of researchers from the University of Washington, Stanford, AI2, UCSB, and Google recently developed the OpenFlamingo project, which aims to build models similar to those DeepMind’s Flamingo team. OpenFlamingo models can handle any mixed text and image sequences and produce text as an output. Captioning, visual question answering, and image classification are just…

Read More

Contextual AI Introduces LENS: An AI Framework for Vision-Augmented Language Models that Outperforms Flamingo by 9% (56->65%) on VQAv2

Large Language Models (LLMs) have transformed natural language understanding in recent years, demonstrating remarkable aptitudes in semantic comprehension, query resolution, and text production, particularly in zero-shot and few-shot environments. As seen in Fig. 1(a), several methods have been put forth for using LLMs on tasks involving vision. An optical encoder may be trained to represent…

Read More

Google Ads With Larger Images Again

Google Ads has been showing larger images on and off for a while now. But here is an ad for a law firm that has really large images on the desktop search interface. This was spotted by Anthony Higman, who posted this example on Twitter – I should note that I cannot replicate this: To…

Read More

Unity Announce the Release of Muse: A Text-to-Video Games Platform that lets you Create Textures, Sprites, and Animations with Natural Language

AI has been making waves in various industries, revolutionizing how we approach art and many other fields. Artificial intelligence has opened up new possibilities for creative expression and efficiency with its ability to analyze data, learn patterns, and generate content. One area where AI has mainly made its mark is in the realm of game…

Read More