With Five New Multimodal Models Across the 3B, 4B, and 9B Scales, the OpenFlamingo Team Releases OpenFlamingo v2 which Outperforms the Previous Model

A group of researchers from the University of Washington, Stanford, AI2, UCSB, and Google recently developed the OpenFlamingo project, which aims to build models similar to those DeepMind’s Flamingo team. OpenFlamingo models can handle any mixed text and image sequences and produce text as an output. Captioning, visual question answering, and image classification are just…

Read More

Contextual AI Introduces LENS: An AI Framework for Vision-Augmented Language Models that Outperforms Flamingo by 9% (56->65%) on VQAv2

Large Language Models (LLMs) have transformed natural language understanding in recent years, demonstrating remarkable aptitudes in semantic comprehension, query resolution, and text production, particularly in zero-shot and few-shot environments. As seen in Fig. 1(a), several methods have been put forth for using LLMs on tasks involving vision. An optical encoder may be trained to represent…

Read More

Google Ads With Larger Images Again

Google Ads has been showing larger images on and off for a while now. But here is an ad for a law firm that has really large images on the desktop search interface. This was spotted by Anthony Higman, who posted this example on Twitter – I should note that I cannot replicate this: To…

Read More

Unity Announce the Release of Muse: A Text-to-Video Games Platform that lets you Create Textures, Sprites, and Animations with Natural Language

AI has been making waves in various industries, revolutionizing how we approach art and many other fields. Artificial intelligence has opened up new possibilities for creative expression and efficiency with its ability to analyze data, learn patterns, and generate content. One area where AI has mainly made its mark is in the realm of game…

Read More

Bing Gains AI-Powered Shopping Features

Bing announced a number of new features it says are powered by AI in both Bing, Bing Chat, and the Edge sidebar. The features include AI-generated buying guides, AI-generated review summaries, and new price match monitors. Let’s go through each, most of which should be live for many of you. Buying Guides Bing uses AI…

Read More

Meet FastSAM: The Breakthrough Real-Time Solution Achieving High-Performance Segmentation with Minimal Computational Load

The Segment Anything Model (SAM) is a newer proposal in the field. It’s a vision foundation concept that’s been hailed as a breakthrough. It may employ multiple possible user involvement prompts to segment any object in the image accurately. Using a Transformer model that has been extensively trained on the SA-1B dataset, SAM can easily…

Read More