Weekly Series #3
Key highlights from OpenAI, Google and Microsoft events, Inflation cools down in April CPI report
Disclaimer: This content is NOT generated by AI
Last week has been nothing less than spectacular for AI product announcements. Google, OpenAI and Microsoft held events within days of each other to announce their latest in AI. It was a lot to process, so I’ve tried to remove noise and create a summary of key items below chronologically:
OpenAI:
They decided to make an impromptu announcement on May 13th, just a day prior to Google I/O event (which was announced many weeks in advance). This was no coincidence. Here’s the link to official video that was live-streamed on YouTube.
New model announced - ChatGPT-4o (‘o’ for omni)
This is the first multi-modal Generative AI model released by OpenAI - which means that it can take inputs in all 3 formats at the same time - text, vision and audio in a very human-like manner. They released videos through their YouTube channel that demonstrates these multi-modal capabilities.ChatGPT-4o currently available to users for free (limited access & features), ie. through their website chatgpt.com. ‘Plus’ subscribers to get higher limits for tokens, in addition to DALL-E image generation and advanced features. Mobile app will soon be upgraded to this new model (date unknown).
New desktop app to be launched soon for both Mac and Windows (date unknown)
Google I/O 2024:
After a not-so-great Gemini launch early this year, Google makes a comeback by showing off its entire AI arsenal, although it mostly seemed to be a catch up to OpenAI’s products. Here are the highlights:
Project Astra: This was Google’s ‘ChatGPT-4o equivalent’ of sorts, as it demonstrated multi-modal capabilities (ability to take text, audio and vision inputs from real world all at once). There is still no clarity on when this will be generally available.
Gemini updates: They announced Gemini 1.5 Flash, a lightweight version of Gemini 1.5 Pro for narrow, low-latency and high frequency tasks and is expected to be cheapest of its kind. They also announced upgraded version of Gemini 1.5 Pro, with doubled context window length, ie. 2 million tokens. This is the largest context window length for any commercial model out there as of today.
IMAGEN 3 and VEO: Google’s equivalent to DALL-E(image model) and Sora(video model) respectively. IMAGEN 3 is an upgrade from 2, while VEO is their fresh new video AI model. Currently only available for Trusted Testers, general availability is unknown.
Google Labs: They announced multiple experimental AI-powered tools including MusicLM (AI too that can create music from user prompts), NotebookLM (AI assisted notebook), FoodMood (AI recipe collaborator), TextFX (AI powered text/language tool) and more.
AI Overview in Search: In a controversial move, Google rolled out the ‘AI Overview’ feature on Google Search to all users across multiple geographies, which seemed to mimic the Perplexity AI concept of summarizing search results from top search links. It is becoming increasingly clear that Google is trying to play catch-up with advancements in AI owing to risk of losing web traffic on its biggest product.
Microsoft Build:
Microsoft goes all-in on AI by making announcements in their developer conference. Here are the key highlights:
Microsoft announced Copilot + PCs, claiming them to be the fastest and most intelligent Windows PCs ever created. They will be equipped with CPU, GPU and ‘NPU’ (Neural Processing Unit) optimized for AI workloads that will not only run LLMs on Azure Cloud, but also Microsoft family of SLMs locally for AI applications. As the name suggests, Microsoft Copilot will be the focal point of these products that will leverage ChatGPT-4o multimodal capabilities to enhance applications and use-cases.
Microsoft announced partnership with Khan Academy to expand AI capabilities in learning and tutoring. This clearly seems to be an increasingly important use-case of AI, as both OpenAI and Google demonstrated the importance it can bring in learning experience.
Additionally, Microsoft announced many new AI developer tools for the developer community to build next generation application on Windows products.
Inflation cools down in April CPI report:
For the first time this year, inflation decelerated to 3.5% YoY against 3.6% YoY in March, which happened to be softer than expected. CPI is a leading indicator for Fed’s monetary policy and is still far from their target of 2% YoY, but April CPI was received well by the market as it can now expect Fed to consider rate cuts later this year. S&P500, Dow and NASDAQ all reached new highs thanks to the CPI report.
Mortgage rates dip below 7% for the first time in over a month.