Gemini AI

12 Dec 2023

What exactly is Google Gemini, and how can it be utilized? Explore its features and limitations.

Google has taken a significant leap forward by introducing Gemini AI, marking a new era in artificial intelligence. Gemini, the latest large language model (LLM) from Google, has been publicly released following a teaser in June. This noteworthy advancement in AI is anticipated to reverberate across all of Google's products.

What is Google Gemini?

Google Gemini represents the latest advancement in large language models (LLM) from Google, surpassing its predecessor in terms of power and capability. This multifaceted AI model is designed to seamlessly handle various modalities, including text, images, video, audio, and code.

Gemini stands out as the first model to surpass human experts in Massive Multitask Language Understanding (MMLU), a widely used method for assessing the knowledge and problem-solving capabilities of AI models. This achievement underscores Gemini's remarkable capabilities.

Gemini AI excels in several domains, including:

Computer Vision: Encompassing object detection, scene understanding, and anomaly

detection.

Geospatial Science: Involving multisource data fusion, planning and intelligence, and

continuous monitoring.

Human Health: Addressing personalized healthcare, biosensor integration, and

preventative medicine.

Integrated Technologies: Covering domain knowledge transfer, data fusion, enhanced

decision-making, and large language models (LLMs).

Google is placing particular emphasis on Gemini's application in coding, showcasing AlphaCode 2, a new code-generating system. This system has demonstrated superior performance,surpassing 85% of participants in a coding competition — an impressive 50% improvement over the original AlphaCode. Users can expect enhancements across various domains where Gemini is applied.

Trained on Google's Tensor Processing Units (TPU), Gemini proves to be faster and more cost-effective than its predecessor, PaLM. Additionally, Google is set to unveil TPU v5p, a newer version tailored for data centres requiring the training and execution of large-scale models.

Gemini is available in three variants—Nano, Pro, and Ultra—each catering to different user needs. Nano is designed for swift on-device tasks, Pro serves as a versatile middle-tier option, and Ultra, the most powerful variant, is slated for release next year pending safety checks.

Users can experience Gemini Nano's capabilities on the Pixel 8 Pro, featuring enhancements like summarization in the Recorder app and Smart Reply on Gboard, initially integrated into WhatsApp. The advanced text-based features of Gemini Pro are accessible for free within Google Bard.