#gemini-api tag

Creating Reddit Summaries with URL Context and Gemini

Gemini's URL Context enables the model to fetch public webpage content directly, avoiding manual HTML downloads and simplifying tasks like scanning subreddit posts.

Artificial intelligence

fromTechCrunch

1 month ago

Google's update for Veo 3.1 lets users create vertical videos through reference images | TechCrunch

Veo 3.1 can natively output 9:16 vertical AI videos from reference images, improving expression, consistency, blending, and upscaling to 1080p/4K across Google platforms.

Artificial intelligence

fromInfoWorld

3 months ago

Google updates Gemini API for Gemini 3

Gemini API adds media_resolution controls, restores encrypted thought signatures, and supports structured outputs with Grounding and updated Google Search pricing.

fromRaymondcamden

3 months ago

Gemini File Search and File Stores for Easy RAG

I am really excited about this post as it's one of the most powerful changes I've seen to Google's Gemini APIs in quite some time. For a while now it's been really easy to perform searches against a document, or a group of documents. You would upload the file (or files), ask your questions, and that was all you needed.

Artificial intelligence

Information security

fromFuturism

3 months ago

Malware Is Now Using AI to Rewrite Its Own Code to Avoid Detection

PROMPTFLUX malware leverages large language models via the Gemini API to dynamically rewrite and obfuscate its own code, enabling adaptive, harder-to-detect attacks.

fromInfoWorld

6 months ago

Gemini 2.5 Flash Image model advances AI image generation

Introduced August 26 and also identified as "Nano Banana," Gemini 2.5 Flash Image enables developers to maintain character for consistency, make targeted transformations using natural language, and use Gemini knowledge to generate and edit images. The model is available via the Gemini API and Google AI Studio for developers and Vertex AI for enterprise. To assist with building with Gemini 2.5 Flash Image, Google has made updates to Google AI Studio's build mode.

Artificial intelligence

fromLogRocket Blog

6 months ago

How to build a multimodal AI app with voice and vision in Next.js - LogRocket Blog

Multimodal AI lets LLMs process text, images, audio, and video together, enabling richer app interactions using frameworks like Next.js and Google's Gemini API.

fromRaymondcamden

6 months ago

Connecting Comic Books to Generative AI

As a reminder, these typically fall into two categories: cbr - A RAR file of scanned images cbz - A zip file of scanned images This week I was wondering - given that GenAI tools are pretty good at understanding images - how well could a GenAI system take a set of images, in order, and understand the context of the story behind them. I decided to give it a shot and honestly, I'm pretty impressed by the results.

Artificial intelligence

Digital life

fromMedium

7 months ago

AI/ML Weekly #485: How Much of OpenAI is Written by AI, GenAI Processors by Google DeepMind &...

AI struggles with basic bugs while companies consider workforce reduction due to its advancements.

#gemini-api#gemini-api

Creating Reddit Summaries with URL Context and Gemini

Google's update for Veo 3.1 lets users create vertical videos through reference images | TechCrunch

Google updates Gemini API for Gemini 3

Gemini File Search and File Stores for Easy RAG

Malware Is Now Using AI to Rewrite Its Own Code to Avoid Detection

Gemini 2.5 Flash Image model advances AI image generation

How to build a multimodal AI app with voice and vision in Next.js - LogRocket Blog

Connecting Comic Books to Generative AI

AI/ML Weekly #485: How Much of OpenAI is Written by AI, GenAI Processors by Google DeepMind &...

#gemini-api
#gemini-api