Amux

Gemini Adapter

Use the Gemini adapter to connect to Google Gemini Pro and Gemini Pro Vision models

The Gemini adapter provides integration with the Google Gemini API, supporting text generation, vision understanding, and ultra-long context (1M tokens).

Installation

pnpm add @amux.ai/llm-bridge @amux.ai/adapter-google

Basic Usage

import { createBridge } from '@amux.ai/llm-bridge'
import { googleAdapter } from '@amux.ai/adapter-google'

const bridge = createBridge({
  inbound: googleAdapter,
  outbound: googleAdapter,
  config: {
    apiKey: process.env.GEMINI_API_KEY
  }
})

const response = await bridge.chat({
  model: 'gemini-pro',
  contents: [{
    role: 'user',
    parts: [{ text: 'What is Amux?' }]
  }]
})

console.log(response.candidates[0].content.parts[0].text)

Gemini API uses a different request format (contents instead of messages). Amux handles format conversion automatically.

Supported Models

ModelContext LengthDescription
gemini-pro1MText generation model
gemini-pro-vision1MVision-enabled model
gemini-1.5-pro2MLatest version, longer context
gemini-1.5-flash1MFast response version

Gemini's key feature is ultra-long context (up to 2M tokens), far exceeding other models.

Key Features

Text Generation

const response = await bridge.chat({
  model: 'gemini-pro',
  contents: [{
    role: 'user',
    parts: [{ text: 'Write a poem about AI' }]
  }],
  generationConfig: {
    temperature: 0.9,
    topK: 40,
    topP: 0.95,
    maxOutputTokens: 1024
  }
})

Vision Understanding

Analyze images with Gemini Pro Vision:

const response = await bridge.chat({
  model: 'gemini-pro-vision',
  contents: [{
    role: 'user',
    parts: [
      { text: 'What is in this image?' },
      {
        inlineData: {
          mimeType: 'image/jpeg',
          data: base64ImageData
        }
      }
    ]
  }]
})

Supported Image Formats:

  • Base64 encoded (recommended)
  • Requires mimeType (image/jpeg, image/png, etc.)

Ultra-Long Context

Gemini supports up to 2M tokens context:

const response = await bridge.chat({
  model: 'gemini-1.5-pro',
  contents: [{
    role: 'user',
    parts: [{
      text: `Analyze this very long document:\n\n${veryLongDocument}`
    }]
  }]
})

Use Cases:

  • Entire book analysis
  • Large codebase understanding
  • Long video transcript analysis
  • Multi-document comparison

Streaming

const stream = bridge.chatStream({
  model: 'gemini-pro',
  contents: [{
    role: 'user',
    parts: [{ text: 'Tell me a story' }]
  }],
  stream: true
})

for await (const chunk of stream) {
  const text = chunk.candidates[0]?.content?.parts[0]?.text
  if (text) {
    process.stdout.write(text)
  }
}

Configuration Options

const bridge = createBridge({
  inbound: googleAdapter,
  outbound: googleAdapter,
  config: {
    apiKey: process.env.GEMINI_API_KEY,
    baseURL: 'https://generativelanguage.googleapis.com', // Default
    timeout: 60000
  }
})

Feature Support

FeatureSupportedNotes
Text GenerationFully supported
StreamingFully supported
VisionGemini Pro Vision
Ultra-Long ContextUp to 2M tokens
Function Calling⚠️Partial support
System PromptFully supported
JSON ModeStructured output

Gemini's function calling format differs from OpenAI and may require additional format conversion.

Best Practices

1. Choose the Right Model

// Text tasks use gemini-pro
const textResponse = await bridge.chat({
  model: 'gemini-pro',
  contents: [{ role: 'user', parts: [{ text: '...' }] }]
})

// Image tasks use gemini-pro-vision
const visionResponse = await bridge.chat({
  model: 'gemini-pro-vision',
  contents: [{ role: 'user', parts: [...] }]
})

// Need longer context use 1.5 version
const longContextResponse = await bridge.chat({
  model: 'gemini-1.5-pro',
  contents: [{ role: 'user', parts: [{ text: longText }] }]
})

2. Optimize Generation Parameters

const response = await bridge.chat({
  model: 'gemini-pro',
  contents: [{
    role: 'user',
    parts: [{ text: 'Write a technical article' }]
  }],
  generationConfig: {
    temperature: 0.7, // Control randomness
    topK: 40, // Top-K sampling
    topP: 0.95, // Top-P sampling
    maxOutputTokens: 2048, // Maximum output length
    stopSequences: ['\n\n'] // Stop sequences
  }
})

3. Handle Multimodal Input

// Read image file
import fs from 'fs'

const imageData = fs.readFileSync('image.jpg')
const base64Image = imageData.toString('base64')

const response = await bridge.chat({
  model: 'gemini-pro-vision',
  contents: [{
    role: 'user',
    parts: [
      { text: 'Analyze this image' },
      {
        inlineData: {
          mimeType: 'image/jpeg',
          data: base64Image
        }
      }
    ]
  }]
})

4. Leverage Ultra-Long Context

// Gemini can process entire books
const bookAnalysis = await bridge.chat({
  model: 'gemini-1.5-pro',
  contents: [{
    role: 'user',
    parts: [{
      text: `Please summarize the main content and key points of this book:\n\n${entireBook}`
    }]
  }],
  generationConfig: {
    maxOutputTokens: 4096 // Generate detailed summary
  }
})

Converting with OpenAI

import { openaiAdapter } from '@amux.ai/adapter-openai'
import { googleAdapter } from '@amux.ai/adapter-google'

const bridge = createBridge({
  inbound: openaiAdapter,
  outbound: googleAdapter,
  config: {
    apiKey: process.env.GEMINI_API_KEY
  }
})

// Send request in OpenAI format
// Amux automatically converts to Gemini format
const response = await bridge.chat({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Hello' }]
})

Safety Settings

Gemini supports content safety filtering:

const response = await bridge.chat({
  model: 'gemini-pro',
  contents: [{
    role: 'user',
    parts: [{ text: '...' }]
  }],
  safetySettings: [
    {
      category: 'HARM_CATEGORY_HARASSMENT',
      threshold: 'BLOCK_MEDIUM_AND_ABOVE'
    },
    {
      category: 'HARM_CATEGORY_HATE_SPEECH',
      threshold: 'BLOCK_MEDIUM_AND_ABOVE'
    }
  ]
})

Next Steps

On this page