Skip to content

feat(vertexai): Gemini multimodal output #8922

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

dlarocque
Copy link
Contributor

@dlarocque dlarocque commented Apr 10, 2025

Adds new ResponseModality enum that allows users to specify which modalities should be included in a response.
Since we provide a text() accessor, a similar inlineDataParts() accessor was added to return all InlineDataPart[] in the first candidate.

API Proposal: https://goto.google.com/vinf-multimodal-output-api (internal)

Copy link

changeset-bot bot commented Apr 10, 2025

🦋 Changeset detected

Latest commit: b898f35

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
firebase Minor
@firebase/vertexai Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

Vertex AI Mock Responses Check ⚠️

A newer major version of the mock responses for Vertex AI unit tests is available. update_vertexai_responses.sh should be updated to clone the latest version of the responses: v10.0

@google-oss-bot
Copy link
Contributor

google-oss-bot commented Apr 10, 2025

Size Report 1

Affected Products

  • @firebase/auth

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser193 kB193 kB+209 B (+0.1%)
    cordova166 kB166 kB+209 B (+0.1%)
    main147 kB147 kB+194 B (+0.1%)
    module193 kB193 kB+209 B (+0.1%)
    react-native165 kB165 kB+194 B (+0.1%)
  • @firebase/auth-cordova

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser166 kB166 kB+209 B (+0.1%)
    module166 kB166 kB+209 B (+0.1%)
  • @firebase/auth-web-extension

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser142 kB142 kB+209 B (+0.1%)
    main159 kB159 kB+200 B (+0.1%)
    module142 kB142 kB+209 B (+0.1%)
  • @firebase/auth/internal

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser204 kB204 kB+209 B (+0.1%)
    main173 kB174 kB+200 B (+0.1%)
    module204 kB204 kB+209 B (+0.1%)
  • @firebase/data-connect

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser21.4 kB21.7 kB+281 B (+1.3%)
    main23.7 kB23.9 kB+266 B (+1.1%)
    module21.4 kB21.7 kB+281 B (+1.3%)
  • @firebase/database

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser249 kB249 kB+62 B (+0.0%)
    main254 kB254 kB+61 B (+0.0%)
    module249 kB249 kB+62 B (+0.0%)
  • @firebase/database-compat/standalone

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    main366 kB366 kB+61 B (+0.0%)
  • @firebase/firestore

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser384 kB385 kB+243 B (+0.1%)
    main594 kB595 kB+291 B (+0.0%)
    module384 kB385 kB+243 B (+0.1%)
    react-native384 kB385 kB+243 B (+0.1%)
  • @firebase/firestore-lite

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser114 kB114 kB+226 B (+0.2%)
    main157 kB157 kB+401 B (+0.3%)
    module114 kB114 kB+226 B (+0.2%)
    react-native114 kB114 kB+227 B (+0.2%)
  • @firebase/functions

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser14.0 kB14.1 kB+73 B (+0.5%)
    main14.6 kB14.7 kB+67 B (+0.5%)
    module14.0 kB14.1 kB+73 B (+0.5%)
  • @firebase/storage

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser58.0 kB58.4 kB+389 B (+0.7%)
    main59.4 kB60.0 kB+540 B (+0.9%)
    module58.0 kB58.4 kB+389 B (+0.7%)
  • @firebase/util

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser23.4 kB23.5 kB+123 B (+0.5%)
    main29.5 kB29.7 kB+193 B (+0.7%)
    module23.4 kB23.5 kB+123 B (+0.5%)
  • @firebase/vertexai

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    browser34.7 kB35.9 kB+1.18 kB (+3.4%)
    main35.7 kB36.9 kB+1.21 kB (+3.4%)
    module34.7 kB35.9 kB+1.18 kB (+3.4%)
  • bundle

    39 size changes

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    auth (Anonymous)77.7 kB77.8 kB+132 B (+0.2%)
    auth (EmailAndPassword)87.8 kB87.9 kB+131 B (+0.1%)
    auth (GoogleFBTwitterGitHubPopup)105 kB105 kB+249 B (+0.2%)
    auth (GooglePopup)102 kB102 kB+131 B (+0.1%)
    auth (GoogleRedirect)102 kB102 kB+131 B (+0.1%)
    auth (Phone)95.2 kB95.3 kB+132 B (+0.1%)
    database (Append to a list of data)150 kB150 kB+104 B (+0.1%)
    database (Filtering data)149 kB149 kB+104 B (+0.1%)
    database (Listen for child events)165 kB165 kB+104 B (+0.1%)
    database (Listen for value events + Detach listeners)165 kB165 kB+104 B (+0.1%)
    database (Listen for value events)165 kB165 kB+104 B (+0.1%)
    database (Read data once)165 kB165 kB+104 B (+0.1%)
    database (Save data as transactions)167 kB167 kB+104 B (+0.1%)
    database (Sort data)150 kB151 kB+104 B (+0.1%)
    database (Write data)149 kB149 kB+104 B (+0.1%)
    firestore (CSI Auto Indexing Disable and Delete)274 kB275 kB+229 B (+0.1%)
    firestore (CSI Auto Indexing Enable)274 kB275 kB+229 B (+0.1%)
    firestore (Persistence)306 kB306 kB+229 B (+0.1%)
    firestore (Query Cursors)251 kB252 kB+231 B (+0.1%)
    firestore (Query)249 kB249 kB+231 B (+0.1%)
    firestore (Read data once)237 kB237 kB+231 B (+0.1%)
    firestore (Read Write w Persistence)330 kB331 kB+231 B (+0.1%)
    firestore (Realtime updates)239 kB239 kB+231 B (+0.1%)
    firestore (Transaction)216 kB216 kB+231 B (+0.1%)
    firestore (Write data)216 kB216 kB+231 B (+0.1%)
    firestore-lite (Query Cursors)105 kB105 kB+270 B (+0.3%)
    firestore-lite (Query)101 kB101 kB+270 B (+0.3%)
    firestore-lite (Read data once)76.0 kB76.3 kB+270 B (+0.4%)
    firestore-lite (Transaction)101 kB102 kB+270 B (+0.3%)
    firestore-lite (Write data)85.6 kB85.9 kB+270 B (+0.3%)
    functions (call)34.9 kB35.0 kB+113 B (+0.3%)
    storage (getBytes)42.5 kB42.8 kB+302 B (+0.7%)
    storage (getDownloadURL)44.6 kB44.9 kB+302 B (+0.7%)
    storage (getMetadata)44.0 kB44.3 kB+302 B (+0.7%)
    storage (list + listAll)43.5 kB43.8 kB+302 B (+0.7%)
    storage (updateMetadata)44.3 kB44.6 kB+302 B (+0.7%)
    storage (uploadBytes)49.2 kB49.5 kB+302 B (+0.6%)
    storage (uploadBytesResumable)59.1 kB59.4 kB+302 B (+0.5%)
    storage (uploadString)49.4 kB49.7 kB+302 B (+0.6%)

  • firebase

    16 size changes

    TypeBase (ea1f913)Merge (2a36e4a)Diff
    firebase-auth-compat.js141 kB141 kB+207 B (+0.1%)
    firebase-auth-cordova.js138 kB138 kB+284 B (+0.2%)
    firebase-auth-web-extension.js120 kB121 kB+284 B (+0.2%)
    firebase-auth.js158 kB158 kB+284 B (+0.2%)
    firebase-compat.js797 kB797 kB+546 B (+0.1%)
    firebase-data-connect.js17.9 kB18.2 kB+302 B (+1.7%)
    firebase-database-compat.js164 kB164 kB+93 B (+0.1%)
    firebase-database.js187 kB187 kB+123 B (+0.1%)
    firebase-firestore-compat.js342 kB342 kB+223 B (+0.1%)
    firebase-firestore-lite.js132 kB133 kB+300 B (+0.2%)
    firebase-firestore.js443 kB443 kB+326 B (+0.1%)
    firebase-functions-compat.js10.5 kB10.5 kB+89 B (+0.9%)
    firebase-functions.js14.9 kB15.0 kB+126 B (+0.8%)
    firebase-storage-compat.js39.8 kB40.1 kB+279 B (+0.7%)
    firebase-storage.js46.4 kB46.7 kB+310 B (+0.7%)
    firebase-vertexai.js28.3 kB29.2 kB+955 B (+3.4%)

Test Logs

  1. https://storage.googleapis.com/firebase-sdk-metric-reports/dIhHPMPItz.html

@google-oss-bot
Copy link
Contributor

google-oss-bot commented Apr 10, 2025

Size Analysis Report 1

This report is too large (443,341 characters) to be displayed here in a GitHub comment. Please use the below link to see the full report on Google Cloud Storage.

Test Logs

  1. https://storage.googleapis.com/firebase-sdk-metric-reports/uc142TVveY.html

@dlarocque dlarocque force-pushed the dl/gemini-image-out branch 2 times, most recently from 9a05c5b to 4f7f1ec Compare April 16, 2025 13:25
@dlarocque dlarocque force-pushed the dl/gemini-image-out branch from 4f7f1ec to 1d58a06 Compare April 30, 2025 14:38
@dlarocque dlarocque changed the title [WIP] feat(vertexai): Gemini multimodal output feat(vertexai): Gemini multimodal output Apr 30, 2025
@dlarocque dlarocque requested a review from hsubox76 April 30, 2025 14:57
@dlarocque dlarocque marked this pull request as ready for review April 30, 2025 14:57
@dlarocque dlarocque requested review from a team as code owners April 30, 2025 14:57
Copy link
Contributor

github-actions bot commented Apr 30, 2025

Changeset File Check ✅

  • No modified packages are missing from the changeset file.
  • No changeset formatting errors detected.

*
* @beta
*/
export const ResponseModality = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the code object we agreed we should be exporting instead of TS enums so I get that, but we've had build issues in the past mixing JS code in types files so we should probably put these in a separate file. Looks like it's not causing build issues now so maybe we can move it along with the others whenever we plan to convert all our enums to JS objects.

@dlarocque dlarocque requested a review from rachelsaunders May 2, 2025 20:41
@dlarocque dlarocque requested a review from rachelsaunders May 5, 2025 14:26

Generation modalities to be returned in generation responses.

- Multimodal response generation is only supported in some Gemini models and versions; see [model versions](https://firebase.google.com/docs/vertex-ai/models)<!-- -->. - Only image generation (`ResponseModality.IMAGE`<!-- -->) is supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Multimodal response generation is only supported in some Gemini models and versions; see [model versions](https://firebase.google.com/docs/vertex-ai/models)<!-- -->. - Only image generation (`ResponseModality.IMAGE`<!-- -->) is supported.
- Multimodal response generation is only supported by some Gemini models and versions; see [model versions](https://firebase.google.com/docs/vertex-ai/models)<!-- -->. - Only image generation (`ResponseModality.IMAGE`<!-- -->) is supported.

* Generation modalities to be returned in generation responses.
*
* @remarks
* - Multimodal response generation is only supported in some Gemini models and versions; see {@link https://firebase.google.com/docs/vertex-ai/models | model versions}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* - Multimodal response generation is only supported in some Gemini models and versions; see {@link https://firebase.google.com/docs/vertex-ai/models | model versions}.
* - Multimodal response generation is only supported by some Gemini models and versions; see {@link https://firebase.google.com/docs/vertex-ai/models | model versions}.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants