speech-to-text/extension.yaml

name: speech-to-text version: 0.1.6 specVersion: v1beta displayName: Transcribe Speech to Text description: Transcribes audio files in Cloud Storage to .txt files using Cloud Speech To Text. icon: icon.png tags: [ai, speech-to-text, speech-transcription, google-ai] license: Apache-2.0 # The license you want for the extension author: authorName: Google Cloud url: https://cloud.google.com/ sourceUrl: https://github.com/GoogleCloudPlatform/firebase-extensions/tree/main/speech-to-text releaseNotesUrl: https://github.com/GoogleCloudPlatform/firebase-extensions/tree/main/speech-to-text/CHANGELOG.md apis: - apiName: speech.googleapis.com reason: Used for transcribing the audio of sound files. resources: - name: transcribeAudio type: firebaseextensions.v1beta.function description: >- Listens for new audio files uploaded to a specified Cloud Storage bucket, transcribes the speech in those files, then stores the transcription in storage, or in firestore, or in both. properties: location: ${param:LOCATION} runtime: nodejs20 availableMemoryMb: 1024 eventTrigger: eventType: google.storage.object.finalize resource: projects/_/buckets/${param:EXTENSION_BUCKET} params: - param: LOCATION label: Cloud Functions location description: >- Where do you want to deploy the functions created for this extension? You usually want a location close to your database. Realtime Database instances are located in `us-central1`. For help selecting a location, refer to the [location selection guide](https://firebase.google.com/docs/functions/locations). type: select options: - label: Iowa (us-central1) value: us-central1 - label: South Carolina (us-east1) value: us-east1 - label: Northern Virginia (us-east4) value: us-east4 - label: Belgium (europe-west1) value: europe-west1 - label: London (europe-west2) value: europe-west2 - label: Frankfurt (europe-west3) value: europe-west3 - label: Hong Kong (asia-east2) value: asia-east2 - label: Tokyo (asia-northeast1) value: asia-northeast1 default: us-central1 required: true immutable: true - param: EXTENSION_BUCKET label: Cloud Storage bucket for input and output description: > The Cloud Storage bucket that the extension should be listening to. Files uploaded to this bucket will be transcribed by the extension. If cloud storage output is enabled, transcriptions will be written to this bucket. type: selectResource resourceType: storage.googleapis.com/Bucket example: my-project-12345.appspot.com validationRegex: ^([0-9a-z_.-]*)$ validationErrorMessage: Invalid storage bucket default: ${STORAGE_BUCKET} required: true - param: OUTPUT_STORAGE_PATH label: Storage path for transcriptions description: > The storage path in which to output transcriptions. If this is not set, the extension will output to the root of the bucket. example: transcriptions required: false - param: COLLECTION_PATH label: Firestore collection for storing transcribed audio description: > The firestore collection in which to output transcriptions. If this is not set, the extension will not output the data to Firestore. validationRegex: '^[^/]+(/[^/]+/[^/]+)*$' validationErrorMessage: Must be a valid Cloud Firestore Collection example: transcriptions required: false - param: LANGUAGE_CODE label: BCP-47 code of the transcription language description: > The BCP-47 code of the transcription language, as shown in the [Language support documentation](https://cloud.google.com/speech-to-text/docs/languages) example: e.g. en-US # TODO(reao): Here I assume that language codes need to end in two # uppercase characters -- is that correct? It is a pattern in the language # support documentation. validationRegex: '^([a-zA-Z-])*[A-Z][A-Z]$' validationErrorMessage: Must be a valid code from https://cloud.google.com/speech-to-text/docs/languages required: true - param: MODEL label: Language model used for transcription description: > Which kind of use-case should the speech-to-text transcription algorithm be honed for? For details, see [the model field in the documentation](https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig) If you're not sure, just use the default. example: default default: default required: true - param: ENABLE_AUTOMATIC_PUNCTUATION label: Enable automatic punctuation description: > Should the transcription algorithm attempt to add punctuation to the transcription? For details, see [the documentation](https://cloud.google.com/speech-to-text/docs/automatic-punctuation) type: select options: - label: Enabled value: true - label: Disabled value: false default: true roles: - role: storage.objectAdmin reason: Allows the extension to write to your Cloud Storage. - role: datastore.user reason: Allows the extension to write to your Firestore Database instance. events: - type: firebase.extensions.storage-transcribe-audio.v1.complete description: Occurs when audio transcription completes. The event will contain the full transcription. - type: firebase.extensions.storage-transcribe-audio.v1.fail description: Occurs when audio transcription fails. The event will contain information about the failure.

speech-to-text/extension.yaml (110 lines of code) (raw):