assets/large_language_models/rag/components/update_pinecone_index/spec.yaml (42 lines of code) (raw):

$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json type: command tags: Preview: "" version: 0.0.45 name: llm_rag_update_pinecone_index display_name: LLM - Update Pinecone Index is_deterministic: true description: | Uploads `embeddings` into Pinecone index specified in `pinecone_config`. The Index will be created if it doesn't exist. Each record in the Index will have the following metadata populated: - "id", String - "content", String - "url", String - "filepath", String - "title", String - "metadata_json_string", String "metadata_json_string" contains all metadata for a document/record serialized as a JSON string. inputs: embeddings: type: uri_folder mode: direct description: "Embeddings output produced from parallel_create_embeddings." pinecone_config: type: string description: 'JSON string containing the Pinecone index configuration. e.g. {"index_name": "my-index"}' connection_id: type: string optional: true description: "The id of the connection to the Pinecone project where the index lives." outputs: index: type: uri_folder description: "Uri folder containing the MLIndex yaml describing the newly created/updated Pinecone index." environment: azureml:llm-rag-embeddings@latest code: '../src/' command: >- python -m azureml.rag.tasks.update_pinecone --embeddings '${{inputs.embeddings}}' --pinecone_config '${{inputs.pinecone_config}}' --output ${{outputs.index}} $[[--connection_id '${{inputs.connection_id}}']]