We start by importing the JARs from the Maven repository that are going to be used in this sample.  We see that we import "contentwarehouse" which is just a synonym for Document AI Warehouse.

In [1]:
%%loadFromPOM
<dependency>
  <groupId>com.google.cloud</groupId>
  <artifactId>google-cloud-contentwarehouse</artifactId>
  <version>0.3.0</version>
</dependency>

Next we import the classes that we are going to use.

In [41]:
import com.google.cloud.contentwarehouse.v1.CreateDocumentRequest;
import com.google.cloud.contentwarehouse.v1.CreateDocumentResponse;
import com.google.cloud.contentwarehouse.v1.DateTimeTypeOptions;
import com.google.cloud.contentwarehouse.v1.DeleteDocumentRequest;
import com.google.cloud.contentwarehouse.v1.Document;
import com.google.cloud.contentwarehouse.v1.DocumentQuery;
import com.google.cloud.contentwarehouse.v1.DocumentSchema;
import com.google.cloud.contentwarehouse.v1.DocumentSchemaServiceClient;
import com.google.cloud.contentwarehouse.v1.DocumentServiceClient;
import com.google.cloud.contentwarehouse.v1.FloatTypeOptions;
import com.google.cloud.contentwarehouse.v1.LocationName;
import com.google.cloud.contentwarehouse.v1.Property;
import com.google.cloud.contentwarehouse.v1.PropertyDefinition;
import com.google.cloud.contentwarehouse.v1.RawDocumentFileType;
import com.google.cloud.contentwarehouse.v1.RequestMetadata;
import com.google.cloud.contentwarehouse.v1.SearchDocumentsRequest;
import com.google.cloud.contentwarehouse.v1.SearchDocumentsResponse;
import com.google.cloud.contentwarehouse.v1.TextArray;
import com.google.cloud.contentwarehouse.v1.TextTypeOptions;
import com.google.cloud.contentwarehouse.v1.UserInfo;

import com.google.cloud.documentai.v1.DocumentProcessorServiceClient;
import com.google.cloud.documentai.v1.ProcessRequest;
import com.google.cloud.documentai.v1.ProcessResponse;
import com.google.cloud.documentai.v1.RawDocument;

import com.google.protobuf.ByteString;

### Change Here
In the following cell, be sure and change the values to reflect your own environment.  Specifically, you
should definitely supply your own value for `PROJECT_NUMBER`.

In [25]:
// Change the following variables
final String PROJECT_NUMBER = "41208676560";
final String LOCATION = "us";
final String USERID = "user:kolban@kolban.altostrat.com";

// End of change area ...
final RequestMetadata requestMetadata = RequestMetadata.newBuilder()
  .setUserInfo(UserInfo.newBuilder()
    .setId(USERID)
    .build())
  .build();
final LocationName parent = LocationName.of(PROJECT_NUMBER, LOCATION);
DocumentSchemaServiceClient documentSchemaServiceClient = DocumentSchemaServiceClient.create();
DocumentServiceClient documentServiceClient = DocumentServiceClient.create();

## Create Schema
In this example we will be creating a new schema.  While it looks like a large amount of code, don't let that fool you.  A schema can have zero or more properties and in this example we are setting quite a few.  As such, most of the code is merely repetitions of `addPropertyDefinitions` where we add new properties to the description of the schema we wish to create.

At the highest level, our fragment populates an instance of an object of type `DocumentSchema`.  This describes what we want our resulting schema to contain.  Next we invoke a client method called `createDocumentSchema` that takes as input our schema description and causes the creation of a new schema based on our description.  On completion, a new schema will have been created and will have been assigned a unique name (identity).  The value of that name is then logged.

In [17]:
public void createSchema() {
  DocumentSchema documentSchema = DocumentSchema.newBuilder()
    .setDisplayName("Invoice")
    .setDescription("Invoice Schema")
    .setDocumentIsFolder(false)
    .addPropertyDefinitions(PropertyDefinition.newBuilder()
      .setName("payee")
      .setDisplayName("Payee")
      .setIsFilterable(true).setIsSearchable(true).setIsMetadata(true).setIsRequired(true)
      .setTextTypeOptions(TextTypeOptions.newBuilder().build())
      .build())
    .addPropertyDefinitions(PropertyDefinition.newBuilder()
      .setName("payer")
      .setDisplayName("Payer")
      .setIsFilterable(true).setIsSearchable(false).setIsMetadata(true).setIsRequired(true)
      .setTextTypeOptions(TextTypeOptions.newBuilder().build())
      .build())
    .addPropertyDefinitions(PropertyDefinition.newBuilder()
      .setName("amount")
      .setDisplayName("Amount")
      .setIsFilterable(true).setIsSearchable(false).setIsMetadata(true).setIsRequired(false)
      .setFloatTypeOptions(FloatTypeOptions.newBuilder().build())
      .build())
    .addPropertyDefinitions(PropertyDefinition.newBuilder()
      .setName("id")
      .setDisplayName("Invoice ID")
      .setIsFilterable(true).setIsSearchable(false).setIsMetadata(true).setIsRequired(false)
      .setTextTypeOptions(TextTypeOptions.newBuilder().build())
      .build())
    .addPropertyDefinitions(PropertyDefinition.newBuilder()
      .setName("date")
      .setDisplayName("Date")
      .setIsFilterable(true).setIsSearchable(false).setIsMetadata(true).setIsRequired(false)
      .setDateTimeTypeOptions(DateTimeTypeOptions.newBuilder().build())
      .build())
    .addPropertyDefinitions(PropertyDefinition.newBuilder()
      .setName("notes")
      .setDisplayName("Notes")
      .setIsFilterable(true).setIsSearchable(false).setIsMetadata(true).setIsRequired(false)
      .setTextTypeOptions(TextTypeOptions.newBuilder().build())
      .build())
    .build();
  
  DocumentSchema newDocumentSchema = documentSchemaServiceClient.createDocumentSchema(parent, documentSchema);
  
  System.out.println("name");
  System.out.println("-------------------------------------------------------------------------");
  System.out.println(newDocumentSchema.getName());
} // createSchema

createSchema()

name
-------------------------------------------------------------------------
projects/41208676560/locations/us/documentSchemas/3loccu79n5t88


## List Schemas
Having just create a new schema, we should be able to list all our schemas and see the one we just created.  There isn't much to explain here.  We invoke the `listDocumentSchemas` method of the client which returns an iterrable over the list of all schemas that we then log.

In [18]:
public void listSchema() {
  DocumentSchemaServiceClient.ListDocumentSchemasPagedResponse response
    = documentSchemaServiceClient.listDocumentSchemas(parent);
  System.out.println("display name    name");
  System.out.println("--------------- ---------------------------------------------------------------------");
  for (DocumentSchema currentSchema: response.iterateAll()) {
    System.out.printf("%-15.15s %s\n",currentSchema.getDisplayName() , currentSchema.getName());
  }
} // listSchema

listSchema();

display name    name
--------------- ---------------------------------------------------------------------
Invoice         projects/41208676560/locations/us/documentSchemas/06l7hah2jjqqo
Invoice         projects/41208676560/locations/us/documentSchemas/13smg321hoo1o
S1              projects/41208676560/locations/us/documentSchemas/1cp01ej1hk798
Invoice         projects/41208676560/locations/us/documentSchemas/3loccu79n5t88
Invoice         projects/41208676560/locations/us/documentSchemas/3qth4jbn7n4jo
Invoice         projects/41208676560/locations/us/documentSchemas/42gj9il1an3d8
Invoice         projects/41208676560/locations/us/documentSchemas/4449qkpphffdo
Invoice         projects/41208676560/locations/us/documentSchemas/6gap487vkc1a8


## Create a document
In this fragment we ingest a document into Document AI Warehouse.  We take the content of the document from a local file.  We must also specify the schema we want to associate with our document.

In [33]:
public void createDocument(String schemaName, ByteString fileData) {
  Document document = Document.newBuilder()
    .setDisplayName("Invoice 1")
    .setTitle("My Invoice 1")
    .setDocumentSchemaName(schemaName)
    .setInlineRawDocument(fileData)
    .setRawDocumentFileType(RawDocumentFileType.RAW_DOCUMENT_FILE_TYPE_PDF)
    .setTextExtractionDisabled(true)
    .addProperties(Property.newBuilder()
      .setName("payee")
      .setTextValues(TextArray.newBuilder().addValues("Developer Company").build())
      .build())
    .addProperties(Property.newBuilder()
      .setName("payer")
      .setTextValues(TextArray.newBuilder().addValues("Buyer Company").build())
      .build())
    .build();

  CreateDocumentRequest createDocumentRequest = CreateDocumentRequest.newBuilder()
    .setDocument(document)
    .setParent(parent.toString())
    .setRequestMetadata(requestMetadata)
    .build();

  CreateDocumentResponse createDocumentResponse = documentServiceClient.createDocument(createDocumentRequest);

  System.out.println("name");
  System.out.println("-------------------------------------------------------------------------");
  System.out.println(createDocumentResponse.getDocument().getName());
} // createDocument

String schemaName = "projects/41208676560/locations/us/documentSchemas/4449qkpphffdo";
String fileName = "data/SampleInvoice1.pdf";

ByteString fileData = ByteString.readFrom(new FileInputStream(fileName));
createDocument(schemaName, fileData);

name
-------------------------------------------------------------------------
projects/41208676560/locations/us/documents/6fd1gpd2f51e0


## Document Deletion
Having looked at how we can create a document, we now look at how to delete a document.

In [34]:
public void deleteDocument(String documentName) {
  DeleteDocumentRequest deleteDocumentRequest = DeleteDocumentRequest.newBuilder()
    .setName(documentName)
    .setRequestMetadata(requestMetadata)
    .build();
  documentServiceClient.deleteDocument(deleteDocumentRequest);
} // deleteDocument

String documentName = "projects/41208676560/locations/us/documents/6fd1gpd2f51e0";

deleteDocument(documentName);

## Document Search
One of the most important features of Document AI Warehouse is the ability to search for documents.  In this fragment we perform a search and show the documents that matched.  The result of a search is an object that contains an iterrable that will walk us over the documents that matched.

In [37]:
public void searchDocuments(String query) {
  DocumentQuery documentQuery = DocumentQuery.newBuilder()
    .setQuery(query)
    .build();

  SearchDocumentsRequest searchDocumentsRequest = SearchDocumentsRequest.newBuilder()
    .setDocumentQuery(documentQuery)
    .setParent(parent.toString())
    .setRequestMetadata(requestMetadata)
    .build();

  DocumentServiceClient.SearchDocumentsPagedResponse response
    = documentServiceClient.searchDocuments(searchDocumentsRequest);
    
  System.out.println("display name    name");
  System.out.println("--------------- ------------------------------------------------------------------------");
  for (SearchDocumentsResponse.MatchingDocument matchingDocument: response.iterateAll()) {
    System.out.printf("%-15.15s %s\n",
      matchingDocument.getDocument().getDisplayName() , matchingDocument.getDocument().getName());
  }
} // searchDocuments
searchDocuments("12-345678");

display name    name
--------------- ------------------------------------------------------------------------
Invoice GCS 1   projects/41208676560/locations/us/documents/5uve1gj0vtrk8
Invoice GCS 1   projects/41208676560/locations/us/documents/4tba7qmdk0mag
Invoice GCS 1   projects/41208676560/locations/us/documents/4i3tjqjqj95og
Invoice GCS 1   projects/41208676560/locations/us/documents/3j8fs62gl2d0o
Invoice GCS 1   projects/41208676560/locations/us/documents/2rjtn7u6sp6s8
Invoice GCS 1   projects/41208676560/locations/us/documents/2nc617rhono70
Invoice GCS 1   projects/41208676560/locations/us/documents/1oljii2hnfdio
Invoice GCS 1   projects/41208676560/locations/us/documents/145ln3mgq2h18
Invoice GCS 1   projects/41208676560/locations/us/documents/0sjgojp1jomp0
Invoice GCS 1   projects/41208676560/locations/us/documents/0enlnenumms4o


## Document Creation with Doc AI
Next our example gets a little richer. This time we invoke Doc AI to process (parse) a document and pass the Doc AI Document results returned into Document AI Warehouse to store both the file and the parsed data.

In [44]:
public void createDocAIDocument(
  String schemaName,
  com.google.cloud.documentai.v1.Document docAiDocument,
  ByteString fileData) {

  Document document = Document.newBuilder()
    .setDisplayName("Invoice 1")
    .setTitle("My Invoice 1")
    .setDocumentSchemaName(schemaName)
    .setCloudAiDocument(docAiDocument)
    .setInlineRawDocument(fileData)
    .setRawDocumentFileType(RawDocumentFileType.RAW_DOCUMENT_FILE_TYPE_PDF)
    .setTextExtractionDisabled(true)
    .addProperties(Property.newBuilder()
      .setName("payee")
      .setTextValues(TextArray.newBuilder().addValues("Developer Company").build())
      .build())
    .addProperties(Property.newBuilder()
      .setName("payer")
      .setTextValues(TextArray.newBuilder().addValues("Buyer Company").build())
      .build())
    .build();

  RequestMetadata requestMetadata = RequestMetadata.newBuilder()
    .setUserInfo(UserInfo.newBuilder()
      .setId(USERID)
      .build())
    .build();
    
  CreateDocumentRequest createDocumentRequest = CreateDocumentRequest.newBuilder()
    .setDocument(document)
    .setParent(parent.toString())
    .setRequestMetadata(requestMetadata)
    .build();

  CreateDocumentResponse createDocumentResponse = documentServiceClient.createDocument(createDocumentRequest);

  System.out.println("name");
  System.out.println("-------------------------------------------------------------------------");
  System.out.println(createDocumentResponse.getDocument().getName());

} // createDocument

public com.google.cloud.documentai.v1.Document processDocAI(String processorName, ByteString fileData) {
  try {
    try (DocumentProcessorServiceClient documentProcessorServiceClient = DocumentProcessorServiceClient.create()) {
      RawDocument rawDocument = RawDocument.newBuilder()
        .setContent(fileData)
        .setMimeType("application/pdf")
        .build();
      ProcessRequest processRequest = ProcessRequest.newBuilder()
        .setName(processorName)
        .setRawDocument(rawDocument)
        .build();
      ProcessResponse response = documentProcessorServiceClient.processDocument(processRequest);
      return response.getDocument();
    }
  } catch(Exception e) {
    e.printStackTrace();
    return null;
  }
} // processDocAI


String processorName = "projects/41208676560/locations/us/processors/7bc4dc0dfcc7e040";
com.google.cloud.documentai.v1.Document docAiDocument = processDocAI(processorName, fileData);
createDocAIDocument(schemaName, docAiDocument, fileData);

name
-------------------------------------------------------------------------
projects/41208676560/locations/us/documents/6s9fc7l3re52g
