notebooks/en/analyzing_art_with_hf_and

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Analyzing Artistic Styles with Multimodal Embeddings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Authored by: [Jacob Marks](https://huggingface.co/jamarks)*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Art Analysis Cover Image](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_cover_image.jpg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Visual data like images is incredibly information-rich, but the unstructured nature of that data makes it difficult to analyze. \n", "\n", "In this notebook, we'll explore how to use multimodal embeddings and computed attributes to analyze artistic styles in images. We'll use the [WikiArt dataset](https://huggingface.co/datasets/huggan/wikiart) from 🤗 Hub, which we will load into FiftyOne for data analysis and visualization. We'll dive into the data in a variety of ways:\n", "\n", "- **Image Similarity Search and Semantic Search**: We'll generate multimodal embeddings for the images in the dataset using a pre-trained [CLIP](https://huggingface.co/openai/clip-vit-base-patch32) model from 🤗 Transformers and index the data to allow for unstructured searches.\n", "\n", "- **Clustering and Visualization**: We'll cluster the images based on their artistic style using the embeddings and visualize the results using UMAP dimensionality reduction.\n", "\n", "- **Uniqueness Analysis**: We'll use our embeddings to assign a uniqueness score to each image based on how similar it is to other images in the dataset.\n", "\n", "- **Image Quality Analysis**: We'll compute image quality metrics like brightness, contrast, and saturation for each image and see how these metrics correlate with the artistic style of the images." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Let's get started! 🚀" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To run this notebook, you'll need to install the following libraries:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -U transformers huggingface_hub fiftyone umap-learn" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To make downloads lightning-fast, install [HF Transfer](https://pypi.org/project/hf-transfer/):\n", "\n", "```bash\n", "pip install hf-transfer\n", "```\n", "\n", "And enable by setting the environment variable `HF_HUB_ENABLE_HF_TRANSFER`:\n", "\n", "```bash\n", "import os\n", "os.environ[\"HF_HUB_ENABLE_HF_TRANSFER\"] = \"1\"\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div class=\"alert alert-block alert-info\">\n", "<b>Note:</b> This notebook was tested with <code>transformers==4.40.0</code>, <code>huggingface_hub==0.22.2</code>, and <code>fiftyone==0.23.8</code>.\n", "</div>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's import the modules that we'll need for this notebook:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import fiftyone as fo # base library and app\n", "import fiftyone.zoo as foz # zoo datasets and models\n", "import fiftyone.brain as fob # ML routines\n", "from fiftyone import ViewField as F # for defining custom views\n", "import fiftyone.utils.huggingface as fouh # for loading datasets from Hugging Face" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll start by loading the WikiArt dataset from 🤗 Hub into FiftyOne. This dataset can also be loaded through Hugging Face's `datasets` library, but we'll use [FiftyOne's 🤗 Hub integration](https://docs.voxel51.com/integrations/huggingface.html#huggingface-hub) to get the data directly from the Datasets server. To make the computations fast, we'll just download the first $1,000$ samples." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "dataset = fouh.load_from_hub(\n", " \"huggan/wikiart\", ## repo_id\n", " format=\"parquet\", ## for Parquet format\n", " classification_fields=[\"artist\", \"style\", \"genre\"], # columns to store as classification fields\n", " max_samples=1000, # number of samples to load\n", " name=\"wikiart\", # name of the dataset in FiftyOne\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Print out a summary of the dataset to see what it contains:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Name: wikiart\n", "Media type: image\n", "Num samples: 1000\n", "Persistent: False\n", "Tags: []\n", "Sample fields:\n", " id: fiftyone.core.fields.ObjectIdField\n", " filepath: fiftyone.core.fields.StringField\n", " tags: fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)\n", " metadata: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)\n", " artist: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)\n", " style: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)\n", " genre: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)\n", " row_idx: fiftyone.core.fields.IntField\n" ] } ], "source": [ "print(dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Visualize the dataset in the [FiftyOne App](https://docs.voxel51.com/user_guide/app.html):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "session = fo.launch_app(dataset)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![WikiArt Dataset](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_wikiart_dataset.jpg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's list out the names of the artists whose styles we'll be analyzing:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['Unknown Artist', 'albrecht-durer', 'boris-kustodiev', 'camille-pissarro', 'childe-hassam', 'claude-monet', 'edgar-degas', 'eugene-boudin', 'gustave-dore', 'ilya-repin', 'ivan-aivazovsky', 'ivan-shishkin', 'john-singer-sargent', 'marc-chagall', 'martiros-saryan', 'nicholas-roerich', 'pablo-picasso', 'paul-cezanne', 'pierre-auguste-renoir', 'pyotr-konchalovsky', 'raphael-kirchner', 'rembrandt', 'salvador-dali', 'vincent-van-gogh']\n" ] } ], "source": [ "artists = dataset.distinct(\"artist.label\")\n", "print(artists)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Finding Similar Artwork" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When you find a piece of art that you like, it's natural to want to find similar pieces. We can do this with vector embeddings! What's more, by using multimodal embeddings, we will unlock the ability to find paintings that closely resemble a given text query, which could be a description of a painting or even a poem.\n", "\n", "Let's generate multimodal embeddings for the images using a pre-trained CLIP Vision Transformer (ViT) model from 🤗 Transformers. Running `compute_similarity()` from the [FiftyOne Brain](https://docs.voxel51.com/user_guide/brain.html) will compute these embeddings and use them to generate a similarity index on the dataset." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Computing embeddings...\n", " 100% |███████████████| 1000/1000 [5.0m elapsed, 0s remaining, 3.3 samples/s] \n" ] }, { "data": { "text/plain": [ "<fiftyone.brain.internal.core.sklearn.SklearnSimilarityIndex at 0x2ad67ecd0>" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fob.compute_similarity(\n", " dataset, \n", " model=\"zero-shot-classification-transformer-torch\", ## type of model to load from model zoo\n", " name_or_path=\"openai/clip-vit-base-patch32\", ## repo_id of checkpoint\n", " embeddings=\"clip_embeddings\", ## name of the field to store embeddings\n", " brain_key=\"clip_sim\", ## key to store similarity index info\n", " batch_size=32, ## batch size for inference\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div style=\"padding: 10px; border-left: 5px solid #0078d4; font-family: Arial, sans-serif; margin: 10px 0;\">\n", "\n", "Alternatively, you could load the model directly from the 🤗 Transformers library and pass the model in directly:\n", "\n", "```python\n", "from transformers import CLIPModel\n", "model = CLIPModel.from_pretrained(\"openai/clip-vit-base-patch32\")\n", "fob.compute_similarity(\n", " dataset, \n", " model=model,\n", " embeddings=\"clip_embeddings\", ## name of the field to store embeddings\n", " brain_key=\"clip_sim\" ## key to store similarity index info\n", ")\n", "```\n", "\n", "For a comprehensive guide to this and more, check out <a href=\"https://docs.voxel51.com/integrations/huggingface.html#transformers-library\">FiftyOne's 🤗 Transformers integration</a>.\n", "</div>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Refresh the FiftyOne App, select the checkbox for an image in the sample grid, and click the photo icon to see the most similar images in the dataset. On the backend, clicking this button triggers a query to the similarity index to find the most similar images to the selected image, based on the pre-computed embeddings, and displays them in the App." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Image Similarity Search](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_image_search.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use this to see what art pieces are most similar to a given art piece. This can be useful for finding similar art pieces (to recommend to users or add to a collection) or getting inspiration for a new piece." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But there's more! Because CLIP is multimodal, we can also use it to perform semantic searches. This means we can search for images based on text queries. For example, we can search for \"pastel trees\" and see all the images in the dataset that are similar to that query. To do this, click on the search icon in the FiftyOne App and enter a text query:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Semantic Search](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_semantic_search.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Behind the scenes, the text is tokenized, embedded with CLIP's text encoder, and then used to query the similarity index to find the most similar images in the dataset. This is a powerful way to search for images based on text queries and can be useful for finding images that match a particular theme or style. And this is not limited to CLIP; you can use any CLIP-like model from 🤗 Transformers that can generate embeddings for images and text!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div class=\"alert alert-block alert-info\">\n", "💡 For efficient vector search and indexing over large datasets, FiftyOne has native <a href=\"https://voxel51.com/vector-search\">integrations with open source vector databases</a>.\n", "</div>\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Uncovering Artistic Motifs with Clustering and Visualization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By performing similarity and semantic searches, we can begin to interact with the data more effectively. But we can also take this a step further and add some unsupervised learning into the mix. This will help us identify artistic patterns in the WikiArt dataset, from stylistic, to topical, and even motifs that are hard to put into words. \n", "\n", "We will do this in two ways:\n", "\n", "1. **Dimensionality Reduction**: We'll use UMAP to reduce the dimensionality of the embeddings to 2D and visualize the data in a scatter plot. This will allow us to see how the images cluster based on their style, genre, and artist.\n", "2. **Clustering**: We'll use K-Means clustering to cluster the images based on their embeddings and see what groups emerge." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For dimensionality reduction, we will run `compute_visualization()` from the FiftyOne Brain, passing in the previously computed embeddings. We specify `method=\"umap\"` to use UMAP for dimensionality reduction, but we could also use PCA or t-SNE:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generating visualization...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/opt/homebrew/Caskroom/miniforge/base/envs/fdev/lib/python3.9/site-packages/numba/cpython/hashing.py:482: UserWarning: FNV hashing is not implemented in Numba. See PEP 456 https://www.python.org/dev/peps/pep-0456/ for rationale over not using FNV. Numba will continue to work, but hashes for built in types will be computed using siphash24. This will permit e.g. dictionaries to continue to behave as expected, however anything relying on the value of the hash opposed to hash as a derived property is likely to not work as expected.\n", " warnings.warn(msg)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "UMAP( verbose=True)\n", "Tue Apr 30 11:51:45 2024 Construct fuzzy simplicial set\n", "Tue Apr 30 11:51:46 2024 Finding Nearest Neighbors\n", "Tue Apr 30 11:51:47 2024 Finished Nearest Neighbor Search\n", "Tue Apr 30 11:51:48 2024 Construct embedding\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "98dde3df324249df91f3336c913b409a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Epochs completed: 0%| 0/500 [00:00]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\tcompleted 0 / 500 epochs\n", "\tcompleted 50 / 500 epochs\n", "\tcompleted 100 / 500 epochs\n", "\tcompleted 150 / 500 epochs\n", "\tcompleted 200 / 500 epochs\n", "\tcompleted 250 / 500 epochs\n", "\tcompleted 300 / 500 epochs\n", "\tcompleted 350 / 500 epochs\n", "\tcompleted 400 / 500 epochs\n", "\tcompleted 450 / 500 epochs\n", "Tue Apr 30 11:51:49 2024 Finished embedding\n" ] }, { "data": { "text/plain": [ "<fiftyone.brain.visualization.VisualizationResults at 0x29f468760>" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fob.compute_visualization(dataset, embeddings=\"clip_embeddings\", method=\"umap\", brain_key=\"clip_vis\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can open a panel in the FiftyOne App, where we will see one 2D point for each image in the dataset. We can color the points by any field in the dataset, such as the artist or genre, to see how strongly these attributes are captured by our image features:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![UMAP Visualization](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_visualize_embeddings.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also run clustering on the embeddings to group similar images together — perhaps the dominant features of these works of art are not captured by the existing labels, or maybe there are distinct sub-genres that we want to identify. To cluster our data, we will need to download the [FiftyOne Clustering Plugin](https://github.com/jacobmarks/clustering-plugin):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!fiftyone plugins download https://github.com/jacobmarks/clustering-plugin" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Refreshing the app again, we can then access the clustering functionality via an operator in the app. Hit the backtick key to open the operator list, type \"cluster\" and select the operator from the dropdown. This will open an interactive panel where we can specify the clustering algorithm, hyperparameters, and the field to cluster on. To keep it simple, we'll use K-Means clustering with $10$ clusters." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can then visualize the clusters in the app and see how the images group together based on their embeddings:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![K-means Clustering](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_clustering.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that some of the clusters select for artist; others select for genre or style. Others are more abstract and may represent sub-genres or other groupings that are not immediately obvious from the data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Identifying the Most Unique Works of Art" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One interesting question we can ask about our dataset is how *unique* each image is. This question is important for many applications, such as recommending similar images, detecting duplicates, or identifying outliers. In the context of art, how unique a painting is could be an important factor in determining its value.\n", "\n", "While there are a million ways to characterize uniqueness, our image embeddings allow us to quantitatively assign each sample a uniqueness score based on how similar it is to other samples in the dataset. Explicitly, the FiftyOne Brain's `compute_uniqueness()` function looks at the distance between each sample's embedding and its nearest neighbors, and computes a score between $0$ and $1$ based on this distance. A score of $0$ means the sample is nondescript or very similar to others, while a score of $1$ means the sample is very unique." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Computing uniqueness...\n", "Uniqueness computation complete\n" ] } ], "source": [ "fob.compute_uniqueness(dataset, embeddings=\"clip_embeddings\") # compute uniqueness using CLIP embeddings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can then color by this in the embeddings panel, filter by uniqueness score, or even sort by it to see the most unique images in the dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "most_unique_view = dataset.sort_by(\"uniqueness\", reverse=True)\n", "session.view = most_unique_view.view() # Most unique images" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Most Unique Images](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_most_unique.jpg)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "least_unique_view = dataset.sort_by(\"uniqueness\", reverse=False)\n", "session.view = least_unique_view.view() # Least unique images" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Least Unique Images](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_least_unique.jpg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Going a step further, we can also answer the question of which artist tends to produce the most unique works. We can compute the average uniqueness score for each artist across all of their works of art:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Unknown Artist: 0.7932221632002723\n", "boris-kustodiev: 0.7480731948424676\n", "salvador-dali: 0.7368807620414014\n", "raphael-kirchner: 0.7315448102204755\n", "ilya-repin: 0.7204744626806383\n", "marc-chagall: 0.7169373812321908\n", "rembrandt: 0.715205220292227\n", "martiros-saryan: 0.708560775790436\n", "childe-hassam: 0.7018343391132756\n", "edgar-degas: 0.699912746806587\n", "albrecht-durer: 0.6969358680800216\n", "john-singer-sargent: 0.6839955708720844\n", "pablo-picasso: 0.6835137858302969\n", "pyotr-konchalovsky: 0.6780653000855895\n", "nicholas-roerich: 0.6676504687452387\n", "ivan-aivazovsky: 0.6484361530090199\n", "vincent-van-gogh: 0.6472004520699081\n", "gustave-dore: 0.6307283287457358\n", "pierre-auguste-renoir: 0.6271467146993583\n", "paul-cezanne: 0.6251076007168186\n", "eugene-boudin: 0.6103397516167454\n", "camille-pissarro: 0.6046182609119615\n", "claude-monet: 0.5998234558947573\n", "ivan-shishkin: 0.589796389836674\n" ] } ], "source": [ "artist_unique_scores = {\n", " artist: dataset.match(F(\"artist.label\") == artist).mean(\"uniqueness\")\n", " for artist in artists\n", "}\n", "\n", "sorted_artists = sorted(\n", " artist_unique_scores, key=artist_unique_scores.get, reverse=True\n", ")\n", "\n", "for artist in sorted_artists:\n", " print(f\"{artist}: {artist_unique_scores[artist]}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It would seem that the artist with the most unique works in our dataset is Boris Kustodiev! Let's take a look at some of his works:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "kustodiev_view = dataset.match(F(\"artist.label\") == \"boris-kustodiev\")\n", "session.view = kustodiev_view.view()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Boris Kustodiev Artwork](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_kustodiev_view.jpg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Characterizing Art with Visual Qualities" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To round things out, let's go back to the basics and analyze some core qualities of the images in our dataset. We'll compute standard metrics like brightness, contrast, and saturation for each image and see how these metrics correlate with the artistic style and genre of the art pieces." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To run these analyses, we will need to download the [FiftyOne Image Quality Plugin](https://github.com/jacobmarks/image-quality-issues):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!fiftyone plugins download https://github.com/jacobmarks/image-quality-issues/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Refresh the app and open the operators list again. This time type `compute` and select one of the image quality operators. We'll start with brightness:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Compute Brightness](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_compute_brightness.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When the operator finishes running, we will have a new field in our dataset that contains the brightness score for each image. We can then visualize this data in the app:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Brightness](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_brightness.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also color by brightness, and even see how it correlates with other fields in the dataset like style:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Style by Brightness](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_style_by_brightness.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now do the same for contrast and saturation. Here are the results for saturation:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Filter by Saturation](https://huggingface.co/datasets/huggingface/cookbook-images/resolve/main/art_analysis_filter_by_saturation.jpg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hopefully this illustrates how not everything boils down to applying deep neural networks to your data. Sometimes, simple metrics can be just as informative and can provide a different perspective on your data 🤓!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div class=\"alert alert-block alert-info\">\n", "📚 For larger datasets, you may want to <a href=\"https://docs.voxel51.com/plugins/using_plugins.html#delegated-operations\">delegate the operations</a> for later execution.\n", "</div>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What's Next?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook, we've explored how to use multimodal embeddings, unsupervised learning, and traditional image processing techniques to analyze artistic styles in images. We've seen how to perform image similarity and semantic searches, cluster images based on their style, analyze the uniqueness of images, and compute image quality metrics. These techniques can be applied to a wide range of visual datasets, from art collections to medical images to satellite imagery. Try [loading a different dataset from the Hugging Face Hub](https://docs.voxel51.com/integrations/huggingface.html#loading-datasets-from-the-hub) and see what insights you can uncover!\n", "\n", "If you want to go even further, here are some additional analyses you could try:\n", "\n", "- **Zero-Shot Classification**: Use a pre-trained vision-language model from 🤗 Transformers to categorize images in the dataset by topic or subject, without any training data. Check out this [Zero-Shot Classification tutorial](https://docs.voxel51.com/tutorials/zero_shot_classification.html) for more info.\n", "- **Image Captioning**: Use a pre-trained vision-language model from 🤗 Transformers to generate captions for the images in the dataset. Then use this for topic modeling or cluster artwork based on embeddings for these captions. Check out FiftyOne's [Image Captioning Plugin](https://github.com/jacobmarks/fiftyone-image-captioning-plugin) for more info." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 📚 Resources" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- [FiftyOne 🤝 🤗 Hub Integration](https://docs.voxel51.com/integrations/huggingface.html#huggingface-hub)\n", "- [FiftyOne 🤝 🤗 Transformers Integration](https://docs.voxel51.com/integrations/huggingface.html#transformers-library)\n", "- [FiftyOne Vector Search Integrations](https://voxel51.com/vector-search/)\n", "- [Visualizing Data with Dimensionality Reduction Techniques](https://docs.voxel51.com/tutorials/dimension_reduction.html)\n", "- [Clustering Images with Embeddings](https://docs.voxel51.com/tutorials/clustering.html)\n", "- [Exploring Image Uniqueness with FiftyOne](https://docs.voxel51.com/tutorials/uniqueness.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## FiftyOne Open Source Project" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[FiftyOne](https://github.com/voxel51/fiftyone/) is the leading open source toolkit for building high-quality datasets and computer vision models. With over 2M downloads, FiftyOne is trusted by developers and researchers across the globe.\n", "\n", "💪 The FiftyOne team welcomes contributions from the open source community! If you're interested in contributing to FiftyOne, check out the [contributing guide](https://github.com/voxel51/fiftyone/blob/develop/CONTRIBUTING.md)." ] } ], "metadata": { "kernelspec": { "display_name": "fdev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 2 }

notebooks/en/analyzing_art_with_hf_and_fiftyone.ipynb (838 lines of code) (raw):