huggingface / cosmopedia
Components

An overview of source code logical components.

Intro

Logical decomposition is a representation of the organization of the main source code, where every and each file is put in exactly one logical component.

Logical Decompositions Overview

Analyzed system has 1 logical decomposition:

Logical Decomposition #1: PRIMARY

The decompositions is based on the folder structure at level 1 (relative to the source code root).

Bubble Chart | Tree Map
Component Sizes (Lines of Code)
The "primary" logical decomposition has 6 components.
  • 20 files, 2,536 LOC (100.0% vs. main code).
  • "prompts" is biggest, containing 67.55% of LOC.
  • "deduplication" is smallest, containing 3.67% of LOC.


prompts1713 LOC (67%) 12 files
generation220 LOC (8%) 2 files
fulltext_search219 LOC (8%) 2 files
classification192 LOC (7%) 2 files
decontamination99 LOC (3%) 1 file
deduplication93 LOC (3%) 1 file
Component Commits
Components ordered by number of commits
Total Commits per Component
All commits, some commits may include files from multiple components.
generation3 commits (5%)
decontamination3 commits (5%)
classification3 commits (5%)
deduplication2 commits (3%)
prompts2 commits (3%)
fulltext_search1 commits (1%)
Yearly File Updates Trend per Components
The number of file changes in commits
animated commit history: all time cumulative | 12 months window
2025 2024
prompts
24
classification
5
generation
3
decontamination
3
fulltext_search
2
deduplication
2


Dependencies between components in same commits (past 180 days)
The number on the lines shows the number of shared commits.
See detailed temporal dependencies report...

No temporal dependencies found.



2025-06-30 09:07