azure / gpt-rag-ingestion
File Change Frequency

File change frequency (churn) shows the distribution of file updates (days with at least one commit).

Overview
File Change Frequency Overall
  • There are 31 files with 4,143 lines of code.
    • 0 files changed more than 100 times (0 lines of code)
    • 1 file changed 51-100 times (1,187 lines of code)
    • 3 files changed 21-50 times (327 lines of code)
    • 16 files changed 6-20 times (1,514 lines of code)
    • 11 files changed 1-5 times (1,115 lines of code)
0% | 28% | 7% | 36% | 26%
Legend:
101+
51-100
21-50
6-20
1-5

explore: grouped by folders | grouped by update frequency | data
Contributors Count Frequency Overall
  • There are 31 files with 4,143 lines of code.
    • 0 files changed by more than 25 contributors (0 lines of code)
    • 0 files changed by 11-25 contributors (0 lines of code)
    • 0 files changed by 6-10 contributors (0 lines of code)
    • 30 files changed by 2-5 contributors (4,045 lines of code)
    • 1 file changed by 1 contributor (98 lines of code)
0% | 0% | 0% | 97% | 2%
Legend:
26+
11-25
6-10
2-5
1

explore: grouped by folders | grouped by contributors count | data
File Change Frequency per File Extension
json, py, md, ps1, txt, sh, vtt, gitignore, yaml
File Change Frequency per Extension
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
py0% | 28% | 7% | 36% | 26%
yaml0% | 0% | 0% | 100% | 0%
ps10% | 0% | 0% | 0% | 100%
File Change Frequency per Logical Decomposition
primary
primary (file change frequency)
The number of recorded file updates
101+
51-100
21-50
6-20
1-5
ROOT0% | 83% | 12% | 1% | 3%
chunking0% | 0% | 15% | 49% | 35%
tools0% | 0% | 0% | 94% | 5%
connectors0% | 0% | 0% | 23% | 76%
utils0% | 0% | 0% | 86% | 13%
scripts0% | 0% | 0% | 0% | 100%
Most Frequently Changed Files (Top 31)

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
setup.py
in root
1187 6 2023-08-16 2025-03-12 58 4 pclacerda@gmail.com pclacerda@gmail.com
172 3 2023-08-15 2025-03-15 37 5 pclacerda@gmail.com pclacerda@gmail.com
54 3 2024-08-27 2025-03-02 21 2 pclacerda@gmail.com pclacerda@gmail.com
base_chunker.py
in chunking/chunkers
101 7 2024-08-27 2025-03-28 21 2 pclacerda@gmail.com pclacerda@gmail.com
langchain_chunker.py
in chunking/chunkers
70 3 2024-08-27 2025-03-02 16 2 pclacerda@gmail.com pclacerda@gmail.com
spreadsheet_chunker.py
in chunking/chunkers
149 5 2024-08-27 2025-01-07 16 2 pclacerda@gmail.com pclacerda@gmail.com
transcription_chunker.py
in chunking/chunkers
57 4 2024-08-27 2025-02-09 15 2 pclacerda@gmail.com pclacerda@gmail.com
doc_analysis_chunker.py
in chunking/chunkers
135 12 2024-08-27 2025-02-09 15 2 pclacerda@gmail.com pclacerda@gmail.com
aoai.py
in tools
158 5 2024-08-27 2025-01-15 15 2 pclacerda@gmail.com pclacerda@gmail.com
305 6 2024-08-27 2025-03-04 15 2 pclacerda@gmail.com pclacerda@gmail.com
blob.py
in tools
160 9 2024-08-27 2025-01-15 14 2 pclacerda@gmail.com pclacerda@gmail.com
19 4 2023-09-21 2025-03-28 12 3 ortizdezarate.joaquin@gmail... pclacerda@gmail.com
52 5 2024-08-27 2025-03-02 12 2 pclacerda@gmail.com pclacerda@gmail.com
sharepoint_files_indexer.py
in connectors/sharepoint
195 1 2024-12-09 2025-03-04 11 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in tools
7 - 2024-08-27 2025-01-15 10 2 pclacerda@gmail.com pclacerda@gmail.com
nl2sql_chunker.py
in chunking/chunkers
36 2 2024-09-30 2025-02-09 10 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in connectors
4 - 2024-11-19 2025-01-15 7 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in chunking
2 - 2024-08-27 2025-01-07 6 2 pclacerda@gmail.com pclacerda@gmail.com
azure.yaml
in root
24 - 2023-11-26 2025-01-07 6 3 gbecerra@hotmail.com.ar pclacerda@gmail.com
aisearch.py
in tools
141 1 2024-12-09 2025-01-15 6 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in utils
3 - 2024-12-09 2025-03-02 5 2 pclacerda@gmail.com pclacerda@gmail.com
keyvault.py
in tools
44 1 2024-11-19 2025-01-07 5 2 pclacerda@gmail.com pclacerda@gmail.com
47 1 2024-12-09 2025-03-02 5 2 pclacerda@gmail.com pclacerda@gmail.com
sharepoint_deleted_files_purger.py
in connectors/sharepoint
216 1 2024-12-09 2025-03-02 5 2 pclacerda@gmail.com pclacerda@gmail.com
multimodal_chunker.py
in chunking/chunkers
263 13 2025-01-06 2025-03-28 5 2 pclacerda@gmail.com pclacerda@gmail.com
sharepoint_data_reader.py
in connectors/sharepoint
375 20 2024-12-09 2025-03-04 5 2 pclacerda@gmail.com pclacerda@gmail.com
exceptions.py
in chunking
2 - 2024-08-27 2025-01-07 4 2 pclacerda@gmail.com pclacerda@gmail.com
postdeploy.ps1
in scripts
5 - 2023-11-26 2023-12-27 3 3 gbecerra@hotmail.com.ar pclacerda@gmail.com
55 1 2025-01-06 2025-01-15 3 2 pclacerda@gmail.com pclacerda@gmail.com
7 - 2023-11-26 2023-11-26 1 2 gbecerra@hotmail.com.ar vhvb1989@gmail.com
json_chunker.py
in chunking/chunkers
98 3 2025-03-02 2025-03-02 1 1 pclacerda@gmail.com pclacerda@gmail.com
Files With Most Contributors (Top 31)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
172 3 2023-08-15 2025-03-15 37 5 pclacerda@gmail.com pclacerda@gmail.com
setup.py
in root
1187 6 2023-08-16 2025-03-12 58 4 pclacerda@gmail.com pclacerda@gmail.com
19 4 2023-09-21 2025-03-28 12 3 ortizdezarate.joaquin@gmail... pclacerda@gmail.com
azure.yaml
in root
24 - 2023-11-26 2025-01-07 6 3 gbecerra@hotmail.com.ar pclacerda@gmail.com
postdeploy.ps1
in scripts
5 - 2023-11-26 2023-12-27 3 3 gbecerra@hotmail.com.ar pclacerda@gmail.com
base_chunker.py
in chunking/chunkers
101 7 2024-08-27 2025-03-28 21 2 pclacerda@gmail.com pclacerda@gmail.com
54 3 2024-08-27 2025-03-02 21 2 pclacerda@gmail.com pclacerda@gmail.com
spreadsheet_chunker.py
in chunking/chunkers
149 5 2024-08-27 2025-01-07 16 2 pclacerda@gmail.com pclacerda@gmail.com
langchain_chunker.py
in chunking/chunkers
70 3 2024-08-27 2025-03-02 16 2 pclacerda@gmail.com pclacerda@gmail.com
305 6 2024-08-27 2025-03-04 15 2 pclacerda@gmail.com pclacerda@gmail.com
aoai.py
in tools
158 5 2024-08-27 2025-01-15 15 2 pclacerda@gmail.com pclacerda@gmail.com
doc_analysis_chunker.py
in chunking/chunkers
135 12 2024-08-27 2025-02-09 15 2 pclacerda@gmail.com pclacerda@gmail.com
transcription_chunker.py
in chunking/chunkers
57 4 2024-08-27 2025-02-09 15 2 pclacerda@gmail.com pclacerda@gmail.com
blob.py
in tools
160 9 2024-08-27 2025-01-15 14 2 pclacerda@gmail.com pclacerda@gmail.com
52 5 2024-08-27 2025-03-02 12 2 pclacerda@gmail.com pclacerda@gmail.com
sharepoint_files_indexer.py
in connectors/sharepoint
195 1 2024-12-09 2025-03-04 11 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in tools
7 - 2024-08-27 2025-01-15 10 2 pclacerda@gmail.com pclacerda@gmail.com
nl2sql_chunker.py
in chunking/chunkers
36 2 2024-09-30 2025-02-09 10 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in connectors
4 - 2024-11-19 2025-01-15 7 2 pclacerda@gmail.com pclacerda@gmail.com
aisearch.py
in tools
141 1 2024-12-09 2025-01-15 6 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in chunking
2 - 2024-08-27 2025-01-07 6 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in utils
3 - 2024-12-09 2025-03-02 5 2 pclacerda@gmail.com pclacerda@gmail.com
keyvault.py
in tools
44 1 2024-11-19 2025-01-07 5 2 pclacerda@gmail.com pclacerda@gmail.com
sharepoint_deleted_files_purger.py
in connectors/sharepoint
216 1 2024-12-09 2025-03-02 5 2 pclacerda@gmail.com pclacerda@gmail.com
sharepoint_data_reader.py
in connectors/sharepoint
375 20 2024-12-09 2025-03-04 5 2 pclacerda@gmail.com pclacerda@gmail.com
47 1 2024-12-09 2025-03-02 5 2 pclacerda@gmail.com pclacerda@gmail.com
multimodal_chunker.py
in chunking/chunkers
263 13 2025-01-06 2025-03-28 5 2 pclacerda@gmail.com pclacerda@gmail.com
exceptions.py
in chunking
2 - 2024-08-27 2025-01-07 4 2 pclacerda@gmail.com pclacerda@gmail.com
55 1 2025-01-06 2025-01-15 3 2 pclacerda@gmail.com pclacerda@gmail.com
7 - 2023-11-26 2023-11-26 1 2 gbecerra@hotmail.com.ar vhvb1989@gmail.com
json_chunker.py
in chunking/chunkers
98 3 2025-03-02 2025-03-02 1 1 pclacerda@gmail.com pclacerda@gmail.com
Files With Least Contributors (Top 31)
Based on the number of unique email addresses found in commits.

See data for all files...

File# lines# unitscreatedlast modified# changes
(days)
# contributorsfirst
contributor
latest
contributor
json_chunker.py
in chunking/chunkers
98 3 2025-03-02 2025-03-02 1 1 pclacerda@gmail.com pclacerda@gmail.com
sharepoint_data_reader.py
in connectors/sharepoint
375 20 2024-12-09 2025-03-04 5 2 pclacerda@gmail.com pclacerda@gmail.com
305 6 2024-08-27 2025-03-04 15 2 pclacerda@gmail.com pclacerda@gmail.com
multimodal_chunker.py
in chunking/chunkers
263 13 2025-01-06 2025-03-28 5 2 pclacerda@gmail.com pclacerda@gmail.com
sharepoint_deleted_files_purger.py
in connectors/sharepoint
216 1 2024-12-09 2025-03-02 5 2 pclacerda@gmail.com pclacerda@gmail.com
sharepoint_files_indexer.py
in connectors/sharepoint
195 1 2024-12-09 2025-03-04 11 2 pclacerda@gmail.com pclacerda@gmail.com
blob.py
in tools
160 9 2024-08-27 2025-01-15 14 2 pclacerda@gmail.com pclacerda@gmail.com
aoai.py
in tools
158 5 2024-08-27 2025-01-15 15 2 pclacerda@gmail.com pclacerda@gmail.com
spreadsheet_chunker.py
in chunking/chunkers
149 5 2024-08-27 2025-01-07 16 2 pclacerda@gmail.com pclacerda@gmail.com
aisearch.py
in tools
141 1 2024-12-09 2025-01-15 6 2 pclacerda@gmail.com pclacerda@gmail.com
doc_analysis_chunker.py
in chunking/chunkers
135 12 2024-08-27 2025-02-09 15 2 pclacerda@gmail.com pclacerda@gmail.com
base_chunker.py
in chunking/chunkers
101 7 2024-08-27 2025-03-28 21 2 pclacerda@gmail.com pclacerda@gmail.com
langchain_chunker.py
in chunking/chunkers
70 3 2024-08-27 2025-03-02 16 2 pclacerda@gmail.com pclacerda@gmail.com
transcription_chunker.py
in chunking/chunkers
57 4 2024-08-27 2025-02-09 15 2 pclacerda@gmail.com pclacerda@gmail.com
55 1 2025-01-06 2025-01-15 3 2 pclacerda@gmail.com pclacerda@gmail.com
54 3 2024-08-27 2025-03-02 21 2 pclacerda@gmail.com pclacerda@gmail.com
52 5 2024-08-27 2025-03-02 12 2 pclacerda@gmail.com pclacerda@gmail.com
47 1 2024-12-09 2025-03-02 5 2 pclacerda@gmail.com pclacerda@gmail.com
keyvault.py
in tools
44 1 2024-11-19 2025-01-07 5 2 pclacerda@gmail.com pclacerda@gmail.com
nl2sql_chunker.py
in chunking/chunkers
36 2 2024-09-30 2025-02-09 10 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in tools
7 - 2024-08-27 2025-01-15 10 2 pclacerda@gmail.com pclacerda@gmail.com
7 - 2023-11-26 2023-11-26 1 2 gbecerra@hotmail.com.ar vhvb1989@gmail.com
__init__.py
in connectors
4 - 2024-11-19 2025-01-15 7 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in utils
3 - 2024-12-09 2025-03-02 5 2 pclacerda@gmail.com pclacerda@gmail.com
__init__.py
in chunking
2 - 2024-08-27 2025-01-07 6 2 pclacerda@gmail.com pclacerda@gmail.com
exceptions.py
in chunking
2 - 2024-08-27 2025-01-07 4 2 pclacerda@gmail.com pclacerda@gmail.com
azure.yaml
in root
24 - 2023-11-26 2025-01-07 6 3 gbecerra@hotmail.com.ar pclacerda@gmail.com
19 4 2023-09-21 2025-03-28 12 3 ortizdezarate.joaquin@gmail... pclacerda@gmail.com
postdeploy.ps1
in scripts
5 - 2023-11-26 2023-12-27 3 3 gbecerra@hotmail.com.ar pclacerda@gmail.com
setup.py
in root
1187 6 2023-08-16 2025-03-12 58 4 pclacerda@gmail.com pclacerda@gmail.com
172 3 2023-08-15 2025-03-15 37 5 pclacerda@gmail.com pclacerda@gmail.com
Correlations

File Size vs. Number of Changes: 31 points

chunking/chunkers/base_chunker.py x: 101 lines of code y: 21 # changes chunking/chunkers/multimodal_chunker.py x: 263 lines of code y: 5 # changes utils/file_utils.py x: 19 lines of code y: 12 # changes function_app.py x: 172 lines of code y: 37 # changes setup.py x: 1187 lines of code y: 58 # changes connectors/sharepoint/sharepoint_data_reader.py x: 375 lines of code y: 5 # changes connectors/sharepoint/sharepoint_files_indexer.py x: 195 lines of code y: 11 # changes tools/doc_intelligence.py x: 305 lines of code y: 15 # changes chunking/chunker_factory.py x: 54 lines of code y: 21 # changes chunking/chunkers/json_chunker.py x: 98 lines of code y: 1 # changes chunking/chunkers/langchain_chunker.py x: 70 lines of code y: 16 # changes chunking/document_chunking.py x: 52 lines of code y: 12 # changes connectors/sharepoint/sharepoint_deleted_files_purger.py x: 216 lines of code y: 5 # changes run_sharepoint.py x: 47 lines of code y: 5 # changes utils/__init__.py x: 3 lines of code y: 5 # changes chunking/chunkers/doc_analysis_chunker.py x: 135 lines of code y: 15 # changes chunking/chunkers/nl2sql_chunker.py x: 36 lines of code y: 10 # changes chunking/chunkers/transcription_chunker.py x: 57 lines of code y: 15 # changes connectors/__init__.py x: 4 lines of code y: 7 # changes connectors/images_deleted_files_purger.py x: 55 lines of code y: 3 # changes tools/__init__.py x: 7 lines of code y: 10 # changes tools/aisearch.py x: 141 lines of code y: 6 # changes tools/aoai.py x: 158 lines of code y: 15 # changes tools/blob.py x: 160 lines of code y: 14 # changes azure.yaml x: 24 lines of code y: 6 # changes chunking/__init__.py x: 2 lines of code y: 6 # changes chunking/chunkers/spreadsheet_chunker.py x: 149 lines of code y: 16 # changes chunking/exceptions.py x: 2 lines of code y: 4 # changes tools/keyvault.py x: 44 lines of code y: 5 # changes scripts/postdeploy.ps1 x: 5 lines of code y: 3 # changes scripts/preprovision.ps1 x: 7 lines of code y: 1 # changes
58.0
# changes
  min: 1.0
  average: 11.77
  25th percentile: 5.0
  median: 10.0
  75th percentile: 15.0
  max: 58.0
0 1187.0
lines of code
min: 2.0 | average: 133.65 | 25th percentile: 19.0 | median: 57.0 | 75th percentile: 160.0 | max: 1187.0

Number of Contributors vs. Number of Changes: 31 points

chunking/chunkers/base_chunker.py x: 2 # contributors y: 21 # changes chunking/chunkers/multimodal_chunker.py x: 2 # contributors y: 5 # changes utils/file_utils.py x: 3 # contributors y: 12 # changes function_app.py x: 5 # contributors y: 37 # changes setup.py x: 4 # contributors y: 58 # changes connectors/sharepoint/sharepoint_files_indexer.py x: 2 # contributors y: 11 # changes tools/doc_intelligence.py x: 2 # contributors y: 15 # changes chunking/chunkers/json_chunker.py x: 1 # contributors y: 1 # changes chunking/chunkers/langchain_chunker.py x: 2 # contributors y: 16 # changes chunking/document_chunking.py x: 2 # contributors y: 12 # changes chunking/chunkers/nl2sql_chunker.py x: 2 # contributors y: 10 # changes connectors/__init__.py x: 2 # contributors y: 7 # changes connectors/images_deleted_files_purger.py x: 2 # contributors y: 3 # changes tools/aisearch.py x: 2 # contributors y: 6 # changes tools/blob.py x: 2 # contributors y: 14 # changes azure.yaml x: 3 # contributors y: 6 # changes chunking/exceptions.py x: 2 # contributors y: 4 # changes scripts/postdeploy.ps1 x: 3 # contributors y: 3 # changes scripts/preprovision.ps1 x: 2 # contributors y: 1 # changes
58.0
# changes
  min: 1.0
  average: 11.77
  25th percentile: 5.0
  median: 10.0
  75th percentile: 15.0
  max: 58.0
0 5.0
# contributors
min: 1.0 | average: 2.23 | 25th percentile: 2.0 | median: 2.0 | 75th percentile: 2.0 | max: 5.0

Number of Contributors vs. File Size: 31 points

chunking/chunkers/base_chunker.py x: 2 # contributors y: 101 lines of code chunking/chunkers/multimodal_chunker.py x: 2 # contributors y: 263 lines of code utils/file_utils.py x: 3 # contributors y: 19 lines of code function_app.py x: 5 # contributors y: 172 lines of code setup.py x: 4 # contributors y: 1187 lines of code connectors/sharepoint/sharepoint_data_reader.py x: 2 # contributors y: 375 lines of code connectors/sharepoint/sharepoint_files_indexer.py x: 2 # contributors y: 195 lines of code tools/doc_intelligence.py x: 2 # contributors y: 305 lines of code chunking/chunker_factory.py x: 2 # contributors y: 54 lines of code chunking/chunkers/json_chunker.py x: 1 # contributors y: 98 lines of code chunking/chunkers/langchain_chunker.py x: 2 # contributors y: 70 lines of code connectors/sharepoint/sharepoint_deleted_files_purger.py x: 2 # contributors y: 216 lines of code run_sharepoint.py x: 2 # contributors y: 47 lines of code utils/__init__.py x: 2 # contributors y: 3 lines of code chunking/chunkers/doc_analysis_chunker.py x: 2 # contributors y: 135 lines of code chunking/chunkers/nl2sql_chunker.py x: 2 # contributors y: 36 lines of code chunking/chunkers/transcription_chunker.py x: 2 # contributors y: 57 lines of code tools/__init__.py x: 2 # contributors y: 7 lines of code tools/aisearch.py x: 2 # contributors y: 141 lines of code tools/aoai.py x: 2 # contributors y: 158 lines of code tools/blob.py x: 2 # contributors y: 160 lines of code azure.yaml x: 3 # contributors y: 24 lines of code chunking/chunkers/spreadsheet_chunker.py x: 2 # contributors y: 149 lines of code tools/keyvault.py x: 2 # contributors y: 44 lines of code scripts/postdeploy.ps1 x: 3 # contributors y: 5 lines of code
1187.0
lines of code
  min: 2.0
  average: 133.65
  25th percentile: 19.0
  median: 57.0
  75th percentile: 160.0
  max: 1187.0
0 5.0
# contributors
min: 1.0 | average: 2.23 | 25th percentile: 2.0 | median: 2.0 | 75th percentile: 2.0 | max: 5.0