azure / azure-llm-fine-tuning
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
0% | 65% | 26% | 3% | 4%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
ipynb0% | 86% | 8% | 4% | 0%
py0% | 0% | 84% | 0% | 15%
yaml0% | 0% | 0% | 0% | 100%
jsonl0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
phi30% | 69% | 20% | 5% | 4%
florence2-VQA0% | 84% | 11% | 0% | 3%
aoai0% | 0% | 95% | 0% | 4%
Longest Files (Top 22)
File# lines# units
866 -
863 -
811 -
640 -
572 -
384 -
phi3.py
in phi3/olive
260 5
244 12
train_mlflow.py
in phi3/src_train
208 5
train_mlflow.py
in florence2-VQA/src_train
206 10
train.py
in phi3/src_train
204 4
1_data-preparation-basic.ipynb
in phi3/dataset-preparation
189 -
score.py
in florence2-VQA/src_serve
57 3
train_tokenizer.py
in phi3/dataset-preparation
51 2
combine_tokenizer.py
in phi3/dataset-preparation
39 2
score.py
in phi3/src_serve
28 2
conda.yaml
in phi3/olive
25 -
logger.py
in aoai
12 -
logger.py
in phi3
12 -
logger.py
in florence2-VQA
12 -
jsonl
training_set.jsonl
in aoai/dataset
10 -
jsonl
validation_set.jsonl
in aoai/dataset
10 -
Files With Most Units (Top 9)
File# lines# units
244 12
train_mlflow.py
in florence2-VQA/src_train
206 10
train_mlflow.py
in phi3/src_train
208 5
phi3.py
in phi3/olive
260 5
train.py
in phi3/src_train
204 4
score.py
in florence2-VQA/src_serve
57 3
combine_tokenizer.py
in phi3/dataset-preparation
39 2
train_tokenizer.py
in phi3/dataset-preparation
51 2
score.py
in phi3/src_serve
28 2
Files With Long Lines (Top 16)

There are 16 files with lines longer than 120 characters. In total, there are 173 long lines.

File# lines# units# long lines
384 - 28
866 - 25
863 - 25
640 - 21
572 - 14
811 - 12
jsonl
training_set.jsonl
in aoai/dataset
10 - 10
jsonl
validation_set.jsonl
in aoai/dataset
10 - 10
train_mlflow.py
in florence2-VQA/src_train
206 10 10
train.py
in phi3/src_train
204 4 5
train_mlflow.py
in phi3/src_train
208 5 4
1_data-preparation-basic.ipynb
in phi3/dataset-preparation
189 - 3
244 12 2
combine_tokenizer.py
in phi3/dataset-preparation
39 2 2
score.py
in phi3/src_serve
28 2 1
score.py
in florence2-VQA/src_serve
57 3 1
Correlations

File Size vs. Commits (all time): 22 points

florence2-VQA/1_training_mlflow_florence2.ipynb x: 2 commits (all time) y: 640 lines of code florence2-VQA/2_serving_florence2.ipynb x: 2 commits (all time) y: 811 lines of code phi3/1_training_custom_phi3.ipynb x: 2 commits (all time) y: 863 lines of code phi3/src_serve/score.py x: 4 commits (all time) y: 28 lines of code phi3/src_train/train.py x: 3 commits (all time) y: 204 lines of code phi3/src_train/train_mlflow.py x: 3 commits (all time) y: 208 lines of code phi3/3_optimization_olive.ipynb x: 3 commits (all time) y: 572 lines of code phi3/dataset-preparation/train_tokenizer.py x: 3 commits (all time) y: 51 lines of code florence2-VQA/logger.py x: 2 commits (all time) y: 12 lines of code florence2-VQA/src_serve/score.py x: 2 commits (all time) y: 57 lines of code phi3/dataset-preparation/1_data-preparation-basic.ipynb x: 3 commits (all time) y: 189 lines of code phi3/olive/conda.yaml x: 2 commits (all time) y: 25 lines of code phi3/olive/phi3.py x: 2 commits (all time) y: 260 lines of code aoai/aoai_finetune.ipynb x: 2 commits (all time) y: 384 lines of code aoai/dataset/training_set.jsonl x: 1 commits (all time) y: 10 lines of code aoai/logger.py x: 1 commits (all time) y: 12 lines of code aoai/token_count_utils.py x: 1 commits (all time) y: 244 lines of code phi3/dataset-preparation/combine_tokenizer.py x: 2 commits (all time) y: 39 lines of code florence2-VQA/src_train/train_mlflow.py x: 1 commits (all time) y: 206 lines of code
866.0
lines of code
  min: 10.0
  average: 259.23
  25th percentile: 21.75
  median: 196.5
  75th percentile: 431.0
  max: 866.0
0 4.0
commits (all time)
min: 1.0 | average: 2.05 | 25th percentile: 1.0 | median: 2.0 | 75th percentile: 3.0 | max: 4.0

File Size vs. Contributors (all time): 22 points

florence2-VQA/1_training_mlflow_florence2.ipynb x: 1 contributors (all time) y: 640 lines of code florence2-VQA/2_serving_florence2.ipynb x: 1 contributors (all time) y: 811 lines of code phi3/1_training_custom_phi3.ipynb x: 1 contributors (all time) y: 863 lines of code phi3/src_serve/score.py x: 1 contributors (all time) y: 28 lines of code phi3/src_train/train.py x: 1 contributors (all time) y: 204 lines of code phi3/src_train/train_mlflow.py x: 1 contributors (all time) y: 208 lines of code phi3/3_optimization_olive.ipynb x: 1 contributors (all time) y: 572 lines of code phi3/dataset-preparation/train_tokenizer.py x: 1 contributors (all time) y: 51 lines of code florence2-VQA/logger.py x: 1 contributors (all time) y: 12 lines of code florence2-VQA/src_serve/score.py x: 1 contributors (all time) y: 57 lines of code phi3/dataset-preparation/1_data-preparation-basic.ipynb x: 1 contributors (all time) y: 189 lines of code phi3/olive/conda.yaml x: 1 contributors (all time) y: 25 lines of code phi3/olive/phi3.py x: 1 contributors (all time) y: 260 lines of code aoai/aoai_finetune.ipynb x: 1 contributors (all time) y: 384 lines of code aoai/dataset/training_set.jsonl x: 1 contributors (all time) y: 10 lines of code aoai/token_count_utils.py x: 1 contributors (all time) y: 244 lines of code phi3/dataset-preparation/combine_tokenizer.py x: 1 contributors (all time) y: 39 lines of code florence2-VQA/src_train/train_mlflow.py x: 1 contributors (all time) y: 206 lines of code
866.0
lines of code
  min: 10.0
  average: 259.23
  25th percentile: 21.75
  median: 196.5
  75th percentile: 431.0
  max: 866.0
0 1.0
contributors (all time)
min: 1.0 | average: 1.0 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 1.0

File Size vs. Commits (30 days): 0 points

No data for "commits (30d)" vs. "lines of code".

File Size vs. Contributors (30 days): 0 points

No data for "contributors (30d)" vs. "lines of code".


File Size vs. Commits (90 days): 9 points

florence2-VQA/1_training_mlflow_florence2.ipynb x: 2 commits (90d) y: 640 lines of code florence2-VQA/2_serving_florence2.ipynb x: 2 commits (90d) y: 811 lines of code phi3/1_training_custom_phi3.ipynb x: 2 commits (90d) y: 863 lines of code phi3/src_serve/score.py x: 2 commits (90d) y: 28 lines of code phi3/src_train/train.py x: 1 commits (90d) y: 204 lines of code phi3/src_train/train_mlflow.py x: 1 commits (90d) y: 208 lines of code phi3/3_optimization_olive.ipynb x: 1 commits (90d) y: 572 lines of code phi3/dataset-preparation/train_tokenizer.py x: 1 commits (90d) y: 51 lines of code
866.0
lines of code
  min: 28.0
  average: 471.44
  25th percentile: 127.5
  median: 572.0
  75th percentile: 837.0
  max: 866.0
0 2.0
commits (90d)
min: 1.0 | average: 1.56 | 25th percentile: 1.0 | median: 2.0 | 75th percentile: 2.0 | max: 2.0

File Size vs. Contributors (90 days): 9 points

florence2-VQA/1_training_mlflow_florence2.ipynb x: 1 contributors (90d) y: 640 lines of code florence2-VQA/2_serving_florence2.ipynb x: 1 contributors (90d) y: 811 lines of code phi3/1_training_custom_phi3.ipynb x: 1 contributors (90d) y: 863 lines of code phi3/src_serve/score.py x: 1 contributors (90d) y: 28 lines of code phi3/src_train/train.py x: 1 contributors (90d) y: 204 lines of code phi3/src_train/train_mlflow.py x: 1 contributors (90d) y: 208 lines of code phi3/3_optimization_olive.ipynb x: 1 contributors (90d) y: 572 lines of code phi3/dataset-preparation/train_tokenizer.py x: 1 contributors (90d) y: 51 lines of code
866.0
lines of code
  min: 28.0
  average: 471.44
  25th percentile: 127.5
  median: 572.0
  75th percentile: 837.0
  max: 866.0
0 1.0
contributors (90d)
min: 1.0 | average: 1.0 | 25th percentile: 1.0 | median: 1.0 | 75th percentile: 1.0 | max: 1.0