twitter / communitynotes

Intro

File size measurements show the distribution of size of files.
Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.

Learn more...

File Size Overall

There are 44 files with 12,965 lines of code.

2 very long files (2,804 lines of code)
6 long files (4,434 lines of code)
11 medium size files (3,739 lines of codeclsfd_ftr_w_mp_ins)
9 small files (1,319 lines of code)
16 very small files (669 lines of code)

Legend:

1001+

501-1000

201-500

101-200

1-100

explore: grouped by folders | grouped by size | sunburst | 3D view

File Size per Extension

1001+

501-1000

201-500

101-200

1-100

File Size per Logical Decomposition

primary

1001+

501-1000

201-500

101-200

1-100

Longest Files (Top 44)

File	# lines	# units
run_scoring.py in sourcecode/scoring	1579	27
pflip_plus_model.py in sourcecode/scoring	1225	36
mf_base_scorer.py in sourcecode/scoring	973	19
scoring_rules.py in sourcecode/scoring	880	44
constants.py in sourcecode/scoring	866	5
note_ratings.py in sourcecode/scoring	598	10
process_data.py in sourcecode/scoring	582	26
pflip_model.py in sourcecode/scoring	535	22
pandas_utils.py in sourcecode/scoring	481	21
matrix_factorization.py in sourcecode/scoring/matrix_factorization	470	17
contributor_state.py in sourcecode/scoring	442	17
reputation_matrix_factorization.py in sourcecode/scoring/reputation_matrix_factorization	439	16
post_selection_similarity_old.py in sourcecode/scoring	389	20
scorer.py in sourcecode/scoring	339	20
pseudo_raters.py in sourcecode/scoring/matrix_factorization	286	12
runner.py in sourcecode/scoring	248	3
topic_model.py in sourcecode/scoring	217	11
post_selection_similarity.py in sourcecode/scoring	215	9
note_status_history.py in sourcecode/scoring	213	7
mf_group_scorer.py in sourcecode/scoring	184	13
helpfulness_scores.py in sourcecode/scoring	173	4
mf_topic_scorer.py in sourcecode/scoring	173	10
diligence_model.py in sourcecode/scoring/reputation_matrix_factorization	167	4
reputation_scorer.py in sourcecode/scoring	136	12
helpfulness_model.py in sourcecode/scoring/reputation_matrix_factorization	129	3
incorrect_filter.py in sourcecode/scoring	120	4
normalized_loss.py in sourcecode/scoring/matrix_factorization	120	6
tag_consensus.py in sourcecode/scoring	117	3
mf_expansion_scorer.py in sourcecode/scoring	83	9
mf_core_with_topics_scorer.py in sourcecode/scoring	83	9
mf_expansion_plus_scorer.py in sourcecode/scoring	77	9
explanation_tags.py in sourcecode/scoring	76	3
mf_core_scorer.py in sourcecode/scoring	69	6
weighted_loss.py in sourcecode/scoring/reputation_matrix_factorization	61	4
tag_filter.py in sourcecode/scoring	59	5
model.py in sourcecode/scoring/matrix_factorization	55	5
dataset.py in sourcecode/scoring/reputation_matrix_factorization	41	1
mf_multi_group_scorer.py in sourcecode/scoring	31	4
enums.py in sourcecode/scoring	25	1
main.py in sourcecode	5	-
__init__.py in sourcecode	1	-
__init__.py in sourcecode/scoring/matrix_factorization	1	-
__init__.py in sourcecode/scoring	1	-
__init__.py in sourcecode/scoring/reputation_matrix_factorization	1	-

Files With Most Units (Top 39)

File	# lines	# units
scoring_rules.py in sourcecode/scoring	880	44
pflip_plus_model.py in sourcecode/scoring	1225	36
run_scoring.py in sourcecode/scoring	1579	27
process_data.py in sourcecode/scoring	582	26
pflip_model.py in sourcecode/scoring	535	22
pandas_utils.py in sourcecode/scoring	481	21
post_selection_similarity_old.py in sourcecode/scoring	389	20
scorer.py in sourcecode/scoring	339	20
mf_base_scorer.py in sourcecode/scoring	973	19
matrix_factorization.py in sourcecode/scoring/matrix_factorization	470	17
contributor_state.py in sourcecode/scoring	442	17
reputation_matrix_factorization.py in sourcecode/scoring/reputation_matrix_factorization	439	16
mf_group_scorer.py in sourcecode/scoring	184	13
pseudo_raters.py in sourcecode/scoring/matrix_factorization	286	12
reputation_scorer.py in sourcecode/scoring	136	12
topic_model.py in sourcecode/scoring	217	11
note_ratings.py in sourcecode/scoring	598	10
mf_topic_scorer.py in sourcecode/scoring	173	10
post_selection_similarity.py in sourcecode/scoring	215	9
mf_expansion_scorer.py in sourcecode/scoring	83	9
mf_expansion_plus_scorer.py in sourcecode/scoring	77	9
mf_core_with_topics_scorer.py in sourcecode/scoring	83	9
note_status_history.py in sourcecode/scoring	213	7
normalized_loss.py in sourcecode/scoring/matrix_factorization	120	6
mf_core_scorer.py in sourcecode/scoring	69	6
tag_filter.py in sourcecode/scoring	59	5
model.py in sourcecode/scoring/matrix_factorization	55	5
constants.py in sourcecode/scoring	866	5
incorrect_filter.py in sourcecode/scoring	120	4
mf_multi_group_scorer.py in sourcecode/scoring	31	4
helpfulness_scores.py in sourcecode/scoring	173	4
diligence_model.py in sourcecode/scoring/reputation_matrix_factorization	167	4
weighted_loss.py in sourcecode/scoring/reputation_matrix_factorization	61	4
tag_consensus.py in sourcecode/scoring	117	3
explanation_tags.py in sourcecode/scoring	76	3
helpfulness_model.py in sourcecode/scoring/reputation_matrix_factorization	129	3
runner.py in sourcecode/scoring	248	3
dataset.py in sourcecode/scoring/reputation_matrix_factorization	41	1
enums.py in sourcecode/scoring	25	1

Files With Long Lines (Top 7)

There are 7 files with lines longer than 120 characters. In total, there are 29 long lines.

File	# lines	# units	# long lines
process_data.py in sourcecode/scoring	582	26	15
contributor_state.py in sourcecode/scoring	442	17	4
note_status_history.py in sourcecode/scoring	213	7	3
run_scoring.py in sourcecode/scoring	1579	27	2
note_ratings.py in sourcecode/scoring	598	10	2
mf_base_scorer.py in sourcecode/scoring	973	19	2
helpfulness_scores.py in sourcecode/scoring	173	4	1

Correlations

File Size vs. Commits (all time): 44 points

		1579.0	lines of code min: 1.0 average: 294.66 25th percentile: 63.0 median: 170.0 75th percentile: 441.25 max: 1579.0
0	91.0
commits (all time) min: 1.0 \| average: 24.32 \| 25th percentile: 6.0 \| median: 17.5 \| 75th percentile: 31.75 \| max: 91.0

File Size vs. Contributors (all time): 44 points

		1579.0	lines of code min: 1.0 average: 294.66 25th percentile: 63.0 median: 170.0 75th percentile: 441.25 max: 1579.0
0	8.0
contributors (all time) min: 1.0 \| average: 3.61 \| 25th percentile: 3.0 \| median: 3.0 \| 75th percentile: 5.0 \| max: 8.0

File Size vs. Commits (30 days): 16 points

		1579.0	lines of code min: 55.0 average: 494.44 25th percentile: 83.0 median: 278.0 75th percentile: 876.5 max: 1579.0
0	4.0
commits (30d) min: 2.0 \| average: 2.5 \| 25th percentile: 2.0 \| median: 2.0 \| 75th percentile: 3.5 \| max: 4.0

File Size vs. Contributors (30 days): 16 points

		1579.0	lines of code min: 55.0 average: 494.44 25th percentile: 83.0 median: 278.0 75th percentile: 876.5 max: 1579.0
0	3.0
contributors (30d) min: 1.0 \| average: 2.19 \| 25th percentile: 2.0 \| median: 2.0 \| 75th percentile: 2.75 \| max: 3.0

File Size vs. Commits (90 days): 23 points

		1579.0	lines of code min: 25.0 average: 437.91 25th percentile: 83.0 median: 248.0 75th percentile: 598.0 max: 1579.0
0	18.0
commits (90d) min: 2.0 \| average: 7.43 \| 25th percentile: 3.0 \| median: 6.0 \| 75th percentile: 10.0 \| max: 18.0

File Size vs. Contributors (90 days): 23 points

		1579.0	lines of code min: 25.0 average: 437.91 25th percentile: 83.0 median: 248.0 75th percentile: 598.0 max: 1579.0
0	3.0
contributors (90d) min: 1.0 \| average: 2.52 \| 25th percentile: 2.0 \| median: 3.0 \| 75th percentile: 3.0 \| max: 3.0