microsoft / graph-based-code-modelling
File Size

The distribution of size of files (measured in lines of code).

Intro
  • File size measurements show the distribution of size of files.
  • Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
  • It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.
Learn more...
File Size Overall
  • There are 42 files with 9,458 lines of code.
    • 2 very long files (2,376 lines of code)
    • 3 long files (1,977 lines of code)
    • 10 medium size files (3,174 lines of codeclsfd_ftr_w_mp_ins)
    • 6 small files (901 lines of code)
    • 21 very small files (1,030 lines of code)
25% | 20% | 33% | 9% | 10%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: zoomable circles | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
py33% | 0% | 44% | 4% | 17%
cs19% | 34% | 26% | 12% | 6%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
Models/exprsynth37% | 0% | 48% | 4% | 9%
DataExtraction/SourceGraphExtractionUtils49% | 36% | 0% | 12% | <1%
DataExtraction/SourceGraphExtractionUtils/Utils0% | 36% | 42% | 9% | 10%
DataExtraction/ExpressionDataExtractor0% | 0% | 63% | 36% | 0%
Models/utils0% | 0% | 0% | 0% | 100%
DataExtraction/TestProject0% | 0% | 0% | 0% | 100%
Models/exprsynth/metadata0% | 0% | 0% | 0% | 100%
Longest Files (Top 42)
File# lines# units
nagdecoder.py
in Models/exprsynth
1238 38
GraphDataExtractor.cs
in DataExtraction/SourceGraphExtractionUtils
1138 31
AstVisitor.cs
in DataExtraction/SourceGraphExtractionUtils
850 209
DataFlowGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
610 39
VariableUseGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
517 25
model.py
in Models/exprsynth
485 39
contextgraphmodel.py
in Models/exprsynth
424 18
SourceGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
403 31
ExecutionPathGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
326 30
RoslynUtils.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
324 17
seqdecoder.py
in Models/exprsynth
280 25
DirectedGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
263 13
CSharpCorpusBuilder.cs
in DataExtraction/ExpressionDataExtractor
237 13
contexttokenmodel.py
in Models/exprsynth
218 14
seq2graphmodel.py
in Models/exprsynth
214 16
TypeHierarchy.cs
in DataExtraction/SourceGraphExtractionUtils
180 9
GuardAnnotationGraphExtractor.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
180 17
nagmodel.py
in Models/exprsynth
163 15
Program.cs
in DataExtraction/ExpressionDataExtractor
139 4
ChunkedJsonWriter.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
121 5
PhogExtractor.cs
in DataExtraction/SourceGraphExtractionUtils
118 7
Multimap.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
99 10
test.py
in Models/utils
94 6
graph2seqmodel.py
in Models/exprsynth
93 14
dataset_split.py
in Models/utils
91 3
utils.py
in Models/exprsynth
85 4
seq2seqmodel.py
in Models/exprsynth
77 14
BidirectionalMap.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
68 11
train.py
in Models/utils
67 3
Program.cs
in DataExtraction/TestProject
57 3
tensorise.py
in Models/utils
56 1
model_restore_helper.py
in Models/exprsynth
52 2
ThreadSafeTextLog.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
45 3
IntVocabulary.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
37 3
PersistentResumeList.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
35 2
Utils.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
21 2
latticejoin.py
in Models/utils
19 -
ExtractionLimits.cs
in DataExtraction/SourceGraphExtractionUtils
12 -
ExtensionUtils.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
12 1
loader.py
in Models/exprsynth/metadata
8 1
__init__.py
in Models/exprsynth
1 -
__init__.py
in Models/exprsynth/metadata
1 -
Files With Most Units (Top 20)
File# lines# units
AstVisitor.cs
in DataExtraction/SourceGraphExtractionUtils
850 209
DataFlowGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
610 39
model.py
in Models/exprsynth
485 39
nagdecoder.py
in Models/exprsynth
1238 38
GraphDataExtractor.cs
in DataExtraction/SourceGraphExtractionUtils
1138 31
SourceGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
403 31
ExecutionPathGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
326 30
VariableUseGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
517 25
seqdecoder.py
in Models/exprsynth
280 25
contextgraphmodel.py
in Models/exprsynth
424 18
GuardAnnotationGraphExtractor.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
180 17
RoslynUtils.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
324 17
seq2graphmodel.py
in Models/exprsynth
214 16
nagmodel.py
in Models/exprsynth
163 15
contexttokenmodel.py
in Models/exprsynth
218 14
graph2seqmodel.py
in Models/exprsynth
93 14
seq2seqmodel.py
in Models/exprsynth
77 14
CSharpCorpusBuilder.cs
in DataExtraction/ExpressionDataExtractor
237 13
DirectedGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
263 13
BidirectionalMap.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
68 11
Files With Long Lines (Top 20)

There are 25 files with lines longer than 120 characters. In total, there are 334 long lines.

File# lines# units# long lines
nagdecoder.py
in Models/exprsynth
1238 38 60
VariableUseGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
517 25 57
GraphDataExtractor.cs
in DataExtraction/SourceGraphExtractionUtils
1138 31 43
seq2graphmodel.py
in Models/exprsynth
214 16 29
contextgraphmodel.py
in Models/exprsynth
424 18 15
model.py
in Models/exprsynth
485 39 15
contexttokenmodel.py
in Models/exprsynth
218 14 14
nagmodel.py
in Models/exprsynth
163 15 14
DirectedGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
263 13 13
SourceGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
403 31 12
CSharpCorpusBuilder.cs
in DataExtraction/ExpressionDataExtractor
237 13 9
DataFlowGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
610 39 9
RoslynUtils.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
324 17 9
ExecutionPathGraph.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
326 30 8
seqdecoder.py
in Models/exprsynth
280 25 5
GuardAnnotationGraphExtractor.cs
in DataExtraction/SourceGraphExtractionUtils/Utils
180 17 4
graph2seqmodel.py
in Models/exprsynth
93 14 4
model_restore_helper.py
in Models/exprsynth
52 2 3
seq2seqmodel.py
in Models/exprsynth
77 14 3
Program.cs
in DataExtraction/ExpressionDataExtractor
139 4 2