awslabs / recurrent-intensity-model-experiments
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 2% duplication:
    • 2,882 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 80 duplicated lines
  • 5 duplicates
system2% (80 lines)
Duplication per Extension
py2% (80 lines)
Duplication per Component (primary)
src/rime/models3% (32 lines)
scripts17% (18 lines)
src/rime/models/zero_shot11% (18 lines)
src/rime/dataset3% (12 lines)
src/rime0% (0 lines)
src/rime/util0% (0 lines)
src/rime/metrics0% (0 lines)
ROOT0% (0 lines)
data0% (0 lines)
Longest Duplicates
The list of 5 longest duplicates.
See data for all 5 duplicates...
Size#FoldersFilesLinesCode
9 x 2 scripts
scripts
everything_ml_1m.py
everything_ml_1m.py
85:93 (8%)
117:125 (8%)
view
9 x 2 src/rime/models/zero_shot
src/rime/models/zero_shot
bayes_lm.py
item_knn.py
41:50 (10%)
36:45 (12%)
view
8 x 2 src/rime/models
src/rime/models
rnn.py
transformer.py
35:43 (6%)
55:63 (17%)
view
8 x 2 src/rime/models
src/rime/models
rnn.py
transformer.py
21:28 (6%)
41:49 (17%)
view
6 x 2 src/rime/dataset
src/rime/dataset
prepare_ml_1m_data.py
prepare_yoochoose_data.py
40:45 (18%)
33:38 (18%)
view