awslabs / unsupervised-qa
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 1% duplication:
    • 1,574 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 30 duplicated lines
  • 4 duplicates
system1% (30 lines)
Duplication per Extension
py2% (30 lines)
Duplication per Component (primary)
distant_supervision1% (18 lines)
spark_scripts4% (12 lines)
resources0% (0 lines)
ROOT0% (0 lines)
Longest Duplicates
The list of 4 longest duplicates.
See data for all 4 duplicates...
Size#FoldersFilesLinesCode
6 x 2 distant_supervision
distant_supervision
ds_es_client.py
ds_es_client.py
138:143 (3%)
150:155 (3%)
view
6 x 2 distant_supervision
distant_supervision
ds_es_client.py
ds_es_client.py
127:132 (3%)
138:143 (3%)
view
6 x 2 distant_supervision
distant_supervision
ds_es_client.py
ds_es_client.py
127:132 (3%)
150:155 (3%)
view
6 x 2 spark_scripts
spark_scripts
create_squad_ner_dataset.py
stat_for_ner_category_to_wh_words.py
36:42 (16%)
118:124 (6%)
view