facebookresearch / DrQA
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 7% duplication:
    • 3,105 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 219 duplicated lines
  • 37 duplicates
system7% (219 lines)
Duplication per Extension
py7% (219 lines)
Duplication per Component (primary)
scripts/pipeline25% (53 lines)
drqa/tokenizers11% (36 lines)
scripts/reader6% (35 lines)
drqa/reader2% (26 lines)
scripts/retriever7% (25 lines)
scripts/convert46% (14 lines)
drqa/retriever5% (12 lines)
scripts/distant5% (12 lines)
drqa/pipeline2% (6 lines)
ROOT0% (0 lines)
drqa0% (0 lines)

Duplication Between Components (50+ lines)

G scripts/pipeline scripts/pipeline scripts/reader scripts/reader scripts/pipeline--scripts/reader 58

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 37 duplicates...
Size#FoldersFilesLinesCode
18 x 2 scripts/pipeline
scripts/pipeline
interactive.py
predict.py
45:64 (23%)
66:85 (18%)
view
12 x 2 drqa/tokenizers
drqa/tokenizers
regexp_tokenizer.py
simple_tokenizer.py
87:100 (17%)
44:57 (40%)
view
8 x 2 scripts/pipeline
scripts/reader
predict.py
predict.py
63:71 (8%)
52:60 (10%)
view
7 x 2 drqa/reader
drqa/reader
model.py
model.py
437:443 (2%)
451:457 (2%)
view
7 x 2 scripts/reader
scripts/retriever
predict.py
interactive.py
19:26 (9%)
15:22 (30%)
view
7 x 2 scripts/reader
scripts/retriever
interactive.py
interactive.py
18:31 (17%)
15:22 (30%)
view
7 x 2 scripts/pipeline
scripts/pipeline
interactive.py
predict.py
19:26 (9%)
20:27 (7%)
view
7 x 2 scripts/pipeline
scripts/reader
predict.py
predict.py
20:27 (7%)
19:26 (9%)
view
7 x 2 scripts/pipeline
scripts/reader
predict.py
interactive.py
20:27 (7%)
18:31 (17%)
view
7 x 2 scripts/convert
scripts/convert
squad.py
webquestions.py
18:28 (50%)
19:29 (43%)
view
7 x 2 scripts/pipeline
scripts/retriever
interactive.py
interactive.py
19:26 (9%)
15:22 (30%)
view
7 x 2 scripts/reader
scripts/reader
interactive.py
predict.py
18:31 (17%)
19:26 (9%)
view
7 x 2 scripts/pipeline
scripts/reader
interactive.py
predict.py
19:26 (9%)
19:26 (9%)
view
7 x 2 scripts/pipeline
scripts/reader
interactive.py
interactive.py
43:50 (9%)
43:50 (17%)
view
7 x 2 scripts/pipeline
scripts/retriever
predict.py
interactive.py
20:27 (7%)
15:22 (30%)
view
7 x 2 scripts/pipeline
scripts/reader
interactive.py
interactive.py
19:26 (9%)
18:31 (17%)
view
6 x 2 scripts/retriever
scripts/retriever
build_tfidf.py
interactive.py
24:29 (5%)
15:20 (26%)
view
6 x 2 scripts/reader
scripts/retriever
predict.py
build_tfidf.py
19:24 (8%)
24:29 (5%)
view
6 x 2 drqa/pipeline
scripts/distant
drqa.py
generate.py
46:53 (2%)
57:64 (3%)
view
6 x 2 scripts/pipeline
scripts/reader
interactive.py
predict.py
45:50 (7%)
55:60 (8%)
view
Duplicated Units
The list of top 1 duplicated units.
See data for all 1 unit duplicate
Size#FoldersFilesLinesCode
7 x 2 drqa/retriever
drqa/retriever
elastic_doc_ranker.py
tfidf_doc_ranker.py
0:0 
0:0 
view