facebookresearch / KILT
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 13% duplication:
    • 3,705 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 502 duplicated lines
  • 36 duplicates
system13% (502 lines)
Duplication per Extension
py13% (502 lines)
Duplication per Component (primary)
kilt/datasets37% (374 lines)
kilt/retrievers11% (42 lines)
kilt/readers/t55% (42 lines)
kilt2% (26 lines)
scripts3% (18 lines)
kilt/configs0% (0 lines)
kilt/configs/retriever0% (0 lines)
kilt/readers/fid0% (0 lines)
ROOT0% (0 lines)
Longest Duplicates
The list of 20 longest duplicates.
See data for all 36 duplicates...
Size#FoldersFilesLinesCode
43 x 2 kilt/datasets
kilt/datasets
natural_questions.py
natural_questions.py
98:144 (24%)
155:201 (24%)
view
21 x 2 kilt/readers/t5
kilt/readers/t5
finetune.py
finetune.py
101:124 (10%)
151:174 (10%)
view
20 x 2 kilt/datasets
kilt/datasets
hotpotqa.py
triviaqa.py
175:197 (12%)
131:153 (18%)
view
20 x 2 kilt/datasets
kilt/datasets
hotpotqa.py
natural_questions.py
175:197 (12%)
213:235 (11%)
view
20 x 2 kilt/datasets
kilt/datasets
natural_questions.py
triviaqa.py
213:235 (11%)
131:153 (18%)
view
19 x 2 kilt/datasets
kilt/datasets
fact_verification.py
hotpotqa.py
134:153 (10%)
53:72 (12%)
view
14 x 2 kilt/datasets
kilt/datasets
fact_verification.py
natural_questions.py
191:205 (7%)
111:125 (8%)
view
14 x 2 kilt/datasets
kilt/datasets
natural_questions.py
triviaqa.py
167:181 (8%)
97:111 (12%)
view
14 x 2 kilt/datasets
kilt/datasets
fact_verification.py
natural_questions.py
208:224 (7%)
189:205 (8%)
view
14 x 2 kilt/datasets
kilt/datasets
natural_questions.py
triviaqa.py
110:124 (8%)
97:111 (12%)
view
14 x 2 kilt/datasets
kilt/datasets
fact_verification.py
natural_questions.py
191:205 (7%)
168:182 (8%)
view
13 x 2 kilt/datasets
kilt/datasets
entity_linking.py
entity_linking.py
48:63 (7%)
75:88 (7%)
view
13 x 2 kilt/datasets
kilt/datasets
fact_verification.py
triviaqa.py
191:204 (6%)
98:111 (11%)
view
12 x 2 kilt/datasets
kilt/datasets
fact_verification.py
hotpotqa.py
193:204 (6%)
137:148 (7%)
view
12 x 2 kilt/datasets
kilt/datasets
fact_verification.py
natural_questions.py
208:220 (6%)
132:144 (6%)
view
12 x 2 kilt/datasets
kilt/retrievers
base_dataset.py
base_retriever.py
26:39 (40%)
24:37 (46%)
view
12 x 2 kilt/datasets
kilt/datasets
hotpotqa.py
natural_questions.py
137:148 (7%)
170:181 (6%)
view
12 x 2 kilt/datasets
kilt/datasets
hotpotqa.py
natural_questions.py
137:148 (7%)
113:124 (6%)
view
12 x 2 kilt/datasets
kilt/datasets
hotpotqa.py
triviaqa.py
137:148 (7%)
100:111 (11%)
view
11 x 2 kilt/datasets
kilt/datasets
natural_questions.py
triviaqa.py
53:63 (6%)
45:55 (10%)
view
Duplicated Units
The list of top 2 duplicated units.
See data for all 2 unit duplicates...
Size#FoldersFilesLinesCode
18 x 3 kilt/datasets
kilt/datasets
kilt/datasets
natural_questions.py
hotpotqa.py
triviaqa.py
0:0 
0:0 
0:0 
view
6 x 2 kilt
scripts
kilt_utils.py
map_TAC-KBP2010_to_KILT.py
0:0 
0:0 
view