facebookresearch / MetaICL
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 12% duplication:
    • 7,739 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 932 duplicated lines
  • 391 duplicates
system12% (932 lines)
Duplication per Extension
py12% (932 lines)
Duplication per Component (primary)
preprocess13% (896 lines)
utils19% (20 lines)
ROOT6% (16 lines)
metaicl0% (0 lines)
Longest Duplicates
The list of 20 longest duplicates.
See data for all 391 duplicates...
Size#FoldersFilesLinesCode
17 x 2 preprocess
preprocess
ai2_arc.py
qasc.py
17:36 (62%)
17:36 (62%)
view
15 x 2 preprocess
preprocess
financial_phrasebank.py
medical_questions_pairs.py
25:46 (45%)
24:46 (45%)
view
15 x 2 preprocess
preprocess
circa.py
medical_questions_pairs.py
26:48 (38%)
24:46 (45%)
view
15 x 2 preprocess
preprocess
circa.py
financial_phrasebank.py
26:48 (38%)
25:46 (45%)
view
15 x 2 preprocess
preprocess
fewshot_gym_dataset.py
fewshot_gym_dataset.py
120:147 (10%)
177:204 (10%)
view
14 x 2 preprocess
preprocess
medical_questions_pairs.py
proto_qa.py
26:46 (42%)
22:41 (38%)
view
14 x 2 preprocess
preprocess
circa.py
proto_qa.py
28:48 (35%)
22:41 (38%)
view
14 x 2 preprocess
preprocess
ai2_arc.py
commonsense_qa.py
17:32 (51%)
17:32 (43%)
view
14 x 2 preprocess
preprocess
financial_phrasebank.py
proto_qa.py
27:46 (42%)
22:41 (38%)
view
14 x 2 preprocess
preprocess
commonsense_qa.py
quartz.py
17:32 (43%)
18:33 (40%)
view
14 x 2 preprocess
preprocess
qasc.py
quartz.py
17:32 (51%)
18:33 (40%)
view
14 x 2 preprocess
preprocess
ai2_arc.py
quartz.py
17:32 (51%)
18:33 (40%)
view
14 x 2 preprocess
preprocess
commonsense_qa.py
qasc.py
17:32 (43%)
17:32 (51%)
view
13 x 2 preprocess
preprocess
ade_dosage.py
aslg_pc12.py
17:36 (50%)
17:37 (50%)
view
13 x 2 preprocess
preprocess
jeopardy.py
numer_sense.py
17:37 (50%)
17:37 (50%)
view
13 x 2 preprocess
preprocess
ade_effect.py
reddit_tifu.py
17:36 (50%)
18:38 (43%)
view
13 x 2 preprocess
preprocess
ade_effect.py
jeopardy.py
17:36 (50%)
17:37 (50%)
view
13 x 2 preprocess
preprocess
numer_sense.py
reddit_tifu.py
17:37 (50%)
18:38 (43%)
view
13 x 2 preprocess
preprocess
ade_dosage.py
numer_sense.py
17:36 (50%)
17:37 (50%)
view
13 x 2 preprocess
preprocess
ade_effect.py
aslg_pc12.py
17:36 (50%)
17:37 (50%)
view
Duplicated Units
The list of top 6 duplicated units.
See data for all 6 unit duplicates...
Size#FoldersFilesLinesCode
11 x 4 preprocess
preprocess
preprocess
preprocess
medical_questions_pairs.py
proto_qa.py
financial_phrasebank.py
circa.py
0:0 
0:0 
0:0 
0:0 
view
8 x 15 preprocess
preprocess
preprocess
preprocess
preprocess
preprocess
preprocess
preprocess
preprocess
preprocess
...
ade_classification.py
aslg_pc12.py
onestop_english.py
trec_finegrained.py
jeopardy.py
hate_speech_offensive.py
ade_effect.py
reddit_tifu.py
app_reviews.py
hate_speech18.py
...
0:0 
0:0 
0:0 
0:0 
0:0 
0:0 
0:0 
0:0 
0:0 
0:0 
...
view
7 x 4 preprocess
preprocess
preprocess
preprocess
ai2_arc.py
commonsense_qa.py
qasc.py
quartz.py
0:0 
0:0 
0:0 
0:0 
view
7 x 2 preprocess
preprocess
numer_sense.py
ethos.py
0:0 
0:0 
view
6 x 3 preprocess
preprocess
preprocess
emo.py
imdb.py
agnews.py
0:0 
0:0 
0:0 
view
7 x 2 preprocess
preprocess
yelp_review_full.py
yahoo_answers_topics.py
0:0 
0:0 
view