facebookresearch / PAQ
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 5% duplication:
    • 2,107 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 112 duplicated lines
  • 15 duplicates
system5% (112 lines)
Duplication per Extension
py5% (112 lines)
Duplication per Component (primary)
paq/generation/answer_extractor10% (30 lines)
paq/retrievers5% (22 lines)
paq/generation/passage_scorer13% (14 lines)
paq/generation/filtering5% (11 lines)
paq/generation/question_generator7% (11 lines)
paq/server11% (9 lines)
paq/generation6% (8 lines)
paq/rerankers5% (7 lines)
paq0% (0 lines)
paq/evaluation0% (0 lines)
Longest Duplicates
The list of 15 longest duplicates.
See data for all 15 duplicates...
Size#FoldersFilesLinesCode
11 x 2 paq/generation/filtering
paq/generation/question_generator
filter_questions.py
generate_questions.py
45:57 (29%)
33:45 (39%)
view
9 x 2 paq/retrievers
paq/server
retrieve.py
server.py
153:162 (7%)
63:71 (11%)
view
8 x 2 paq/generation/answer_extractor
paq/generation/passage_scorer
extract_answers.py
score_passages.py
40:49 (26%)
40:49 (26%)
view
8 x 2 paq/generation/answer_extractor
paq/generation/answer_extractor
extractors.py
extractors.py
44:51 (7%)
57:64 (7%)
view
7 x 2 paq/rerankers
paq/retrievers
rerank.py
embed.py
67:75 (5%)
33:41 (7%)
view
6 x 2 paq/generation/filtering
paq/generation
filter_questions.py
generate_qa_pairs.py
50:57 (16%)
146:153 (5%)
view
6 x 2 paq/generation
paq/retrievers
generate_qa_pairs.py
embed.py
144:150 (5%)
83:89 (6%)
view
6 x 2 paq/generation/filtering
paq/generation/passage_scorer
filter_questions.py
score_passages.py
50:57 (16%)
42:49 (20%)
view
6 x 2 paq/generation
paq/generation/question_generator
generate_qa_pairs.py
generate_questions.py
146:153 (5%)
38:45 (21%)
view
6 x 2 paq/generation
paq/generation/passage_scorer
generate_qa_pairs.py
score_passages.py
146:153 (5%)
42:49 (20%)
view
6 x 2 paq/generation/answer_extractor
paq/generation/filtering
extract_answers.py
filter_questions.py
42:49 (20%)
50:57 (16%)
view
6 x 2 paq/generation/answer_extractor
paq/generation
extract_answers.py
generate_qa_pairs.py
42:49 (20%)
146:153 (5%)
view
6 x 2 paq/generation/passage_scorer
paq/generation/question_generator
score_passages.py
generate_questions.py
42:49 (20%)
38:45 (21%)
view
6 x 2 paq/generation/answer_extractor
paq/generation/question_generator
extract_answers.py
generate_questions.py
42:49 (20%)
38:45 (21%)
view
6 x 2 paq/generation/answer_extractor
paq/generation/passage_scorer
extract_answers.py
score_passages.py
13:20 (20%)
13:20 (20%)
view