facebookresearch / cc_net
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 2% duplication:
    • 3,607 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 94 duplicated lines
  • 5 duplicates
system2% (94 lines)
Duplication per Extension
py2% (94 lines)
Duplication per Component (primary)
cc_net2% (94 lines)
ROOT0% (0 lines)
cc_net/tools0% (0 lines)
Longest Duplicates
The list of 5 longest duplicates.
See data for all 5 duplicates...
Size#FoldersFilesLinesCode
16 x 2 cc_net
cc_net
execution.py
execution.py
98:115 (10%)
182:199 (10%)
view
10 x 2 cc_net
cc_net
regroup.py
regroup.py
46:59 (13%)
91:104 (13%)
view
8 x 2 cc_net
cc_net
regroup.py
regroup.py
27:38 (10%)
63:74 (10%)
view
7 x 2 cc_net
cc_net
perplexity.py
perplexity.py
92:98 (2%)
158:164 (2%)
view
6 x 2 cc_net
cc_net
dedup.py
dedup.py
188:193 (1%)
219:224 (1%)
view