facebookresearch / FAMBench
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 17% duplication:
    • 16,768 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 2,937 duplicated lines
  • 169 duplicates
system17% (2,937 lines)
Duplication per Extension
py22% (2,822 lines)
cc2% (77 lines)
toml36% (24 lines)
h1% (14 lines)
Duplication per Component (primary)
benchmarks/dlrm/ootb25% (1,589 lines)
benchmarks/rnnt/ootb/inference11% (576 lines)
benchmarks/rnnt/ootb/train18% (507 lines)
fb5logging30% (66 lines)
bmlogging79% (66 lines)
benchmarks/dlrm/ootb/tools8% (60 lines)
benchmarks/dlrm/ubench10% (44 lines)
benchmarks/dlrm/ootb/cython61% (21 lines)
benchmarks/dlrm/ootb/optim10% (8 lines)
benchmarks/xlmr/ootb0% (0 lines)
benchmarks/dlrm/ootb/tricks0% (0 lines)
benchmarks/cudnn_multihead_attn0% (0 lines)

Duplication Between Components (50+ lines)

G benchmarks/rnnt/ootb/inference benchmarks/rnnt/ootb/inference benchmarks/rnnt/ootb/train benchmarks/rnnt/ootb/train benchmarks/rnnt/ootb/inference--benchmarks/rnnt/ootb/train 866 bmlogging bmlogging fb5logging fb5logging bmlogging--fb5logging 132 benchmarks/dlrm/ootb/cython benchmarks/dlrm/ootb/cython benchmarks/dlrm/ootb benchmarks/dlrm/ootb benchmarks/dlrm/ootb/cython--benchmarks/dlrm/ootb 63

Download: SVG DOT (open online Graphviz editor)

Open 3D force graph...

Show more details on duplication between components...
Longest Duplicates
The list of 20 longest duplicates.
See data for all 169 duplicates...
Size#FoldersFilesLinesCode
64 x 2 benchmarks/rnnt/ootb/inference/pytorch/parts/text
benchmarks/rnnt/ootb/train/common/text
numbers.py
numbers.py
23:101 (100%)
23:99 (100%)
view
50 x 2 benchmarks/rnnt/ootb/inference/pytorch/parts/text
benchmarks/rnnt/ootb/train/common/text
cleaners.py
cleaners.py
20:89 (79%)
20:83 (96%)
view
50 x 2 bmlogging
fb5logging
bmlogger.py
fb5logger.py
36:118 (74%)
11:93 (98%)
view
44 x 2 benchmarks/rnnt/ootb/inference/pytorch/parts
benchmarks/rnnt/ootb/train/common
segment.py
audio.py
107:170 (44%)
118:184 (36%)
view
37 x 2 benchmarks/rnnt/ootb/inference/pytorch/parts
benchmarks/rnnt/ootb/train/common
segment.py
audio.py
34:81 (37%)
68:116 (30%)
view
31 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_pytorch.py
772:823 (5%)
1239:1289 (3%)
view
30 x 2 benchmarks/rnnt/ootb/inference/pytorch/utils
benchmarks/rnnt/ootb/train/utils
download_librispeech.py
download_librispeech.py
25:60 (65%)
24:59 (71%)
view
24 x 2 benchmarks/rnnt/ootb/inference/pytorch/utils
benchmarks/rnnt/ootb/train/utils
preprocessing_utils.py
preprocessing_utils.py
48:77 (57%)
47:76 (58%)
view
24 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_pytorch.py
639:670 (4%)
1106:1137 (2%)
view
23 x 2 benchmarks/rnnt/ootb/inference/pytorch/utils
benchmarks/rnnt/ootb/train/utils
download_utils.py
download_utils.py
24:52 (62%)
23:51 (62%)
view
22 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_s_caffe2.py
dlrm_s_pytorch.py
1167:1189 (1%)
1384:1406 (1%)
view
21 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_pytorch.py
dlrm_data_pytorch.py
725:745 (2%)
748:768 (2%)
view
21 x 2 benchmarks/rnnt/ootb/inference/pytorch/utils
benchmarks/rnnt/ootb/train/utils
convert_librispeech.py
convert_librispeech.py
35:58 (44%)
35:57 (44%)
view
20 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_s_caffe2.py
dlrm_s_caffe2.py
834:857 (1%)
1003:1026 (1%)
view
19 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_pytorch.py
598:624 (3%)
1063:1089 (1%)
view
19 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_s_caffe2.py
dlrm_s_caffe2.py
754:783 (1%)
863:892 (1%)
view
18 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
data_utils.py
data_utils.py
251:269 (1%)
467:485 (1%)
view
17 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_pytorch.py
700:729 (3%)
1167:1196 (1%)
view
17 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_caffe2.py
620:637 (3%)
647:664 (3%)
view
17 x 2 benchmarks/rnnt/ootb/inference/pytorch/utils
benchmarks/rnnt/ootb/train/utils
convert_librispeech.py
convert_librispeech.py
60:82 (36%)
59:81 (36%)
view
Duplicated Units
The list of top 20 duplicated units.
See data for all 27 unit duplicates...
Size#FoldersFilesLinesCode
30 x 2 benchmarks/rnnt/ootb/train/common
benchmarks/rnnt/ootb/inference/pytorch/parts
audio.py
segment.py
0:0 
0:0 
view
19 x 2 benchmarks/rnnt/ootb/train/utils
benchmarks/rnnt/ootb/inference/pytorch/utils
download_utils.py
download_utils.py
0:0 
0:0 
view
18 x 2 benchmarks/rnnt/ootb/train/common/text
benchmarks/rnnt/ootb/inference/pytorch/parts/text
numbers.py
numbers.py
0:0 
0:0 
view
17 x 2 benchmarks/rnnt/ootb/train/common/text
benchmarks/rnnt/ootb/inference/pytorch/parts/text
numbers.py
numbers.py
0:0 
0:0 
view
20 x 2 benchmarks/rnnt/ootb/train/rnnt
benchmarks/rnnt/ootb/inference/pytorch
model.py
model_separable_rnnt.py
0:0 
0:0 
view
11 x 2 benchmarks/rnnt/ootb/inference/loadgen
benchmarks/rnnt/ootb/inference/loadgen
logging.h
logging.h
134:145 
148:159 
view
11 x 2 benchmarks/rnnt/ootb/inference/loadgen
benchmarks/rnnt/ootb/inference/loadgen
cc
logging.cc
logging.cc
156:167 
169:180 
view
13 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_pytorch.py
0:0 
0:0 
view
10 x 2 benchmarks/rnnt/ootb/train/utils
benchmarks/rnnt/ootb/inference/pytorch/utils
download_utils.py
download_utils.py
0:0 
0:0 
view
10 x 2 benchmarks/rnnt/ootb/train/common
benchmarks/rnnt/ootb/inference/pytorch/parts
audio.py
segment.py
0:0 
0:0 
view
13 x 2 benchmarks/rnnt/ootb/train/common
benchmarks/rnnt/ootb/inference/pytorch/parts
audio.py
segment.py
0:0 
0:0 
view
10 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_pytorch.py
0:0 
0:0 
view
8 x 2 benchmarks/rnnt/ootb/train/common/text
benchmarks/rnnt/ootb/inference/pytorch/parts/text
numbers.py
numbers.py
0:0 
0:0 
view
8 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_pytorch.py
0:0 
0:0 
view
10 x 2 fb5logging
bmlogging
fb5logger.py
bmlogger.py
0:0 
0:0 
view
11 x 2 benchmarks/rnnt/ootb/train/common/data
benchmarks/rnnt/ootb/train/common/data/dali
dataset.py
iterator.py
0:0 
0:0 
view
7 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_pytorch.py
0:0 
0:0 
view
7 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
dlrm_data_caffe2.py
dlrm_data_pytorch.py
0:0 
0:0 
view
7 x 2 benchmarks/dlrm/ootb
benchmarks/dlrm/ootb
extend_distributed.py
extend_distributed.py
0:0 
0:0 
view
9 x 2 fb5logging
bmlogging
fb5logger.py
bmlogger.py
0:0 
0:0 
view