G

Intro

For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.

Learn more...

Duplication Overall

17% duplication:

16,768 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
2,937 duplicated lines

169 duplicates

Duplication per Extension

Duplication per Component (primary)

Duplication Between Components (50+ lines)

Download: SVG DOT (open online Graphviz editor)

From Component --> To Component	Duplicated Lines	File Pairs	Details
benchmarks/rnnt/ootb/inference (8%) --> benchmarks/rnnt/ootb/train (16%)	866	13 file pairs	details...
bmlogging (79%) --> fb5logging (30%)	132	2 file pairs	details...
benchmarks/dlrm/ootb/cython (92%) --> benchmarks/dlrm/ootb (<1%)	63	2 file pairs	details...

Open 3D force graph...

Show more details on duplication between components...

Longest Duplicates

The list of 20 longest duplicates.

See data for all 169 duplicates...

Size	#	Folders	Files	Lines	Code
64	x 2	benchmarks/rnnt/ootb/inference/pytorch/parts/text benchmarks/rnnt/ootb/train/common/text	numbers.py numbers.py	23:101 (100%) 23:99 (100%)	view
50	x 2	benchmarks/rnnt/ootb/inference/pytorch/parts/text benchmarks/rnnt/ootb/train/common/text	cleaners.py cleaners.py	20:89 (79%) 20:83 (96%)	view
50	x 2	bmlogging fb5logging	bmlogger.py fb5logger.py	36:118 (74%) 11:93 (98%)	view
44	x 2	benchmarks/rnnt/ootb/inference/pytorch/parts benchmarks/rnnt/ootb/train/common	segment.py audio.py	107:170 (44%) 118:184 (36%)	view
37	x 2	benchmarks/rnnt/ootb/inference/pytorch/parts benchmarks/rnnt/ootb/train/common	segment.py audio.py	34:81 (37%) 68:116 (30%)	view
31	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_pytorch.py	772:823 (5%) 1239:1289 (3%)	view
30	x 2	benchmarks/rnnt/ootb/inference/pytorch/utils benchmarks/rnnt/ootb/train/utils	download_librispeech.py download_librispeech.py	25:60 (65%) 24:59 (71%)	view
24	x 2	benchmarks/rnnt/ootb/inference/pytorch/utils benchmarks/rnnt/ootb/train/utils	preprocessing_utils.py preprocessing_utils.py	48:77 (57%) 47:76 (58%)	view
24	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_pytorch.py	639:670 (4%) 1106:1137 (2%)	view
23	x 2	benchmarks/rnnt/ootb/inference/pytorch/utils benchmarks/rnnt/ootb/train/utils	download_utils.py download_utils.py	24:52 (62%) 23:51 (62%)	view
22	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_s_caffe2.py dlrm_s_pytorch.py	1167:1189 (1%) 1384:1406 (1%)	view
21	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_pytorch.py dlrm_data_pytorch.py	725:745 (2%) 748:768 (2%)	view
21	x 2	benchmarks/rnnt/ootb/inference/pytorch/utils benchmarks/rnnt/ootb/train/utils	convert_librispeech.py convert_librispeech.py	35:58 (44%) 35:57 (44%)	view
20	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_s_caffe2.py dlrm_s_caffe2.py	834:857 (1%) 1003:1026 (1%)	view
19	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_pytorch.py	598:624 (3%) 1063:1089 (1%)	view
19	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_s_caffe2.py dlrm_s_caffe2.py	754:783 (1%) 863:892 (1%)	view
18	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	data_utils.py data_utils.py	251:269 (1%) 467:485 (1%)	view
17	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_pytorch.py	700:729 (3%) 1167:1196 (1%)	view
17	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_caffe2.py	620:637 (3%) 647:664 (3%)	view
17	x 2	benchmarks/rnnt/ootb/inference/pytorch/utils benchmarks/rnnt/ootb/train/utils	convert_librispeech.py convert_librispeech.py	60:82 (36%) 59:81 (36%)	view

Duplicated Units

The list of top 20 duplicated units.

See data for all 27 unit duplicates...

Size	#	Folders	Files	Lines	Code
30	x 2	benchmarks/rnnt/ootb/train/common benchmarks/rnnt/ootb/inference/pytorch/parts	audio.py segment.py	0:0 0:0	view
19	x 2	benchmarks/rnnt/ootb/train/utils benchmarks/rnnt/ootb/inference/pytorch/utils	download_utils.py download_utils.py	0:0 0:0	view
18	x 2	benchmarks/rnnt/ootb/train/common/text benchmarks/rnnt/ootb/inference/pytorch/parts/text	numbers.py numbers.py	0:0 0:0	view
17	x 2	benchmarks/rnnt/ootb/train/common/text benchmarks/rnnt/ootb/inference/pytorch/parts/text	numbers.py numbers.py	0:0 0:0	view
20	x 2	benchmarks/rnnt/ootb/train/rnnt benchmarks/rnnt/ootb/inference/pytorch	model.py model_separable_rnnt.py	0:0 0:0	view
11	x 2	benchmarks/rnnt/ootb/inference/loadgen benchmarks/rnnt/ootb/inference/loadgen	logging.h logging.h	134:145 148:159	view
11	x 2	benchmarks/rnnt/ootb/inference/loadgen benchmarks/rnnt/ootb/inference/loadgen	cc logging.cc logging.cc	156:167 169:180	view
13	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_pytorch.py	0:0 0:0	view
10	x 2	benchmarks/rnnt/ootb/train/utils benchmarks/rnnt/ootb/inference/pytorch/utils	download_utils.py download_utils.py	0:0 0:0	view
10	x 2	benchmarks/rnnt/ootb/train/common benchmarks/rnnt/ootb/inference/pytorch/parts	audio.py segment.py	0:0 0:0	view
13	x 2	benchmarks/rnnt/ootb/train/common benchmarks/rnnt/ootb/inference/pytorch/parts	audio.py segment.py	0:0 0:0	view
10	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_pytorch.py	0:0 0:0	view
8	x 2	benchmarks/rnnt/ootb/train/common/text benchmarks/rnnt/ootb/inference/pytorch/parts/text	numbers.py numbers.py	0:0 0:0	view
8	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_pytorch.py	0:0 0:0	view
10	x 2	fb5logging bmlogging	fb5logger.py bmlogger.py	0:0 0:0	view
11	x 2	benchmarks/rnnt/ootb/train/common/data benchmarks/rnnt/ootb/train/common/data/dali	dataset.py iterator.py	0:0 0:0	view
7	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_pytorch.py	0:0 0:0	view
7	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	dlrm_data_caffe2.py dlrm_data_pytorch.py	0:0 0:0	view
7	x 2	benchmarks/dlrm/ootb benchmarks/dlrm/ootb	extend_distributed.py extend_distributed.py	0:0 0:0	view
9	x 2	fb5logging bmlogging	fb5logger.py bmlogger.py	0:0 0:0	view