G

Intro

For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.

Learn more...

Duplication Overall

24% duplication:

48,395 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
11,664 duplicated lines

896 duplicates

Duplication per Extension

Duplication per Component (primary)

Duplication Between Components (50+ lines)

Download: SVG DOT (open online Graphviz editor)

From Component --> To Component	Duplicated Lines	File Pairs	Details
archive (5%) --> research (7%)	1352	8 file pairs	details...
archive (<1%) --> data_management (<1%)	229	8 file pairs	details...
api (1%) --> detection (6%)	176	1 file pair	details...
data_management (<1%) --> research (<1%)	106	1 file pair	details...
archive (<1%) --> detection (3%)	84	1 file pair	details...
api (<1%) --> visualization (2%)	53	2 file pairs	details...

Open 3D force graph...

Show more details on duplication between components...

Longest Duplicates

The list of 20 longest duplicates.

See data for all 896 duplicates...

Size	#	Folders	Files	Lines	Code
245	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	hierarchical_clustering_AL.py hierarchical_clustering_AL.py	33:362 (100%) 33:362 (100%)	view
229	x 2	archive research/active_learning/labeling_tool	runapp.py runapp.py	1:336 (53%) 1:336 (53%)	view
190	x 2	archive research/active_learning/labeling_tool	runapp.py runapp.py	338:559 (44%) 340:561 (44%)	view
120	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	simulate_batch.py simulate_batch.py	30:261 (100%) 30:261 (100%)	view
92	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	constants.py constants.py	19:131 (100%) 19:131 (100%)	view
89	x 2	research/active_learning..._learning_methods/utils research/active_learning/sampling_methods/utils	tree.py tree.py	32:157 (100%) 32:157 (100%)	view
88	x 2	api/batch_processing/api_core/batch_service detection	score.py run_tf_detector.py	117:276 (37%) 117:283 (33%)	view
57	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	mixture_of_samplers.py mixture_of_samplers.py	28:110 (100%) 28:110 (100%)	view
53	x 2	research/active_learning/archive research/active_learning/archive	good_run.py run_bk.py	154:214 (27%) 133:193 (33%)	view
50	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	bandit_discrete.py bandit_discrete.py	34:124 (100%) 34:124 (100%)	view
50	x 2	research/active_learning/DL research/active_learning/deep_learning	utils.py utils.py	307:391 (16%) 191:275 (23%)	view
46	x 2	data_management/lila data_management/lila	create_lila_test_set.py download_lila_subset.py	37:120 (38%) 56:139 (27%)	view
46	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	kcenter_greedy.py kcenter_greedy.py	38:122 (100%) 38:122 (100%)	view
43	x 2	data_management/importers data_management/importers	save_the_elephants_survey_A.py save_the_elephants_survey_B.py	196:258 (24%) 208:270 (22%)	view
43	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	informative_diverse.py informative_diverse.py	33:101 (100%) 33:101 (100%)	view
40	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	graph_density.py graph_density.py	34:92 (100%) 34:92 (100%)	view
39	x 2	classification classification	prepare_classification_script.py prepare_classification_script_mc.py	91:146 (33%) 89:144 (28%)	view
39	x 2	research/active_learning/archive research/active_learning	filebased_main.py main.py	26:67 (29%) 27:68 (26%)	view
38	x 2	archive/classification_marcel/tf-slim/nets/nasnet archive/classification_marcel/tf-slim/nets/nasnet	nasnet.py nasnet.py	349:393 (10%) 404:448 (10%)	view
38	x 2	research/active_learning/DL research/active_learning/deep_learning	utils.py utils.py	15:60 (12%) 14:57 (18%)	view

Duplicated Units

The list of top 20 duplicated units.

See data for all 85 unit duplicates...

Size	#	Folders	Files	Lines	Code
64	x 2	archive research/active_learning/labeling_tool	runapp.py runapp.py	0:0 0:0	view
92	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	simulate_batch.py simulate_batch.py	0:0 0:0	view
44	x 2	archive research/active_learning/labeling_tool	runapp.py runapp.py	0:0 0:0	view
61	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	hierarchical_clustering_AL.py hierarchical_clustering_AL.py	0:0 0:0	view
36	x 2	archive research/active_learning/labeling_tool	runapp.py runapp.py	0:0 0:0	view
44	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	hierarchical_clustering_AL.py hierarchical_clustering_AL.py	0:0 0:0	view
34	x 2	archive research/active_learning/labeling_tool	runapp.py runapp.py	0:0 0:0	view
44	x 9	archive/classification_marcel/tf-slim/datasets archive/classification_marcel/tf-slim/datasets archive/classification_marcel/tf-slim/datasets archive/classification_marcel/tf-slim/datasets archive/classification_marcel/tf-slim/datasets archive/classification_marcel/tf-slim/datasets archive/classification_marcel/tf-slim/datasets archive/classification_marcel/tf-slim/datasets archive/classification_marcel/tf-slim/datasets	cct.py idfg.py nacti.py obscured.py obscured_large.py rspb.py serengeti.py wellington.py wiitigers.py	0:0 0:0 0:0 0:0 0:0 0:0 0:0 0:0 0:0	view
31	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	hierarchical_clustering_AL.py hierarchical_clustering_AL.py	0:0 0:0	view
31	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	represent_cluster_centers.py represent_cluster_centers.py	0:0 0:0	view
40	x 2	api/batch_processing/api_core/batch_service detection	score.py run_tf_detector.py	0:0 0:0	view
25	x 2	archive research/active_learning/labeling_tool	runapp.py runapp.py	0:0 0:0	view
33	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	informative_diverse.py informative_diverse.py	0:0 0:0	view
37	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	bandit_discrete.py bandit_discrete.py	0:0 0:0	view
29	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	mixture_of_samplers.py mixture_of_samplers.py	0:0 0:0	view
40	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	simulate_batch.py simulate_batch.py	0:0 0:0	view
37	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	kcenter_greedy.py kcenter_greedy.py	0:0 0:0	view
33	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	simulate_batch.py simulate_batch.py	0:0 0:0	view
21	x 2	data_management/lila data_management/lila	create_lila_test_set.py download_lila_subset.py	0:0 0:0	view
24	x 2	research/active_learning/active_learning_methods research/active_learning/sampling_methods	mixture_of_samplers.py mixture_of_samplers.py	0:0 0:0	view