awslabs / ml-io
Duplication

Places in code with 6 or more lines that are exactly the same.

Intro
  • For duplication, we look at places in code where there are 6 or more lines of code that are exactly the same.
  • Before duplication is calculated, the code is cleaned to remove empty lines, comments, and frequently duplicated constructs such as imports.
  • You should aim at having as little as possible (<5%) of duplicated code as high-level of duplication can lead to maintenance difficulties, poor factoring, and logical contradictions.
Learn more...
Duplication Overall
  • 4% duplication:
    • 10,982 cleaned lines of cleaned code (without empty lines, comments, and frequently duplicated constructs such as imports)
    • 481 duplicated lines
  • 40 duplicates
system4% (481 lines)
Duplication per Extension
cc4% (235 lines)
h4% (178 lines)
c6% (38 lines)
proto28% (30 lines)
Duplication per Component (primary)
src/mlio-py/mlio4% (186 lines)
src/mlio7% (159 lines)
src/mlio/record_readers5% (30 lines)
src/mlio/detail14% (30 lines)
include/mlio/memory8% (20 lines)
src/mlio/instance_readers4% (20 lines)
include/mlio<1% (12 lines)
include/mlio/data_stores8% (12 lines)
src/mlio/streams2% (12 lines)
cmake0% (0 lines)
src/mlio-py0% (0 lines)
src0% (0 lines)
src/mlio/util0% (0 lines)
src/mlio/data_stores0% (0 lines)
src/mlio/integ0% (0 lines)
src/mlio/memory0% (0 lines)
packaging/conda/recipe0% (0 lines)
ROOT0% (0 lines)
include/mlio/record_readers0% (0 lines)
include/mlio/detail0% (0 lines)
include/mlio/util0% (0 lines)
include/mlio/integ0% (0 lines)
include/mlio/streams0% (0 lines)
include0% (0 lines)
Longest Duplicates
The list of 20 longest duplicates.
See data for all 40 duplicates...
Size#FoldersFilesLinesCode
16 x 2 src/mlio-py/mlio/contrib/insights/hll
src/mlio-py/mlio/contrib/insights/hll
xxh3.h
xxh3.h
691:706 (1%)
867:882 (1%)
view
12 x 2 src/mlio/record_readers
src/mlio/record_readers
csv_record_reader.cc
csv_record_reader.cc
95:112 (8%)
152:169 (8%)
view
12 x 2 src/mlio
src/mlio/instance_readers
instance.cc
core_instance_reader.cc
132:145 (18%)
87:100 (10%)
view
11 x 2 src/mlio
src/mlio
csv_reader.cc
recordio_protobuf_reader.cc
594:615 (2%)
391:412 (2%)
view
10 x 2 src/mlio/detail/protobuf
src/mlio/detail/protobuf
proto
recordio_protobuf.proto
recordio_protobuf.proto
14:24 (9%)
32:42 (9%)
view
10 x 2 src/mlio/detail/protobuf
src/mlio/detail/protobuf
proto
recordio_protobuf.proto
recordio_protobuf.proto
14:24 (9%)
50:60 (9%)
view
10 x 2 src/mlio/detail/protobuf
src/mlio/detail/protobuf
proto
recordio_protobuf.proto
recordio_protobuf.proto
32:42 (9%)
50:60 (9%)
view
10 x 2 src/mlio
src/mlio
csv_reader.cc
recordio_protobuf_reader.cc
448:461 (2%)
299:312 (2%)
view
10 x 2 include/mlio/memory
include/mlio/memory
file_backed_memory_block.h
heap_memory_block.h
46:68 (33%)
42:64 (43%)
view
10 x 2 src/mlio
src/mlio
csv_reader.cc
recordio_protobuf_reader.cc
541:560 (2%)
351:370 (2%)
view
9 x 2 src/mlio-py/mlio/contrib/insights/hll
src/mlio-py/mlio/contrib/insights/hll
xxh3.h
xxh3.h
445:455 (<1%)
1452:1462 (<1%)
view
9 x 2 src/mlio
src/mlio
image_reader.cc
image_reader.cc
317:333 (4%)
373:389 (4%)
view
8 x 2 src/mlio
src/mlio
recordio_protobuf_reader.cc
recordio_protobuf_reader.cc
609:622 (1%)
836:849 (1%)
view
8 x 2 src/mlio
src/mlio/instance_readers
instance.cc
core_instance_reader.cc
43:54 (12%)
215:226 (7%)
view
8 x 2 src/mlio
src/mlio
csv_reader.cc
csv_reader.cc
260:270 (2%)
336:346 (2%)
view
8 x 2 src/mlio
src/mlio
image_reader.cc
image_reader.cc
216:229 (3%)
373:386 (3%)
view
8 x 2 src/mlio
src/mlio
image_reader.cc
image_reader.cc
216:229 (3%)
317:330 (3%)
view
8 x 2 src/mlio-py/mlio/contrib/insights/hll
src/mlio-py/mlio/contrib/insights/hll
xxh3.h
xxh3.h
483:492 (<1%)
1504:1513 (<1%)
view
8 x 2 src/mlio-py/mlio/contrib/insights/hll
src/mlio-py/mlio/contrib/insights/hll
xxh3.h
xxh3.h
465:474 (<1%)
1477:1486 (<1%)
view
7 x 2 src/mlio-py/mlio/core
src/mlio-py/mlio/core
data_reader.cc
data_reader.cc
659:665 (1%)
669:675 (1%)
view
Duplicated Units
The list of top 1 duplicated units.
See data for all 1 unit duplicate
Size#FoldersFilesLinesCode
8 x 2 src/mlio-py/mlio/core
src/mlio-py/mlio/core
stream.cc
stream.cc
113:122 
124:133 
view