apache / datasketches-python
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
0% | 0% | 47% | 28% | 23%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
ipynb0% | 0% | 87% | 12% | 0%
cpp0% | 0% | 30% | 43% | 25%
py0% | 0% | 0% | 35% | 64%
hpp0% | 0% | 0% | 0% | 100%
toml0% | 0% | 0% | 0% | 100%
in0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
jupyter0% | 0% | 81% | 17% | <1%
src0% | 0% | 30% | 43% | 25%
include0% | 0% | 0% | 0% | 100%
ROOT0% | 0% | 0% | 0% | 100%
datasketches0% | 0% | 0% | 0% | 100%
Longest Files (Top 39)
File# lines# units
454 24
403 -
354 -
HLLSketch.ipynb
in jupyter
346 -
CPCSketch.ipynb
in jupyter
345 -
241 1
api-differences.ipynb
in jupyter/comparison-to-datasketch
200 -
168 1
151 6
148 2
147 2
146 2
cardinality_error_experiment.py
in jupyter/comparison-to-datasketch
114 7
112 3
111 1
94 3
87 4
84 2
82 2
setup.py
in root
76 3
71 2
65 4
PySerDe.py
in datasketches
61 15
60 1
59 8
57 1
55 4
48 -
py_serde.hpp
in include
39 3
TuplePolicy.py
in datasketches
35 12
31 -
in
24 -
utils.py
in jupyter/comparison-to-datasketch
17 1
KernelFunction.py
in datasketches
13 2
12 -
9 1
__init__.py
in datasketches
5 -
1 -
1 -
Files With Most Units (Top 27)
File# lines# units
454 24
PySerDe.py
in datasketches
61 15
TuplePolicy.py
in datasketches
35 12
59 8
cardinality_error_experiment.py
in jupyter/comparison-to-datasketch
114 7
151 6
55 4
65 4
87 4
py_serde.hpp
in include
39 3
94 3
112 3
setup.py
in root
76 3
148 2
82 2
71 2
147 2
84 2
146 2
KernelFunction.py
in datasketches
13 2
9 1
57 1
111 1
168 1
241 1
60 1
utils.py
in jupyter/comparison-to-datasketch
17 1
Files With Long Lines (Top 21)

There are 21 files with lines longer than 120 characters. In total, there are 103 long lines.

File# lines# units# long lines
151 6 13
354 - 10
82 2 9
454 24 8
148 2 7
241 1 6
146 2 6
403 - 6
112 3 5
65 4 4
168 1 4
71 2 3
147 2 3
HLLSketch.ipynb
in jupyter
346 - 3
cardinality_error_experiment.py
in jupyter/comparison-to-datasketch
114 7 3
api-differences.ipynb
in jupyter/comparison-to-datasketch
200 - 3
CPCSketch.ipynb
in jupyter
345 - 3
57 1 2
94 3 2
60 1 2
111 1 1
Correlations

File Size vs. Commits (all time): 37 points

version.cfg.in x: 8 commits (all time) y: 1 lines of code pyproject.toml x: 28 commits (all time) y: 31 lines of code setup.py x: 16 commits (all time) y: 76 lines of code src/tdigest_wrapper.cpp x: 6 commits (all time) y: 84 lines of code src/datasketches.cpp x: 16 commits (all time) y: 48 lines of code src/tuple_wrapper.cpp x: 15 commits (all time) y: 241 lines of code src/density_wrapper.cpp x: 17 commits (all time) y: 94 lines of code src/theta_wrapper.cpp x: 15 commits (all time) y: 168 lines of code include/kernel_function.hpp x: 12 commits (all time) y: 55 lines of code include/tuple_policy.hpp x: 11 commits (all time) y: 59 lines of code src/count_wrapper.cpp x: 10 commits (all time) y: 82 lines of code src/cpc_wrapper.cpp x: 10 commits (all time) y: 60 lines of code src/ebpps_wrapper.cpp x: 13 commits (all time) y: 71 lines of code src/fi_wrapper.cpp x: 13 commits (all time) y: 151 lines of code src/hll_wrapper.cpp x: 12 commits (all time) y: 111 lines of code src/kll_wrapper.cpp x: 14 commits (all time) y: 146 lines of code src/quantiles_wrapper.cpp x: 12 commits (all time) y: 147 lines of code src/vector_of_kll.cpp x: 16 commits (all time) y: 454 lines of code src/vo_wrapper.cpp x: 11 commits (all time) y: 112 lines of code datasketches/KernelFunction.py x: 7 commits (all time) y: 13 lines of code datasketches/PySerDe.py x: 5 commits (all time) y: 61 lines of code datasketches/TuplePolicy.py x: 2 commits (all time) y: 35 lines of code src/ks_wrapper.cpp x: 9 commits (all time) y: 57 lines of code src/py_serde.cpp x: 13 commits (all time) y: 87 lines of code include/quantile_conditional.hpp x: 9 commits (all time) y: 65 lines of code MANIFEST.in x: 10 commits (all time) y: 24 lines of code datasketches/__init__.py x: 10 commits (all time) y: 5 lines of code include/py_object_lt.hpp x: 6 commits (all time) y: 9 lines of code include/py_object_ostream.hpp x: 6 commits (all time) y: 12 lines of code include/py_serde.hpp x: 7 commits (all time) y: 39 lines of code jupyter/comparison-to-datasketch/api-differences.ipynb x: 3 commits (all time) y: 200 lines of code jupyter/comparison-to-datasketch/cardinality_error_experiment.py x: 3 commits (all time) y: 114 lines of code jupyter/comparison-to-datasketch/utils.py x: 3 commits (all time) y: 17 lines of code jupyter/CPCSketch.ipynb x: 2 commits (all time) y: 345 lines of code jupyter/FrequentItemsSketch.ipynb x: 2 commits (all time) y: 354 lines of code jupyter/ThetaSketchNotebook.ipynb x: 2 commits (all time) y: 403 lines of code
454.0
lines of code
  min: 1.0
  average: 112.95
  25th percentile: 37.0
  median: 76.0
  75th percentile: 147.5
  max: 454.0
0 28.0
commits (all time)
min: 2.0 | average: 9.89 | 25th percentile: 6.0 | median: 10.0 | 75th percentile: 13.0 | max: 28.0

File Size vs. Contributors (all time): 37 points

version.cfg.in x: 1 contributors (all time) y: 1 lines of code pyproject.toml x: 4 contributors (all time) y: 31 lines of code setup.py x: 3 contributors (all time) y: 76 lines of code src/tdigest_wrapper.cpp x: 1 contributors (all time) y: 84 lines of code src/datasketches.cpp x: 2 contributors (all time) y: 48 lines of code src/tuple_wrapper.cpp x: 2 contributors (all time) y: 241 lines of code src/density_wrapper.cpp x: 2 contributors (all time) y: 94 lines of code src/theta_wrapper.cpp x: 2 contributors (all time) y: 168 lines of code include/kernel_function.hpp x: 4 contributors (all time) y: 55 lines of code include/tuple_policy.hpp x: 4 contributors (all time) y: 59 lines of code src/count_wrapper.cpp x: 1 contributors (all time) y: 82 lines of code src/cpc_wrapper.cpp x: 1 contributors (all time) y: 60 lines of code src/ebpps_wrapper.cpp x: 2 contributors (all time) y: 71 lines of code src/fi_wrapper.cpp x: 2 contributors (all time) y: 151 lines of code src/hll_wrapper.cpp x: 2 contributors (all time) y: 111 lines of code src/kll_wrapper.cpp x: 4 contributors (all time) y: 146 lines of code src/quantiles_wrapper.cpp x: 4 contributors (all time) y: 147 lines of code src/vector_of_kll.cpp x: 4 contributors (all time) y: 454 lines of code datasketches/KernelFunction.py x: 2 contributors (all time) y: 13 lines of code datasketches/PySerDe.py x: 1 contributors (all time) y: 61 lines of code datasketches/TuplePolicy.py x: 1 contributors (all time) y: 35 lines of code src/ks_wrapper.cpp x: 2 contributors (all time) y: 57 lines of code src/py_serde.cpp x: 4 contributors (all time) y: 87 lines of code include/quantile_conditional.hpp x: 2 contributors (all time) y: 65 lines of code MANIFEST.in x: 3 contributors (all time) y: 24 lines of code datasketches/__init__.py x: 4 contributors (all time) y: 5 lines of code include/py_object_lt.hpp x: 2 contributors (all time) y: 9 lines of code include/py_object_ostream.hpp x: 2 contributors (all time) y: 12 lines of code include/py_serde.hpp x: 2 contributors (all time) y: 39 lines of code jupyter/comparison-to-datasketch/api-differences.ipynb x: 2 contributors (all time) y: 200 lines of code jupyter/comparison-to-datasketch/cardinality_error_experiment.py x: 2 contributors (all time) y: 114 lines of code jupyter/comparison-to-datasketch/utils.py x: 2 contributors (all time) y: 17 lines of code jupyter/CPCSketch.ipynb x: 2 contributors (all time) y: 345 lines of code jupyter/FrequentItemsSketch.ipynb x: 2 contributors (all time) y: 354 lines of code jupyter/ThetaSketchNotebook.ipynb x: 2 contributors (all time) y: 403 lines of code
454.0
lines of code
  min: 1.0
  average: 112.95
  25th percentile: 37.0
  median: 76.0
  75th percentile: 147.5
  max: 454.0
0 4.0
contributors (all time)
min: 1.0 | average: 2.38 | 25th percentile: 2.0 | median: 2.0 | 75th percentile: 3.5 | max: 4.0

File Size vs. Commits (30 days): 0 points

No data for "commits (30d)" vs. "lines of code".

File Size vs. Contributors (30 days): 0 points

No data for "contributors (30d)" vs. "lines of code".


File Size vs. Commits (90 days): 3 points

version.cfg.in x: 2 commits (90d) y: 1 lines of code pyproject.toml x: 2 commits (90d) y: 31 lines of code setup.py x: 2 commits (90d) y: 76 lines of code
76.0
lines of code
  min: 1.0
  average: 36.0
  25th percentile: 1.0
  median: 31.0
  75th percentile: 76.0
  max: 76.0
0 2.0
commits (90d)
min: 2.0 | average: 2.0 | 25th percentile: 2.0 | median: 2.0 | 75th percentile: 2.0 | max: 2.0

File Size vs. Contributors (90 days): 3 points

version.cfg.in x: 1 contributors (90d) y: 1 lines of code pyproject.toml x: 2 contributors (90d) y: 31 lines of code setup.py x: 2 contributors (90d) y: 76 lines of code
76.0
lines of code
  min: 1.0
  average: 36.0
  25th percentile: 1.0
  median: 31.0
  75th percentile: 76.0
  max: 76.0
0 2.0
contributors (90d)
min: 1.0 | average: 1.67 | 25th percentile: 1.0 | median: 2.0 | 75th percentile: 2.0 | max: 2.0