uber / gluten-fork
Source Code Overview

Analysis scope, overview of main, test, generated, deployment, build, and other code.

Source Code Analysis Scope
Files includes and excluded from analyses
txt
patch
proto
orc
properties
clang-format
in
clang-tidy
Overview of Analyzed Files
Basic stats on analyzed files
Intro
For analysis purposes we separate files in scope into several categories: main, test, generated, deployment and build, and other.

  • The main category contains all manually created source code files that are being used in the production.
  • Files in the main category are used as input for other analyses: logical decomposition, concerns, duplication, file size, unit size, and conditional complexity.
  • Test source code files are used only for testing of the product. These files are normally not deployed to production.
  • Build and deployment source code files are used to configure or support build and deployment process.
  • Generated source code files are automatically generated files that have not been manually changed after generation.
  • While a source code folder may contain a number of files, we are primarily interested in the source code files that are being written and maintained by developers.
  • Files containing binaries, documentation, or third-party libraries, for instance, are excluded from analysis. The exception are third-party libraries that have been changed by developers.

main124932 LOC (53%) 1,291 files
test82415 LOC (35%) 1,056 files
generated0 LOC (0%) 0 files
build and deployment6747 LOC (2%) 80 files
other19732 LOC (8%) 365 files
Main Code
All manually created or maintained source code that defines logic of the product that is run in a production environment.
proto
orc
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*".
  • 1291 files match defined criteria (124,932 LOC, 100.0% vs. main code):
    • 426 *.scala files (43,817 LOC)
    • 151 *.cpp files (21,948 LOC)
    • 80 *.cc files (21,198 LOC)
    • 217 *.h files (12,454 LOC)
    • 215 *.java files (12,383 LOC)
    • 125 *.sql files (5,198 LOC)
    • 22 *.proto files (3,915 LOC)
    • 35 *.cmake files (1,433 LOC)
    • 4 *.orc files (1,237 LOC)
    • 10 *.yaml files (652 LOC)
    • 4 *.py files (413 LOC)
    • 2 *.xml files (284 LOC)
  • " *.scala" is biggest, containing 35.07% of LOC.
  • " *.xml" is smallest, containing 0.23% of LOC.


*.scala43817 LOC (35%) 426 files
*.cpp21948 LOC (17%) 151 files
*.cc21198 LOC (16%) 80 files
*.h12454 LOC (9%) 217 files
*.java12383 LOC (9%) 215 files
*.sql5198 LOC (4%) 125 files
*.proto3915 LOC (3%) 22 files
*.cmake1433 LOC (1%) 35 files
*.orc1237 LOC (<1%) 4 files
*.yaml652 LOC (<1%) 10 files
*.py413 LOC (<1%) 4 files
*.xml284 LOC (<1%) 2 files
Test Code
Used only for testing of the product. Normally not deployed in a production environment.
orc
in
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/[Tt]est/.*".
    • files with paths like ".*[-]tests/.*".
    • files with paths like ".*/test[-]data/.*".
    • files with paths like ".*/[Tt]ests/.*".
  • 1056 files match defined criteria (82,415 LOC, 66.0% vs. main code):
    • 701 *.scala files (64,107 LOC)
    • 298 *.sql files (12,033 LOC)
    • 17 *.cpp files (2,711 LOC)
    • 20 *.cc files (2,429 LOC)
    • 5 *.h files (536 LOC)
    • 10 *.java files (321 LOC)
    • 4 *.orc files (274 LOC)
    • 1 *.in files (4 LOC)
  • " *.scala" is biggest, containing 77.79% of LOC.
  • " *.in" is smallest, containing 0% of LOC.


*.scala64107 LOC (77%) 701 files
*.sql12033 LOC (14%) 298 files
*.cpp2711 LOC (3%) 17 files
*.cc2429 LOC (2%) 20 files
*.h536 LOC (<1%) 5 files
*.java321 LOC (<1%) 10 files
*.orc274 LOC (<1%) 4 files
*.in4 LOC (<1%) 1 file
Build and Deployment Code
Source code used to configure or support build and deployment process.
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/pom[.]xml".
    • files with paths like ".*[.]sh".
    • files with paths like ".*[.]git[a-z]+".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*/assembly[.]xml".
  • 80 files match defined criteria (6,747 LOC, 5.4% vs. main code):
    • 29 *.xml files (4,787 LOC)
    • 51 *.sh files (1,960 LOC)
  • " *.xml" is biggest, containing 70.95% of LOC.
  • " *.sh" is smallest, containing 29.05% of LOC.


*.xml4787 LOC (70%) 29 files
*.sh1960 LOC (29%) 51 files
Other Code
txt
patch
properties
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]properties".
    • files with paths like ".*[.]txt".
    • files with paths like ".*[.]md".
    • files with paths like ".*/README[.][a-z0-9]+".
    • files with paths like ".*/[Ee]xamples/.*".
    • files with paths like ".*[.]json".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*[.]patch".
    • files with paths like ".*/checkstyle.*".
    • files with paths like ".*/[.]dockerignore".
    • files with paths like ".*/checkstyle[.]xml".
  • 365 files match defined criteria (19,732 LOC, 15.8% vs. main code):
    • 61 *.json files (10,061 LOC)
    • 251 *.txt files (7,517 LOC)
    • 30 *.patch files (1,040 LOC)
    • 9 *.md files (607 LOC)
    • 6 *.properties files (222 LOC)
    • 2 *.xml files (147 LOC)
    • 2 *.cpp files (83 LOC)
    • 3 *.sh files (35 LOC)
    • 1 *.cc files (20 LOC)
  • " *.json" is biggest, containing 50.99% of LOC.
  • " *.cc" is smallest, containing 0.1% of LOC.


*.json10061 LOC (50%) 61 files
*.txt7517 LOC (38%) 251 files
*.patch1040 LOC (5%) 30 files
*.md607 LOC (3%) 9 files
*.properties222 LOC (1%) 6 files
*.xml147 LOC (<1%) 2 files
*.cpp83 LOC (<1%) 2 files
*.sh35 LOC (<1%) 3 files
*.cc20 LOC (<1%) 1 file
Analyzers
Info about analyzers used for source code examinations.


2026-04-18 13:03