apache / carbondata
Source Code Overview

Analysis scope, overview of main, test, generated, deployment, build, and other code.

Source Code Analysis Scope
Files includes and excluded from analyses
txt
properties
orc
cfg
Overview of Analyzed Files
Basic stats on analyzed files
Intro
For analysis purposes we separate files in scope into several categories: main, test, generated, deployment and build, and other.

  • The main category contains all manually created source code files that are being used in the production.
  • Files in the main category are used as input for other analyses: logical decomposition, concerns, duplication, file size, unit size, and conditional complexity.
  • Test source code files are used only for testing of the product. These files are normally not deployed to production.
  • Build and deployment source code files are used to configure or support build and deployment process.
  • Generated source code files are automatically generated files that have not been manually changed after generation.
  • While a source code folder may contain a number of files, we are primarily interested in the source code files that are being written and maintained by developers.
  • Files containing binaries, documentation, or third-party libraries, for instance, are excluded from analysis. The exception are third-party libraries that have been changed by developers.

main177488 LOC (54%) 1,420 files
test122527 LOC (37%) 557 files
generated12344 LOC (3%) 31 files
build and deployment5186 LOC (1%) 25 files
other6858 LOC (2%) 103 files
Main Code
All manually created or maintained source code that defines logic of the product that is run in a production environment.
cfg
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*".
  • 1420 files match defined criteria (177,488 LOC, 100.0% vs. main code):
    • 997 *.java files (117,592 LOC)
    • 369 *.scala files (55,193 LOC)
    • 28 *.py files (2,041 LOC)
    • 7 *.cpp files (1,056 LOC)
    • 5 *.xml files (628 LOC)
    • 1 *.g4 files (554 LOC)
    • 5 *.thrift files (277 LOC)
    • 7 *.h files (145 LOC)
    • 1 *.cfg files (2 LOC)
  • " *.java" is biggest, containing 66.25% of LOC.
  • " *.cfg" is smallest, containing 0% of LOC.


*.java117592 LOC (66%) 997 files
*.scala55193 LOC (31%) 369 files
*.py2041 LOC (1%) 28 files
*.cpp1056 LOC (<1%) 7 files
*.xml628 LOC (<1%) 5 files
*.g4554 LOC (<1%) 1 file
*.thrift277 LOC (<1%) 5 files
*.h145 LOC (<1%) 7 files
*.cfg2 LOC (<1%) 1 file
Test Code
Used only for testing of the product. Normally not deployed in a production environment.
orc
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/[Tt]est/.*".
    • files with paths like ".*/[Tt]ests/.*".
    • files with paths like ".*/test[.].*".
    • files with paths like ".*/test_.*".
  • 557 files match defined criteria (122,527 LOC, 69.0% vs. main code):
    • 362 *.scala files (94,701 LOC)
    • 128 *.java files (23,376 LOC)
    • 47 *.py files (2,877 LOC)
    • 1 *.cpp files (822 LOC)
    • 11 *.xml files (472 LOC)
    • 3 *.avsc files (178 LOC)
    • 4 *.orc files (97 LOC)
    • 1 *.cs files (4 LOC)
  • " *.scala" is biggest, containing 77.29% of LOC.
  • " *.cs" is smallest, containing 0% of LOC.


*.scala94701 LOC (77%) 362 files
*.java23376 LOC (19%) 128 files
*.py2877 LOC (2%) 47 files
*.cpp822 LOC (<1%) 1 file
*.xml472 LOC (<1%) 11 files
*.avsc178 LOC (<1%) 3 files
*.orc97 LOC (<1%) 4 files
*.cs4 LOC (<1%) 1 file
Generated Code
Automatically generated files, not manually changed after generation.
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/generated/.*".
  • 31 files match defined criteria (12,344 LOC, 7.0% vs. main code). All matches are in *.scala files.


*.scala12344 LOC (100%) 31 files
Build and Deployment Code
Source code used to configure or support build and deployment process.
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/pom[.]xml".
    • files with paths like ".*[.]sh".
    • files with paths like ".*[.]git[a-z]+".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*[.]bat".
    • files with paths like ".*/assembly[.]xml".
  • 25 files match defined criteria (5,186 LOC, 2.9% vs. main code):
    • 23 *.xml files (5,145 LOC)
    • 1 *.bat files (31 LOC)
    • 1 *.sh files (10 LOC)
  • " *.xml" is biggest, containing 99.21% of LOC.
  • " *.sh" is smallest, containing 0.19% of LOC.


*.xml5145 LOC (99%) 23 files
*.bat31 LOC (<1%) 1 file
*.sh10 LOC (<1%) 1 file
Other Code
txt
properties
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/[Ee]xamples/.*".
    • files with paths like ".*[.]properties".
    • files with paths like ".*[.]md".
    • files with paths like ".*/README[.][a-z0-9]+".
    • files with paths like ".*[.]json".
    • files with paths like ".*[.]txt".
    • files with paths like ".*/[.]gitignore".
  • 103 files match defined criteria (6,858 LOC, 3.9% vs. main code):
    • 41 *.scala files (4,373 LOC)
    • 10 *.java files (881 LOC)
    • 10 *.md files (489 LOC)
    • 3 *.xml files (352 LOC)
    • 11 *.txt files (326 LOC)
    • 18 *.json files (297 LOC)
    • 10 *.properties files (140 LOC)
  • " *.scala" is biggest, containing 63.76% of LOC.
  • " *.properties" is smallest, containing 2.04% of LOC.


*.scala4373 LOC (63%) 41 files
*.java881 LOC (12%) 10 files
*.md489 LOC (7%) 10 files
*.xml352 LOC (5%) 3 files
*.txt326 LOC (4%) 11 files
*.json297 LOC (4%) 18 files
*.properties140 LOC (2%) 10 files
Analyzers
Info about analyzers used for source code examinations.


2025-05-07 16:01