GoogleCloudPlatform / training-data-analyst
Source Code Overview

Analysis scope, overview of main, test, generated, deployment, build, and other code.

Source Code Analysis Scope
Files includes and excluded from analyses
txt
properties
cfg
cmd
proto
ini
pig
mod
pb
libsonnet
in
csproj
lookml
editorconfig
gitattributes
ddl
Overview of Analyzed Files
Basic stats on analyzed files
Intro
For analysis purposes we separate files in scope into several categories: main, test, generated, deployment and build, and other.

  • The main category contains all manually created source code files that are being used in the production.
  • Files in the main category are used as input for other analyses: logical decomposition, concerns, duplication, file size, unit size, and conditional complexity.
  • Test source code files are used only for testing of the product. These files are normally not deployed to production.
  • Build and deployment source code files are used to configure or support build and deployment process.
  • Generated source code files are automatically generated files that have not been manually changed after generation.
  • While a source code folder may contain a number of files, we are primarily interested in the source code files that are being written and maintained by developers.
  • Files containing binaries, documentation, or third-party libraries, for instance, are excluded from analysis. The exception are third-party libraries that have been changed by developers.

main1005892 LOC (81%) 8,770 files
test7254 LOC (<1%) 150 files
generated18704 LOC (1%) 6 files
build and deployment23704 LOC (1%) 870 files
other173640 LOC (14%) 1,273 files
Main Code
All manually created or maintained source code that defines logic of the product that is run in a production environment.
proto
lookml
pig
cfg
in
ddl
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*".
  • 8770 files match defined criteria (1,005,892 LOC, 100.0% vs. main code):
    • 603 *.ipynb files (499,963 LOC)
    • 2,754 *.py files (272,482 LOC)
    • 2,383 *.js files (70,568 LOC)
    • 948 *.java files (47,903 LOC)
    • 968 *.html files (43,721 LOC)
    • 20 *.sql files (18,011 LOC)
    • 320 *.pug files (11,151 LOC)
    • 15 *.go files (10,309 LOC)
    • 416 *.yaml files (10,158 LOC)
    • 41 *.jinja files (6,917 LOC)
    • 17 *.proto files (3,015 LOC)
    • 69 *.xml files (2,742 LOC)
    • 78 *.tf files (1,754 LOC)
    • 21 *.css files (1,491 LOC)
    • 4 *.c files (1,380 LOC)
    • 29 *.ts files (1,304 LOC)
    • 2 *.lookml files (948 LOC)
    • 4 *.cc files (672 LOC)
    • 6 *.scala files (380 LOC)
    • 7 *.cs files (353 LOC)
    • 6 *.pig files (190 LOC)
    • 32 *.cfg files (128 LOC)
    • 6 *.sbt files (108 LOC)
    • 4 *.jsonnet files (106 LOC)
    • 8 *.tfvars files (51 LOC)
    • 2 *.bash files (37 LOC)
    • 3 *.in files (26 LOC)
    • 2 *.htm files (16 LOC)
    • 1 *.ddl files (6 LOC)
    • 1 *.r files (2 LOC)
  • " *.ipynb" is biggest, containing 49.7% of LOC.
  • " *.r" is smallest, containing 0% of LOC.


*.ipynb499963 LOC (49%) 603 files
*.py272482 LOC (27%) 2,754 files
*.js70568 LOC (7%) 2,383 files
*.java47903 LOC (4%) 948 files
*.html43721 LOC (4%) 968 files
*.sql18011 LOC (1%) 20 files
*.pug11151 LOC (1%) 320 files
*.go10309 LOC (1%) 15 files
*.yaml10158 LOC (1%) 416 files
*.jinja6917 LOC (<1%) 41 files
*.proto3015 LOC (<1%) 17 files
*.xml2742 LOC (<1%) 69 files
*.tf1754 LOC (<1%) 78 files
*.css1491 LOC (<1%) 21 files
*.c1380 LOC (<1%) 4 files
*.ts1304 LOC (<1%) 29 files
*.lookml948 LOC (<1%) 2 files
*.cc672 LOC (<1%) 4 files
*.scala380 LOC (<1%) 6 files
*.cs353 LOC (<1%) 7 files
*.pig190 LOC (<1%) 6 files
*.cfg128 LOC (<1%) 32 files
*.sbt108 LOC (<1%) 6 files
*.jsonnet106 LOC (<1%) 4 files
*.tfvars51 LOC (<1%) 8 files
*.bash37 LOC (<1%) 2 files
*.in26 LOC (<1%) 3 files
*.htm16 LOC (<1%) 2 files
*.ddl6 LOC (<1%) 1 file
*.r2 LOC (<1%) 1 file
Test Code
Used only for testing of the product. Normally not deployed in a production environment.
csproj
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*_test[.].*".
    • files with paths like ".*/[Tt]ests/.*".
    • files with paths like ".*[.][Tt]ests[.].*".
    • files with paths like ".*[.]tests[.].*".
    • files with paths like ".*/[Tt]est/.*".
    • files with paths like ".*/test_.*".
    • files with paths like ".*/karma[.]conf[.]js".
    • files with paths like ".*/test[.].*".
    • files with paths like ".*[.]spec[.]ts".
    • files with paths like ".*/protractor[.]conf[.]js".
    • files with paths like ".*/e2e/.*".
    • files with paths like ".*[-]test[-].*".
    • files with paths like ".*_tests[.].*".
    • files with paths like ".*/testing[.].*".
    • files with paths like ".*/jest[.][a-zA-Z0-9\.]+".
    • files with paths like ".*/__mock[a-zA-Z0-9_\- ]+/.*".
    • files with paths like ".*[.]spec[.]js".
  • 150 files match defined criteria (7,254 LOC, 0.7% vs. main code):
    • 31 *.py files (5,022 LOC)
    • 26 *.java files (562 LOC)
    • 4 *.go files (547 LOC)
    • 14 *.tf files (359 LOC)
    • 15 *.ts files (307 LOC)
    • 45 *.sh files (185 LOC)
    • 1 *.cs files (110 LOC)
    • 5 *.js files (110 LOC)
    • 6 *.yaml files (30 LOC)
    • 1 *.csproj files (16 LOC)
    • 2 *.html files (6 LOC)
  • " *.py" is biggest, containing 69.23% of LOC.
  • " *.html" is smallest, containing 0.08% of LOC.


*.py5022 LOC (69%) 31 files
*.java562 LOC (7%) 26 files
*.go547 LOC (7%) 4 files
*.tf359 LOC (4%) 14 files
*.ts307 LOC (4%) 15 files
*.sh185 LOC (2%) 45 files
*.cs110 LOC (1%) 1 file
*.js110 LOC (1%) 5 files
*.yaml30 LOC (<1%) 6 files
*.csproj16 LOC (<1%) 1 file
*.html6 LOC (<1%) 2 files
Generated Code
Automatically generated files, not manually changed after generation.
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/package[-]lock[.]json".
    • files with paths like ".*[.](py|java|h|cc|cpp|m|rb|php)" AND any line of content like ".*Generated by the protocol buffer compiler[.][ ]+DO NOT EDIT[!].*".
  • 6 files match defined criteria (18,704 LOC, 1.9% vs. main code):
    • 4 *.json files (18,236 LOC)
    • 2 *.py files (468 LOC)
  • " *.json" is biggest, containing 97.5% of LOC.
  • " *.py" is smallest, containing 2.5% of LOC.


*.json18236 LOC (97%) 4 files
*.py468 LOC (2%) 2 files
Build and Deployment Code
Source code used to configure or support build and deployment process.
csproj
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*[.]git[a-z]+".
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*[.]sh".
    • files with paths like ".*/pom[.]xml".
    • files with paths like ".*[.]gradle".
    • files with paths like ".*[.]bat".
    • files with paths like ".*/package[-]lock[.]json".
    • files with paths like ".*/package[.]json".
    • files with paths like ".*[.]csproj".
    • files with paths like ".*/[.]gitattributes".
    • files with paths like ".*/docker[-]compose[.]yaml".
    • files with paths like ".*([.]|/)webpack([.]|/).*".
  • 870 files match defined criteria (23,704 LOC, 2.4% vs. main code):
    • 123 *.xml files (13,426 LOC)
    • 739 *.sh files (9,993 LOC)
    • 2 *.gradle files (108 LOC)
    • 1 *.bat files (78 LOC)
    • 3 *.js files (75 LOC)
    • 1 *.csproj files (13 LOC)
    • 1 *.yaml files (11 LOC)
  • " *.xml" is biggest, containing 56.64% of LOC.
  • " *.yaml" is smallest, containing 0.05% of LOC.


*.xml13426 LOC (56%) 123 files
*.sh9993 LOC (42%) 739 files
*.gradle108 LOC (<1%) 2 files
*.bat78 LOC (<1%) 1 file
*.js75 LOC (<1%) 3 files
*.csproj13 LOC (<1%) 1 file
*.yaml11 LOC (<1%) 1 file
Other Code
pb
txt
properties
mod
ini
libsonnet
cfg
Explore:   circles  |  sunburst
  • The following criteria are used to filter files:
    • files with paths like ".*/[.]gitignore".
    • files with paths like ".*[.]md".
    • files with paths like ".*/README[.][a-z0-9]+".
    • files with paths like ".*[.]txt".
    • files with paths like ".*/go[.]mod".
    • files with paths like ".*/[.]dockerignore".
    • files with paths like ".*[.]properties".
    • files with paths like ".*[.]json".
    • files with paths like ".*[.]svg".
    • files with paths like ".*[.]pb".
    • files with paths like ".*/[Dd]emos?/.*".
    • files with paths like ".*[.]libsonnet".
    • files with paths like ".*/[Ss]amples/.*".
    • files with paths like ".*[.]ini".
    • files with paths like ".*[.](rst|rest|resttxt|rsttxt)".
    • files with paths like ".*/LICENSE[.][a-z0-9]+".
    • files with paths like ".*[.]editorconfig".
  • 1273 files match defined criteria (173,640 LOC, 17.3% vs. main code):
    • 4 *.pb files (67,768 LOC)
    • 290 *.json files (30,729 LOC)
    • 144 *.py files (21,103 LOC)
    • 319 *.txt files (20,980 LOC)
    • 32 *.ipynb files (18,079 LOC)
    • 243 *.md files (10,391 LOC)
    • 40 *.rst files (1,064 LOC)
    • 17 *.sql files (982 LOC)
    • 22 *.yaml files (702 LOC)
    • 14 *.svg files (460 LOC)
    • 95 *.properties files (312 LOC)
    • 1 *.css files (221 LOC)
    • 20 *.sh files (185 LOC)
    • 4 *.mod files (156 LOC)
    • 3 *.html files (145 LOC)
    • 7 *.js files (142 LOC)
    • 7 *.ini files (68 LOC)
    • 4 *.libsonnet files (68 LOC)
    • 4 *.java files (64 LOC)
    • 1 *.bash files (9 LOC)
    • 1 *.htm files (8 LOC)
    • 1 *.cfg files (4 LOC)
  • " *.pb" is biggest, containing 39.03% of LOC.
  • " *.cfg" is smallest, containing 0% of LOC.


*.pb67768 LOC (39%) 4 files
*.json30729 LOC (17%) 290 files
*.py21103 LOC (12%) 144 files
*.txt20980 LOC (12%) 319 files
*.ipynb18079 LOC (10%) 32 files
*.md10391 LOC (5%) 243 files
*.rst1064 LOC (<1%) 40 files
*.sql982 LOC (<1%) 17 files
*.yaml702 LOC (<1%) 22 files
*.svg460 LOC (<1%) 14 files
*.properties312 LOC (<1%) 95 files
*.css221 LOC (<1%) 1 file
*.sh185 LOC (<1%) 20 files
*.mod156 LOC (<1%) 4 files
*.html145 LOC (<1%) 3 files
*.js142 LOC (<1%) 7 files
*.ini68 LOC (<1%) 7 files
*.libsonnet68 LOC (<1%) 4 files
*.java64 LOC (<1%) 4 files
*.bash9 LOC (<1%) 1 file
*.htm8 LOC (<1%) 1 file
*.cfg4 LOC (<1%) 1 file
Analyzers
Info about analyzers used for source code examinations.


2025-05-04 14:27