Scripts/sf-collect-node-info.ps1 (916 lines of code) (raw):
<#
.SYNOPSIS
powershell script to collect service fabric node diagnostic data
To download and execute:
[net.servicePointManager]::Expect100Continue = $true;[net.servicePointManager]::SecurityProtocol = [net.securityProtocolType]::Tls12;
invoke-webRequest "https://raw.githubusercontent.com/Azure/Service-Fabric-Troubleshooting-Guides/master/Scripts/sf-collect-node-info.ps1" -outFile "$pwd\sf-collect-node-info.ps1";
.\sf-collect-node-info.ps1
optional download for event log conversion:
invoke-webRequest "https://raw.githubusercontent.com/Azure/Service-Fabric-Troubleshooting-Guides/master/Scripts/event-log-manager.ps1" -outFile "$pwd\event-log-manager.ps1";
if working with microsoft support, upload to workspace the outputted sfgather* directory or zip file
Microsoft Privacy Statement: https://privacy.microsoft.com/en-US/privacystatement
MIT License
Copyright (c) Microsoft Corporation. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE
.DESCRIPTION
To enable script execution, you may need to Set-ExecutionPolicy Bypass -Force
script will collect event logs, hotfixes, services, processes, drive, firewall, and other OS information
Requirements:
- administrator powershell prompt
- administrative access to machine
- remote network ports:
- smb 445
- rpc endpoint mapper 135
- rpc ephemeral ports
- to test access from source machine to remote machine: dir \\%remote machine%\admin$
- winrm
- depending on configuration / security, it may be necessary to modify trustedhosts on
source machine for management of remote machines
- to query: winrm get winrm/config
- to enable sending credentials to remote machines: winrm set winrm/config/client '@{TrustedHosts="*"}'
- to disable sending credentials to remote machines: winrm set winrm/config/client '@{TrustedHosts=""}'
- firewall
- if firewall is preventing connectivity the following can be run to disable
- Set-NetFirewallProfile -Profile Domain,Public,Private -Enabled False
.NOTES
File Name : sf-collect-node-info.ps1
Author : microsoft service fabric support
Version : 240422 add 'g' datetime format for timestamps to fix event log enumeration in different cultures
History :
.EXAMPLE
.\sf-collect-node-info.ps1
default command to collect event logs, process, service, os information for last 7 days.
.EXAMPLE
.\sf-collect-node-info.ps1 -certInfo
example command to query all diagnostic information, event logs, and certificate store information.
.EXAMPLE
.\sf-collect-node-info.ps1 -startTime 8/16/2018
example command to query all diagnostic information using start date of 08/16/2018.
dates are used for event log and rest queries
.EXAMPLE
.\sf-collect-node-info.ps1 -remoteMachines 10.0.0.4,10.0.0.5
example command to query diagnostic information remotely from two machines.
files will be copied back to machine where script is being executed.
.EXAMPLE
.\sf-collect-node-info.ps1 -runCommand "dir c:\windows -recurse"
example to run custom command on machine after data collection
output will be captured in runCommand.txt
.PARAMETER apiVersion
api version for testing fabricgateway endpoint with service fabric rest api calls.
.PARAMETER cacheCredentials
switch enable storing credentials in $global:creds variable.
to clear, execute: $global:creds=$null
.PARAMETER certInfo
bool to enable collection of certificate store export to troubleshoot certificate issues.
thumbprints and serial numbers during export will be partially masked.
default true.
.PARAMETER endTime
end time in normal dateTime formatting.
example "8/26/2018 22:00"
default today.
.PARAMETER eventLogNames
regex list of eventlog names to export into csv formatl
default list should be sufficient for most scenarios.
.PARAMETER externalUrl
url to use for network connectivity tests.
.PARAMETER logMin
if greater than 0, minutes of log files to collect based on last write time
default 30 minutes
.PARAMETER netmonMin
minutes to run network trace at end of collection after all jobs run.
.PARAMETER networkTestAddress
remote machine for service fabric tcp port test.
.PARAMETER noAdmin
switch to bypass admin powershell session check.
most jobs will work with non-admin session but not all for example some of the network tests.
.PARAMETER noEventLogs
switch to prevent download of event-log-manager.ps1 script and collection of windows event log events.
.PARAMETER noNet
bypass network tests.
.PARAMETER noOs
bypass OS information collection.
.PARAMETER noSF
bypass service fabric information collection.
.PARAMETER perfmonMin
minutes to run basic perfmon at end of collection after all jobs run.
cpu, memory, disk, network.
.PARAMETER ports
comma separated list of tcp ports to test.
default ports include basic connectivity, rdp, and service fabric.
.PARAMETER quiet
disable display of folder / zip in shell at end of script.
.PARAMETER remoteMachines
comma separated list of machine names and / or ip addresses to run diagnostic script on remotely.
this will only work if proper connectivity, authentication, and OS health exists.
if there are errors connecting, run script instead individually on each node.
to resolve remote connectivity issues, verify tcp port connectivity for ports, review, winrm, firewall, and nsg configurations.
.PARAMETER runCommand
command to run at end of collection.
command needs to runnable from 'invoke-expression'
.PARAMETER startTime
start time in normal dateTime formatting.
example "8/26/2018 22:00"
default -7 days.
.PARAMETER timeoutMinutes
script timeout in minutes.
script will cancel any running jobs and collect what is available if timeout is hit.
.PARAMETER workDir
output directory where all files will be created.
default is $env:temp
.LINK
https://raw.githubusercontent.com/Azure/Service-Fabric-Troubleshooting-Guides/master/Scripts/sf-collect-node-info.ps1
#>
[CmdletBinding()]
param(
[string]$workdir,
[bool]$certInfo = $true,
[string]$eventLogNames = "System$|Application$|wininet|dns|Fabric|http|Firewall|Azure|insight",
[string]$externalUrl = "bing.com",
[dateTime]$startTime = (get-date).AddDays(-7),
[dateTime]$endTime = (get-date),
[int]$netmonMin,
[string]$networkTestAddress = $env:computername,
[int]$perfmonMin,
[object[]]$ports = @(1025, 1026, 19000, 19080, 135, 445, 3389, 5985),
[int]$timeoutMinutes = [Math]::Max($perfmonMin, $netmonMin) + 15,
[string]$apiversion = "6.2-preview", #"6.0"
[string[]]$remoteMachines,
[switch]$noAdmin,
[switch]$noEventLogs,
[switch]$noOs,
[switch]$noNet,
[switch]$noSF,
[switch]$quiet,
[string]$runCommand,
[int]$logMin = 30,
[string]$defaultFabricLogRoot = 'd:\svcfab\log',
[string]$defaultFabricDataRoot = 'd:\svcfab',
[switch]$cacheCredentials
)
[net.servicePointManager]::Expect100Continue = $true;
[net.servicePointManager]::SecurityProtocol = [net.securityProtocolType]::Tls12;
$PSModuleAutoLoadingPreference = 2
$ErrorActionPreference = "Continue"
$creds = $null
$timer = get-date
$currentWorkDir = get-location
$osInfo = (get-wmiobject -Class Win32_OperatingSystem -Namespace root\cimv2)
$legacy = ([version]$osInfo.Version).major -lt 10
$workstation = $osInfo.ProductType -eq 1
$parentWorkDir = $null
$jobs = new-object collections.arraylist
$logFile = $null
$global:zipFile = $null
$trustedHosts = $null
$winrmClientInfo = $null
$eventScriptFile = $null
$wEvtUtilLogs = [collections.arraylist]@()
$sfCollectInfoDir = "sfColInfo-"
$restTimeoutSec = 15
$serviceFabricInstallReg = "HKLM:\software\microsoft\service fabric"
$warnonZoneCrossingReg = "HKCU:\Software\Microsoft\Windows\CurrentVersion\Internet Settings"
$disableWarnOnZoneCrossing = $false
$useBasicParsing = [bool](get-command invoke-webrequest).Parameters.UseBasicParsing
$global:allparams = @{ }
[string]$scriptUrl = 'https://raw.githubusercontent.com/Azure/Service-Fabric-Troubleshooting-Guides/master/Scripts/sf-collect-node-info.ps1'
# to bypass self-signed cert validation check
add-type @"
using System.Net;
using System.Security.Cryptography.X509Certificates;
public class IDontCarePolicy : ICertificatePolicy {
public IDontCarePolicy() {}
public bool CheckValidationResult(
ServicePoint sPoint, X509Certificate cert,
WebRequest wRequest, int certProb) {
return true;
}
}
"@
[System.Net.ServicePointManager]::CertificatePolicy = new-object IDontCarePolicy
function main() {
$error.Clear()
write-warning "to troubleshoot this issue, this script may collect sensitive information similar to other microsoft diagnostic tools."
write-warning "information may contain items such as ip addresses, process information, user names, or similar."
write-warning "information in directory / zip can be reviewed before uploading to workspace."
write-warning "see: https://github.com/Azure/Service-Fabric-Trou-Guides/blob/master/Cluster/SF%20collect%20node%20info.md"
if (!$workDir -and $remoteMachines) {
$workdir = "$($env:temp)\$($sfCollectInfoDir)$((get-date).ToString("yy-MM-dd-HH-mm"))"
}
elseif (!$workDir) {
$workdir = "$($env:temp)\$($sfCollectInfoDir)$($env:COMPUTERNAME)"
}
$parentWorkDir = [io.path]::GetDirectoryName($workDir)
$eventScriptFile = "$($parentWorkdir)\event-log-manager.ps1"
if ((test-path $workdir)) {
remove-item $workdir -Recurse -Force
}
new-item $workdir -ItemType Directory
Set-Location $parentworkdir
$logFile = "$($workdir)\sf-collect-node-info.log"
if (!$legacy) {
Start-Transcript -Path $logFile -Force
}
write-host "starting $(get-date)"
if (!([Security.Principal.WindowsPrincipal][Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole] "Administrator")) {
Write-Warning "please restart script in administrator powershell session"
if (!$noadmin) {
Write-Warning "if unable to run as admin, restart and use -noadmin switch. This will collect less data that may be needed. exiting..."
return $false
}
}
$disableSecuritySetting = (Get-ItemProperty -Path $warnonZoneCrossingReg -Name "WarnonZoneCrossing" -ErrorAction SilentlyContinue)
if (!$disableSecuritySetting -or $disableSecuritySetting.WarnonZoneCrossing -eq 1) {
New-ItemProperty -Path $warnonZoneCrossingReg -Name "WarnonZoneCrossing" -Value 0 -PropertyType DWORD -Force | Out-Null
$disableWarnOnZoneCrossing = $true
}
$error.Clear()
write-host "remove old jobs"
get-job | remove-job -Force
# stage event-log-manager script
if (!$noEventLogs -and !(test-path $eventScriptFile)) {
try {
invoke-webRequest "https://raw.githubusercontent.com/Azure/Service-Fabric-Troubleshooting-Guides/master/Scripts/event-log-manager.ps1" -outFile $eventScriptFile
}
catch {
write-warning ($error | Out-String)
write-warning "unable to download $eventScriptFile. using wEvtUtil.exe instead"
$error.Clear()
Remove-Item $eventScriptFile
$wEvtUtilLogs.AddRange((get-childItem -Path "$env:SystemRoot\system32\winevt\Logs" | Where-Object BaseName -imatch $eventLogNames | Select-Object FullName))
}
}
if ($remoteMachines) {
if (!$global:creds) {
Write-Host "Enter your RDP Credentials"
#Get the RDP User Name and Password
$creds = Get-Credential
if ($cacheCredentials) {
$global:creds = $creds
}
}
else {
$creds = $global:creds
}
# setup local (source) machine for best chance of success
$winrmClientInfo = (winrm get winrm/config/client)
$trustedHostsPattern = "TrustedHosts = (.*)"
if ([regex]::IsMatch($winrmClientInfo, $trustedHostsPattern)) {
$trustedHosts = ([regex]::matches($winrmClientInfo , $trustedHostsPattern)).groups[1].value
}
winrm set winrm/config/client '@{TrustedHosts="*"}'
# switch to arraylist
$remoteMachines = new-object collections.arraylist(, $remoteMachines)
foreach ($machine in (new-object collections.arraylist(, $remoteMachines))) {
$adminPath = "\\$($machine)\admin$\temp"
Invoke-Command -Authentication Negotiate -ComputerName $machine -scriptBlock {
$logFile = "c:\windows\temp\fw.txt"
$displayGroup = 'File and Printer Sharing'
if ((Get-NetFirewallRule -DisplayGroup $displayGroup).Enabled -icontains 'false') {
Write-Warning "enabling firewall $displayGroup" | out-file -Append $logFile
Set-NetFirewallRule -DisplayGroup $displayGroup -Enabled True -PassThru |
Select-Object DisplayName, Enabled
}
} -Credential $creds
if (!(Test-path $adminPath)) {
Write-Warning "unable to connect to $($machine) to start diagnostics. skipping!"
$remoteMachines.Remove($machine)
continue
}
if (!$noEventLogs -and !$wEvtUtilLogs) {
copy-item -path $eventScriptFile -Destination $adminPath -force
}
copy-item -path ($MyInvocation.ScriptName) -Destination $adminPath -force
write-host "adding job for $($machine)"
[void]$jobs.Add((Invoke-Command -JobName $machine -AsJob -ComputerName $machine -Credential $creds -scriptblock {
param($scriptUrl = $args[0], $machine = $args[1], $sfCollectInfoDir = $args[2], $allParams = $args[3])
$parentWorkDir = "$($env:systemroot)\temp"
$workDir = "$($parentWorkDir)\$($sfCollectInfoDir)$($machine)"
$scriptPath = "$($parentWorkDir)\$($scriptUrl -replace `".*/`",`"`")"
if (!(test-path $scriptPath)) {
invoke-webRequest $scriptUrl -outFile $scriptPath
}
[text.stringbuilder]$sb = new-object text.stringbuilder
foreach ($item in $allParams.GetEnumerator()) {
if ($item.key -imatch "quiet" -or $item.key -imatch "noadmin" -or $item.key -imatch "workdir") {
continue
}
if (@($item.value).count -gt 1) {
$item.value = $item.value -join ','
}
$sb.Append("-$($item.key) $($item.value) ")
}
write-host "executing: $($scriptPath) -quiet -noadmin -workdir $($workDir) $($sb.tostring())"
Invoke-Expression "$($scriptPath) -quiet -noadmin -workdir $($workDir) $($sb.tostring())"
write-host ($error | out-string)
} -ArgumentList @($scriptUrl, $machine, $sfCollectInfoDir, $global:allparams)))
}
monitor-jobs
foreach ($machine in $remoteMachines) {
$adminPath = "\\$($machine)\admin$\temp"
$foundZip = $false
if (!(Test-path $adminPath)) {
Write-Warning "unable to connect to $($machine) to copy zip. skipping!"
continue
}
$sourcePath = "$($adminPath)\$($sfCollectInfoDir)$($machine)"
$destPath = "$($workDir)\$($sfCollectInfoDir)$($machine)"
$sourcePathZip = "$($sourcePath).zip"
$destPathZip = "$($destPath).zip"
if ((test-path $sourcePathZip)) {
write-host "copying file $($sourcePathZip) to $($destPathZip)" -ForegroundColor Magenta
Copy-Item $sourcePathZip $destPathZip -Force
remove-item $sourcePathZip -Force
$foundZip = $true
}
if ((test-path $sourcePath)) {
if (!$foundZip) {
write-host "copying folder $($sourcePath) to $($destPath)" -ForegroundColor Magenta
Copy-Item $sourcePath $destPath -Force -Recurse
compress-file $destPath
remove-item $destPath -Recurse -Force
}
remove-item $sourcePath -Recurse -Force
}
else {
write-host "warning: unable to find diagnostic files in $($sourcePath)"
}
Invoke-Command -Authentication Negotiate -ComputerName $machine -scriptBlock {
$logFile = "c:\windows\temp\fw.txt"
$displayGroup = 'File and Printer Sharing'
if ((test-path $logFile)) {
Write-Warning "disabling firewall $displayGroup" | out-file -Append $logFile
Set-NetFirewallRule -DisplayGroup $displayGroup -Enabled False -PassThru |
Select-Object DisplayName, Enabled
Remove-Item $logFile
}
} -Credential $creds -ArgumentList $logFile
}
$global:zipFile = compress-file $workDir
}
else {
process-machine
}
if (!($quiet) -and (test-path "$($env:systemroot)\explorer.exe")) {
start-process "explorer.exe" -ArgumentList $parentWorkDir
}
}
function process-machine() {
write-host "processing machine"
if (!$noEventLogs) {
add-job -jobName "event logs" -scriptBlock {
param($workdir = $args[0], $parentWorkdir = $args[1], $eventLogNames = $args[2], $startTime = $args[3], $endTime = $args[4], $eventScriptFile = $args[5], $wEvtUtilLogs = $args[6])
$eventScriptFile = "$parentWorkdir\$([io.path]::GetFileName($eventScriptFile))"
$tempLocation = "$($workdir)\event-logs"
if (!(test-path $tempLocation)) {
New-Item -ItemType Directory -Path $tempLocation
}
if (!(test-path $eventScriptFile) -and $wEvtUtilLogs) {
foreach ($file in $wEvtUtilLogs) {
write-host "exporting file $($file.FullName)"
write-host "wEvtUtil.exe export-log "$($file.FullName)" "$tempLocation\$([io.path]::GetFileName($file.FullName))" /logfile:true"
wEvtUtil.exe export-log "$($file.FullName)" "$tempLocation\$([io.path]::GetFileName($file.FullName))" /logfile:true
}
}
elseif ((test-path $eventScriptFile)) {
$argList = "-File $($parentWorkdir)\event-log-manager.ps1 -eventLogNamePattern `"$($eventlognames)`" -eventStartTime `"$($startTime -f 'g')`" -eventStopTime `"$($endTime -f 'g')`" -eventDetails -merge -uploadDir `"$($tempLocation)`" -nodynamicpath"
write-host "event logs: starting command powershell.exe $($argList)"
start-process -filepath "powershell.exe" -ArgumentList $argList -Wait -WindowStyle Hidden -WorkingDirectory $tempLocation
}
else {
write-host "error:unable to collect event logs"
}
} -arguments @($workdir, $parentWorkdir, $eventLogNames, $startTime, $endTime, $eventScriptFile, $wEvtUtilLogs)
}
if (!$noOs) {
if (!$legacy) {
add-job -jobName "windows update" -scriptBlock {
param($workdir = $args[0])
Get-WindowsUpdateLog -LogPath "$($workdir)\windowsupdate.log.txt"
} -arguments $workdir
}
else {
copy-item "$env:systemroot\windowsupdate.log" "$($workdir)\windowsupdate.log.txt"
}
add-job -jobName "check machinekeys" -scriptBlock {
param($workdir = $args[0])
$machineKeys = "C:\ProgramData\Microsoft\Crypto\RSA\MachineKeys"
Get-ChildItem $machineKeys -Recurse | out-file "$($workDir)\dir-machinekeys.txt"
Invoke-Expression "icacls $($machineKeys) /C /T | out-file -Append $($workdir)\dir-machinekeys.txt"
} -arguments @($workdir)
add-job -jobName "check for docker" -scriptBlock {
param($workdir = $args[0])
$error.clear()
(docker version)
if ($error) {
$error.Clear()
write-host "docker not installed"
return
}
docker version | out-file "$($workdir)\docker-info.txt"
docker images | out-file "$($workdir)\docker-info.txt"
docker network ls | out-file -Append "$($workdir)\docker-info.txt"
docker ps | out-file -Append "$($workdir)\docker-info.txt"
#docker inspect <containerid> | out-file -Append "$($workdir)\docker-info.txt"
} -arguments @($workdir)
add-job -jobName "check for dump file c" -scriptBlock {
param($workdir = $args[0])
get-childitem -Recurse -Path "c:\" -Filter "*.*dmp" | out-file "$($workdir)\dumplist-c.txt"
} -arguments @($workdir)
add-job -jobName "check for dump file d" -scriptBlock {
param($workdir = $args[0])
get-childitem -Recurse -Path "d:\" -Filter "*.*dmp" | out-file "$($workdir)\dumplist-d.txt"
} -arguments @($workdir)
add-job -jobName "drives" -scriptBlock {
param($workdir = $args[0])
Get-psdrive | out-file "$($workdir)\drives.txt"
} -arguments @($workdir)
add-job -jobName "os info" -scriptBlock {
param($workdir = $args[0])
get-wmiobject -Class Win32_OperatingSystem -Namespace root\cimv2 | format-list * | out-file "$($workdir)\os-info.txt"
Invoke-Expression "cmd.exe /c sc query type= driver > $($workdir)\drivers.txt"
get-hotfix | out-file "$($workdir)\hotfixes.txt"
Get-process | out-file "$($workdir)\process-summary.txt"
Get-process | format-list * | out-file "$($workdir)\processes.txt"
get-process | Where-Object ProcessName -imatch "fabric|FileStoreService|imagebuilder|docker" | out-file "$($workdir)\processes-fabric.txt"
Get-service | out-file "$($workdir)\service-summary.txt"
Get-Service | format-list * | out-file "$($workdir)\services.txt"
} -arguments @($workdir)
write-host "etw / logman sessions / traces"
logman -ets | out-file "$($workdir)\logman-ets.txt"
logman | out-file "$($workdir)\logman.txt"
write-host "installed applications"
Invoke-Expression "reg.exe query HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall /s /v DisplayName > $($workDir)\installed-apps.reg.txt"
write-host "features"
if ($workstation) {
Invoke-Expression "dism /online /get-features | out-file $($workdir)\windows-features.txt"
}
else {
Get-WindowsFeature | Where-Object "InstallState" -eq "Installed" | out-file "$($workdir)\windows-features.txt"
}
add-job -jobName ".net reg" -scriptBlock {
param($workdir = $args[0])
Get-ChildItem "HKLM:\SOFTWARE\Microsoft\NET Framework Setup\NDP" -Recurse | Get-ItemProperty -Name Version -ErrorAction SilentlyContinue | Select-Object PSChildName, Version | out-file "$($workDir)\dotnet.reg.txt"
Invoke-Expression "dotnet.exe --list-runtimes > $($workDir)\dotnet.core.txt"
} -arguments @($workdir)
write-host "policies"
Invoke-Expression "reg.exe query HKEY_LOCAL_MACHINE\SOFTWARE\Policies /s > $($workDir)\policies.reg.txt"
write-host "schannel"
Invoke-Expression "reg.exe query HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL /s > $($workDir)\schannel.reg.txt"
add-job -jobName "azure config files" -scriptBlock {
param($workdir = $args[0])
function copy-files($sourceDir) {
if (!(test-path $sourceDir)) {
return
}
copy-item -path $sourceDir -Destination $workDir -Filter "*.json" -Recurse -ErrorAction SilentlyContinue
copy-item -path $sourceDir -Destination $workDir -Filter "*.txt" -Recurse -ErrorAction SilentlyContinue
copy-item -path $sourceDir -Destination $workDir -Filter "*.settings" -Recurse -ErrorAction SilentlyContinue
copy-item -path $sourceDir -Destination $workDir -Filter "*.config" -Recurse -ErrorAction SilentlyContinue
copy-item -path $sourceDir -Destination $workDir -Filter "*.xml" -Recurse -ErrorAction SilentlyContinue
copy-item -path $sourceDir -Destination $workDir -Filter "*.log" -Recurse -ErrorAction SilentlyContinue
}
copy-files "c:\packages"
copy-files "c:\windowsAzure"
} -arguments @($workdir)
}
if (!$noNet) {
add-job -jobName "network port tests" -scriptBlock {
param($workdir = $args[0], $networkTestAddress = $args[1], $ports = $args[2])
foreach ($port in $ports) {
$ProgressPreference = "silentlycontinue"
test-netconnection -port $port -ComputerName $networkTestAddress -InformationLevel Detailed | out-file -Append "$($workdir)\network-port-test.txt"
}
} -arguments @($workdir, $networkTestAddress, $ports)
add-job -jobName "check external connection" -scriptBlock {
param($workdir = $args[0], $externalUrl = $args[1], $useBasicParsing = $args[2])
if ($useBasicParsing) {
[net.httpWebResponse](Invoke-WebRequest $externalUrl -UseBasicParsing).BaseResponse | out-file "$($workdir)\network-external-test.txt"
}
else {
[net.httpWebResponse](Invoke-WebRequest $externalUrl).BaseResponse | out-file "$($workdir)\network-external-test.txt"
}
} -arguments @($workdir, $externalUrl, $useBasicParsing)
add-job -jobName "resolve-dnsname" -scriptBlock {
param($workdir = $args[0], $networkTestAddress = $args[1], $externalUrl = $args[2])
Resolve-DnsName -Name $networkTestAddress | out-file -Append "$($workdir)\resolve-dnsname.txt"
Resolve-DnsName -Name $externalUrl | out-file -Append "$($workdir)\resolve-dnsname.txt"
} -arguments @($workdir, $networkTestAddress, $externalUrl)
add-job -jobName "nslookup" -scriptBlock {
param($workdir = $args[0], $networkTestAddress = $args[1], $externalUrl = $args[2])
write-host "nslookup"
out-file -InputObject "querying nslookup for $($externalUrl)" -Append "$($workdir)\nslookup.txt"
Invoke-Expression "nslookup $($externalUrl) | out-file -Append $($workdir)\nslookup.txt"
out-file -InputObject "querying nslookup for $($networkTestAddress)" -Append "$($workdir)\nslookup.txt"
Invoke-Expression "nslookup $($networkTestAddress) | out-file -Append $($workdir)\nslookup.txt"
} -arguments @($workdir, $networkTestAddress, $externalUrl)
if ((test-path "C:\Windows\System32\LogFiles\HTTPERR")) {
write-host "http log files"
copy-item -path "C:\Windows\System32\LogFiles\HTTPERR\*" -Destination $workdir -Force -Filter "*.log"
}
add-job -jobName "firewall" -scriptBlock {
param($workdir = $args[0])
Invoke-Expression "reg.exe query HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\SharedAccess\Parameters\FirewallPolicy\FirewallRules /s > $($workDir)\firewallrules.reg.txt"
Get-NetFirewallRule | out-file "$($workdir)\firewall-config.txt"
} -arguments @($workdir)
add-job -jobName "get-nettcpconnetion" -scriptBlock {
param($workdir = $args[0])
Get-NetTCPConnection | format-list * | out-file "$($workdir)\netTcpConnection.txt"
Get-NetTCPConnection | Where-Object RemotePort -eq 1026 | out-file "$($workdir)\connected-nodes.txt"
} -arguments @($workdir)
add-job -jobName "get-netadapterchecksumoffload" -scriptBlock {
param($workdir = $args[0])
get-netadapterchecksumoffload | format-list * | out-file "$($workdir)\netadapterchecksumoffload.txt"
} -arguments @($workdir)
add-job -jobName "get-netnatstaticmapping" -scriptBlock {
param($workdir = $args[0])
get-netnatstaticmapping | format-list * | out-file "$($workdir)\netnatstaticmapping.txt"
} -arguments @($workdir)
write-host "netstat ports"
Invoke-Expression "netstat -bna > $($workdir)\netstat.txt"
write-host "netsh ssl"
Invoke-Expression "netsh http show sslcert > $($workdir)\netshssl.txt"
write-host "ip info"
Invoke-Expression "ipconfig /all > $($workdir)\ipconfig.txt"
write-host "winrm settings"
Invoke-Expression "winrm get winrm/config/client > $($workdir)\winrm-config.txt"
}
if ($certInfo) {
write-host "certs (output scrubbed)"
certutil -verifystore MY | out-file "$($workdir)\certs.txt"
[regex]::Replace((get-content -raw "$($workdir)\certs.txt"), "[0-9a-fA-F]{20}`r`n", "xxxxxxxxxxxxxxxxxxxx`r`n") | out-file "$($workdir)\certs.txt"
}
#
# service fabric information
#
if (!$noSF) {
write-host "service fabric reg"
Invoke-Expression "reg.exe query `"HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Service Fabric`" /s > $($workDir)\serviceFabric.reg.txt"
Invoke-Expression "reg.exe query HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ServiceFabricNodeBootStrapAgent /s > $($workDir)\serviceFabricNodeBootStrapAgent.reg.txt"
if ((test-path $serviceFabricInstallReg)) {
enumerate-serviceFabric
}
else {
write-warning "service fabric is *not* installed on this machine!"
}
}
write-host "waiting for $($jobs.Count) jobs to complete"
monitor-jobs
if ($perfmonMin -gt 0) {
add-job -jobName "perfmon" -scriptBlock {
param($workdir = $args[0], $perfmonMin = $args[1])
$command = "Logman.exe create counter sfnodediag -o $($workdir)\PerfCounters.blg -f bincirc -v mmddhhmm -max 300 -c " `
+ '"\Memory\*" ' `
+ '"\.NET CLR Memory(*)\*" ' `
+ '"\Network Interface(*)\*" ' `
+ '"\Netlogon(*)\*" ' `
+ '"\Paging File(*)\*" ' `
+ '"\PhysicalDisk(*)\*" ' `
+ '"\Processor(*)\*" ' `
+ '"\Process(*)\*" ' `
+ '"\Server\*" ' `
+ '"\System\*" ' `
+ "-si 00:00:01"
invoke-expression $command
invoke-expression "logman.exe start sfnodediag"
start-sleep -seconds ($perfmonMin * 60)
invoke-expression "logman.exe stop sfnodediag"
invoke-expression "logman.exe delete sfnodediag"
} -arguments @($workdir, $perfmonMin)
}
if ($netmonMin -gt 0) {
add-job -jobName "netmon" -scriptBlock {
param($workdir = $args[0], $netmonMin = $args[1])
Invoke-Expression "netsh trace start capture=yes overwrite=yes maxsize=1024 tracefile=$($workdir)\net.etl filemode=circular > $($workdir)\netmon.txt"
start-sleep -seconds ($netmonMin * 60)
Invoke-Expression "netsh trace stop >> $($workdir)\netmon.txt"
} -arguments @($workdir, $netmonMin)
}
if ($runCommand) {
add-job -jobName "runCommand" -scriptBlock {
param($workdir = $args[0], $runCommand = $args[1])
Invoke-Expression "$($runCommand) > $($workdir)\runCommand.txt"
} -arguments @($workdir, $runCommand)
}
write-host "formatting xml files"
foreach ($file in (get-childitem -filter *.xml -Path "$($workdir)" -Recurse)) {
# format xml in output
read-xml -xmlFile $file.FullName -format
}
monitor-jobs
write-host "cleaning empty directories in output"
$allDirectories = (Get-ChildItem $workdir -Recurse -Directory).FullName | sort -Descending
foreach ($dir in $allDirectories) {
$dirItem = [io.directoryinfo]::new($dir)
$error.Clear()
try {
if ($dirItem.GetFiles() + $dirItem.GetDirectories()) {
write-host "skipping output directory removal: $dir"
continue
}
else {
write-host "removing empty output directory: $dir"
remove-item $dir -Force
}
}
catch {
write-host "error during dir check $($error | out-string)"
}
}
$global:zipFile = compress-file $workDir
}
function add-job($jobName, $scriptBlock, $arguments) {
write-host "adding job $($jobName)"
[void]$jobs.Add((Start-Job -Name $jobName -ScriptBlock $scriptBlock -ArgumentList $arguments))
}
function compress-file($dir) {
$zipFile = "$($dir).zip"
write-host "creating zip $($zipFile)"
write-debug "zip dir before: $(tree /a /f $dir | out-string)"
if ((test-path $zipFile )) {
remove-item $zipFile -Force
}
if (!$legacy) {
Stop-Transcript | out-null
Compress-archive -path $dir -destinationPath $zipFile -Force
Start-Transcript -Path $logFile -Force -Append | Out-Null
}
else {
Add-Type -Assembly System.IO.Compression.FileSystem
$compressionLevel = [System.IO.Compression.CompressionLevel]::Optimal
[void][System.IO.Compression.ZipFile]::CreateFromDirectory($dir, $zipFile, $compressionLevel, $false)
}
$global:zipFile = $zipFile
write-debug "zip dir after: $(tree /a /f $dir | out-string)"
return $zipFile
}
function enumerate-serviceFabric() {
$fabricDataRoot = (get-itemproperty -path $serviceFabricInstallReg).fabricdataroot
write-host "fabric data root:$($fabricDataRoot)"
if (!$fabricDataRoot) {
$fabricDataRoot = $defaultFabricDataRoot
}
$fabricLogRoot = (get-itemproperty -path $serviceFabricInstallReg).fabricdataroot
write-host "fabric log root:$($fabricLogRoot)"
if (!$fabricLogRoot) {
$fabricLogRoot = $defaultFabricLogRoot
}
if ($logMin -and (test-path $fabricLogRoot)) {
add-job -jobName "fabric log files" -scriptBlock {
param($workdir = $args[0], $fabricLogRoot = $args[1], $logMin = $args[2])
foreach ($file in (Get-ChildItem -Path $fabricLogRoot -Recurse).FullName) {
try {
if (![regex]::isMatch($file, ".+?(?:\.trace$|\.blg$|\.etl$|\.dtr$|\.zip$|\\sfcontainerlogs.+?\.(?:out|err)$)", [text.regularExpressions.regexOptions]::ignoreCase)) {
write-verbose "skipping file $file"
continue
}
$fileInfo = new-object io.fileInfo($file)
if ($fileInfo.LastWriteTime -gt (get-date).AddMinutes(-$logMin)) {
write-host "copying file $file $($fileInfo.LastWriteTime) to $workdir"
copy-item -Path $file -Destination $workdir
}
else {
write-verbose "skipping file $file $($fileInfo.LastWriteTime)"
}
}
catch {
write-host "error copying file $file to $workdir $error"
$error.clear()
continue
}
}
} -arguments @($workdir, $fabricLogRoot, $logMin)
}
add-job -jobName "fabric config files" -scriptBlock {
param($workdir = $args[0], $fabricDataRoot = $args[1])
Get-ChildItem $($fabricDataRoot) -Recurse | out-file "$($workDir)\dir-fabricdataroot.txt"
Copy-Item -Path $fabricDataRoot -Filter "*.xml" -Destination $workdir -Recurse
} -arguments @($workdir, $fabricDataRoot)
$clusterManifestFile = "$($fabricDataRoot)\clustermanifest.xml"
if ((test-path $clusterManifestFile)) {
write-host "reading $($clusterManifestFile)"
$xml = read-xml -xmlFile $clusterManifestFile
$xml.clustermanifest
try {
$seedNodes = $xml.ClusterManifest.Infrastructure.PaaS.Votes.Vote
write-host "azure service fabric cluster"
write-host "seed nodes: $($seedNodes | format-list * | out-string)"
$nodeCount = $xml.ClusterManifest.Infrastructure.PaaS.Roles.Role.RoleNodeCount
write-host "node count:$($nodeCount)"
$clusterId = (($xml.ClusterManifest.FabricSettings.Section | Where-Object Name -eq "Paas").childnodes | where-object Name -eq "ClusterId").value
write-host "cluster id:$($clusterId)"
$upgradeServiceParams = ($xml.ClusterManifest.FabricSettings.Section | Where-Object Name -eq "UpgradeService").parameter
$sfrpUrl = ($upgradeServiceParams | Where-Object Name -eq "BaseUrl").Value
$sfrpUrl = "$($sfrpUrl)$($clusterId)"
write-host "sfrp url:$($sfrpUrl)"
out-file -InputObject $sfrpUrl "$($workdir)\sfrp-response.txt"
$ucert = ($upgradeServiceParams | Where-Object Name -eq "X509FindValue").Value
add-job -jobName "sfrp check" -scriptBlock {
param($workdir = $args[0], $sfrpUrl = $args[1], $ucert = $args[2], $useBasicParsing = $args[3])
if ($useBasicParsing) {
$sfrpResponse = Invoke-WebRequest $sfrpUrl -UseBasicParsing -Certificate (Get-ChildItem -Path Cert:\LocalMachine\My -Recurse | Where-Object Thumbprint -eq $ucert)
}
else {
$sfrpResponse = Invoke-WebRequest $sfrpUrl -Certificate (Get-ChildItem -Path Cert:\LocalMachine\My -Recurse | Where-Object Thumbprint -eq $ucert)
}
write-host "sfrp response: $($sfrpresponse)"
out-file -Append -InputObject $sfrpResponse "$($workdir)\sfrp-response.txt"
} -arguments @($workdir, $sfrpUrl, $ucert, $useBasicParsing)
}
catch {
$seedNodes = $xml.ClusterManifest.Infrastructure.WindowsServer.NodeList.Node
write-host "seed nodes: $($seedNodes | format-list * | out-string)"
write-host "standalone service fabric cluster"
}
$httpGwEpt = $xml.ClusterManifest.NodeTypes.FirstChild.Endpoints.HttpGatewayEndpoint
$clusterCertThumb = $xml.ClusterManifest.NodeTypes.FirstChild.Certificates.ClientCertificate.X509FindValue
$clusterCert = (Get-ChildItem -Path Cert:\LocalMachine\My -Recurse | Where-Object Thumbprint -eq $clusterCertThumb)
write-host "cluster cert: $($clusterCert | format-list *)"
# todo handle continuationtoken
$gwEpt = "$($httpGwEpt.Protocol)://localhost:$($httpGwEpt.Port)"
$urlArgs = "api-version=$($apiversion)&timeout=$($restTimeoutSec)&StartTimeUtc=$($startTime.ToString(`"yyyy-MM-ddTHH:mm:ssZ`"))&EndTimeUtc=$($endTime.ToString(`"yyyy-MM-ddTHH:mm:ssZ`"))"
<#
1 - Created
2 - Claimed
4 - Preparing
8 - Approved
16 - Executing
32 - Restoring
64 - Completed
#>
$stateFilter = 1 -bor 2 -bor 4 -bor 8 -bor 16 -bor 32
rest-query -url "$($gwEpt)/$/GetRepairTaskList?api-version=$($apiversion)&timeout=$($restTimeoutSec)&StateFilter=$stateFilter" -cert $clusterCert | out-file "$($workdir)\repair-tasks.txt"
rest-query -url "$($gwEpt)/ImageStore?$($urlArgs)" -cert $clusterCert | out-file "$($workdir)\rest-imageStore.txt"
rest-query -url "$($gwEpt)/Nodes?$($urlArgs)" -cert $clusterCert | out-file "$($workdir)\rest-nodes.txt"
rest-query -url "$($gwEpt)/$/GetClusterHealth?$($urlArgs)" -cert $clusterCert | out-file "$($workdir)\rest-getClusterHealth.txt"
rest-query -url "$($gwEpt)/EventsStore/Cluster/Events?$($urlArgs)" -cert $clusterCert | out-file "$($workdir)\rest-eventsCluster.txt"
rest-query -url "$($gwEpt)/EventsStore/Nodes/Events?$($urlArgs)" -cert $clusterCert | out-file "$($workdir)\rest-eventsNodes.txt"
rest-query -url "$($gwEpt)/EventsStore/Applications/Events?$($urlArgs)" -cert $clusterCert | out-file "$($workdir)\rest-eventsApplications.txt"
rest-query -url "$($gwEpt)/EventsStore/Services/Events?$($urlArgs)" -cert $clusterCert | out-file "$($workdir)\rest-eventsServices.txt"
rest-query -url "$($gwEpt)/EventsStore/Partitions/Events?$($urlArgs)" -cert $clusterCert | out-file "$($workdir)\rest-eventsPartition.txt"
}
$fabricRoot = (get-itemproperty -path $serviceFabricInstallReg).fabricroot
write-host "fabric root:$($fabricRoot)"
Get-ChildItem $($fabricRoot) -Recurse | out-file "$($workDir)\dir-fabricroot.txt"
}
function monitor-jobs() {
$incompletedCount = 0
while (@(get-job).Count -gt 0) {
$incompleteCount = @(get-job | Where-Object State -eq "Running").Count
if ($incompleteCount -eq 0 -or $incompletedCount -ne $incompleteCount) {
write-host "`n$((get-date).ToString("HH:mm:sszz")) waiting on $($incompleteCount) jobs..." -ForegroundColor Yellow
$incompletedCount = $incompleteCount
foreach ($job in get-job) {
write-host ("name:$($job.Name) state:$($job.State) output:$((Receive-Job -job $job | format-list * | out-string))") -ForegroundColor Cyan
if ($job.State -imatch "Failed|Completed") {
remove-job $job -Force
}
}
continue
}
start-sleep -seconds 1
write-host "." -NoNewline
if (((get-date) - $timer).TotalMinutes -ge $timeoutMinutes) {
write-error "script timed out waiting for jobs to complete totalMinutes: $($timeoutMinutes) minutes"
get-job | receive-job
get-job | remove-job -force
return
}
}
}
function read-xml($xmlFile, [switch]$format) {
try {
write-host "reading xml file $($xmlFile)"
[Xml.XmlDocument] $xdoc = New-Object System.Xml.XmlDocument
[void]$xdoc.Load($xmlFile)
if ($format) {
[IO.StringWriter] $sw = new-object IO.StringWriter
[Xml.XmlTextWriter] $xmlTextWriter = new-object Xml.XmlTextWriter ($sw)
$xmlTextWriter.Formatting = [Xml.Formatting]::Indented
$xdoc.PreserveWhitespace = $true
[void]$xdoc.WriteTo($xmlTextWriter)
#write-host ($sw.ToString())
out-file -FilePath $xmlFile -InputObject $sw.ToString()
}
return $xdoc
}
catch {
return $null
}
}
function rest-query($cert, $url) {
$error.clear()
try {
$result = $null
write-host "rest query: $($url)" -foregroundcolor cyan
if ($useBasicParsing) {
$result = Invoke-RestMethod -Method Get -Certificate $cert -Uri $url -UseBasicParsing | format-list * | Out-String
}
else {
$result = Invoke-RestMethod -Method Get -Certificate $cert -Uri $url | format-list * | Out-String
}
write-host "rest result: `n$($result)"
return $result
}
catch {
write-host "rest exception $($error | out-string)"
$error.clear()
return $null
}
}
try {
# process command line arguments on recursive call
# arrays on command line easier to pass as strings
# create argument list with all values including defaults
foreach ($param in $MyInvocation.MyCommand.Parameters.GetEnumerator()) {
$error.clear()
Write-Debug "checking parameter $($param)"
Write-Debug "checking parameter type $($param.value.ParameterType)"
# remove remoteMachines
if ($param.key -imatch "remoteMachines") {
continue
}
$paramValue = get-variable -ValueOnly -Name $param.key -ErrorAction SilentlyContinue
if ($error) {
write-debug "error $($error | out-string)"
$error.Clear()
continue
}
if ($param.Value.ParameterType -imatch "bool") {
if ($paramValue -ieq 'true') {
$paramValue = 1
}
else {
$paramValue = 0
}
}
# remove switches unless true
if ($param.Value.ParameterType -imatch "switch" -and $paramValue.Ispresent -eq $false) {
continue
}
elseif ($param.Value.ParameterType -imatch "switch") {
$paramValue = $null
}
# remove empty strings for now
if ($param.Value.ParameterType -imatch "string" -and !$paramValue) {
continue
}
elseif ($param.Value.ParameterType -imatch "string") {
$paramValue = "`"$paramValue`""
}
# join arrays passed as string due to command line issues with arrays
if ($param.Value.ParameterType.IsArray -and $paramValue) {
if ($paramValue.count -le 1) {
[object[]]$paramValue = $paramValue.Replace(" ", ",").Split(",")
}
#$paramvalue = (get-variable -ValueOnly -Name $param.key -ErrorAction SilentlyContinue) -join ","
#$paramValue = "`'$($arr)`'"
}
if ($paramValue -and $paramValue.ToString().Contains(' ')) {
$global:allparams.Add($param.key, "`"$($paramValue)`"")
}
else {
$global:allparams.Add($param.key, $paramValue)
}
}
write-host "arguments:"
write-host ($global:allparams | out-string)
main
}
catch {
write-error "main exception: $($error | out-string)"
}
finally {
if ($winrmClientInfo) {
# set local machine back
winrm set winrm/config/client "@{TrustedHosts="$($trustedHosts)"}"
}
if ($disableWarnOnZoneCrossing) {
New-ItemProperty -Path $warnonZoneCrossingReg -Name "WarnonZoneCrossing" -Value 0 -PropertyType DWORD -Force | Out-Null
}
set-location $currentWorkDir
get-job | remove-job -Force
write-debug "errors during script: $($error | out-string)"
if (!$legacy) {
Stop-Transcript
}
if ($global:zipFile) {
Set-Clipboard -Path $global:zipFile
write-host "zip path added to clipboard:$($global:zipFile)" -ForegroundColor Cyan
write-host "upload zip to workspace:$($global:zipFile)" -ForegroundColor Cyan
}
write-host "finished. total minutes: $(((get-date) - $timer).TotalMinutes.tostring("F2"))"
}