From 3e83b2e36c3c79620ca1662bab71223fdc00847a Mon Sep 17 00:00:00 2001 From: "Kyle J. McKay" Date: Mon, 12 Feb 2018 15:18:20 -0800 Subject: [PATCH] project-fsck-status.sh: provide --no-full mode and options The "git fsck" command provides a very exhaustive and complete check of a repository. It examines all objects for corruption and validity as well as their interconnectivity. Unfortunately, the "git fsck" command does not run particularly quickly and has a tendency to be overly chatty about things that do not impact remote client cloning, fetching or pushing. Running "git fsck" on all Girocco repositories on a regular basis imposes an unwanted load on the server not to mention the amount of time needed for the check to run on all repositories. Introduce a new "--full" option that requests the full "git fsck" check and make the default a "--no-full" option that does not use "git fsck" but detects common repository corruption issues such as missing objects or most corrupted objects. This new default "--no-full" option makes use of the "git rev-list --objects --all" command to provide this capability. As a result it runs much, much, much faster and is now suitable for more regular use. In addition it only reports truly fatal errors. The only thing it does not detect is corrupted 'blob' objects (it will detect missing 'blob' objects though). Since it's unlikely that any 'blob' objects became corrupted since they passed through 'transfer.fsckObjects=true' this provides a welcome speed up while still finding all connectivity issues. In addition to the "--full/--no-full" options, the ability to check an explicitly given list of one or more projects has been added and the ability to suppress the output (or email) if there are no errors (-E) or no errors or warnings (-W). Running weekly with the -m and -E options now becomes both reasonable and desirable. The load on the server will now be much, much less and no email at all will be generated unless fatal problems are detected. The previous default behavior can still be accessed via the "--full" option. To further assist with use of this utility, a "-h/--help" option has also now been implemented. Signed-off-by: Kyle J. McKay --- cron/girocco | 4 +- toolbox/reports/project-fsck-status.sh | 234 +++++++++++++++++++++++++++++---- 2 files changed, 212 insertions(+), 26 deletions(-) diff --git a/cron/girocco b/cron/girocco index e7ce679..36adb5a 100644 --- a/cron/girocco +++ b/cron/girocco @@ -57,8 +57,8 @@ # Report project disk space usage once a week with an email to admin #23 7 * * 3 repo "$HOME/repomgr/toolbox/reports/project-disk-use.sh" -m 100 -# Report project fsck status once a month with an email to admin -#37 19 17 * * repo "$HOME/repomgr/toolbox/reports/project-fsck-status.sh" -m +# Report project fsck status once a week if any errors with an email to admin +#23 7 * * 4 repo "$HOME/repomgr/toolbox/reports/project-fsck-status.sh" -mE # The job daemon and task daemon are run in a screen # The ../screen/screenrc file needs to be installed to diff --git a/toolbox/reports/project-fsck-status.sh b/toolbox/reports/project-fsck-status.sh index 5f3cfe1..77077b9 100755 --- a/toolbox/reports/project-fsck-status.sh +++ b/toolbox/reports/project-fsck-status.sh @@ -1,29 +1,177 @@ #!/bin/sh -# Report on project fsck status. +# Report on project repository status. # Output can be sent as email to admin with -m # Automatically runs with nice and ionice (if available) -# Usage: project-fsck-status [-m] - -# With -m mail the report to $cfg_admin instead of sending it to stdout - -# Shows fsck status for all projects with details for any issues - -# Note that only projects listed in $cfg_chroot/etc/group are checked - set -e datefmt='%Y-%m-%d %H:%M:%S %z' startdate="$(date "+$datefmt")" . @basedir@/shlib.sh +PATH="$cfg_basedir/bin:${PATH:-$(/usr/bin/getconf PATH)}" +export PATH + +USAGE="${0##*/} [-hmEW] [--full] [...]" +HELP=" +NAME + ${0##*/} - report on \"health\" of Girocco repositories + +SYNOPSIS + $USAGE + +DESCRIPTION + The ${0##*/} script checks all Girocco repositories for + consistency and includes a count of empty repositories as well. + + The results can automatically be mailed to the Girocco admin or + simply displayed on STDOUT or optionally suppressed if there are + no errors or warnings. + + Note that this utility currently provides absolutely no progress + reporting even when run on a terminal so there will be no output + whatsoever until all projects have been checked. + + The \"checking\" runs using nice and, if available, ionice. + + Projects not listed in \$chroot/etc/group will not be checked. + +OPTIONS + -h show this help + + -m e-mail results to the \$Girocco::Config::admin address. + This option suppresses output to STDOUT. + + -E Suppress output (i.e. no e-mail will be sent if -m has been used) + unless at least one error has been detected. If this flag is + used WITHOUT -W and warnings occur, output will still be + suppressed. + + -W Suppress output unless at least one error or warning has been + detected. This flag overrides -E and will always produce output + when only warnings are detected. + + --full Run a full \"git fsck\" check on each repository. Without this + option a much, much, much (5x or more) faster check is done on + each repository using \"git rev-list --objects --all\" that will + verify all objects reachable from any ref are present but does + not perform all the validation checks nor does it verify the + integrity of 'blob' objects (just that they are present). + + If the default Girocco 'transfer.fsckObjects=true' option has + been left intact, use of the \"--full\" option should not + normally be necessary. + + --no-full + Disables --full. Present for completeness. + + --list Show a list of all projects checked in the order checked. + This is the default if any specific projects are given. + + --no-list + Do NOT show a list of each project name checked. + This is the default when all projects are being checked. + + + Optionally one or more Girocco project names may be given to + check only those projects. + +TIPS + Leave Girocco's default 'transfer.fsckObjects=true' setting alone and + have a cron job run this utility in the default --no-full mode using the + -m and -E options once a week to detect repository corruption. + + The default --no-full mode will detect any fatal errors with all + reachable 'tag', 'commit' and 'tree' objects. The --no-full mode will + detect missing 'blob' objects but not corrupt 'blob' objects. + + The --no-full mode is an order of magnitude faster than --full mode and + does not complain about non-fatal object errors. +" + +usage() { + if [ "${1:-1}" = "1" ]; then + printf 'usage: %s\n' "$USAGE" >&2 + return 0 + fi + printf '%s\n' "$HELP" +} mailresult= -if [ "$1" = "-m" ]; then +minoutput=0 # 0 => always output, 1 => if warn or err, 2 => if err +fullcheck= +onlylist= +shownames= + +while [ $# -gt 0 ]; do + if [ "${1#-[!-]?}" != "$1" ]; then + _rest="${1#-?}" + _next="${1%"$_rest"}" + shift + set -- "$_next" "-$_rest" "$@" + fi + case "$1" in + "-?") + usage 1 2>&1 + exit 0 + ;; + -h|--help) + usage 2 + exit 0 + ;; + -m|--mail) + mailresult=1 + ;; + -E) + [ "${minoutput:-0}" != "0" ] || minoutput=2 + ;; + -W) + minoutput=1 + ;; + --full) + fullcheck=1 + ;; + --no-full) + fullcheck= + ;; + --list) + shownames=1 + ;; + --no-list) + shownames=0 + ;; + --) + shift + break + ;; + -?*) + echo "${0##*/}: unknown option: $1" >&2 + usage 1 + exit 1 + ;; + *) + break + ;; + esac shift - mailresult=1 -fi +done + +[ -n "$shownames" ] || [ $# -eq 0 ] || shownames=1 + +while [ $# -gt 0 ]; do + aproj="${1%.git}" + shift + if ! [ -d "$cfg_reporoot/$aproj.git" ]; then + echo "${0##*/}: fatal: no such dir: $cfg_reporoot/$aproj.git" >&2 + exit 1 + fi + if ! is_git_dir "$cfg_reporoot/$aproj.git"; then + echo "${0##*/}: fatal: not a git dir: $cfg_reporoot/$aproj.git" >&2 + exit 1 + fi + onlylist="${onlylist:+$onlylist }$aproj.git" +done hasnice= ! command -v nice >/dev/null || hasnice=1 @@ -51,22 +199,30 @@ is_empty_proj() { test $(find -L "$_pd/refs" -type f -print 2>/dev/null | head -n 1 | LC_ALL=C wc -l) -eq 0 } -get_fsck_proj() ( +get_check_proj() ( cd "$cfg_reporoot/$1.git" || { echo "no such directory: $cfg_reporoot/$1.git" return 1 } - # using --strict changes "zero-padded file modes" from a warning into an error - # which we do NOT want so we do NOT use --strict - cmd="git fsck" - [ -z "$var_have_git_1710" ] || cmd="$cmd --no-dangling" - # no need for --no-progress (v1.7.9+) since stderr will not be a tty - cmd="$cmd 2>&1" + if [ -n "$fullcheck" ]; then + # use git fsck + # using --strict changes "zero-padded file modes" from a warning into an error + # which we do NOT want so we do NOT use --strict + cmd="git fsck" + [ -z "$var_have_git_1710" ] || cmd="$cmd --no-dangling" + # no need for --no-progress (v1.7.9+) since stderr will not be a tty + cmd="$cmd 2>&1" + else + # use git rev-list --objects --all + cmd="git rev-list --objects --all" + # but we only want stderr output not any of the objects list + cmd="$cmd 2>&1 >/dev/null" + fi [ -z "$hasionice" ] || cmd="ionice -c 3 $cmd" [ -z "$hasnice" ] || cmd="nice -n 19 $cmd" fsckresult=0 fsckoutput="$(eval "$cmd")" || fsckresult=$? - if [ -z "$var_have_git_1710" ]; then + if [ -n "$fullcheck" ] && [ -z "$var_have_git_1710" ]; then # strip lines starting with "dangling" since --no-dangling is not supported # note that "dangling" is NOT translated fsckoutput="$(printf '%s\n' "$fsckoutput" | LC_ALL=C sed -n '/^dangling/!p')" @@ -89,6 +245,12 @@ warncount=0 errresults= # warnresults are non-empty results from 0 status fsck runs warnresults= +# if non-empty, warning results may be generated +haswarn=1 +[ -n "$fullcheck" ] || haswarn= +# list of all projects checked in order if --list is active +allprojs= + while IFS='' read -r proj; do if [ -L "$proj" ]; then # symlinks to elsewhere under $cfg_reporoot are ignored @@ -103,13 +265,14 @@ while IFS='' read -r proj; do proj="${proj%.git}" is_listed_proj "$proj" && is_git_dir "$cfg_reporoot/$proj.git" || continue [ -d "$proj.git/objects" ] || continue + [ "${shownames:-0}" = "0" ] || allprojs="${allprojs:+$allprojs }$proj" howmany="$(( $howmany + 1 ))" if is_empty_proj "$proj"; then mtcount="$(( $mtcount + 1 ))" continue fi ok=1 - output="$(get_fsck_proj "$proj")" || ok= + output="$(get_check_proj "$proj")" || ok= [ -z "$ok" ] || okcount="$(( $okcount + 1 ))" [ -n "$ok" ] || [ -n "$output" ] || output="git fsck failed with no output" if [ -n "$output" ]; then @@ -125,10 +288,19 @@ while IFS='' read -r proj; do fi fi done </dev/null) +$( + if [ -n "$onlylist" ]; then + printf '%s\n' $onlylist + else + find -L . -type d \( -path ./_recyclebin -o -path ./_global -o -name '*.git' -print \) -prune 2>/dev/null | + LC_ALL=C sort -f + fi +) EOT enddate="$(date "+$datefmt")" +[ "$minoutput" != "1" ] || [ "$howmany" != "$(( $okcount + $mtcount ))" ] || [ "$warncount" != "0" ] || exit 0 +[ "$minoutput" != "2" ] || [ "$howmany" != "$(( $okcount + $mtcount ))" ] || exit 0 domail=cat [ -z "$mailresult" ] || domail='mailref "fsck@$cfg_gitweburl" -s "[$cfg_name] Project Fsck Status Report" "$cfg_admin"' { @@ -140,14 +312,28 @@ Project Fsck Status Report End Time: $enddate Projects Checked: $howmany - Projects Okay: $(( $okcount + $mtcount )) (passed + warned + empty) + Projects Okay: $(( $okcount + $mtcount )) (passed + ${haswarn:+warned + }empty) Projects Passed: $(( $okcount - $warncount )) +EOT + [ -z "$haswarn" ] || cat <