The main issue I had with my original script here was that with the sheer number of pictures we had it didn’t finish in a reasonable time. What I needed was a way to allow the script to work and resume from an interrupt (like a reboot). So I took the original script and whacked it with the ScriptHammer(TM) again.
Updated script after the jump with notes following!
[CmdletBinding()] param ( [parameter(Mandatory=$True)] [string]$Path ) if (!(Test-Path -PathType Container $Path)) { Write-Error "Invalid path specified." Exit } $ProcessedFile="d:\temp\Processed.txt" $LogFile="d:\temp\comparisonresults.csv" $Processed=@() if (Test-Path $ProcessedFile) { $Processed=Get-Content $ProcessedFile } Write-Verbose "Scanning Path : $Path" $Files=gci -File -Recurse -path $Path | Select-Object -property FullName,Length $Count=1 $MatchedSourceFiles=@() if (Test-Path $LogFile) { $MatchedSourceFiles=Import-csv $LogFile } $Directories=gci -Recurse -Directory -path $Path | ? {$Processed -notcontains $_.FullName} $Directories+=get-item $Path $TotalDirectories=$Directories.Count Foreach ($Directory in $Directories) { Write-Verbose "Directory : $($Directory)" $LocalFiles=gci -File -path $Directory.FullName | Select-Object -property FullName,Length Write-Progress -Activity "Processing Directories" -status "Processing Directory $Count / $TotalDirectories" -PercentComplete ($Count / $TotalDirectories * 100) ForEach ($SourceFile in $LocalFiles) { if (!(($MatchedSourceFiles.MatchingFiles | % {$_ -like "*" + $SourceFile.FullName+"*"}) -eq $True)) { $MatchingFiles=@() Foreach ($TargetFile in $Files) { if (($SourceFile.FullName -ne $TargetFile.FullName)) { #Write-Verbose "Matching $($SourceFile.FullName) and $($TargetFile.FullName)" if ($SourceFile.Length -eq $TargetFile.Length) { if ((c:\Windows\System32\fc.exe /A $SourceFile.FullName $TargetFile.FullName) -contains "FC: no differences encountered") { Write-Verbose "Match found." $MatchingFiles+=$TargetFile.FullName } } } } if ($MatchingFiles.Count -gt 0) { $NewObject=[pscustomobject][ordered]@{ File=$SourceFile.FullName MatchingFiles=[string]$MatchingFiles } $MatchedSourceFiles+=$NewObject } } } $Count+=1 Add-Content -Path $ProcessedFile $Directory.FullName $MatchedSourceFiles |Export-CSV $LogFile -NoTypeInformation } $MatchedSourceFiles
Most of the new work happens at the start;
$ProcessedFile="d:\temp\Processed.txt" $LogFile="d:\temp\comparisonresults.csv" $Processed=@() if (Test-Path $ProcessedFile) { $Processed=Get-Content $ProcessedFile } if (Test-Path $LogFile) { $MatchedSourceFiles=Import-csv $LogFile } $Directories=gci -Recurse -Directory -path $Path | ? {$Processed -notcontains $_.FullName} $Directories+=get-item $Path
We need both a file with all the matches in it ($LogFile, which we update as we go) and a file that contains a list of all the folders already processed ($ProcessedFile).
If $ProcessedFile exists we load it (into $Processed) and then when the list of folders to scan is calculated we only include the folders that aren’t listed in $Processed.
We do a similar thing with $MatchedSourceFiles; if the $LogFile csv exists we load it into $MatchedSourceFiles.
The last change is to make sure we add any folders we’ve finished processing to the $ProcessedFile;
Add-Content -Path $ProcessedFile $Directory.FullName