I wrote this script to check for updates to some common software used in my organization. In order to do so, the script uses PowerShell to crawl the website of each piece of software so that it can get the software version number and download link. The script compares the version of the software it found online to the version it found in the local folder inside the script directory. If the version online is newer, it downloads the file and renames it to a nice format including the version number. The software (64-bit where available) crawled for download is:
- 7-Zip
- Google Chrome
- Mozilla Firefox
- Mozilla Firefox ESR
- Adobe Flash Player ActiveX
- Adobe Flash Player NPAPI Plugin
- Java
- VLC Media Player
I figured this script would be a good example to share because it uses a few different methods to find the information it needs on each publisher’s website. Some people can find these examples helpful to modify the code to crawl for something else. In the case of Adobe Flash Player for enterprise distribution, the website requires a login, so I just left the script to check the version and then open a browser window going to the login screen so that the user can manually download the MSI files. Since the Adobe enterprise distribution license says the download link cannot be shared, I have edited it out of the script, so you will need to replace it yourself if you want to use this function. Just search for all instances of “https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE” and replace it with the URL you received from Adobe.
I split this script into functions, with a function to download the software and a separate function to crawl the website for each piece of software. Inside of each crawl function, I split the variables and crawl methods that need modification. This also means that you can remove the last part of the script after line 315 if you would rather use the script to manually call each function using the PowerShell console in order to check for just one piece of software.
Again, this script is very application-specific, so unless you need to check for updates and download this specific set of software, you will need to modify the script for whatever software you need. Use it as a very broad example of crawling the web with PowerShell. There are many different ways to do this, and I’m sure some might be better, but this way works exactly as I need it to for my specific use case. And of course if a website for a particular piece of software changes, the crawl function for that software will break, but there’s no way around that with web crawling.
Here is the required directory structure for this specific set of software:
And here is the full PowerShell script:
|
function downloadProgram ($readVersion, $version, $download, $name) { Write-Host "LOCAL VERSION: $readVersion" Write-Host "WEB__ VERSION: $version" Write-Host "LINK: $download" Write-Host "FILENAME: $name" Write-Host " " if ($readVersion -lt $version) { Write-Host "Newer Version Found Online!" Read-Host "Press Enter to Download" Import-Module BitsTransfer $start_time = Get-Date Start-BitsTransfer -Source "$download" -Destination "$name" Write-Output "Completed in: $((Get-Date).Subtract($start_time).Seconds) seconds" } else { Write-Host "No Newer Version Found." } } #################################################################################### #################################################################################### function Download-7zip { # SET VARIABLES $initialURL = "http://www.7-zip.org/download.html" $folderName = "7zip" $filenamePrefix = "7zip64" $filenameExtension = "msi" $defaultVersion = "0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.href -like "*x64.msi")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSION = $programSTRING -replace("@{href=a/7z","") -replace("-x64.msi}","") $programDOWNLOAD = $programSTRING -replace("@{href=","http://www.7-zip.org/") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-Chrome { # SET VARIABLES $initialURL = "http://feeds.feedburner.com/GoogleChromeReleases" $folderName = "chrome" $filenamePrefix = "chrome64" $filenameExtension = "msi" $defaultVersion = "0.0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD [xml]$program = Invoke-webRequest "$initialURL" $programVersionLookup = ($program.feed.entry | Where-object{$_.title.'#text' -match 'Stable'}).content | Select-Object{$_.'#text'} | Where-Object{$_ -match 'Windows'} | ForEach{[version](($_ | Select-string -allmatches '(\d{1,4}\.){3}(\d{1,4})').matches | select-object -first 1 -expandProperty Value)} | Sort-Object -Descending | Select-Object -first 1 $programVERSION = "$programVersionLookup" $programDOWNLOAD = "https://dl.google.com/dl/chrome/install/googlechromestandaloneenterprise64.msi" #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-Firefox { # SET VARIABLES $initialURL = "https://www.mozilla.org/en-US/firefox/all/?q=English%20(US)" $folderName = "firefox" $filenamePrefix = "firefox64" $filenameExtension = "exe" $defaultVersion = "0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.href -like "*os=win64*")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSION = $programSTRING -replace("@{href=https://download.mozilla.org/\?product=firefox-","") -replace("-SSL&os=win64&lang=en-US}","") $programDOWNLOAD = $programSTRING -replace("@{href=","") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-FirefoxESR { # SET VARIABLES $initialURL = "https://www.mozilla.org/en-US/firefox/organizations/all/?q=English%20(US)" $folderName = "firefoxESR" $filenamePrefix = "firefox64ESR" $filenameExtension = "exe" $defaultVersion = "0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.href -like "*os=win64*")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSION = $programSTRING -replace("@{href=https://download.mozilla.org/\?product=firefox-","") -replace("esr-SSL&os=win64&lang=en-US}","") $programDOWNLOAD = $programSTRING -replace("@{href=","") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-FlashActiveX { # CAN ONLY CHECK VERSION BUT NOT DOWNLOAD SINCE IT REQUIRES LOGIN TO ADOBE WEBSITE # SET VARIABLES $initialURL = "https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE" $folderName = "flashActiveX" $filenamePrefix = "flashActiveX" $filenameExtension = "msi" $defaultVersion = "0.0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD #$programVERSION = ((Invoke-WebRequest -Uri "$initialURL").AllElements | where {$_.tagName -eq "h4"} | select -expand innerText) -replace ("Downloads","") -replace ("Flash Player ","") -replace ("`n|`r","") -replace (" \(Win, Mac \& Linux\)","") [xml]$FlashMajorVersion = Invoke-WebRequest -Uri "http://fpdownload2.macromedia.com/pub/flashplayer/update/current/sau/currentmajor.xml" $FlashMajorVersionResult = $FlashMajorVersion.version.player.major [xml]$FlashVersionDetails = Invoke-WebRequest -Uri "http://fpdownload2.macromedia.com/pub/flashplayer/update/current/sau/$FlashMajorVersionResult/xml/version.xml" $FlashMinorVersion = $FlashVersionDetails.version.activex.minor $FlashBuildMajorVersion = $FlashVersionDetails.version.activex.buildMajor $FlashBuildMinorVersion = $FlashVersionDetails.version.activex.buildMinor $programVERSION = "$FlashMajorVersionResult.$FlashMinorVersion.$FlashBuildMajorVersion.$FlashBuildMinorVersion" #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } ################### Write-Host "LOCAL VERSION: $programREADVERSION" Write-Host "WEB__ VERSION: $programVERSION" Write-Host "LINK: https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE" Write-Host "FILENAME: $programFILENAME" Write-Host " " if ($programREADVERSION -lt $programVERSION) { Write-Host "Newer Version Found Online!" Write-Host " " Write-Host "Please login to Adobe website to manually download" Write-Host "and then rename to FILENAME indicated." Write-Host " " Read-Host "Press Enter to open browser and go to download page" Start-Process -FilePath https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE Read-Host "Press Enter to continue after manual download" } else { Write-Host "No Newer Version Found." } } #################################################################################### function Download-FlashNPAPI { # CAN ONLY CHECK VERSION BUT NOT DOWNLOAD SINCE IT REQUIRES LOGIN TO ADOBE WEBSITE # SET VARIABLES $initialURL = "https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE" $folderName = "flashNPAPI" $filenamePrefix = "flashNPAPI" $filenameExtension = "msi" $defaultVersion = "0.0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD #$programVERSION = ((Invoke-WebRequest -Uri "$initialURL").AllElements | where {$_.tagName -eq "h4"} | select -expand innerText) -replace ("Downloads","") -replace ("Flash Player ","") -replace ("`n|`r","") -replace (" \(Win, Mac \& Linux\)","") [xml]$FlashMajorVersion = Invoke-WebRequest -Uri "http://fpdownload2.macromedia.com/pub/flashplayer/update/current/sau/currentmajor.xml" $FlashMajorVersionResult = $FlashMajorVersion.version.player.major [xml]$FlashVersionDetails = Invoke-WebRequest -Uri "http://fpdownload2.macromedia.com/pub/flashplayer/update/current/sau/$FlashMajorVersionResult/xml/version.xml" $FlashMinorVersion = $FlashVersionDetails.version.plugin.minor $FlashBuildMajorVersion = $FlashVersionDetails.version.plugin.buildMajor $FlashBuildMinorVersion = $FlashVersionDetails.version.plugin.buildMinor $programVERSION = "$FlashMajorVersionResult.$FlashMinorVersion.$FlashBuildMajorVersion.$FlashBuildMinorVersion" #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } ################### Write-Host "LOCAL VERSION: $programREADVERSION" Write-Host "WEB__ VERSION: $programVERSION" Write-Host "LINK: https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE" Write-Host "FILENAME: $programFILENAME" Write-Host " " if ($programREADVERSION -lt $programVERSION) { Write-Host "Newer Version Found Online!" Write-Host " " Write-Host "Please login to Adobe website to manually download" Write-Host "and then rename to FILENAME indicated." Write-Host " " Read-Host "Press Enter to open browser and go to download page" Start-Process -FilePath https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE Read-Host "Press Enter to continue after manual download" } else { Write-Host "No Newer Version Found." } } #################################################################################### function Download-Java { # SET VARIABLES $initialURL = "https://java.com/en/download/manual.jsp" $folderName = "java" $filenamePrefix = "java64" $filenameExtension = "exe" $defaultVersion = "0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.innerText -like "*Offline (64-bit)*")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSIONcrawl = (Invoke-WebRequest -uri "$initialURL").AllElements | where {$_.tagName -eq "h4"} | where {$_.outerHTML -like "*sub*"} | where {$_.innerText -like "*Recommended Version *"} | select -expand innerText $programVERSION = $programVERSIONcrawl -replace ("Recommended Version ","") -replace (" Update ",".") $programDOWNLOAD = $programSTRING -replace("@{href=","") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-VLC { # SET VARIABLES $initialURL = "http://www.videolan.org/vlc/download-windows.html" $folderName = "vlc" $filenamePrefix = "vlc64" $filenameExtension = "exe" $defaultVersion = "0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.href -like "*-win64.exe")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSION = $programSTRING -replace("@{href=//get.videolan.org/vlc/\d{1}\.\d{1}\.\d{1}/win64/vlc-","") -replace("-win64.exe}","") $programDOWNLOAD = $programSTRING -replace("@{href=","http:") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### #################################################################################### Write-Host " " Write-Host "This script will check for updates to:" Write-Host " " Write-Host "- 7zip" Write-Host "- Chrome" Write-Host "- Firefox" Write-Host "- Firefox ESR" Write-Host "- Java" Write-Host "- VLC" Write-Host "- Flash Player ActiveX (manual download)" Write-Host "- Flash Player NPAPI (manual download)" Write-Host " " Read-Host "Press Enter to start" Write-Host "################################################################################" Write-Host "Checking: 7zip" Write-Host "########################################" Download-7zip Write-Host " " Write-Host "################################################################################" Write-Host "Checking: Chrome" Write-Host "########################################" Download-Chrome Write-Host " " Write-Host "################################################################################" Write-Host "Checking: Firefox" Write-Host "########################################" Download-Firefox Write-Host " " Write-Host "################################################################################" Write-Host "Checking: FirefoxESR" Write-Host "########################################" Download-FirefoxESR Write-Host " " Write-Host "################################################################################" Write-Host "Checking: Java" Write-Host "########################################" Download-Java Write-Host " " Write-Host "################################################################################" Write-Host "Checking: VLC" Write-Host "########################################" Download-VLC Write-Host " " Write-Host "################################################################################" Write-Host "Checking: FlashActiveX" Write-Host "########################################" Download-FlashActiveX Write-Host " " Write-Host "################################################################################" Write-Host "Checking: FlashNPAPI" Write-Host "########################################" Download-FlashNPAPI Write-Host " " Write-Host "################################################################################" Write-Host " " Write-Host "SCRIPT COMPLETE" Read-Host "Press Enter to exit" |
Hello Boris
Subject: How to Crawl Websites to Download Software Updates with PowerShell.
How can exclude the beta versions from the downloads in the $program READ?
$programREAD = Get-ChildItem “.\$folderName\” -name | Sort-Object -Descending | Where-Object {$_.Name -NotMatch “nls”} | Select-Object -First 1
Thanks
Hi Akram,
It depends what website you’re trying to crawl and how their page is laid out in HTML. If the link contains the word “beta” then you can exclude links that contain that string inside the < a > tags. Just an example – it’ll really depend on the web page.
Hello Boris
Page: How to Crawl Websites to Download Software Updates with PowerShell
How to change the script to exculde the beta version from the download?
$programREAD = Get-ChildItem “.\$folderName\” -name | Sort-Object -Descending | Select-Object -First 1
Thanx
Hi Boris,
I’m using the script to download the 7-zip updates from the URL: http://www.7-zip.org/download.html
Above on the page of the download site you can see the beta name and the version number. would you like
to help me with this?
thank you very much.
Akram
With the way the 7zip download page is built, there is no good easy way to do this, however, if we assume the beta will always be the first link displayed, then we can simply change $programURL = $program[0] to $programURL = $program[1] to select the second 64 bit MSI download displayed on the page instead of the first, which in this case will be the non-beta version.
This is gold!!