I wrote this script to check for updates to some common software used in my organization. In order to do so, the script uses PowerShell to crawl the website of each piece of software so that it can get the software version number and download link. The script compares the version of the software it found online to the version it found in the local folder inside the script directory. If the version online is newer, it downloads the file and renames it to a nice format including the version number. The software (64-bit where available) crawled for download is:
- 7-Zip
- Google Chrome
- Mozilla Firefox
- Mozilla Firefox ESR
- Adobe Flash Player ActiveX
- Adobe Flash Player NPAPI Plugin
- Java
- VLC Media Player
I figured this script would be a good example to share because it uses a few different methods to find the information it needs on each publisher’s website. Some people can find these examples helpful to modify the code to crawl for something else. In the case of Adobe Flash Player for enterprise distribution, the website requires a login, so I just left the script to check the version and then open a browser window going to the login screen so that the user can manually download the MSI files. Since the Adobe enterprise distribution license says the download link cannot be shared, I have edited it out of the script, so you will need to replace it yourself if you want to use this function. Just search for all instances of “https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE” and replace it with the URL you received from Adobe.
I split this script into functions, with a function to download the software and a separate function to crawl the website for each piece of software. Inside of each crawl function, I split the variables and crawl methods that need modification. This also means that you can remove the last part of the script after line 315 if you would rather use the script to manually call each function using the PowerShell console in order to check for just one piece of software.
Again, this script is very application-specific, so unless you need to check for updates and download this specific set of software, you will need to modify the script for whatever software you need. Use it as a very broad example of crawling the web with PowerShell. There are many different ways to do this, and I’m sure some might be better, but this way works exactly as I need it to for my specific use case. And of course if a website for a particular piece of software changes, the crawl function for that software will break, but there’s no way around that with web crawling.
Here is the required directory structure for this specific set of software:
And here is the full PowerShell script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 |
function downloadProgram ($readVersion, $version, $download, $name) { Write-Host "LOCAL VERSION: $readVersion" Write-Host "WEB__ VERSION: $version" Write-Host "LINK: $download" Write-Host "FILENAME: $name" Write-Host " " if ($readVersion -lt $version) { Write-Host "Newer Version Found Online!" Read-Host "Press Enter to Download" Import-Module BitsTransfer $start_time = Get-Date Start-BitsTransfer -Source "$download" -Destination "$name" Write-Output "Completed in: $((Get-Date).Subtract($start_time).Seconds) seconds" } else { Write-Host "No Newer Version Found." } } #################################################################################### #################################################################################### function Download-7zip { # SET VARIABLES $initialURL = "http://www.7-zip.org/download.html" $folderName = "7zip" $filenamePrefix = "7zip64" $filenameExtension = "msi" $defaultVersion = "0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.href -like "*x64.msi")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSION = $programSTRING -replace("@{href=a/7z","") -replace("-x64.msi}","") $programDOWNLOAD = $programSTRING -replace("@{href=","http://www.7-zip.org/") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-Chrome { # SET VARIABLES $initialURL = "http://feeds.feedburner.com/GoogleChromeReleases" $folderName = "chrome" $filenamePrefix = "chrome64" $filenameExtension = "msi" $defaultVersion = "0.0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD [xml]$program = Invoke-webRequest "$initialURL" $programVersionLookup = ($program.feed.entry | Where-object{$_.title.'#text' -match 'Stable'}).content | Select-Object{$_.'#text'} | Where-Object{$_ -match 'Windows'} | ForEach{[version](($_ | Select-string -allmatches '(\d{1,4}\.){3}(\d{1,4})').matches | select-object -first 1 -expandProperty Value)} | Sort-Object -Descending | Select-Object -first 1 $programVERSION = "$programVersionLookup" $programDOWNLOAD = "https://dl.google.com/dl/chrome/install/googlechromestandaloneenterprise64.msi" #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-Firefox { # SET VARIABLES $initialURL = "https://www.mozilla.org/en-US/firefox/all/?q=English%20(US)" $folderName = "firefox" $filenamePrefix = "firefox64" $filenameExtension = "exe" $defaultVersion = "0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.href -like "*os=win64*")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSION = $programSTRING -replace("@{href=https://download.mozilla.org/\?product=firefox-","") -replace("-SSL&os=win64&lang=en-US}","") $programDOWNLOAD = $programSTRING -replace("@{href=","") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-FirefoxESR { # SET VARIABLES $initialURL = "https://www.mozilla.org/en-US/firefox/organizations/all/?q=English%20(US)" $folderName = "firefoxESR" $filenamePrefix = "firefox64ESR" $filenameExtension = "exe" $defaultVersion = "0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.href -like "*os=win64*")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSION = $programSTRING -replace("@{href=https://download.mozilla.org/\?product=firefox-","") -replace("esr-SSL&os=win64&lang=en-US}","") $programDOWNLOAD = $programSTRING -replace("@{href=","") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-FlashActiveX { # CAN ONLY CHECK VERSION BUT NOT DOWNLOAD SINCE IT REQUIRES LOGIN TO ADOBE WEBSITE # SET VARIABLES $initialURL = "https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE" $folderName = "flashActiveX" $filenamePrefix = "flashActiveX" $filenameExtension = "msi" $defaultVersion = "0.0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD #$programVERSION = ((Invoke-WebRequest -Uri "$initialURL").AllElements | where {$_.tagName -eq "h4"} | select -expand innerText) -replace ("Downloads","") -replace ("Flash Player ","") -replace ("`n|`r","") -replace (" \(Win, Mac \& Linux\)","") [xml]$FlashMajorVersion = Invoke-WebRequest -Uri "http://fpdownload2.macromedia.com/pub/flashplayer/update/current/sau/currentmajor.xml" $FlashMajorVersionResult = $FlashMajorVersion.version.player.major [xml]$FlashVersionDetails = Invoke-WebRequest -Uri "http://fpdownload2.macromedia.com/pub/flashplayer/update/current/sau/$FlashMajorVersionResult/xml/version.xml" $FlashMinorVersion = $FlashVersionDetails.version.activex.minor $FlashBuildMajorVersion = $FlashVersionDetails.version.activex.buildMajor $FlashBuildMinorVersion = $FlashVersionDetails.version.activex.buildMinor $programVERSION = "$FlashMajorVersionResult.$FlashMinorVersion.$FlashBuildMajorVersion.$FlashBuildMinorVersion" #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } ################### Write-Host "LOCAL VERSION: $programREADVERSION" Write-Host "WEB__ VERSION: $programVERSION" Write-Host "LINK: https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE" Write-Host "FILENAME: $programFILENAME" Write-Host " " if ($programREADVERSION -lt $programVERSION) { Write-Host "Newer Version Found Online!" Write-Host " " Write-Host "Please login to Adobe website to manually download" Write-Host "and then rename to FILENAME indicated." Write-Host " " Read-Host "Press Enter to open browser and go to download page" Start-Process -FilePath https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE Read-Host "Press Enter to continue after manual download" } else { Write-Host "No Newer Version Found." } } #################################################################################### function Download-FlashNPAPI { # CAN ONLY CHECK VERSION BUT NOT DOWNLOAD SINCE IT REQUIRES LOGIN TO ADOBE WEBSITE # SET VARIABLES $initialURL = "https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE" $folderName = "flashNPAPI" $filenamePrefix = "flashNPAPI" $filenameExtension = "msi" $defaultVersion = "0.0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD #$programVERSION = ((Invoke-WebRequest -Uri "$initialURL").AllElements | where {$_.tagName -eq "h4"} | select -expand innerText) -replace ("Downloads","") -replace ("Flash Player ","") -replace ("`n|`r","") -replace (" \(Win, Mac \& Linux\)","") [xml]$FlashMajorVersion = Invoke-WebRequest -Uri "http://fpdownload2.macromedia.com/pub/flashplayer/update/current/sau/currentmajor.xml" $FlashMajorVersionResult = $FlashMajorVersion.version.player.major [xml]$FlashVersionDetails = Invoke-WebRequest -Uri "http://fpdownload2.macromedia.com/pub/flashplayer/update/current/sau/$FlashMajorVersionResult/xml/version.xml" $FlashMinorVersion = $FlashVersionDetails.version.plugin.minor $FlashBuildMajorVersion = $FlashVersionDetails.version.plugin.buildMajor $FlashBuildMinorVersion = $FlashVersionDetails.version.plugin.buildMinor $programVERSION = "$FlashMajorVersionResult.$FlashMinorVersion.$FlashBuildMajorVersion.$FlashBuildMinorVersion" #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } ################### Write-Host "LOCAL VERSION: $programREADVERSION" Write-Host "WEB__ VERSION: $programVERSION" Write-Host "LINK: https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE" Write-Host "FILENAME: $programFILENAME" Write-Host " " if ($programREADVERSION -lt $programVERSION) { Write-Host "Newer Version Found Online!" Write-Host " " Write-Host "Please login to Adobe website to manually download" Write-Host "and then rename to FILENAME indicated." Write-Host " " Read-Host "Press Enter to open browser and go to download page" Start-Process -FilePath https://INSERT-ADOBE-DISTRIBUTION-LINK-HERE Read-Host "Press Enter to continue after manual download" } else { Write-Host "No Newer Version Found." } } #################################################################################### function Download-Java { # SET VARIABLES $initialURL = "https://java.com/en/download/manual.jsp" $folderName = "java" $filenamePrefix = "java64" $filenameExtension = "exe" $defaultVersion = "0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.innerText -like "*Offline (64-bit)*")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSIONcrawl = (Invoke-WebRequest -uri "$initialURL").AllElements | where {$_.tagName -eq "h4"} | where {$_.outerHTML -like "*sub*"} | where {$_.innerText -like "*Recommended Version *"} | select -expand innerText $programVERSION = $programVERSIONcrawl -replace ("Recommended Version ","") -replace (" Update ",".") $programDOWNLOAD = $programSTRING -replace("@{href=","") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### function Download-VLC { # SET VARIABLES $initialURL = "http://www.videolan.org/vlc/download-windows.html" $folderName = "vlc" $filenamePrefix = "vlc64" $filenameExtension = "exe" $defaultVersion = "0.0.0" ############### # MIGHT NEED CUSTOMIZATION DEPENDING ON CRAWL METHOD $program = (Invoke-WebRequest -Uri "$initialURL").Links | Where-Object {($_.href -like "*-win64.exe")} | select href $programURL = $program[0] $programSTRING = "$programURL" $programVERSION = $programSTRING -replace("@{href=//get.videolan.org/vlc/\d{1}\.\d{1}\.\d{1}/win64/vlc-","") -replace("-win64.exe}","") $programDOWNLOAD = $programSTRING -replace("@{href=","http:") -replace("}","") #################################################### # NO CHANGES NEEDED $programFILENAME = ".\$folderName\$filenamePrefix-$programVERSION.$filenameExtension" $programREAD = Get-ChildItem ".\$folderName\" -name | Sort-Object -Descending | Select-Object -First 1 if ($programREAD.length -eq 0) { $programREADVERSION = "$defaultVersion" } else { $programREADVERSION = $programREAD -replace("$filenamePrefix-","") -replace(".$filenameExtension","") } downloadProgram $programREADVERSION $programVERSION $programDOWNLOAD $programFILENAME ################### } #################################################################################### #################################################################################### Write-Host " " Write-Host "This script will check for updates to:" Write-Host " " Write-Host "- 7zip" Write-Host "- Chrome" Write-Host "- Firefox" Write-Host "- Firefox ESR" Write-Host "- Java" Write-Host "- VLC" Write-Host "- Flash Player ActiveX (manual download)" Write-Host "- Flash Player NPAPI (manual download)" Write-Host " " Read-Host "Press Enter to start" Write-Host "################################################################################" Write-Host "Checking: 7zip" Write-Host "########################################" Download-7zip Write-Host " " Write-Host "################################################################################" Write-Host "Checking: Chrome" Write-Host "########################################" Download-Chrome Write-Host " " Write-Host "################################################################################" Write-Host "Checking: Firefox" Write-Host "########################################" Download-Firefox Write-Host " " Write-Host "################################################################################" Write-Host "Checking: FirefoxESR" Write-Host "########################################" Download-FirefoxESR Write-Host " " Write-Host "################################################################################" Write-Host "Checking: Java" Write-Host "########################################" Download-Java Write-Host " " Write-Host "################################################################################" Write-Host "Checking: VLC" Write-Host "########################################" Download-VLC Write-Host " " Write-Host "################################################################################" Write-Host "Checking: FlashActiveX" Write-Host "########################################" Download-FlashActiveX Write-Host " " Write-Host "################################################################################" Write-Host "Checking: FlashNPAPI" Write-Host "########################################" Download-FlashNPAPI Write-Host " " Write-Host "################################################################################" Write-Host " " Write-Host "SCRIPT COMPLETE" Read-Host "Press Enter to exit" |
Hello Boris
Subject: How to Crawl Websites to Download Software Updates with PowerShell.
How can exclude the beta versions from the downloads in the $program READ?
$programREAD = Get-ChildItem “.\$folderName\” -name | Sort-Object -Descending | Where-Object {$_.Name -NotMatch “nls”} | Select-Object -First 1
Thanks
Hi Akram,
It depends what website you’re trying to crawl and how their page is laid out in HTML. If the link contains the word “beta” then you can exclude links that contain that string inside the < a > tags. Just an example – it’ll really depend on the web page.
Hello Boris
Page: How to Crawl Websites to Download Software Updates with PowerShell
How to change the script to exculde the beta version from the download?
$programREAD = Get-ChildItem “.\$folderName\” -name | Sort-Object -Descending | Select-Object -First 1
Thanx
Hi Boris,
I’m using the script to download the 7-zip updates from the URL: http://www.7-zip.org/download.html
Above on the page of the download site you can see the beta name and the version number. would you like
to help me with this?
thank you very much.
Akram
With the way the 7zip download page is built, there is no good easy way to do this, however, if we assume the beta will always be the first link displayed, then we can simply change $programURL = $program[0] to $programURL = $program[1] to select the second 64 bit MSI download displayed on the page instead of the first, which in this case will be the non-beta version.
This is gold!!