Optimizing lookups in PowerShell
By Anatoly Mironov
Have you had a PowerShell script that contains two bigger arrays and you wanted merge the information. It can become quite slow if you need to search for every item from array A through all items in array B. The solution is called a HashTable! It might be not an advanced tip for some, but I was really glad to see a huge improvement, so I decided to share it as a post.
My Array A ($sites) is a list of SharePoint Sites (over 10K of them). For every site I need to get information on the owner (such as UsageLocation). In order to minimize calls to the server I want to reuse the information - in my array B: $users. This array of users has also thousands of entries.
Here is my main (simplified) setup:
$users = # @() array, code ommitted for brevity | |
$sites = # @() array, code ommitted for brevity | |
$sitesAndOwners = $sites | ForEach-Object { | |
[PSCustomObject]@{ | |
Site = $_ | |
Owner = GetUserInfo($_.Owner) | |
} | |
} |
Traversing the array B for the right item for every entry in array A is slow: Where-Object:
function GetUserInfoSlow($upn) { | |
$user = $users | Where-Object { $_.UserPrincipalName -eq $upn } | |
if ($user.Count -eq 0) { | |
$user = Get-AzureADUser -SearchString $upn | |
$users = $users + $user | |
} | |
return $user | |
} |
Using a hashtable is much faster:
$sersHash = @{} | |
function GetUserInfoFast($upn) { | |
# we check if there is an entry even if value is null | |
if ($sersHash.Contains($upn)) { | |
$user = $sersHash[$upn] | |
} | |
else { | |
$user = Get-AzureADUser -SearchString $upn | |
$sersHash.Add($upn, $user) | |
} | |
$user | |
} |
In my example it took hours first. Now it takes seconds. A bonus: here is how you can convert an array to a hash table:
#bonus: convert array to a hash table | |
$users | ForEach-Object { | |
$usersHash.Add($_.UserPrincipalName, $_) | |
} |
That’s all I’ve got today. Sharing is caring… And of course, big thanks to my colleague Anton H. for his advise.