Tome's Land of IT

IT Notes from the Powertoe – Tome Tanasovski

ForEach-Parallel

I just came back from the PowerShell Deep Dive at TEC 2012.  A great experience, by the way.  I highly recommend it to everyone.  Extremely smart and passionate people who could talk about PowerShell for days along with direct access to the PowerShell product team!

During this summit, workflows were a topic of conversation.  If you have looked at workflows, there is one feature that generally catches the eye – I know it caught mine the first time I saw it – ForEach-Parallel.  Unfortunately, when you dig into what it’s doing you come to learn that it is not a solution for multithreading in PowerShell.  Nope, it’s extremely slowwwwwwwwwwwwwww.  If you’re like me, parallel processing is key to getting some enterprise-class scripts to run faster.  You may have played with jobs before, but even they have some overhead that causes them to slow down.  Running scripts side by side works, but requires you to engineer the scripts in a way that they can be called like that.  So what is the best way to run something like a loop of data across four threads?  The answer is runspaces and runspace pooling.

function ForEach-Parallel {
    param(
        [Parameter(Mandatory=$true,position=0)]
        [System.Management.Automation.ScriptBlock] $ScriptBlock,
        [Parameter(Mandatory=$true,ValueFromPipeline=$true)]
        [PSObject]$InputObject,
        [Parameter(Mandatory=$false)]
        [int]$MaxThreads=5
    )
    BEGIN {
        $iss = [system.management.automation.runspaces.initialsessionstate]::CreateDefault()
        $pool = [Runspacefactory]::CreateRunspacePool(1, $maxthreads, $iss, $host)
        $pool.open()
        $threads = @()
        $ScriptBlock = $ExecutionContext.InvokeCommand.NewScriptBlock("param(`$_)`r`n" + $Scriptblock.ToString())
    }
    PROCESS {
        $powershell = [powershell]::Create().addscript($scriptblock).addargument($InputObject)
        $powershell.runspacepool=$pool
        $threads+= @{
            instance = $powershell
            handle = $powershell.begininvoke()
        }
    }
    END {
        $notdone = $true
        while ($notdone) {
            $notdone = $false
            for ($i=0; $i -lt $threads.count; $i++) {
                $thread = $threads[$i]
                if ($thread) {
                    if ($thread.handle.iscompleted) {
                        $thread.instance.endinvoke($thread.handle)
                        $thread.instance.dispose()
                        $threads[$i] = $null
                    }
                    else {
                        $notdone = $true
                    }
                }
            }
        }
    }
}

With that function, you can do things like this:

(0..50) |ForEach-Parallel -MaxThreads 4{
    $_
    sleep 3
}

You’ll notice that the above causes batches of four to run simultaneously.  Actually, it looks like the data is running serially, but it’s really in parallel.  A better example is something like this that simulates that some processes take longer than others:

(0..50) |ForEach-Parallel -MaxThreads 4{
    $_
    sleep (Get-Random -Minimum 0 -Maximum 5)
}

Mind you, parallel processing doesn’t always make things faster.  For example, if your CPU consumption per thread is more than your box can handle, you may be adding latency due to scheduling of the CPU.  Another example is that if it’s not a long running process that you are performing in your loop, the overhead for starting up multiple threads could make your script slower.  Just use your head and play with it.  In the right place at the right time, this is an absolute lifesaver.

Note: I learned this technique from Dr. Tobias Weltner, but for some reason I can’t find the link to the video where he discussed it.

Advertisements

7 responses to “ForEach-Parallel

  1. brwilkinson May 3, 2012 at 10:37 pm

    Dr Tobias Weltner video on multi threading etc
    http://bits_video.s3.amazonaws.com/022012-SUPS01_archive.f4v
    >> download files: http://powershell.com/cs/media/p/14779.aspx

    Maybe the original links for the video from Powershell.com were removed?!

  2. David Erickson May 15, 2012 at 7:28 pm

    I think I am missing something. The script block populates the variables but after the scriptblock is finished the variables are no longer populated with data. I need the results of the variable for the next part of my script. What am I missing. $fqdnlist and $deadservers is blank after the script block is done.
    I am very excited about this as this is much faster then start-job!
    function ForEach-Parallel {
    param(
    [Parameter(Mandatory=$true,position=0)]
    [System.Management.Automation.ScriptBlock] $ScriptBlock,
    [Parameter(Mandatory=$true,ValueFromPipeline=$true)]
    [PSObject]$InputObject,
    [Parameter(Mandatory=$false)]
    [int]$MaxThreads=5
    )
    BEGIN {
    $iss = [system.management.automation.runspaces.initialsessionstate]::CreateDefault()
    $pool = [Runspacefactory]::CreateRunspacePool(1, $maxthreads, $iss, $host)
    $pool.open()
    $threads = @()
    $ScriptBlock = $ExecutionContext.InvokeCommand.NewScriptBlock(“param(`$_)`r`n” + $Scriptblock.ToString())
    }
    PROCESS {
    $powershell = [powershell]::Create().addscript($scriptblock).addargument($InputObject)
    $powershell.runspacepool=$pool
    $threads+= @{
    instance = $powershell
    handle = $powershell.begininvoke()
    }
    }
    END {
    $notdone = $true
    while ($notdone) {
    $notdone = $false
    for ($i=0; $i -lt $threads.count; $i++) {
    $thread = $threads[$i]
    if ($thread) {
    if ($thread.handle.iscompleted) {
    $thread.instance.endinvoke($thread.handle)
    $thread.instance.dispose()
    $threads[$i] = $null
    }
    else {
    $notdone = $true
    }
    }
    }
    }
    }
    }

    $erroractionpreference = “SilentlyContinue”
    $colComputers = get-content C:\temp\listserver.txt
    $Fqdnlist = @()
    $deadservers = @()
    $code = {
    $results += $var = nltest /domain_trusts; $var = $var -split ” “; $var = $var | ? {$_ -like “*.net” -or $_ -like “*.com” -or $_ -like “*.pvt”}
    foreach ($domain in $var)
    {
    $ping = new-object System.Net.NetworkInformation.Ping
    $fqdn = “$_” + “.” + “$domain”
    $Reply = $ping.send($fqdn)
    if ($Reply.status –eq “Success”)
    {
    Write-host -ForegroundColor Green (“$_” + “.” + “$domain”)
    $Fqdnlist += (“$_” + “.” + “$domain” + “,”)
    }
    else
    {
    $deadservers += (“$_”)
    }
    $reply = “”
    }

    }
    ($colcomputers) | foreach-parallel -ScriptBlock $code -MaxThreads 20

    • Tome May 21, 2012 at 12:39 pm

      Yes, the variables within the scriptblock will only remain in scope for the duration of the scriptblock. You would need to send the objects you want to return within the scriptblock on a line of their own to return them. You would then need to figure out how to handle multiple objects returned.

      You can control this greatly, however, by managing the runspaces yourself. In other words, if you don’t use runspacepools, you can reuse your runspaces and maintain the state of variables. However, this will only be in the scope of your runspaces, but you have more access to them.

  3. Pingback: Parallel PowerShell | rambling cookie monster

  4. ramblingcookiemonster October 24, 2012 at 11:59 am

    Hello Tome,

    This is coming in quite handy, many thanks!

    Have you had to modify the initial session state to add a module, rather than loading the module for every single thread?

    I’m assuming we create the default, and then use ImportPSModule, but I’m having trouble translating the developer oriented details on this method to PowerShell!

    $sessionstate = [system.management.automation.runspaces.initialsessionstate]::CreateDefault()
    $sessionstate.ImportPSModule(“ActiveDirectory”) #doesn’t seem to work

    Regards,

    CM

  5. Pingback: More on PowerShell multithreading via runspace pools | Dave Wyatt's Blog

  6. Peter Kriegel May 25, 2014 at 9:00 am

    Hi Tome!

    I like to invite you to my discussion: Invoke-Parallel need help to clone the current Runspace
    http://powershell.org/wp/forums/topic/invpke-parallel-need-help-to-clone-the-current-runspace/

    Greets Peter Kriegel
    Founder member of the European, German speaking, Windows PowerShell Community
    http://www.PowerShell-Group.eu

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: