Tome's Land of IT

IT Notes from the Powertoe – Tome Tanasovski

Linux folks meet piped objects, Microsoft folks meet sed!

The world is a buzz around the announcement that Microsoft has open sourced PowerShell and released a working version of the language for Mac and Linux.


Personally, I’ve been looking forward to this for a long time.  While I believe that PowerShell is a great language for development, I’m NOT excited about that aspect of it on Linux.  I think it will be a long time before the core dotnet becomes tuned to a point where it would perform comparably to Python as an interpreted language, but I do have hope.  No, the reason I have wanted this is because it is a super booster to bash!

Bash Booster – Objects in the pipe

I cannot tell you how many times I’ve tinkered in a Linux shell over the last few years and cursed the fact that I didn’t simply have objects.  Sure I can represent objects in CSVs or JSON, but to truly interact with them on a non-text level is frustrating in a shell after you’ve used PowerShell for so long.  In PowerShell, this is the way the world works:

Invoke-Webrequest http://blah.blah.blah |
ConvertFrom-Json|where {size -gt 100} |select name, size |export-csv output.csv

It’s all about data manipulation.  Grab data, filter, select properties to create new data sets, and output that data.  Additionally, you often process that data inline, for example:

...| ConvertFrom-Json| select @{name=SizeMB;expression={size/1MB}}| ...

And because everything is objects, you can easily write your own parsers, manipulators, or outputters that are duck-typed to work with the objects coming in.  It’s truly a game changer in the shell.

Native Linux Commands to PowerShell? – Hello sed!

For those unfamiliar with Linux or those who use Linux who are looking for the quickest way to convert your text into PowerShell objects, I’m here to tell you that sed is your friend.  Basically, the process is to turn your Linux output into CSV with sed and then use ConvertFrom-CSV to turn the CSV into PowerShell objects.  Of course, this assumes there isn’t a built-in way to switch the output of the command to CSV or JSON.  If there is a switch that exists to do so, it is always the best way to go.  For this article we’re talking about pure text output to objects.

We’re going to use the -r format of sed so that the regexes are more robust as far as what you can use in the regex.  We’re also going to use the form s/regex/replace/g which basically says (s)earch for regex and swap the contents with replace (g)lobally in the input string.

For this example, we’ll look at the output of ps -f:

10:06:04 PS /> ps -f
tome     22528  4516  0 09:42 pts/2    00:00:00 -bash
tome     22689 22528  0 09:45 pts/2    00:00:04 powershell
tome     22760 22689  0 10:06 pts/2    00:00:00 /bin/ps -f

As you can see there are some spaces or tabs between each record. We can easily parse the contents of this and replace those spaces with a comma. To do this with sed, we do the following:

10:08:12 PS /> ps -f |sed -r 's/\s+/,/g'

For now, let’s just ignore the fact that the -f is showing up with a comma in the last line. There is a fix to that which I will give an example of, but for now, let’s just convert this to PowerShell:

10:27:24 PS /> ps -f |sed -r 's/\s+/,/g' |ConvertFrom-Csv |select uid, cmd

---  ---
tome -bash
tome powershell
tome /bin/ps

10:27:49 PS /> ps -f |sed -r 's/\s+/,/g' |ConvertFrom-Csv |Format-Table

---  ---   ----  - ----- ---   ----     ---
tome 22528 4516  0 09:42 pts/2 00:00:00 -bash
tome 22689 22528 0 09:45 pts/2 00:00:09 powershell
tome 23058 22689 0 10:28 pts/2 00:00:00 /bin/ps

If you really need to ensure that the spaces in cmd are preserved, there is a good stack overflow discussion about it here, but the shortcut outcome would be to do something like this:

10:29:55 PS /> ps -f |sed -r 's/\s+/XXX/8' |sed -r 's/\s+/,/g' |
sed -r 's/XXX/ /g' |ConvertFrom-Csv |Format-Table

---  ---   ----  - ----- ---   ----     ---
tome 22528 4516  0 09:42 pts/2 00:00:00 -bash
tome 22689 22528 0 09:45 pts/2 00:00:09 powershell
tome 23084 22689 0 10:30 pts/2 00:00:00 /bin/ps -f

One final note: if you are new to PowerShell from Linux, the format commands always go at the end of an object chain. They are designed to change the display output to the screen. Otherwise, you should not use them. They will change your objects and is likely not what you want to do if you have a pipe after the format command.

What about headerless commands such as ls -l?

If you look at the output of ls -l and the corresponding output of the CSV file from a sed, it looks like this:

10:35:05 PS /home/tome> ls -l
total 4
-rw-rw-r--  1 tome tome    0 Aug 19 10:33 file1
-rw-rw-r--  1 tome tome    0 Aug 19 10:34 file2
drwxrwxr-x 11 tome tome 4096 Aug 17 10:36 PowerShell
10:35:08 PS /home/tome> ls -l |sed -r 's/\s+/,/g'

There are two problems with the above.  First, there is an extra line that has no relevant info.  Second, there is no header to tell PowerShell what the property names are for the objects.

Skipping a line with Select

Skipping the total line is easy using the skip argument to Select-Object:

10:35:10 PS /home/tome> ls -l |sed -r 's/\s+/,/g' |select -Skip 1

Adding a custom header

ConvertFrom-CSV has a specific argument called header that allows you to supply a list that makes up what would be the header found in a CSV if one does not exist.  Here is how you can use it to convert the output of ls -l to actual PowerShell objects:

10:43:33 PS /home/tome> ls -l |sed -r 's/\s+/,/g' |select -Skip 1 |
ConvertFrom-Csv -Header @('mode','count','user','group','size','month','day','time','name') |Format-Table

mode       count user group size month day time  name
----       ----- ---- ----- ---- ----- --- ----  ----
-rw-rw-r-- 1     tome tome  0    Aug   19  10:33 file1
-rw-rw-r-- 1     tome tome  0    Aug   19  10:34 file2
drwxrwxr-x 11    tome tome  4096 Aug   17  10:36 PowerShell


Alternative to sed

Alternatively, you can use PowerShell in place of sed for replacing string contents. The pattern is generally like this:

10:43:55 PS /home/tome> ls -l |select -Skip 1 |%{$_ -replace '\s+', ','}


PowerShell adds a lot more tools into your belt.  When it comes to data manipulation, the paradigm shift to objects over text is a game changer.  I’m personally really happy to have this flexibility and I can’t wait to take advantage of it!



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: