Out-File Encoding in PowerShell

I recently ran into a problem while setting up a build pipeline for a new project at work. The errors were completely unhelpful and I was struggling to understand what was going on. Several hours later, and having nearly pulled out all my hair, I finally came across a post on StackOverflow that provided the answer.

It all came down to the way that the PowerShell Out-File command works. In PowerShell 5 (the version my server is on) the Out-File command uses UTF-8 with BOM as its default encoding. This is a problem because a lot of tools (at least in my experience) do not handle UTF-8 with BOM gracefully. In my case I was trying to integrate with Atlassian tools and Docker, neither of which took the BOM well.

Thanks to the post I found on StackOverflow (which, unfortunately, I can’t seem to find anymore) I found that one can dip into .Net to get things working.

$content = @"
SomeData=blue
OtherVariable=42
@"
// this file will be UTF-8 with BOM
$content | Out-File "bad-encoding.txt" 
// this file will just be UTF-8
[System.IO.File]::WriteAllLines( "good-encoding.txt", $content )

This solution works fine however, if you are fortunate enough to be working with a newer version of PowerShell this will hopefully not be an issue. According to the documentation UTF-8 without the BOM has been made the default and can be set like so.

Out-File -Encoding utf8NoBOM

I haven’t tested it but this should produce the same output as the [System.IO.File] call above.

What is BOM?

I took a quick look over at Wikipedia’s article on BOM. Essentially, the BOM, or byte order mark, is just a flag to help programs determine if the file is UTF encoded or not and also to help determine the endianness of the file. According to the article the BOM should be optional but seems to cause problems where not expected.

In fact the article specifically mentions

BOM use is optional. Its presence interferes with the use of UTF-8 by software that does not expect non-ASCII bytes at the start of a file but that could otherwise handle the text stream.

This makes me wonder if I could have solved the encoding problems I ran into by simply switching to ASCII. Hmm, I’ll have to give it a try.

Leave a Reply

Your email address will not be published. Required fields are marked *