February, 2010


25
Feb 10

Built-in .NET Utility Classes for encoding/decoding strings

Every now and again, you’ll will come across a string that has been encoded (or possibly you need to encode a string) and you’ll wonder whether there is a class in .NET that can help. The answer is probably yes, but they are tucked away and difficult to find. In this article I’ll show you how to easily encode/decode your strings for the three most common forms of encoding.

HTML encoding

This is also used in XML data, to quote ampersands and other characters. For example, the string “Ben & Jerry’s” will be encoded as “Ben & Jerry's”.

To escape and unescape these strings, use System.Web.HttpUtility.HtmlEncode() and System.Web.HttpUtility.HtmlDecode().

URL Encoding

URL encoded characters are the %-number elements you occasionally see in URLs. For example, “Ben & Jerry’s” this time will become “Ben+%26+Jerry’s”. Note that spaces can be represented as “+” or as “%20″. The encode method will always use +, but the Decode method handles both.

To escape and unescape these strings, use System.Web.HttpUtility.URLEncode() and System.Web.HttpUtility.URLDecode().

Note that both of the above examples require your project to have a reference to System.Web.

Escaped Strings

These characters are the \r\n and \t’s that you use to specify special characters in strings. Though it’s rare, sometimes you’ll come across these characters in plain text and need to convert them to the actual character codes (or vice versa).

To handle these strings correctly, use the frustratingly obscure Regex.Escape and Regex.Unescape methods. Escape will turn control characters in to their plain-text escaped representation (e.g., a new line will become \r\n) and Unescape will turn them back again.