Developer Blog
Articles about Using Microsoft Developer Tools

Abbreviating URLs

Thursday, April 29, 2010 7:35 AM by jonwood

NOTE: This article has been updated and moved to: Abbreviating URLs.

Recently, I had a case where an ASP.NET page displayed the user's URL in a side column. This worked fine except that I found some users had very long URLs, which didn't look right.

It occurred to me that I could simple truncate the visible URL while still keeping the underlying link the same. However, when I truncated the URL by trimming excess characters, I realized it could be done more intelligently.

For example, consider the URL http://www.domain.com/here/is/one/long/url/page.apsx. If I wanted to keep it within 40 characters, I could trim it to http://www.domain.com/here/is/one/long/u. The problem is that this abbreviation could be more informative. For example, is it a directory or a page? And, if it's a page, what kind? And what exactly does the "u" at the end stand for?

Wouldn't it be a little better if I instead abbreviated this URL to http://www.domain.com/.../url/page.apsx? We've lost a few characters due to the three dots that show information is missing. But we can still see the domain, and the page name and type.

The code is Listing 1 abbreviates a URL is this way. The UrlHelper class contains just a single, static method, LimitLength(). This method takes a URL string and a maximum length arguments, and attempts to abbreviate the URL so that it will fit within the specified number of characters as described above.

public class UrlHelper
{
  public static char[] Delimiters = { '/', '\\' };
  /// <summary>
  /// Attempts to intelligently short the length of a URL. No attempt is
  /// made to shorten less than 5 characters.
  /// </summary>
  /// <param name="url">The URL to be tested</param>
  /// <param name="maxLength">The maximum length of the result string</param>
  /// <returns></returns>
  public static string LimitLength(string url, int maxLength)
  {
    if (maxLength < 5)
      maxLength = 5;
    if (url.Length > maxLength)
    {
      // Remove protocol
      int i = url.IndexOfAny(new char[] { ':', '.' });
      if (i >= 0 && url[i] == ':')
        url = url.Remove(0, i + 1);
      // Remove leading delimiters
      i = 0;
      while (url.Length > 0 && (url[i] == Delimiters[0]
        || url[0] == Delimiters[1]))
        i++;
      if (i > 0)
        url = url.Remove(0, i);
      // Remove trailing delimiter
      if (url.Length > maxLength && (url.EndsWith("/") || url.EndsWith("\\")))
        url = url.Remove(url.Length - 1);
      // Remove path segments until url is short enough or no more segments:
      //
      // domain.com/abc/def/ghi/jkl.htm
      // domain.com/.../def/ghi/jkl.htm
      // domain.com/.../ghi/jkl.htm
      // domain.com/.../jkl.htm
      if (url.Length > maxLength)
      {
        i = url.IndexOfAny(Delimiters);
        if (i >= 0)
        {
          string first = url.Substring(0, i + 1);
          string last = url.Substring(i);
          bool trimmed = false;
          do
          {
            i = last.IndexOfAny(Delimiters, 1);
            if (i < 0 || i >= (last.Length - 1))
              break;
            last = last.Substring(i);
            trimmed = true;
          } while ((first.Length + 3 + last.Length) > maxLength);
          if (trimmed)
            url = String.Format("{0}...{1}", first, last);
        }
      }
    }
    return url;
  }
}

Listing 1: UrlHelper class.

If the specified maximum length is less than five, LimitLength() simply changes it to five as there is no point in attempting to shorten a URL to less than the length of the protocol (http://).

That's all there is to it. I hope some of you find this code helpful.

Tags:  
Categories:   C# .NET | ASP.NET
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed

Encoding Query Arguments

Thursday, April 29, 2010 7:06 AM by jonwood

NOTE: This article has been updated and moved to: Encrypting Query Arguments.

When passing variables between pages in ASP.NET, you have a few techniques you can choose from. One of the simplest is to use query arguments (e.g. http://www.domain.com/page.aspx?arg1=val1&arg2=val2). In ASP.NET, query arguments are easy to implement and use.

If you spend time browsing sites like Amazon.com, you'll see these query arguments causing the URLs to grow quite long. Long URLs don't generally cause a problem; however, there are some potential problems with query arguments.

For one thing, they are completely visible to the user. If you need to pass sensitive variables, then this could cause problems. For another thing, users can easily modify these values. For example, let's say you have a page that displays the current user's information. If a user ID is passed as a query argument, the user could easily edit that ID, possibly causing information for another user to be displayed. The potential security concerns here are pretty obvious.

Still, query arguments can be so convenient I decided to throw together a class that allows me to use them without the potential issues described above. In order to prevent the arguments from being seen by the user, the arguments are encrypted into a single argument. And in order to prevent the user from tampering with the values, the encrypted value includes a checksum that can detect if the data has been tampered with or corrupted.

Listing 1 shows my EncryptedQueryString class. By inheriting from Dictionary<string, string>, my class is a dictionary class. You can add any number of key/value items to the dictionary and then call ToString() to produce an encrypted string that contains all the values and a simple checksum. The string returned can then be passed to a page as a single query argument.

To restore the values, you can call the constructor that accepts an encrypted string. This constructor extracts the data from the encrypted string and adds it to the dictionary. Note that if this constructor finds an invalid or missing checksum, nothing is added to the dictionary. This prevents the calling code from working with questionable data.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Security.Cryptography;
using System.Text;
using System.Web;
public class EncryptedQueryString : Dictionary<string, string>
{
  // Change the following keys to ensure uniqueness
  // Must be 8 bytes
  protected byte[] _keyBytes =
    { 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18 };
  // Must be at least 8 characters
  protected string _keyString = "ABC12345";
  // Name for checksum value (unlikely to be used in user arguments)
  protected string _checksumKey = "__$$";
  /// <summary>
  /// Creates an empty dictionary
  /// </summary>
  public EncryptedQueryString()
  {
  }
  /// <summary>
  /// Creates a dictionary from the given, encrypted string
  /// </summary>
  /// <param name="encryptedData"></param>
  public EncryptedQueryString(string encryptedData)
  {
    // Descrypt string
    string data = Decrypt(encryptedData);
    // Parse out key/value pairs and add to dictionary
    string checksum = null;
    string[] args = data.Split('&');
    foreach (string arg in args)
    {
      int i = arg.IndexOf('=');
      if (i != -1)
      {
        string key = arg.Substring(0, i);
        string value = arg.Substring(i + 1);
        if (key == _checksumKey)
          checksum = value;
        else
          base.Add(HttpUtility.UrlDecode(key), HttpUtility.UrlDecode(value));
      }
    }
    // Clear contents if valid checksum not found
    if (checksum == null || checksum != ComputeChecksum())
      base.Clear();
  }
  /// <summary>
  /// Returns an encrypted string that contains the current dictionary
  /// </summary>
  /// <returns></returns>
  public override string ToString()
  {
    // Build query string from current contents
    StringBuilder content = new StringBuilder();
    foreach (string key in base.Keys)
    {
      if (content.Length > 0)
        content.Append('&');
      content.AppendFormat("{0}={1}",  HttpUtility.UrlEncode(key),
        HttpUtility.UrlEncode(base[key]));
    }
    // Add checksum
    if (content.Length > 0)
      content.Append('&');
    content.AppendFormat("{0}={1}", _checksumKey, ComputeChecksum());
    return Encrypt(content.ToString());
  }
  /// <summary>
  /// Returns a simple checksum for all keys and values in the collection
  /// </summary>
  /// <returns></returns>
  protected string ComputeChecksum()
  {
    int checksum = 0;
    foreach (KeyValuePair<string, string> pair in this)
    {
      checksum += pair.Key.Sum(c => c - '0');
      checksum += pair.Value.Sum(c => c - '0');
    }
    return checksum.ToString("X");
  }
  /// <summary>
  /// Encrypts the given text
  /// </summary>
  /// <param name="text">Text to be encrypted</param>
  /// <returns></returns>
  protected string Encrypt(string text)
  {
    try
    {
      byte[] keyData = Encoding.UTF8.GetBytes(_keyString.Substring(0, 8));
      DESCryptoServiceProvider des = new DESCryptoServiceProvider();
      byte[] textData = Encoding.UTF8.GetBytes(text);
      MemoryStream ms = new MemoryStream();
      CryptoStream cs = new CryptoStream(ms,
        des.CreateEncryptor(keyData, _keyBytes), CryptoStreamMode.Write);
      cs.Write(textData, 0, textData.Length);
      cs.FlushFinalBlock();
      return GetString(ms.ToArray());
    }
    catch (Exception)
    {
      return String.Empty;
    }
  }
  /// <summary>
  /// Decrypts the given encrypted text
  /// </summary>
  /// <param name="text">Text to be decrypted</param>
  /// <returns></returns>
  protected string Decrypt(string text)
  {
    try
    {
      byte[] keyData = Encoding.UTF8.GetBytes(_keyString.Substring(0, 8));
      DESCryptoServiceProvider des = new DESCryptoServiceProvider();
      byte[] textData = GetBytes(text);
      MemoryStream ms = new MemoryStream();
      CryptoStream cs = new CryptoStream(ms,
        des.CreateDecryptor(keyData, _keyBytes), CryptoStreamMode.Write);
      cs.Write(textData, 0, textData.Length);
      cs.FlushFinalBlock();
      return Encoding.UTF8.GetString(ms.ToArray());
    }
    catch (Exception)
    {
      return String.Empty;
    }
  }
  /// <summary>
  /// Converts a byte array to a string of hex characters
  /// </summary>
  /// <param name="data"></param>
  /// <returns></returns>
  protected string GetString(byte[] data)
  {
    StringBuilder results = new StringBuilder();
    foreach (byte b in data)
      results.Append(b.ToString("X2"));
    return results.ToString();
  }
  /// <summary>
  /// Converts a string of hex characters to a byte array
  /// </summary>
  /// <param name="data"></param>
  /// <returns></returns>
  protected byte[] GetBytes(string data)
  {
    // GetString() encodes the hex-numbers with two digits
    byte[] results = new byte[data.Length / 2];
    for (int i = 0; i < data.Length; i += 2)
      results[i / 2] = Convert.ToByte(data.Substring(i, 2), 16);
    return results;
  }
}

Listing 1: EncryptedQueryString class.

So, for example, a page that sends encrypted arguments to another page could contain code something like what is shown in Listing 2. This code constructs an empty EncryptedQueryString object, adds a couple of values to the dictionary, and then passes the resulting string as a single query argument to page.aspx.

protected void Button1_Click(object sender, EventArgs e)
{
  EncryptedQueryString args = new EncryptedQueryString();
  args["arg1"] = "val1";
  args["arg2"] = "val2";
  Response.Redirect(String.Format("page.aspx?args={0}", args.ToString()));
}

Listing 2: Code that passes encrypted query arguments.

Finally, Listing 3 shows code that could go in page.aspx to extract the encrypted values from the single argument.

protected void Page_Load(object sender, EventArgs e)
{
  EncryptedQueryString args =
    new EncryptedQueryString(Request.QueryString["args"]);
  Label1.Text = string.Format("arg1={0}, arg2={1}", args["arg1"], args["arg2"]);
}

Listing 3: Code to extract encrypted query arguments.

And that's all there is to it. Be sure to add error checking in case the dictionary objects are not there (either because they were not provided, or because an invalid checksum caused the EncryptedQueryString class to clear all items from the dictionary).

Also, be sure to customize the two keys near the top of Listing 1 so that people who read this article won't be able to decrypt your values.

Query arguments aren't always the best choice. As mentioned, you may choose to use Session variables or other techniques, depending on your requirements. But query arguments are straight forward and easy to implement. Using the class I've presented here, they can also be reasonably secure.

Tags:  
Categories:   C# .NET | ASP.NET
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed

Domains and the New .CO TLD

Thursday, April 01, 2010 11:38 AM by jonwood

Yesterday, I purchased my shortest domain yet: PADFILE.COM. It cost me $80.00, which seems kind of expensive but I have plans for it and there are domains selling for many, many times more than that.

As .COM domains become more and more scarce, it has me thinking about the industry that has come about as a result of domains and the fact that there are a limited number of them available. It doesn’t seem that long ago that I had no knowledge of domains. As someone who likes to build unique websites and make small business out of them, I now spend quite a bit of time searching, buying, and thinking about Internet domains.

Recently, I received a notice that a new top-level domain (TLD) is now available: domains of type .CO. This isn’t really a new domain; .CO has been used in the past to indicate a domain located in Columbia. But it is scheduled to become available worldwide, and registrars want you to instead understand .CO as an abbreviation for “Company,” “Corporation” and “Commerce” instead of “Columbia”.

I’m not sure this is a good move. The argument given is that we are running out of .COM domains names, which is true. But, if there is a domain XYZ.COM, what is the point of XYZ.CO. Well, many people would say both should be owned by the same company, XYZ. In that case, we don’t have a new domain available, we just have one more domain that the company needs to pay for. If, instead, you think a new company should be able to register XYZ.CO, then we really end up with results that are bound to confuse users.

Worse, one reason to own XYZ.CO is, if XYZ.COM is a popular website, people could easily mistype by leaving off the final letter and go to XYZ.CO instead. So cyber squatters will likely try and use .CO to gain traffic this way in addition to visits that could result from the confusion I described above. This issue is being addressed by allowing organizations with an existing global trademark to have first access to .CO domains exactly matching their trademarks. But, currently, .CO domains are running around $300/year instead of the $10/year charged for most domains.

It will be interesting to see how this plays out. The release of the new .CO domains has been delayed due, in part, to some controversy about the decision. I have no question .CO domains will be popular if it goes as planned—I’m sure I’ll be getting some of my own. But I really question the wisdom of this decision. I strongly suspect it is more about making more money than it is about addressing the fact that we are starting to run out of .COM domains.

Go Daddy has posted more information about this at http://www.godaddy.com/tlds/co-domain.aspx?ci=19152.

Tags:  
Categories:   General
Actions:   E-mail | del.icio.us | Permalink | Comments (0) | Comment RSSRSS comment feed