Todd's Tech Nonsense: NHibernate Performance (Ansi String vs. Unicode String)

I recently got pulled into an application that was encountering serious performance degradation on a set of queries run via NHibernate. After a lot of Googling and not finding what I would consider a decent write up I decided to post this. Turns out as we did our research we discovered that SQL Server was doing index scans on primary key lookups as opposed to the more efficient index seeks. When this occurs repeatedly over a high volume of data there can be a significant performance impact.

For more details on the difference between an index seek vs. index scan please see this article.

There is a very subtle difference in the NHibernate mapping configuration that causes this to happen when the database column is of type VARCHAR or CHAR and the NHibernate mapping type is represented as a string. This is more concerning when using Fluent NHibernate as the default reflected value on any .Net string will map to a NHibernate type of string.

For example take the following table design and mapping files:

Mapping via HBM:

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2">
  <class name="ConsoleApplication2.Client, ConsoleApplication2" table="Client">
    <id name="AccountNumber" type="String" length="6" column="Account">
      <generator class="assigned" />
    </id>
    <property name="Name" type="String" length="50"/>
  </class>
</hibernate-mapping>

Mapping via Fluent NHibernate:

public class ClientMap : ClassMap<Client>

{
    public ClientMap()
    {
        Table("Todd..Client");
        Id(x => x.AccountNumber).Column("Account").GeneratedBy.Assigned();
        Map(x => x.Name);
    }        
}

A lookup on the Account column will result in a sql/plan as follows:

exec sp_executesql N'SELECT client0_.Account as Account0_0_, client0_.Name as Name0_0_ FROM Todd..Client client0_ WHERE client0_.Account=@p0',N'@p0 nvarchar(6)',@p0=N'222222'

The problem here is that @p0 is passed as NVARCHAR(6) (aka a Unicode string) which does not match the index of the column with is non-Unicode. This cases the index scan.

The solution is to update your mappings to use AnsiString. If you are using HBM files this means:

type="AnsiString"

and if you are using Fluent you would add:

.CustomType("AnsiString")

This changes the sql/plan to be as follows:

exec sp_executesql N'SELECT client0_.Account as Account0_0_, client0_.Name as Name0_0_ FROM Todd..Client client0_ WHERE client0_.Account=@p0','@p0 varchar(6)',@p0='222222'

Summary: If you have non-Unicode columns in your database that you do frequent index lookups on be sure they are mapped in NHibernate to type AnsiString.

4 comments:

UnknownNovember 10, 2010 at 7:17 AM
Great catch! This will definitely help us in a lot of areas.
Mark PerryApril 11, 2011 at 4:03 AM
Thanks we missed this as well.
AnonymousSeptember 28, 2013 at 11:39 AM
Still useful 3 years later. Curious how you got NHibernate to cast as varchar(6). No matter what I've tried I always get a varchar(8000). I don't know if it makes a difference for performance or not.
UnknownApril 8, 2022 at 3:58 AM
Where can I check the sql query generated by this fix

Tuesday, November 9, 2010

NHibernate Performance (Ansi String vs. Unicode String)

4 comments: