Search This Blog

2011-02-20

EmitMapper: customizing mapping tules

There is a need for data mapping while developing the enterprise applications with rich functionality, especially in Domain-driven design (DDD).
Martin Fowler says:
Sometimes you need to set up communications between two subsystems that still need to stay ignorant of each other. This may be because you can't modify them or you can but you don't want to create dependencies between the two or even between them and the isolating element.

Objects, relational databases, data transfer objects (DTO), etc. often have different mechanisms for structuring data. Many parts of an object, such as collections and inheritance, aren't present in relational databases, DTOs. When you build an object model with a lot of business logic it's valuable to use these mechanisms to better organize the data and the behavior that goes with it. Doing so leads to variant schemas; that is, the object schema and the relational schema don't match up.
You still need to transfer data between the two schemas, and this data transfer becomes a complexity in its own right. If the in-memory objects know about the relational database structure, changes in one tend to ripple to the other.
The Data Mapper is a layer of software that separates the in-memory objects from the database or other service. Its responsibility is to transfer data between the two and also to isolate them from each other.

So when there is a challenge of data mapping between two subsystem it becomes a bone in the throat! On the one hand it is possible to create handwritten mapping, use adapters. etc, it is fast and reliable, but on the other hand it takes much time to support and it is not flexible enough.
There is bunch of tools and solutions for data mapping, automatic: AutoMapper, BLToolkit, EmitMapper, semi-automatic: MapForce, Stylus Studio, BTS. Semi-automatic mapper are very advanced but they are usually not free and complex.
I was using automatic mappers, in particular EmitMapper as one of the fastest, because it uses Emit (unsurprisingly from its name :-)) But default configurations are also not so flexible, but there is a bunch of available customizations by default, like creating object for custom constructor, mapping rule definitions, etc. But the most interesting there is IMappingOperation that allow totally customize a mapping process.
Ok, let's look at the problem I recently had. On the one side of the system there is a container with data grouped by key-value collection (just like a table).
public class DataContainer
{
  public Dictionary<string, object> Fields;
}
On the other side there is strongly-typed entity (actually it can be tree of object, but let's simplify the task for this post):
public class Entity
{
  public Guid Id { get; set; }
  public string Name { get; set; }
  public int Number { get; set; }
  public decimal Price { get; set; }
  public string UserName { get; set; }
  public DateTime Time { get; set; }
}
And of-course there is a data container definition available (list of keys and filed type). Usually, if you are using an ORM there are some techniques to set-up mapping rules: xml, attributes, etc.
But if you are not using an ORM (either it is not a data-access layer object or it is not possible to use it), you have to find the way how to formalize the mapping rules. The most obvious way for me was to use custom attributes, just like in some ORMs. data serialization for services:
[AttributeUsage(AttributeTargets.Property | AttributeTargets.Field, AllowMultiple = true, Inherited = true)]
public class DataMemberAttribute : Attribute
{
  public virtual string FieldName { get; set; }
  public virtual Type FieldType { get; set; }
}
Ok, let's formalize the rules for our example:
public class Entity
{
  [DataMember(FieldName = "order_id", FieldType = typeof(string))]
  public Guid Id { get; set; }

  [DataMember(FieldName = "order_name")]
  public string Name { get; set; }

  [DataMember(FieldName = "order_number", FieldType = typeof(double))]
  [DataMember(FieldName = "order_number_2", FieldType = typeof(double))]
  public int Number { get; set; }

  [DataMember(FieldName = "order_price")]
  public decimal Price { get; set; }

  [DataMember]
  public string UserName { get; set; }

  [DataMember(FieldType = typeof(string))]
  public DateTime Time { get; set; }
}
So that means data container has these columns: order_id, order_name, order_number, order_number_2, order_price, UserName, Time. It is a simple example but it might be so in real situations. If field name or type is not set it will take the property name and type.
The task might be more clear after this test example:
[Test]
public void EntityToDataContainerMappingTest()
{
  var entity = new Entity
  {
    Id = Guid.NewGuid(),
    Name = "Entity Name",
    Number = 134567,
    Price = 100.500m,
    UserName = string.Empty,
    Time = DateTime.Now
  };

  var container = Mapper.Map<Entity, DataContainer>(entity);

  Assert.AreEqual(entity.Id.ToString(), container.Fields["order_id"]);
  Assert.AreEqual(entity.Number, container.Fields["order_number"]);
  Assert.AreEqual(entity.Number, container.Fields["order_number_2"]);
  Assert.AreEqual(entity.Name, container.Fields["order_name"]);
  Assert.AreEqual(entity.Price, container.Fields["order_price"]);
  Assert.AreEqual(entity.UserName, container.Fields["UserName"]);
}

The class diagram of the solution I want to create:
Mapper tool is wrapped with static facade (Mapper class) for convenient interface (to be protected against invariants, changing of a tool etc), that provides us both with mapping API and mapping customizations that implements EmitMapper's IMappingConfigurator interface.
Static facade will have an instance of mapping core (MapperCore class) that contains mappers (EmitMapper's ObjectsMapper classes) and configuration instances for the Emit. The lowest layer here is domain model mapping configurations that would allow us to map data container to the entity and in reverse order (ObjectToDataContainerConfigurator and DataContainerToObjectConfigurator classes).

Ok, where we should start? First, we need a logic to extract those fields descriptions for Reflection Utils class. Ok, let's go:
public static ConcurrentDictionary<string, Tuple<MemberInfo, Type>> GetTypeDataContainerDescription(Type type)
{
  Assert.ArgumentNotNull(type, "type");

  var fieldsDescription = from member in EmitMapper.Utils.ReflectionUtils.GetPublicFieldsAndProperties(type)
      let membersDefinitions = GetDataMemberDefinition(member)
      from fieldDescription in membersDefinitions
      let fieldName = fieldDescription.Item1
      let fieldType = fieldDescription.Item2
      select new KeyValuePair<string, Tuple<MemberInfo, Type>>(fieldName, Tuple.Create(member, fieldType));

  return new ConcurrentDictionary<string, Tuple<MemberInfo, Type>>(fieldsDescription);
}

public static IEnumerable<Tuple<string, Type>> GetDataMemberDefinition(MemberInfo memberInfo)
{
  Assert.ArgumentNotNull(memberInfo, "memberInfo");

  return from attribute in Attribute.GetCustomAttributes(memberInfo, typeof(DataMemberAttribute), true).Cast<DataMemberAttribute>()
      let fieldName = string.IsNullOrEmpty(attribute.FieldName) ? memberInfo.Name : attribute.FieldName
      let memberType = EmitMapper.Utils.ReflectionUtils.GetMemberType(memberInfo)
      let fieldType = attribute.FieldType ?? memberType
      select Tuple.Create(fieldName, fieldType);
}

This code looks messy, but one can always make a code refactoring :-) The purpose we had here is to get container schema for the type.
Let's create mapper core class, it would be nice to have the possibility of mapping objects, configuring a mapper and caching mappers.
public class MapperCore
{
  private static readonly IMappingConfigurator DefaultConfigurator;
  private static readonly ConcurrentBag<object> Mappers;
  private static readonly ConcurrentBag<Tuple<Type, Type, IMappingConfigurator>> MappingConfigurations;

  static MapperCore()
  {
    DefaultConfigurator = new DefaultMapConfig();
    Mappers = new ConcurrentBag<object>();
    MappingConfigurations = new ConcurrentBag<Tuple<Type, Type, IMappingConfigurator>>();
  }

  public virtual Tuple<Type, Type, IMappingConfigurator>[] Configurations
  {
    get { return MappingConfigurations.ToArray(); }
  }

  public void Initialize(IMapperInitializator mapperInitializator)
  {
    mapperInitializator.ConfigureMapper(this);
  }

  public virtual void AddConfiguration<TFrom, TTo>(IMappingConfigurator configurator)
  {
    Assert.IsNotNull(configurator, "configurator");

    MappingConfigurations.Add(new Tuple<Type, Type, IMappingConfigurator>(typeof(TFrom), typeof(TTo), configurator));
  }

  public virtual TTo Map<TFrom, TTo>(TFrom @from)
  {
    Assert.ArgumentNotNull(@from, "@from");

    var mapper = GetMapper<TFrom, TTo>();
    return mapper.Map(@from);
  }

  public virtual TTo Map<TFrom, TTo>(TFrom @from, TTo @to)
  {
    Assert.ArgumentNotNull(@from, "@from");
    Assert.ArgumentNotNull(@to, "@to");

    var mapper = GetMapper<TFrom, TTo>();
    return mapper.Map(@from, @to);
  }

  public virtual IEnumerable<TTo> MapCollection<TFrom, TTo>(IEnumerable<TFrom> @from)
  {
    Assert.ArgumentNotNullOrEmpty(@from, "@from");

    var mapper = GetMapper<TFrom, TTo>();
    return mapper.MapEnum(@from);
  }

  protected virtual ObjectsMapper<TFrom, TTo> GetMapper<TFrom, TTo>()
  {
    var mapper = Mappers.FirstOrDefault(m => m is ObjectsMapper<TFrom, TTo>) as ObjectsMapper<TFrom, TTo>;

    if (mapper == null)
    {
      var configuration = MappingConfigurations.Where(mp => mp.Item1.IsAssignableFrom(typeof(TFrom)) && mp.Item2.IsAssignableFrom(typeof(TTo))).FirstOrDefault();
      var config = configuration == null ? DefaultConfigurator : configuration.Item3;

      mapper = ObjectMapperManager.DefaultInstance.GetMapper<TFrom, TTo>(config);

      Mappers.Add(mapper);
    }

    return mapper;
  }
}
Ok, the service code has been written and now we can move to the most magic things .. configurations for mapping data containers to strongly-typed entities and vice versa. Two custom configurators had been written for EmitMapper for those purposes. The idea were also simple, just read the field descriptions from data member attributes and convert values walking through the entity's members:
public class DataContainerToObjectConfigurator : MapConfigBase<DataContainerToObjectConfigurator>
  {
    public override IMappingOperation[] GetMappingOperations(Type from, Type to)
    {
      return this.FilterOperations(from, to, ReflectionUtils.GetTypeDataContainerDescription(to)
.Select(fieldsDescription =>
        {
          var fieldName = fieldsDescription.Key;
          var destinationMember = fieldsDescription.Value.Item1;
          var fieldType = fieldsDescription.Value.Item2;
          return new DestWriteOperation
            {
              Destination = new MemberDescriptor(destinationMember),
              Getter = (ValueGetter<object>)((item, state) =>
                {
                  if (item == null || !(item is DataContainer))
                  {
                    return ValueToWrite<object>.Skip();
                  }
                  var container = item as DataContainer;

                  var destinationType = EmitMapper.Utils.ReflectionUtils.GetMemberType(destinationMember);
                  var destinationMemberValue = ReflectionUtils.ConvertValue(container.Fields[fieldName], fieldType, destinationType);

                  return ValueToWrite<object>.ReturnValue(destinationMemberValue);
          })
      };
    })).ToArray();
  }
}

public class ObjectToDataContainerConfigurator : MapConfigBase
{
  public ObjectToDataContainerConfigurator()
  {
    ConstructBy(() => new DataContainer { Fields = new Dictionary() });
  }

  public override IMappingOperation[] GetMappingOperations(Type from, Type to)
  {
    return this.FilterOperations(from, to, ReflectionUtils.GetTypeDataContainerDescription(from)
                .Select(fieldsDescription =>
                {
                  var fieldName = fieldsDescription.Key;
                  var sourceMember = fieldsDescription.Value.Item1;
                  var fieldType = fieldsDescription.Value.Item2;
                  return new SrcReadOperation
                    {
                      Source = new MemberDescriptor(sourceMember),
                      Setter = (destination, value, state) =>
                      {
                        if (destination == null || value == null || !(destination is DataContainer))
                        {
                          return;
                        }

                        var container = destination as DataContainer;

                        var sourceType = EmitMapper.Utils.ReflectionUtils.GetMemberType(sourceMember);
                        var destinationMemberValue = ReflectionUtils.ConvertValue(value, sourceType, fieldType);

                        container.Fields.Add(fieldName, destinationMemberValue);
                      }
                    };
                })).ToArray();
  }
}
EmitMapper utils provides us with fluent API of extracting and saving data without using System.Reflection API. But...not so simple here, actually we need a way to convert types for data container, because entity's property type may vary form data entry's type.
The .NET Framework provides several features that support type conversion. These include the following:
1. The Implicit operator, which defines the available widening conversions between types.
2. The Explicit operator, which defines the available narrowing conversions between types.
3. The IConvertible interface, which defines conversions to each of the base .NET Framework data types.
4. The Convert class, which provides a set of methods that implement the methods in the IConvertible interface.
5. The TypeConverter class, which is a base class that can be extended to support the conversion of a specified type to any other type.

In order to create generic type converter we should consider these type conversion stages. In the code below during converting different type first it tries to check whether types are the same or assailable. If so we can just assign value directly. If type implements IConvertable interface it make sense to use it to convert value because it does not use Reflection so works pretty fast, i.e. Guid can be converted to String. But if converting fails, we will cache the list of non-convertable types to avoid exceptional situation in future. For example, it tries to convert String to Guid but fails with exception (that cause performance issues).
The last echelon is using TypeConverter. Here we also need to cache the converters to increase execution time.
private static readonly ConcurrentDictionary<Type, TypeConverter> TypeConverters = new ConcurrentDictionary<Type, TypeConverter>();
private static readonly List<Tuple<Type, Type>> TypesNotIConvertible = new List<Tuple<Type, Type>>();

public static object ConvertValue(object sourceValue, Type sourceType, Type destinationType)
{
  if (sourceValue == null || sourceType == null || destinationType == null)
  {
    return null;
  }

  if (sourceType == destinationType || destinationType.IsAssignableFrom(sourceType))
  {
    return sourceValue;
  }

  if (sourceValue is IConvertible && !TypesNotIConvertible.Contains(Tuple.Create(sourceType, destinationType)))
  {
    try
    {
      return Convert.ChangeType(sourceValue, destinationType);
    }
    catch
    {
      TypesNotIConvertible.Add(Tuple.Create(sourceType, destinationType));
    }      
  }

  object result = null;

  var typeConverter = TypeConverters.GetOrAdd(destinationType, TypeDescriptor.GetConverter);
  if (typeConverter != null && typeConverter.CanConvertFrom(sourceType))
  {
    result = typeConverter.ConvertFrom(sourceValue);
  }
  else
  {
    typeConverter = TypeConverters.GetOrAdd(sourceType, TypeDescriptor.GetConverter);
    if (typeConverter != null && typeConverter.CanConvertTo(destinationType))
    {
      result = typeConverter.ConvertTo(sourceValue, destinationType);
    }
  }

  return result;
}
Ok, mapping seems to be working, but what is the performance of such a mapper?
Let's create some test to check it! Will play big and map 1 million objects:
[TestCase(1000000)]
public void EntityToDataContainerMappingCollectionTest(int capacity)
{
  var entities = Enumerable.Range(0, capacity).Select(i => new Entity
  {
    Id = Guid.NewGuid(),
    Number = i,
    Name = "Name_" + i,
    UserName = "UserName_" + i,
    Price = (decimal)Math.Sqrt(i),
    Time = DateTime.Now
  }).ToArray();

  Mapper.DataMapper.Initialize(new DomainMappingInitializator());

  // Cold mapping
  Mapper.MapCollection<Entity, DataContainer>(new List<Entity> { new Entity() });

  var stopWatch = new Stopwatch();
  stopWatch.Start();
  GC.Collect();

  var containers = Mapper.MapCollection<Entity, DataContainer>(entities).ToArray();

  stopWatch.Stop();
  Console.WriteLine(string.Format("Mapping of the collection with {0} elements took: {1} ms. Approx. mapping time per object: {2} sec.", capacity, stopWatch.ElapsedMilliseconds, ((float)stopWatch.ElapsedMilliseconds) / capacity));
}
For correct results we need to invoke cold start because emit needs to be compiled first and let the Garbage Collector to do it's work before testing.

The results are: approx. 6.2 seconds per 1 million entities and tables with 7 fields of various types. To compare, it takes 5.5 seconds to map the same collection for the direct handwritten mapping. Yeap, I believe,  the results are very good and solution is pretty scalable! Of course, I had to spent some time with dotTrace to optimize the performance but it is worth!

All sources of described EmitMapper customizations are available on GitHub.

10 comments:

  1. Very nice piece of code, but how would you approach mapping nested objects (like some other eg. Entity2 object property inside your example Entity class) from dictionary ?

    ReplyDelete
  2. Sketches are in fact pleasant source of teaching instead of content, its my familiarity, what would you say?
    Portugal property for sale

    ReplyDelete
  3. Shocking! I am generally astounded by the way you exceptional out for all intents and purposes every last little detail. It can be really rushing toward help me a great offer. An obligation of appreciation is all together to share your proposals so certainly.Excellent post. See on our reasonable reviews on paper help from different destinations to get tried and true examination of educational locales.
    Vila sol property for sale

    ReplyDelete
  4. Such are in fact cool YouTube movies, its my luck to go to see this web page and finding these awesome YouTube movies. Logistics and freight forwarding software in Australia

    ReplyDelete
  5. It is amazing and wonderful to visit your site.Thanks for sharing this information,this is useful to me...
    python Training in Pune
    python Training in Chennai
    python Training in Bangalore

    ReplyDelete
  6. On every weekend, we all mates jointly used to watch movie, because enjoyment is also essential in life.

    -------------------------
    Best Website development company in Kanpur

    ReplyDelete
  7. This post is so interactive and informative.keep update more information...
    IELTS Coaching in Tambaram
    IELTS Coaching in Chennai

    ReplyDelete