pondělí 3. března 2014

Entity Framework, repositories, unit of work


Recently we decided to make huge architectural changes in our core library which started a discussion that led us to entity framework. Our core library would need a persistence mechanism, one of the candidates are entity framework (EF). This blog post is about my evaluation of EF in our conditions.

Let me first describe our current architectural pattern on a particular example. We would like to implement a new Audit mechanism, which updates the audit log on the behalf of client applications.

The implementation using current architecture would be:
We have a simple POCO class AuditEvent which the client passes to an instance of AuditService and its Log method does the job. We have separated the business and data access layer and use dependency injection (DI) to put the real instances together at runtime. (DAO implementation is injected to AuditService’s IAuditServiceDao constructor parameter by DI). Dao implementation is not provided here, our classic approach for it is to write simple ADO.NET data access code to save the data to a database (hand written inserts etc.).

My first thought was, it would be simple, just write a new DAO, that will be using EF instead of ADO.NET and change the configuration of the DI, everything else remains as is. Although this is of course possible, this way of implementing EF (and OR mappers by default) wouldn’t use it’s full potential. EF can do more for us than only free us from the hand writing of database inserts and other DML commands. But we need to change the architecture a little bit.

Our current ado.net dao which is sitting behind IAuditServiceDao interface is not transactional, every call in the interface opens a connection to database, does it’s job and commits the changes. This could be something you can live with or you can feel it constraining. The interface could grow quickly, every database access would have it’s call in this interface. (you can separate the calls to more interfaces to have a better feeling about this, but still it could be a lot of work to handwrite every call). Our ado.net doesn’t tracks whether the entities changed or not, it saves them to database again even if they are unchanged. Changes in database structure are not tracked (we write sql change scripts by hand for every release), we use oracle database which has no tooling support in visual studio (there was a Toad extension for oracle plugin which made database projects work for oracle too, but it was discontinued).

On the other hand the developer in ado.net has full control of the DML statements going to database, this could be sometimes an advantage.

The issues listed above could be addressed by the EF (or other OR mappers).
Entity framework implements a unit of work pattern with it’s DbContext (or ObjectContext) class (this is the main class that represents the database). Unit of work encapsulates a business transaction, tracks all objects affected and coordinates the writing out of changes to database [Ref-Fowler]. Tables from database could be exposed using class DbSet<T>, this implements the repository pattern. Repository is used to CRUD operations on a table (or enity). DbSet also implements the IQueryable<T> interface which would be usefull in future.
The simplest implementation of EF DbContext is as follows:


Couple of notes here:
  • Note that the AuditEvents repository returns directly an enity from business layer. There is no mapping between objects returned by the DAL and the business layer. This would be important in the future when we would like to use advanced queriing using IQueryable.
  • DbContext and DbSet are exact classes not hidden behind any interface. This is by design in EF, unfortunately. This way we cannot expose the DbContext directly to business layer, because it is not mockable and testable. We need to hide it behind an interface.

  •  This style of using EF is called code first approach, well fitting when you are on a green field project, no database, no business entities etc. There are two other EF strategies (Database First and Model first, when the approach is slightly different). I will write a short blog post about this in the future. Code first approach adheres to top down design principle, when you first design the business entities and only after this you start to care about their database representation. (which solves EF for you).
As noted above if we would like to have a testable implementation we need to hide the DbContext from the business layer. This could be done using following simple implementation:

Actually this is an implementation of a new unit of work which only wraps the DbContext. For every business entity it will return a “repository” using IQueryable<T> interface and has generic implementation for adding new entities stored in dbContext. It also exposes SaveChanges method for commiting changes. We don’t need other methods currently (Delete etc.), update is done using changing the properties of business entities obtained from IQueryable (which are tracked by the dbContext) and call SaveChanges for persisting the change to database.

We are returning an IQueryable for every business entity, why is this good? There are a lot of discussions out there whether it is good practice or not. IQueryable is a great interface for asking for data, you are sending expression trees which are translated and executed inside an implementation of IQueryable provider, returning back only the desired data. This way the filtration is done on the level of the provider (database) not on the client. This way our AuditService, the consument of AuditUow can compose intelligent queries using linq on the top of IQueryable properties effectively (we don’t need to change the IAuditUow interface with specific Get methods – GetAuditEventsByLoggedUser, GetAuditEventsByType etc., using IQueryable we can query for what we want). On the other hand the interface is not exact, we don’t know how the clients of IQueryable will use the interface, if we change the implementation under IQueryable, we need to support all the queries that are possible using this interface. In fact currently the only provider that “correctly” implements IQueryable is entity framework or linqtosql. Correctly is not meant for 100% here, it can happen, and it happens that a query is correctly executed using linqToObjects, but the same query fails under linqToSql or EF. (due to different implementation of providers). This is why you need to have integration tests for the classes that are using IQueryable, integration test will query against real database, and test whether it can get data back (it is not enough to query against linqtoobjects).

In other words if using IQueryable you are in fact tightly coupled to entitiy framework (now in 2014). 
Mark Seemann has a nice post about this here http://blog.ploeh.dk/2012/03/26/IQueryableTisTightCoupling/
 
Despite of this I decided to use IQueryable at the level of unit of work (data layer). I would not use IQueryable on the business layer. (returning it from AuditService for example). I think it brings for me a comfort on the level of DAL with acceptable risks. (I don’t plan to quit EF)

The generic implementation of Add method is about not writing separate methods for Adding a new enity to context for every business object, using generics we can live with only one.

SaveChanges method is the part of the unit of work pattern, we let the client decide when the work is done by calling this method.

Great, we now have an implementation of unit of work which hides the EF for the outside world. The uow needs to be tested using integration tests and can be mocked when using from the business layer to test the business layer. 

The new implementation of the business layer looks like this:

Now another question arises, DbContext has a connection to database, how is it witch the lifecycle of this connection? More on this in next blog post.

Žádné komentáře:

Okomentovat