Entity Framework Core – Creating Schemas With the FluentAPI


What’s the difference between storing data in Excel and SQL Server? Schema.

A good schema can help maintain the integrity of your data, and improve the performance of your application.  Entity Framework makes it so easy to create a new database, that it’s easy to forget what’s going on under the hood.  I’ll present some techniques to make EF create databases with indexes, foreign keys, constraints, and inheritance using the Fluent API.

Entity Framework has three methods to create a database schema.  The default method is “convention”.  It uses built in rules based on the name of properties, and entity composition to create the schema.  It’s the method most people use to get started, but you don’t have any control over them.  Another way to configure the schema is to use annotations on the entity model to dictate how the schema is created.  The last and most powerful, is to use the Fluent API.  Configuration options you make using the Fluent API override annotations and conventions. Below is a common data model that I’ll use in my examples.

Identities and Constraints

EF will make the best choice it can when configuring the schema.  For example, if the primary key is an integer, it will make the column an identity column by default.  For every row that is added, the PK is set to the next highest integer in the column.  If you wanted to be explicit about this option, you could do that by overriding the OnModelCreating method of the DbContext object.

public class CRMContext : DbContext

{

     public DbSet<Address> Addresses {get; set;}

     public DbSet<Party> Party {get; set;}

     public DbSet<Person> People {get; set;}

     public DbSet<Organization> Organizations {get; set;}


     protected override void OnModelCreating(ModelBuilder modelBuilder)

     {

          modelBuilder.Entity<Party>()

           .Property(p => p.PartyID

           .UseSqlServerIdentityColumn();

     }
}

Another convention that EF uses, is to create NVARCHAR(MAX) columns to store string data.  It’s a safe guess because it’s flexible in the absence of knowledge about what kind of data will be stored.  If you know that a column will be storing the names of people, you could specify the maximum length the column will store, and improve performance.

modelBuilder.Entity<Party>()

.Property(p => p.DisplayName)

.HasMaxLength(128);

Foreign Keys

In this example, a person or organization inherits from a party.  Each has a collection of addresses that can be associated with them.  This relationship is defined by creating a foreign key property on the database schema.  Using the Fluent API, we begin by declaring the navigation properties of the relationship.  This can be defined as “HasOne” or “HasMany”, depending on the carnality of the relationship.  The inverse navigation can be chained to the definition by using “WithOne” or “WithMany”.

 

protected override void OnModelCreating(ModelBuilder modelBuilder)

{

     modelBuilder.Entity<Party>()

       .HasMany(p => p.Addresses)

       .WithOne(p => p.Party)

       .HasForeignKey(p => p.PartyID)

       .IsRequired()

       .OnDelete(DeleteBehavior.Cascade);

}

After establishing the relationship and its carnality, we need to tell Entity Framework what the foreign key field is.  This is done using HasForeignKey().  It doesn’t make sense to have an address that is not attached to a person or organization, so the foreign key should be required.  This is done with the IsRequired() method.  Because an address is tied to a party, when a party is deleted, we want the related addresses to be deleted as well.  Cascading deletes in SQL Server can handle this problem for us, so we’ll apply that behavior with the OnDelete method.

Indexes

The next step is to create some indexes on the entities to improve performance.  The HasIndex method is used to create an index based on one or more columns.  By default the indexes do not require that each “row” be unique.  That can be configured by using IsUnique.

protected override void OnModelCreating(ModelBuilder modelBuilder)

{

modelBuilder.Entity<Party>()

.HasIndex(p => new { p.PartyID, p.DisplayName })

.IsUnique();

}

Inheritance

Inheritance in a database can take on two distince flavors.  The table-per-hierarchy (TPH) pattern uses a single table to store all the data for all the types in the hierarchy.  The table-per-type (TPT) pattern creates a table for each type in the hierarchy.  Entity Framework currently only supports TPH.  Because data for multiple types will be stored in the same table, an additional piece of information is needed to identify a row’s type.  In EF, this column is called the discriminator.  It can be added to the schema with the HasDiscriminator method.  The values that can be placed in column are defined with the HasValue method.

protected override void OnModelCreating(ModelBuilder modelBuilder)

{

modelBuilder.Entity<Party>()

.HasDiscriminator<string>("PartyType")

.HasValue<Person>("Person")

.HasValue<Organization>("Organization");

}

Because EF is using TPH, the Person DbSet and the Organization DbSet are merged into a Party table. This does limit our ability to use SQL constraints to enforce data integerity. For example, if we need to record the schools that a person has attended, there will be no foreign key constraint that can limit this to people. To prevent an organization from attending a school, we would need to create a insert and update trigger on the School table. TPT modeling would let us avoid having to use a trigger. At this point EF doesn’t support TPT modeling, but it is on Microsoft’s roadmap.  Another thing to take note of is that the extensions needed to set the discriminator are in the Microsoft.EntityFrameworkCore.Relational package in NuGet.

What do you do when you run DBCC CHECKDB and you find corruption?

Data in SQL databases will become corrupted, eventually. Thankfully there are several mechanisms in SQL Server to protect against it, and warn you that there is a problem. One of the most difficult things to learn about, to train for, is database corruption. Most of us don’t have a set of corrupted databases that we can practice repairing. Learning how to repair them in the heat of the moment is stressful, and often poor decisions can be made. With that in mind, I’ve created some tutorials along with corrupted databases for you to practice on. Download the Northwnd.mdf and Northwnd.ldf, attach them to a SQL Server, and run DBCC CHECKDB on the database. You will see an error message that looks like this.
DBCC results for ‘NORTHWND’.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 0, index ID -1, partition ID 0, alloc unit ID 1056009275703296 (type Unknown), page (32993:-2015636120). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 133129 and -1.

Msg 8928, Level 16, State 1, Line 1
Object ID 2137058649, index ID 3, partition ID 984479205752832, alloc unit ID 984479205752832 (type In-row data): Page (1:295) could not be processed. See other errors for details.

Msg 8980, Level 16, State 1, Line 1
Table error: Object ID 2137058649, index ID 3, partition ID 984479205752832, alloc unit ID 984479205752832 (type In-row data). Index node page (0:0), slot 0 refers to child page (1:295) and previous child (0:0), but they were not encountered.

CHECKDB found 0 allocation errors and 2 consistency errors in table ‘Suppliers’ (object ID 2137058649).
CHECKDB found 0 allocation errors and 3 consistency errors in database ‘NORTHWND’.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (NORTHWND).

At first glance, this looks like a horrible error message. You might be tempted to try recovering the database from a backup, or running DBCC CHECKDB(NORTHWND, REPAIR_REBUILD). But it wouldn’t be necessary in this case. The key to understanding how to solve database corruption problems is to understand what corruption is, and what the error messages are telling you.

What is corruption?
Database corruption is simply a sequence of ones and zeros that got written to a datafile where they should not have been. Sometimes this happens when a hard disk fails to physically perform as it should, a device driver encounters a bug, or some part of the IO subsystem faults. It doesn’t happen often, but when it does, the result can be a huge headache.

The Error Messages:
Let’s look at each of the error messages, and figure out what they mean.
Msg 8939 & 8928 – The job of physically reading the data, is done by the buffer pool. When a page of data is read by the buffer pool, the page’s checksum is checked. If it doesn’t match the computed value of the data, an error is thrown. CHCEKDB is telling us there is corruption, and look at the other error messages to see how to deal with it.
Msg 8980 – This message is telling us that there is an error in object ID 2137058649. I can take a look at sys.indexes to see what that object is.

SELECT object_id, name, index_id, type, type_description FROM sys.indexes WHERE object_id = 2137958649
object_id name index_id type
———– —————————– ———– —-
2137058649 PK_Suppliers 1 1
2137058649 CompanyName 2 2
2137058649 PostalCode 3 2

(3 row(s) affected)

So now we know that the problem lies in the PostalCode index for the Suppliers table. Lets run a query to verify this.

SELECT PostalCode FROM Suppliers
Msg 824, Level 24, State 2, Line 19
SQL Server detected a logical consistency-based I/O error: torn page (expected signature: 0x00000000; actual signature: 0x55555554). It occurred during a read of page (1:295) in database ID 5 at offset 0x0000000024e000 in file ‘C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\NORTHWND.MDF’. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

Note that the torn page error message refers to page (1:295). This matches the error we saw in DBCC CHECKDB. Lets see what happens if we try to query the Suppliers table without using the PostalCode index.

SELECT * FROM Suppliers

No error messages are returned. Indexes are really just metadata, so rather than taking a drastic step like recovering from backup, we can just recreate the index.

DROP INDEX [PostalCode] ON [dbo].[Suppliers]
GO

CREATE NONCLUSTERED INDEX [PostalCode] ON [dbo].[Suppliers]
(
[PostalCode] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

After it completes, we can try querying the Suppliers table using the index.

SELECT PostalCode FROM Suppliers

You should now see a list of the postal codes from the table. Now lets run DBCC CHECKDB to see how it looks.

CHECKDB found 0 allocation errors and 0 consistency errors in database ‘NORTHWND’.

The purpose of this exercise is to remind you that corrupted databases will occur, but the first step to solving the problem is to understand what is corrupted.

Northwnd.mdf
Northwnd.ldf

ASP.net MVC Core Routing

Routing in ASP.net MVC solves the problem of mapping a request to a route handler that can return a file, web page, or data. Out of the box, configuring routing is straightforward, but as an application becomes more complex, routing can be challenging. Knowing how routing works, and how to use the different types of routes will help you solve the problems that can arise.

What exactly are routes?

Routes define how requests are should be handled in your application. Each route has three parts that you must configure for them to work.

• Name – Identifies the route.
• Template – A pattern that is used to match to URLs.
• Target – A handler to specify where the request should go.

There are two different ways to define a route with these pieces of information. With conventional routing, you generally establish a standard pattern for mapping a URL to actions in controllers.

Routes.MapRoute(“Default”,
                “{controller}/{action}/{id}”,
                New { controller = “Home”, action = “Index”, id = “” });

In this example, the route is named “Default”, the template is a pattern of the controller name, followed by the action method in the controller, and a variable being passed into the method. Defining a pattern like this allows you to add more controllers and actions to your application, without having to map a URL each time.
Routes can also be defined using the [Route] attribute. You can simply decorate a controller action with the attribute.

[Route(“products/{id}”)]
public IActionResult ProductDetail(int id) { … }

In this case, the name is going to be generated for us, the template will be “products” followed by a variable, and the target will be the action that is decorated. If you are unfamiliar with routing, the “{controller}” or “{id}” might throw you. It is a segment, or parameter, that represents a piece of data that will be parsed from the URL.

How do the requests get applied to the routes?
Each application has a collection of routes. When an incoming request needs to be matched with a route handler, the RouteAsync() method is sequentially called on each route handler in the collection. If the route sets the handler for the request, iteration through the collection is stopped, and the handler is invoked. If a handler is not found, it is passed on the next piece of middleware in the request pipeline. But that’s a story for another day.

Now let’s take a look at some interesting cases.
Let’s say we wanted to have a controller that could display a product when you gave the website the name of the product, or an id number.

GetProductByID(int id) // product/4
GetProductByName(string name) //product/silly-puddy

In this case, we need to see what type of data is being included in the URL. If it is an integer, we want the GetProductByID action to be invoked. If it is a string, we want the GetProductByName action to be invoked. One way to do that is to use a constrained route parameter.

[Route(“product/{productid:int}”)]
public IActionResult GetProductByID(int id)

[Route(“product/{name:string}”)]
public IActionResult GetProductByName(string name)

Microsoft has created several types of route constraints that can be used to evaluate parameters in a URL. A route constraint matcher will iterate through the list of constraints, and call a match function to see if a given URL should be handled by the route. Constraints that check for integers and strings are commonly used, as well as regular expressions for more complex requirements. If you want to see all the available constraints, check out the GitHub repo for ASP.NET.
https://github.com/aspnet/Routing/tree/dev/src/Microsoft.AspNetCore.Routing/Constraints

Sometimes you need a handler for URLs with many segments. One example you need a specific style of webpage for widgets with special features.

[Route(“product/widgets/{*feature}”)]
public IActionResult GetSpecialWidget(string feature)

A wild card parameter will act as a “catch-all” for URLs that may contain multiple segments.
/product/widgets/jumbo/green
/product/widgets/tiny
/product/widget/blue

The wild card parameter can be constrained in the same way normal parameters are.

[Route(“product/widgets/{*feature:string}")]

 

Don’t forget about order!

Remember back when I mentioned that for each application, there is a collection of routes that are checked one by one, until a matching route is found. That means that the order of the routes in the collection matters. If a route with a wildcard parameter was first in the collection, it could swallow up requests that were intended for different handlers. In ASP.net there is a TreeRouteBuilder that is responsible for putting the routes in order. Here is the order by route type.

1) Literal Routes /product/brand-new
2) Constrained Routes with Parameters /product/{productid:int}
3) Unconstrained Routes with Parameters /product/{productname}
4) Constrained Routes with Wildcards /product/{*widgetsize:int}
5) Unconstrained Routes with Wildcards /product/{*feature}

In the event you need to have a route placed higher in the list, you can add an Order parameter to the route definition. All routes have an Order with a default value of zero. Setting the route’s order to a value less than zero will cause it to be checked before those set to zero. Setting the value to 1 will push the route to the end of the collection, and thus be checked later in the matching process.

[Route(“product”, Order=1)]

Hope this helps shed some light on how routing works, and the ways you can use routing features to better control your web applications.