Using Elastic Scale inside your application

Now that you have configured Elastic Scale for your solution there are still some changes to be made in your application.

At the moment there are 2 sample applications available, one using plain old SQL-queries (ADO.NET) and another one using the Entity Framework. I would suggest checking out these samples before doing any serious Elastic Scale work.

https://jan-v.nl/files/2dd3b4d5-f687-46e3-9c1e-b4a5e1dc6f9b.png

You will probably notice there are some small differences between the ADO.NET sample and the Entity Framework sample. One of them is the Entity Framework not having a sample for doing multi sharded queries. This is because Entity Framework (or any other ORM for that matter) can’t use the multi shard connection objects which are used by the Elastic Scale libraries. I’ve developed a small work around (hack) for this which is shown at the end of this post.

First, let’s step through the basics on how the Elastic Scale library is supposed to be used.

As stated in my earlier posts about this subject, Elastic Scale should be used when you are creating different shards each with subsets of data. In order to connect to the a shard all you need to do is use the Elastic Scale library.

// Looks up the key in the shard map and opens a connection to the shard
using (SqlConnection conn = shardMap.OpenConnectionForKey(customerId, credentialsConnectionString))
{
	// Create a simple command that will insert a new order
	SqlCommand cmd = conn.CreateCommand();

	//Execute your commands, the way you would normally do.
}

As you can see, all you need is the shard map you want to use, the identifier to search for and (the shardlet) and a credentialsConnectionString. The shardlet is specified in the ShardMap of the ShardMapManager. The connection string is something you will have to create yourself. It should only contain the credentials to log in to a database as the Elastic Scale library will find out the server and database by itself. The sample application has a nice helper method for it which you can use of course.

For reference, this is the implementation of the method in the sample applications.

/// <summary>
/// Returns a connection string to use for Data-Dependent Routing and Multi-Shard Query,
/// which does not contain DataSource or InitialCatalog.
/// </summary>
public static string GetCredentialsConnectionString()
{
    // Get User name and password from the app.config file. If they don't exist, default to string.Empty.
    string userId = ConfigurationManager.AppSettings["UserName"] ?? string.Empty;
    string password = ConfigurationManager.AppSettings["Password"] ?? string.Empty;

    // Get Integrated Security from the app.config file.
    // If it exists, then parse it (throw exception on failure), otherwise default to false.
    string integratedSecurityString = ConfigurationManager.AppSettings["IntegratedSecurity"];
    bool integratedSecurity = integratedSecurityString != null &amp;&amp; bool.Parse(integratedSecurityString);

    SqlConnectionStringBuilder connStr = new SqlConnectionStringBuilder
    {
        // DDR and MSQ require credentials to be set
        UserID = userId,
        Password = password,
        IntegratedSecurity = integratedSecurity,

        // DataSource and InitialCatalog cannot be set for DDR and MSQ APIs, because these APIs will
        // determine the DataSource and InitialCatalog for you.
        //
        // DDR also does not support the ConnectRetryCount keyword introduced in .NET 4.5.1, because it
        // would prevent the API from being able to correctly kill connections when mappings are switched
        // offline.
        //
        // Other SqlClient ConnectionString keywords are supported.

        ApplicationName = "ESC_SKv1.0",
        ConnectTimeout = 30
    };
    return connStr.ToString();
}

Now, if you want to use Entity Framework (or some other ORM), you can use the same methods to open up a connection. Just pass this connection to your database context and you are set to go. It couldn’t be any easier!

The next thing, which is pretty awesome, is doing queries on all shards at once. The SQL team has created a new MultiShardConnection object for us to use in such a scenario.

When using this MultiShardConnection a query is executed on all shards at once so you don’t have to create all of these connections by yourself. The code sample below is another excerpt from the Elastic Scale sample applications.

// Create the multi-shard connection
using (MultiShardConnection conn = new MultiShardConnection(shards, credentialsConnectionString))
{
	// Create a simple command
	using (MultiShardCommand cmd = conn.CreateCommand())
	{
		// Because this query is grouped by CustomerID, which is sharded,
		// we will not get duplicate rows.
		cmd.CommandText = @"
            SELECT
                c.CustomerId,
                c.Name AS CustomerName,
                COUNT(o.OrderID) AS OrderCount
            FROM
                dbo.Customers AS c INNER JOIN
                dbo.Orders AS o
                ON c.CustomerID = o.CustomerID
            GROUP BY
                c.CustomerId,
                c.Name
            ORDER BY
                OrderCount";

		// Append a column with the shard name where the row came from
		cmd.ExecutionOptions = MultiShardExecutionOptions.IncludeShardNameColumn;

		// Allow for partial results in case some shards do not respond in time
		cmd.ExecutionPolicy = MultiShardExecutionPolicy.PartialResults;

		// Allow the entire command to take up to 30 seconds
		cmd.CommandTimeout = 30;

		// Execute the command.
		// We do not need to specify retry logic because MultiShardDataReader will internally retry until the CommandTimeout expires.
		using (MultiShardDataReader reader = cmd.ExecuteReader())
		{
			//Do some logic over here
		}
	}
}

As you can see, querying all of your shards is quite easy when using the MultiShardConnection. Just keep in mind the queries could perform a bit slow, depending on the latency between the shards.

There is one big problem when you are using an ORM and need to do these multi-sharded queries. The MultiShardConnection can not be used within a database context. So if you really need to do queries on all shards you’ll have to create connections to all shards by yourself, run the queries on the different shards and aggregate the results to your dataset. This is quite cumbersome and expensive, so don’t do it too often. I’ve had to create something which does this exact same thing in my project so I’ll just share the (very simplified) code below.

var resultSet = var resultSet = List<t>();
var shardMap = shardMapManager.GetListShardMap<guid>("MyShardMapName");
var allMappings = shardMap.GetMappings();
var shards = allMappings.Select(a => a.Shard).Distinct();
var credentialsConnectionString = this.databaseConnection.GetCredentialsConnectionString();
foreach (var shard in shards)
{
	shard.OpenConnection(credentialsConnectionString, ConnectionOptions.Validate);
	using (connection)
	{
		using (var context = new CrossPointStoreDatabaseEntities(connection))
		{
			var result = command.Invoke(context);
			if (result.Any())
			{
				resultSet.AddRange(result);
			}
		}
	}
}
return resultSet;

This isn’t the safest way to get results, but it’s the only one we’ve got at the moment. Hopefully this will get resolved in later releases of the Elastic Scale libraries. Be warned to use this ‘solution’ with caution.


Share

comments powered by Disqus