Enzo Shard 2.0

Rating: No reviews yet
Downloads: 491
Released: Mar 12, 2013
Updated: Mar 13, 2013 by hroggero
Dev status: Stable Help Icon

Recommended Download

Source Code EnzoShardSourceCode
source code, 493K, uploaded Mar 13, 2013 - 293 downloads

Other Available Downloads

Documentation FederationStrategyDiagram
documentation, 116K, uploaded Sep 3, 2011 - 198 downloads

Release Notes

Release Notes

This version of the Enzo Shard provides support for the upcoming release of SQL Azure Data Federation. In addition this project is built on the concept of Strategy patterns; this is useful to shard common code between various implementations of sharding models.

The following changes were made to the previous release:
  • Provides a CORE set of classes that allows building multiple shard strategies
  • Early code to support SQL Azure Data Federation (the Enzo Data Federation strategy)
  • New Distributed T-SQL Semantics to use with the SQL Azure Data Federation class

About the Strategy Implementation

This project was reorganized by centralizing common sharding operations in specialized classes. In addition, the basic underlying core logic of performing a shard call was placed in a Core class. By inheriting this class, any new strategy gains access to specific sharding options and settings. This new model was used to adapt the original shard library, now called the Expanded Shard library. The Federation Strategy also uses the Core class and additional capabilities provided by this release, such as the Distributed Query (see below). Going forward, the intent is to provide a set of sharding implementation strategies that use the same underlying plumbing. See the FederationStrategyDiagram (as a download) for an overview of the classes of the new Data Federation Strategy.

About the Federation Strategy

The Enzo Federation Strategy supports the following features:
  • Map Reduce: Fetches across federation members in parallel
  • Shard Reduce (Index): Creates a bitmap index of federation members
  • Fan Out: Supports execution of parallel execution with the Task Parallel Library
  • New T-SQL Semantics: allows support for an enhanced T-SQL to query across federation members

About the Distributed Query

The distributed query allows you to send a SELECT statement in the following format, and get a data table back. The distributed query is analyzed by the library and extracts enough information to determine how to fetch the data across federation members in SQL Azure.
The distributed query also supports the use of caching with the CACHED FOR clause, and allows exclusion of records from other federation members with the WHERE IN clause.

The statement currently supported is:

SELECT fields USING ( sql1 ) FEDERATED ON {ROOT, (fed_name [, member_key = member_value [, FILTERED]]) }
[WHERE field [NOT] IN ( sql2 ) FEDERATED ON {ROOT, (fed_name [, member_key = member_value [, FILTERED]]) }]
[ORDER BY fields3]
[CACHED FOR n seconds]

Example

For example, the following query returns the MAX customerId of a Customer table. It executes the inner SQL statement across the federation members, then performs the MAX aggregate calculation.

SELECT MAX(customerId) USING (select customerid from customer) FEDERATED ON (customerfederation)

More advanced distributed statements are available in a test application.

Reviews for this release

No reviews yet for this release.