Data Elasticity Through Database Sharding

Around late 2008, I came across my first real exposure to a customer who wanted to scale their read-write RDBMS.  It was a very interesting problem.  At the time I didn’t evaluate if RDBMS was an appropriate data store, but rather approached it from “let’s make this work”.  A complex PoC was pulled together that relied on a Peer-to-Peer Replication topology, that was super fragile.  Since then I have had several opportunities to work with customers chasing this objective.

In most cases the sought after distributed data store is achieved by leveraging a NoSQL system.  Of course that introduces other considerations.  At one point the team I was working on came up with this sharding pattern, which I’ve blogged about on at least one occasion, however it was predominantly about app and web server scale.  The data tier was simply, how do I say this succinctly, let me try “atrocious”.  It lacked real depth, and real versatility in terms of how it was to be cared for by DBAs, nor did it really consider the SDLC aspects of developers that needed an ELASTIC scale solution for the RDBMS tier.

I spent the next 2 years working on a framework that took the lessons from multiple engagements and placed into a consumable package.  It took a great deal of work, and several failures plus a group of excellent colleagues to produce the package.  In particular Silvano Coriani and Jeff Odell, with some great input in the early days by Josh Handel.  The Channel9 video presentation that can be found at http://bit.ly/Tju0aB, is where you get to see some good features of the package.  The articles I’m listing below is the outcome of many months of all that goes into content creation.  It is a great supplement to the video presentation.

Successfully achieving elastic scale is no trivial task, don’t be led to believe otherwise.  It is without a doubt the most fun, challenging and exciting journey with HUGE rewards.

Enjoy It!!

Advertisement
Posted in Scalability | Tagged , , , , | 2 Comments

Migrating Azure services to new regions

(originally posted on azure.microsoft.com http://bit.ly/1sSC5mW)

We recently carried out an engagement where a customer, needed to move their services from US South Central region to US East region data centers. The engagement included the processing of capacity planning an appropriate sized Premium SKU and leveraging Active Geo-Replication for High-Availability and Rollback.

The customer was looking to migrate their Azure services that reside in US South Central to US East region. The primary area required for assistance is around their database, where they need predictability of any potential downtime. The downtime should be as low as possible but can be no longer than 6 hours.

Customer Overview

The customer is a developer of a geosocial mobile game. The game has awareness of towns and cities across the world, with a presence in all but 2 African countries. Players primarily participate in terms of their physical geographical location

Planning

From a planning perspective we considered multiple options. These weren’t particularly long discussions, but were ideas and are worth taking note of as they are inevitably typical of such conversations.

  • How big is the database?
  • What is the current SKU used for the database?
  • How much downtime can you tolerate?
  • How much data loss can you tolerate, if any, during the migration?
  • How much money are you willing to spend?
  • Can you tolerate the use of recently published services and/or services that are still in preview?
  • What are the current concerns related to moving the data?
  • What are you not worried about?

For this customer it was most important to move. So we had quite a fair amount of latitude in terms of what we could do.

The database was approximately 35GB. About 5GB of the data could be considered as needing to be online while the remainder could be added over time that is it would essentially be offline.

The database was a business edition SKU and didn’t leverage any specific Azure features, such as Federations.

They could tolerate up to 8 hours downtime, but managing expectations of business and consumers was much easier if the downtime was less than 2 hours. Additionally certain days of the week and hours of the day were more amenable to downtime, therefore control of the downtime was a desired requirement.

The customer was cost conscious but was open to all ideas provided they included awareness of cost considerations.

The customer was very cautious of the DBCopy process and felt that along with export/import for moving large sets of data took much longer than 10hours, which severely impacted the business.

The moving of the compute nodes was not a concern, and neither was there an immediate need to move the existing blob storage data. The solution is architected such that no cross DC logic is required for accessing the existing blobs

Some options included:

  1. Leverage Active Geo-Replication functionality
  2. Stop all connections, export data, copy files, import into US East region
  3. Create DBCopy, Export data, copy files, restore in US East region

The Active Geo-Replication option was the preferred choice because:

  • Lowest downtime
  • No data loss, during planned migration
  • Rollback option is available
    • A failure in execution is a simple step of re-enabling resource at US South Central data center
  • Lowest TCO
  • Aligned with future plans (Upgrade to Premium SKU)

The one concern was that the service was in Preview, but this was considered a tolerable risk. After reviewing their historical performance, it was decided that a P1 would be sufficient for their needs. Some concerns that we had were whether the replication would be too resource intensive but given that the peak was understood to be one time event and the foreseeable resource requirement was low, the decision was to remain with P1 and be ready to upgrade to P2 at a moment’s notice.

As part of the capacity planning process, the customer was pointed to article Basic, Standard and Premium Preview for Azure SQL Database (http://msdn.microsoft.com/en-us/library/azure/dn369873.aspx).

Execution

The customer had to sign up for the Premium SKU preview. This must be done through the portal, and it takes seconds for the process to complete.

Once signed up, the database can be upgraded to Premium. Refer to “Changing Database Service Tiers and Performance Levels” (http://msdn.microsoft.com/en-us/library/dn369872.aspx) for details of what was leveraged for planning. The below formula was actually quite accurate, in predicting about 12 hours of time spent upgrading.

3 * (5 minutes + database size / 150 MB/minute)

The upgrade occurred smoothly. At one point several connections failed to connect, but this was for a brief moment and expected, as mentioned in MSDN article.

The next step was to create a replica in the desired region, namely US East region. There was uncertainty about how long it would take, and uncertainty about the resources required to carry out the replica. The steps are well documented as per this link “Failover in an Active Geo-Replication Configuration” (http://msdn.microsoft.com/en-us/library/azure/dn741337.aspx). The customer said it best when responding to a question of how long did the initial synchronization of the replica take…

It [Initial Synchronization] was no more than 40 minutes*.  I’m not sure on actual copy time.  This is quite amazing considering making a local bacpac file of this database takes 4 hours.

*Completely online experience

It should be noted that this execution plan was undertaken in-between code deployments. Much of this could’ve occurred within a 24hour period if process was purely technical. The customer chose to wait a couple of days before moving to the next action of the execution plan. The reasoning was related to making a significant code change to rather occur on the existing topology than in a new data center. They were concerned about whether the DDL changes would replicate. All changes are replicated except for those that rely on updating the master database, such as logins. These must be done in each environment, whereas creating a user can be done at the source. Refer to “Security Configuration for Active Geo-Replication” (http://msdn.microsoft.com/en-us/library/azure/dn741335.aspx) which specifically relates to the topic of logins.

The last step was to switch to the new region. Below are the steps that the customer shared for the benefit of others that undertake such a task. For terminating the Active Geo-Replication topology, the instructions are well-described as per link “Terminate a Continuous Copy Relationship” (http://msdn.microsoft.com/en-us/library/azure/dn741323.aspx).

  1. A week in advance, upgrade database to premium and setup geo replication to the East data center.  This could have been done the same day.  It took less than 40 minutes for a 32GB database to fully replicate from US South Central to East.
  2. [Day of move] Create blob containers in East to match the names of the containers in US South Central.  (we did not elect to move blob data at this time, but you would do that here)
  3. Deploy new Web and Worker roles to East.  Point new deployment at DB and blob storage in East.
  4. Setup any needed SSL certs in new (East) environment
  5. (Downtime begins)Take old Web and new Web offline line.  We do this with a configuration switch in the application that shows the users a friendly message that the site is under maintenance.  Only admins can get into the site/API normally.
  6. Change DNS to point to new web roles IP addresses
  7. Deploy new code to old data center (US South Central ) with connection strings for DB and Blob pointing at East
  8. Configure East DB to allow connections from US South Central IP Address (may not be required)
  9. Set new deployment (in staging area) in US South Central to offline mode, swap with production area
  10. Stop DB Active Geo-Replication (Planned Termination of a Continuous Copy Relationship)
    • Steps carried out using the portal
  11. Set US South Central DB into Read-Only mode
  12. Set US East region DB into read/write mode (after replication is completed) (may not be required)
  13. (Downtime ended) Re-enable both web sites/API in both US South Central and US East region data centers
  14. Setup automatic AzureDB backups on new DB.
    • Using the automated export under the “configure” section on the DB in the management portal.
  15. After DNS replication is fully propagated around the world, delete application servers in US South Central.

The following quote from the customer does a great job in articulating the experience

The move went very smoothly.  We were down for about 1 hour, and most of that time was waiting for our new code to deploy, configuration updates and prod/staging swaps.  The actual process of shutting down the Azure DB replication and changing read/write modes was simple.

Posted in Azure Migration | Tagged | Leave a comment

SQL Database Sharding Patterns

(originally posted at channel9 – http://bit.ly/Tju0aB)

Join Scott Klein and myself, as we discuss patterns and practices for sharding Azure SQL Database. The video session is a culmination of many hours of work to produce a framework / pattern that most industries could use to tackle the task of elastically scaling their RDBMS tier. Scott and I talk about some of the complexities of database sharding but introduce and demo a library that help overcome those complexities and provide the elastic scale needed in SQL Database. I highlight the key components and concepts when working with the sharding and then dive right into demoing the library to illustrate how the library can be used to implement sharding patterns into SQL Database. The sample library is available for customers to modify/extend/use and is compiled from engaging with customers who have taken on the task of creating customer sharding solutions.  While it is targeted towards Azure SQL Databases, the pattern is possible against on-premise and VM instances too.

//channel9.msdn.com/Shows/Data-Exposed/SqlDbShardingIntro/player?format=html5

Posted in Scalability | Tagged , | 1 Comment

DAL – Sharding of RDBMS

(originally posted at azure.microsoft.com/blog – http://bit.ly/1vnv3bB)

The “Cloud Service Fundamentals” application, referred to as “CSFundamentals,” demonstrates how to build database-backed Azure services.  This includes a description of the scenario, implementation architecture and reusable components for logging, configuration and data access.  The code base is intended to be a tangible exploration of best practices for delivering scalable, available services on Azure based on production deployments by the Windows Azure Customer Advisory Team.

With the majority of companies these days looking to move their cloud initiatives forward, the business drivers for specific solutions vary greatly, from lowering costs to significantly increasing agility and scale.  When solutions are trying to attain “cloud scale”, the strategy of “vertical scalability”, increasing capacity by upgrading the hardware, is replaced with “horizontal scalability”, increasing the number of machines that share a specific task.  A great example of this tradeoff is the creation of web farms where many servers serve the same web site content versus having a single monolithic machine try to handle the load.

Most people that start down this path plan for horizontal scalability on the compute nodes but bypass the more complex and potentially more critical state tiers, such as the relational database management system (RDBMS) and caches.  These services are often IO-intensive and bound by a single instance.  One technique to implement horizontal scalability in the state tier is known as sharding.  Sharding is when you logically separate your RDBMS data into multiple databases, typically with the same schema. For example, an employee table could be split across three distinct databases where each database stores a different department’s employees.

The benefits of sharding assists in far more than just capacity related scenarios.  For the purposes of this post we will focus on sharding an RDBMS that is implemented in Azure SQL Database platform, and primarily serves OLTP scenarios.  Some example scenarios that could benefit from a sharded database structure include:

  • Throttling thresholds or throughput limits are hit too often.
  • Size of the database becomes unwieldy (index rebuilds, backups).
  • A single unavailable database affects all users (as opposed to a single shard).
  • A database that has difficulty scaling up and down gracefully in respond to demand.
  • Certain business models, such as multi-tenant or Software as a Service offerings.

When using a multi-tenant database as a service solution, such as Windows Azure SQL Database, there are typically Quality of Service (QOS) controls put in place that throttle clients under various conditions. Throttling typically occurs when resource pressure climbs.  Sharding is a key strategy to help reduce resource pressure by taking the load that would typically affect a single server and spreading it across as multiple servers that each contain a shard. For example, assuming an even distribution, creating five shards reduces the load to approximately twenty percent on each database.

As with anything that grants greater power, there are sacrifices that must be made.  Sharding increases the complexity of several key areas, requiring more planning. These include:

  • Identity columns should be globally unique across all shards in case future business needs necessitate the reduction in shard count.  If the identity is not unique across all shards, merging two shards can result in conflicts.
  • Referential integrity cannot reference or enforce relationships to rows in other shards as they are independent databases.
  • Queries that cross shards should be avoided if possible, because they require querying each shard and merging the results.  The need to do “fan out” queries across the shards is not only costly from a performance point of view but increases the complexity of the sharding framework that must support it.  If cross-shard queries are necessary, the typical strategy is to query each shard asynchronously.  However, there are times where a synchronous approach offers more control of the resultset size.

In most cases, sharding is a Data Access Layer (DAL) concept, abstracting most of the intricacies from the higher-level application logic.

How you define a “tenant” is one of the most crucial decisions that you can make when building a sharded architecture.  A tenant is the largest unique categorization of data that is guaranteed to be on the same shard.  Queries that are constrained to a single tenant typically perform better as they do not need to execute fan-out operations during normal operational states.  Some of the factors that influence the decision of the appropriate tenant definition include:

  • The level of awareness of the higher-level application code of the identifier.
  • The ability of most core business transactions to be performed at this level.
  • The ability to avoid throttling in general day-to-day operations at the tenant granularity.

In an effort to surface these concepts and considerations at a high level, the Windows Azure Customer Advisory Team has built out a basic sharding Data Access Layer (DAL) in the Cloud Services Fundamentals (CSF) package (http://code.msdn.microsoft.com/Cloud-Service-Fundamentals-4ca72649).

In CSF the tenant is defined as an individual user.  Some of the factors that led to selecting this tenant were:

  • Most of the core business requirements do not require queries across multiple users.
  • An unavailable shard only impacts a defined set of users, leaving all others to continue their typical usage of the system.  The quantity of users that are on a single shard can be controlled to a number that is tolerable to a business.

The tenant is defined and implemented such that cross-database transactions are not required.  In Figure 1, we refer to this set of data as a shardlet, a Data Model Transaction Boundary.

Figure 1 – Data Model Transaction Boundary

When the user connects to the database for the first time in the session they have a series of simple queries that they can execute to understand if any features are unavailable to them due to offline shards.

In an effort to simplify the sharding techniques being demonstrated in CSF, we decided to create a set of shards that have enough storage to fulfill capacity needs for the foreseeable future. By selecting this size, it eliminated the need to demonstrate increasing and reducing the number of shards, which would also include actions like tenant movement.  An integer is generated by hashing the tenant’s name and this id is used to lookup a matching range in a “shard map”.  CSF uses a range-based mechanism, where a range of these numbers are assigned to a specific shard (captured in the “shard map”).

In the event that shards need to be added or removed from the shard set, it requires that tenants become unavailable until they are transitioned into their new shards. Due to this significant limitation, it is expected that the shard set will be significantly over provisioned when it is first created to reduce or eliminate the need to do complex shard management.

This solution requires that the Data Access Layer (DAL) is aware of the tenant id to determine the placement of the tenants in the shard set.  If a query is executed that includes a shard that is unavailable the entire query will fail.  In the event that the DAL does not include the tenant id all shards will have to be queried, increasing the chance of failure and reducing performance.

There is currently some preliminary work underway to offer further code samples demonstrating more advanced sharding techniques.  The sample will offer improvements in the following areas:

  • Reactive and proactive shard management.
  • Global uniqueness and identity management.
  • Migration of tenants between shards within the set.
  • Expansion and contraction of the shard set.
  • Improvements in queries that are not tenant aware.

In conclusion, the Cloud Services Fundamentals code sample is a great way to start to explore the basic concepts of sharding, which is an important technique for creating “cloud scale” applications.

Posted in Scalability | Tagged , | Leave a comment

Brokerage Cuts Manual Data Workloads by 50 Percent and Strengthens Customer Trust

(originally posted on microsoft.com/casestudies – http://bit.ly/Ltlz2M)

TradeStation provides historical stock-market data so that traders can test investment strategies. To support a massive planned increase in data volumes for an important customer offering, TradeStation needed to update a data management solution that required excessive maintenance work and lacked sufficient scalability. Based on extensive experience, the brokerage deployed a Microsoft solution, which further strengthens customer relationships and cuts manual workloads and processing overhead by half.

Business Needs

TradeStation offers traders analysis tools that include providing historical data against which strategies can be tested. It offers about 30 years’ worth of data for all major stock exchanges, and customers have access to a development interface where they can build and test computer-based buy and sell strategies.

“The SQL Server 2012 solution has reduced the IT department’s manual workloads by at least half” – Jose Ruiz, Quality Assurance Director, TradeStation

To support growth, further strengthen customer relationships, and expand into a new line of business, TradeStation needed to update the solution it was using to manage its historical trading data. One problem with the existing solution was that it did not support having one centralized, authoritative version of the data, so customers had to access copies on different machines. Occasionally, such as after power interruptions or operating system failures, these duplicated files diverged.

Updating these files was also a challenge. “Adding each new day’s data required rebuilding our files every night, which took anywhere from one to six hours,” says Joe Messina, a Senior Engineer for TradeStation. This high processing overhead reduced the company’s ability to deliver data promptly to customers.

In addition to correcting these shortcomings, TradeStation had another goal. “Currently, our system stores only a stock’s best asking price, best bid, and any actual trades,” says Messina. “Our plan is to begin offering access to market depth data, or every single asking price and bid for each stock. That’s going to expand our data volume by about a hundredfold, and it just wasn’t going to be possible with our old, homegrown system.”

TradeStation wanted a solution that could more easily manage binary large objects (BLOBs) and deliver data to customers from one central database. The solution needed to be scalable, easy to update, and simple to maintain.

Solution

TradeStation considered KDB and Vertica solutions but ultimately decided to deploy Microsoft SQL Server 2012 data management software. “We already use a lot of Microsoft technology elsewhere in our IT environment, and we’re very comfortable with it,” says Andrew James, a Product Manager for TradeStation. “This solution is going to be the foundation for the long-term growth of our company, so we wanted to go with a technology that has large teams of people expending all of their effort and brainpower making it even faster and better. That’s the reason we chose Microsoft.”

With SQL Server 2012, TradeStation can use the FILESTREAM feature to store BLOBs in its SQL Server database but still be able to access them from Windows applications as if they were stored in the file system. The database holds 4 terabytes of market data, and TradeStation receives about 175 gigabytes of new data per day. The new solution is deployed on HP ProLiant DL360 G7 server computers.

To simplify maintenance, speed data loading, and optimize query performance, TradeStation is using dynamic partitioning logic in SQL Server 2012 to create a new partition every 100,000 new files. SQL Server 2012 supports as many as 15,000 partitions to support large sliding-window scenarios that can streamline maintenance of extremely large data sets.

In the next phase of the deployment project, TradeStation plans to implement SQL Server 2012 Remote Blob Store (RBS) to further reduce strains on its storage and processing resources. With RBS, the company has the option to use commodity hardware for dedicated storage solutions and make use of a scalable architecture to increase throughput and grow to accommodate thousands of terabytes.

Benefits

The Microsoft solution cut manual workloads for IT and other employees, improved the company’s relationship with its customers, and reduced processing overhead.

Reduces Manual Workloads by 50 Percent

The new solution so simplified the data management infrastructure that it sharply reduced the time that TradeStation IT workers needed to spend on maintenance. “With SQL Server 2012, we’ve reduced the number of physical servers from 50 to 4, or 46 fewer machines that we need to worry about upgrading, managing, and tracking,” says Jose Ruiz, the Quality Assurance Director for TradeStation. “And because we now have one centralized set of data, we’ve cut out all the time we used to spend duplicating data sets from one machine to another. When you combine these improvements, the SQL Server 2012 solution has reduced the IT department’s manual workloads by at least half.”

The new solution is saving time for other departments, too. James says, “The SQL Server 2012 solution is reducing the number of trouble calls to our data-integrity department.” Keith Black, Vice President of Product Development at TradeStation, agrees: “SQL Server 2012 has removed a lot of pain and wasted time. That’s a big win for our employees.”

Further Strengthens Customers’ Trust

The solution’s centralized, authoritative data eliminates the inconsistencies that customers sometimes encountered previously. “Before, when our data was duplicated across so many machines, sometimes customers noticed differences between data sets that should have been identical,” says Ruiz. “By centralizing our data with SQL Server 2012, every customer is going to access the same data, no matter what server they log on to and get served from. That improved consistency is a huge gain for us and for our customers.”

Cuts Processing Overhead

By centralizing data storage, the new solution drastically reduces the amount of processing time needed to update or correct data. “By moving all of our data to SQL Server 2012, we no longer need to create multiple copies and rebuild the files each night, so we’ve dramatically reduced the length of time needed to process our end-of-day files or make corrections,” says Messina. “SQL Server 2012 is also helping us serve requested data to customers more promptly, which makes them happier.”

 

Posted in Case studies | Tagged | Leave a comment

Migrating Data-Centric Applications to Windows Azure

(originally posted at download.microsoft.com – http://bit.ly/1zBhQtA)
migrating

This guide provides experienced developers and information technology (IT) professionals with detailed guidance on how to migrate their data-centric applications to Windows Azure Cloud Services, while also providing an introduction on how to migrate those same applications to Windows Azure Virtual Machines. By using this guide, you will have the planning process, migration considerations, and prescriptive how to’s needed for a positive migration experience.

 

 

Part 1: Planning and Implementing a Migration

Overview of the Migration Life Cycle

Provides step-by-step instructions for migrating your applications and data to Windows Azure.

Planning a Migration

Provides a walk through on several concerns and steps you should consider as you plan a migration to Windows Azure.

Implementing the Migration Plan

Provides guidance on the final steps of migration to Windows Azure.

Part 2: Migration Considerations, Best Practices, and How To

Migrating with Windows Azure Virtual Machines

Provides an overview on the Windows Azure Virtual Machines (VM) and guidance on how you can migrate your existing SQL Server databases to Windows Azure using SQL Server in a Windows Azure Virtual Machine.

Migrating with Windows Azure Cloud Services

Provides an overview on the Windows Azure Cloud Services and guidance on how to migrate your existing applications and databases to Windows Azure using the functionalities provided by the Windows Azure Cloud Services platform.

Posted in Azure Migration | Tagged | Leave a comment

Successfully execute an INSERT, UPDATE and DELETE against a Database Snapshot

(originally posted at blogs.msdn.com/b/sqlcat http://bit.ly/1BL788I)

Reviewers: Mike Ruthruff, Sanjay Mishra, Alexei Khalyako

Not too long ago an ISV that developed solutions using SQL Server as the RDBMS, asked me how they could query a database as at a point in time. This was a relatively easy answer, thanks to the Database Snapshot feature. I was however surprised at the next question “Can we update the database snapshot?”

A reactive response is “No. You cannot update a Database Snapshot”

Msg 3906, Level 16, State 1, Line 1
 Failed to update database "Orig_Snapshot" because the database is read-only.

A creative answer is a tentative “….well maybe…depending on what the objective is?”

This blog demonstrates that it is possible to run an INSERT, UPDATE or DELETE against a Database Snapshot.  This will not update the snapshot, but rather the database that has a “Database Snapshot” associated to it.

Imagine a scenario where a reconciliation of data at a point in time must be carried out. Database Snapshot provides the ability to present the data as at a point in time, however the common understanding is that any compensating modifications requires a second connection or a USE statement. The USE statement is not permitted in a database module:

Msg 154, Level 15, State 1, Procedure testSP, Line 4
a USE database statement is not allowed in a procedure, function or trigger.

Listing 1 creates the objects needed to prove it is possible to successfully execute DML statements against a Database Snapshot.

    IF DB_ID('Orig') IS NOT NULL
        DROP DATABASE [Orig]
    GO
    
    CREATE DATABASE [Orig]
     ON PRIMARY ( NAME = N'Orig', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.SQL02\MSSQL\DATA\Orig.mdf')
    GO
    USE [Orig]
    GO
  
    IF OBJECT_ID('dbo.TestTable', 'U') IS NOT NULL
        DROP TABLE dbo.TestTable
    GO
    
    CREATE TABLE dbo.TestTable (Col1 int)
    GO
    
    IF OBJECT_ID('dbo.UpdView', 'V') IS NOT NULL
        DROP VIEW dbo.UpdView
    GO
    
    CREATE VIEW dbo.UpdView
    AS
    SELECT Col1 FROM Orig.dbo.TestTable
    GO
    
    INSERT INTO dbo.TestTable (Col1) VALUES (1)
    GO
    
    IF DB_ID('Orig_Snapshot') IS NOT NULL
        DROP DATABASE [Orig_Snapshot]
    GO
    
    CREATE DATABASE [Orig_Snapshot]
     ON PRIMARY ( NAME = N'Orig', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.SQL02\MSSQL\DATA\Orig_Snapshot.ss')
     AS SNAPSHOT OF [Orig]
    GO
    
    USE [Orig_Snapshot]
    GO
    
    SELECT * FROM dbo.UpdView
    GO
After running the above code, you should have a database and an accompanying snapshot of that database. Within the database, you’ll have a table with at least 1 row in it and a view that simply returns the contents of that table.  This view is a fully qualified name of the original database, this creates 2 scenarios.  One is that it becomes possible to view the originating database from within the snapshot, and the second is the ability to run DML statements against the source database from within the database snapshot.

Listing 2

    INSERT INTO dbo.UpdView VALUES (2), (3);
    UPDATE dbo.UpdView SET Col1 = 99 WHERE Col1 = 3;
    DELETE FROM dbo.UpdView WHERE Col1 = 1;
    SELECT * FROM dbo.UpdView
    GO

Running the code in listing 2, you should have a result set as shown below:

In conclusion, the error message that informs us the database is read-only, while identical to the message returned when writing to a Read-Only database, is only partially accurate in the context of a Database Snapshot. Questions that come to mind are:

  • Is it a bug?
  • Would this be considered a good practice?
  • What does the code management look like?
  • What useful scenarios could leverage this insight?

Let me know if you think it’s a bug. From my perspective, this makes sense as the database snapshot is actually a read/write database, persisting older values as records change. Additionally, we are not actually updating the Database Snapshot, the changes are still made directly to the main database.

As far as been a good or recommended practice, my reservations about using this in a design are:

  • Views must be created for each table that can be updated from the database snapshot
  • Separate objects/statements must be created for DML operations that already exists for the main table
  • It isn’t intuitive in terms of troubleshooting code. In fact, if you start down this road you may end up with one of those horrendous applications that have layers upon layers of views ultimately leading to poor performance.

It does however reduce the cost of creating and managing a new connection, especially if the DML statements are a result of a query that originated from the snapshot. Additionally, the logic could reside in the same SP as the query, allowing for conditional logic.

Can you think of other scenarios, pitfalls or benefits of updating the main database via a database snapshot?

 

Posted in Thoughts | Tagged , , , , , , , | Leave a comment

“QUOTED_IDENTIFIER” causes Unexpected Query Plan for Persisted Computed Column query

Author: Shaun Tinline-Jones

Technical Reviewers: Thomas Kejser, Steve Howard, Jaime Alva Bravo, Kun Cheng, Jimmy May

Note: Validation for this post was performed in the SQL CAT Customer Lab on an HP Proliant DL580 G7, Intel Xeon Nehalem E7-4870 2.40 GHz 4 socket, 10 physical cores, 20 logical cores for a total of 40 physical cores, 80 logical cores; 1TB RAM.  SQL Server 2008 R2 was installed on a Fusion-io ioDrive Duo 1.28TB MCL using driver version 2.3.1

We recently engaged a Tier 1 Global ISV in our labs, where the objective was to achieve the highest Business Transactions for their application.

During the testing and optimizing we encountered an interesting behavior where we kept incurring a clustered index scan; irrespective of applying known techniques to generate an index seek. In this blog, I will share the reproducible steps that reveal the following observation:

“If the table has a persisted computed column*, the query optimizer will choose a clustered index scan over a clustered index seek.”

*This applies to the case, which is most common, where the compute expression doesn’t produce a constant.

Here is the code to create the table & SPs that we’ll refer to for describing this behavior:

IF OBJECT_ID('dbo.testComputedCol') IS NOT NULL
DROP TABLE dbo.TestComputedCol
GO

IF OBJECT_ID('dbo.GetResults01') IS NOT NULL
DROP PROC dbo.GetResults01
GO

IF OBJECT_ID('dbo.GetResults02') IS NOT NULL
DROP PROC dbo.GetResults02
GO

CREATE TABLE dbo.TestComputedCol (Col1 int PRIMARY KEY CLUSTERED, Col2 int)
GO

CREATE NONCLUSTERED INDEX NDX_TestComputedCol_Col2 ON dbo.TestComputedCol (Col2)
GO

SET NOCOUNT ON

DECLARE @Count int = 1;

WHILE @Count < 1000
BEGIN
INSERT INTO dbo.TestComputedCol (Col1, Col2) VALUES (@Count, @Count*10)

SET @Count += 1
END
GO

SET QUOTED_IDENTIFIER OFF
GO

CREATE PROC dbo.GetResults01
AS
SELECT * FROM dbo.TestComputedCol WHERE Col1 = 10
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE PROC dbo.GetResults02
AS
SELECT * FROM dbo.TestComputedCol WHERE Col1 = 10
GO

To baseline the scenario we are describing, we run both SPs and confirm that we experience the index seek operation.

EXEC dbo.GetResults01

image

EXEC dbo.GetResults02

image

In our scenario we recognized the need to partition the data to avoid the last page insert contention issue and therefore implemented a computed column, similar to the below statement. While creating the column it is documented that the Quoted Identifier must be ON in order to create the computed column.

Side Note: For resolving the Last Page Insert Contention challenge, we are left with two choices, troubleshoot the reason for the differing behavior or change the insert behavior to include setting a value for the partition bucket. We chose to troubleshoot the issue. This blog is not striving to optimize the partition hash; however it’s important to recognize that resolving the Last Page Insert Contention through a computed column is not optimal.

In its literal sense QUOTED_IDENTIFIER is required to parse quoted strings as identifiers. A recommended practice is to rather use square brackets “[” as an identifier. Therefore it’s not intuitive that the QUOTED_IDENTIFIER setting influences queries against computed columns.

NOTE: When a table is created, the QUOTED_IDENTIFIER option is always stored as ON in the metadata for the table, irrespective of the current setting. However when creating a computed column, the setting must be set to ON.

SET QUOTED_IDENTIFIER ON
GO

ALTER TABLE dbo.TestComputedCol
ADD ComputedCol AS CAST(Col1%10 AS tinyint) PERSISTED NOT NULL

 
In our solution, prior to adding the computed column, a frequently called SP generated a query plan very similar to what you see below:

EXEC dbo.GetResults01

image
 

However after the computed column was added, executing the same SP generated the following query plan.

image

At first this may seem unexpected, and frustrating. However, some may infer that this is expected behavior, as described in article Creating Indexes on Computed Columns. Take note that this is an inference or a source of data points, as we did not add an index to the computed column. Does this happen if the SELECT and WHERE columns are not referencing the computed column? We modified the SP to follow the recommended practice of explicitly specifying only the required columns,

SELECT Col1, Col2 FROM dbo.TestComputedCol WHERE Col1 = 10

Unfortunately, you cannot shake the scan this way!

The only way to resolve the issue is to ensure the QUOTED_IDENTIFIER to ON during the execution of the batch. The challenge is finding the right place to ensure the effective setting at runtime.

Side Note: In this scenario a computed column will be the partitioning column, which requires a PERSISTED property. The above issue doesn’t occur if the computed column is not persisted.

The default value for QUOTED_IDENTIFIERS is ON; however it can be changed at a variety of points. Therefore you may not encounter the above issue, yet when it does occur it may not be intuitive to check the QUOTED_IDENTIFIER setting or even at which point along the path it has been changed.

The setting can be applied:

  1. As a connection attribute
  2. As a database option
  3. As an object attribute
  4. As a runtime SET option

In our case, we unexpectedly started incurring the index scan. We identified these index scans when the table partitioning strategy didn’t produce the benefits we sought. Troubleshooting revealed that when we ran the code (from the SP) as an isolated batch in a Query Analyzer window, we got the seek behavior. If we ran the SP in the same Query Analyzer window but in a separate batch, we got the scan behavior. When we ran the code using the osql utility, we got scan behavior for both batch executions. If we changed the computed column to a constant value, then we got the seek behavior.

The clue to finding a resolution is in the differences between the osql.exe and SSMS outcomes. These utilities have different defaults for connection attributes. The resolution lies in creating the stored procedure with QUOTED_IDENTIFIERS set to ON. An interesting observation is that it’s redundant to set the quoted identifier setting within the stored procedure; the setting at creation time will take precedence.

Let me share with you how ALM (Application Lifecycle Management) practices of the ISV caused this issue to slither its way into the solution.

NOTE: Keep in mind that there are many variations of what I’m about to describe

A developer creates a module, such as an SP, using a tool such as SSMS, VSTS, osql, sqlcmd, and others but most often SSMS. When the developer is done unit testing, a script file is checked into the source control system, below is an example of such a file.

CREATE PROC dbo.GetResults01
AS
SELECT * FROM dbo.TestComputedCol WHERE Col1 = 10
GO

The Build Process then generates a deployment script. There are a variety of ways that deployment scripts can be generated and run against the target SQL Server and database instance. The point in the above scenario is that the QUOTED_IDENTIFIER setting for this object is dependent on the current connection attribute of the deployment script generator.

If the deployment script does not specify QUOTED_IDENTIFIER setting for the objects, then the setting will be dependent on the connection settings when the script is run. The Build Process for the ISV leverages the osql.exe utility, which has a default setting of OFF. It’s not uncommon for build processes to use the SQLCMD.exe, which would cause the same behavior.

NOTE: If the connection does not set this attribute, then the database setting will take precedence. By default, ODBC and OLEDB set the option to ON. For DB-Library the default is OFF. Therefore it is very unlikely that the database default is ever enforced.

Tool

Default Quoted Identifier Setting

osql.exe

OFF

sqlcmd.exe

OFF

SQLCMD Scripts

ON

SSMS

ON

Below are T-SQL queries to identify current QUOTED_IDENTIFIER settings:

SELECT DATABASEPROPERTYEX(DB_NAME(), 'IsQuotedIdentifiersEnabled')
SELECT * FROM sys.databases WHERE is_quoted_identifier_on = 0
SELECT quoted_identifier, * FROM sys.dm_exec_sessions WHERE session_id = @@SPID

They each have cases where they are useful, though I particularly like:

SELECT * FROM sys.objects
WHERE OBJECTPROPERTYEX(object_id, 'IsQuotedIdentOn') = 0 AND TYPE <> 'S'

This last query reveals those objects that deviate from the recommended standard of having QUOTED_IDENTIFIER set to ON.

In conclusion…there are connection settings that can influence the behaviors of the queries and output. Therefore it’s important to standardize on a setting, the more aligned it is with the industry defaults the easier it is to adhere to the standard.

Historically, prior to computed columns, quoted identifiers were used merely distinguish strings from literals. It is recommended that literals are identified by square brackets, and strings are defined by single quotes. It is further recommended to keep quoted identifier setting to ON in all cases, even if it means explicitly adding this to script files before objects are created.

For our scenario, we conclude that it’s not intuitive that the reason behind the index scan is related to a quoted identifiers setting, therefore we standardize that the setting be ON and run the above queries to validate all appropriate objects have this setting to ON as part of validating a build deployment.

Posted in Thoughts | Tagged | Leave a comment

Creating sample Application that uses a RBS Provider

Summary

This is a simple application sample that leverages the RBS Provider. In this case, it’s using a Filestream provider but none of the code itself is aware what type of provider is in use.

This iteration does not attempt to use the more advanced, optimal writing and reading methods. These will come soon enough, for now a reader of this blog should be able to accomplish insights into creating a sample application to demonstrate RBS capability.

Objective/Context

I want to share the experiences of creating a sample application. My intent was simply to understand how a value can be written to the database, and then read from the database when the data must be accessed through RBS. Historically, even when I implemented a Filestream solution, I was able to meet all my programmatic needs through T-SQL. This allowed me to focus on the database design, and infrastructural requirements.

RBS is forcing me to recall very rusty memories about C# coding, go beyond simply reading code and deriving the intent. As I’ve said, my first iteration is to get a record into the database and read it again. I will follow this up with the recommended pattern for optimal performance, which I believe will require a deeper understanding of how buffering plays an important role in the streaming of blob data. All this is to ultimately get to a point where we can scale the solution to have a 200MB/sec insert throughput, and a 100MB/sec read throughput. I’m expecting numerous concurrent connections, so it must do that too. The solution must be able to maintain these throughputs even when there are 100’s of millions of records.

Ok, there is more to be said about what I want the solution to achieve but for now, let’s just remain focus on the sample application.

Planning

I highly recommend that you read the whitepaper “Remote BLOB Storage”. It is 56 pages so you may be interested in first reading this 9-page review on it “Review & musings of whitepaper “Remote BLOB Storage””. The whitepaper is very informative, and reasonably recent enough to be trustworthy.

I referred to the following sample code “How To: Write a BLOB (Remote BLOB Store)” and “How To: Read a BLOB (Remote BLOB Store)”.

I chose a Windows Form project. This helps me visually understand concepts, and I can easily provide input parameters.

Nothing too intellectually challenging

On the first iteration, the writing and reading of the blobs uses basic logic and constructs. As mentioned under the objectives/context section, I will enhance the solution and therefore you can expect additional fields will appear on the form as the application gets more advanced, such as:

    • Ability to read and/or write multiple files
    • Ability to have a multiple connections simultaneously work through the multiple files
    • Ability to evaluate various file sizes, for instance should we be writing 4MB files
    • Ability to evaluate various buffer sizes when writing files, for instance should we be writing 4MB files in 1MB buffer sizes
      • I must remember to investigate the value of implementing Elastic Buffering

NOTE: RBS is managed code; therefore if you are writing in C++ unmanaged code, then you will have to access the RBS API using C++ Managed code.

Writing the code

At the beginning of this article, I referred you to another blog on this topic. There you will have learnt that you have to create the application’s database and BLOB store database prior to running the RBS.msi. Now before you can run this code, you have to:

  1. Enable the database to use RBS (mssqlrbs.rbs_sp_enable_rbs)
  2. Create the table that holds the BlobId column
    1. The column must be varbinary(64)
    2. You must index this column
    3. I suggest that you add other indexes that align with the type of queries you would run to get the BlobID. For instance, if you collect Blobs based on date then index that column.
  3. Register the column of this table (mssqlrbs.rbs_sp_register_column)

These are the database objects I created. This will do for now, but I caution that more attention should be taken.

CREATE TABLE [dbo].[BlobUserTable]
(
    [id]        uniqueidentifier NOT NULL,
    [FileName]  varchar(50)      NOT NULL,
    [BlobId]    varbinary(64)    NOT NULL
)

CREATE UNIQUE NONCLUSTERED INDEX [IX_BlobUserTable_BlobId]
    ON [dbo].[BlobUserTable] ([BlobId] ASC)

In order to use the sample you will require the following references in the C# project:

using Microsoft.Data.SqlRemoteBlobs;
using System.Transactions;
using System.Globalization;
using System.Data.SqlClient;
using System.IO;

There will be several cases where I need to connect to the database:

private string GetSQLConnString()
 {
     string sSqlServer = txtSQLInst.Text;
     string sDatabase = txtDBName.Text;

     string sConnection = string.Format(CultureInfo.InvariantCulture,
      "Data Source={0};Initial Catalog={1};Integrated Security=SSPI",
      sSqlServer, sDatabase);

     return sConnection;
 }

For the list box I made a call to the database to return all the values from the “filename” column. The design could be better, specifically in that the list should include a Primary Key value (hidden) which can be used when selecting which Blob to read.

private void PopulateListBox()
{
    //Get Connection String
    string sConnection = GetSQLConnString().ToString();

    //Prepare the query string
    string sQuery = "SELECT [FileName] AS [BlobName] FROM dbo.BlobUserTable";

    //Call SP to return list of Filenames and fill data set
    using (SqlConnection connection = new SqlConnection(sConnection))
    {
     SqlCommand SQLCmd = new SqlCommand(sQuery, connection);

     connection.Open();

     SqlDataReader reader = SQLCmd.ExecuteReader();

     //Empty the list box
     lbxBlobInDB.Items.Clear();

     //Add them to the list
     while (reader.Read())
     {
         lbxBlobInDB.Items.Add(String.Format("{0}", reader[0]));
     }

     //Close when done reading
     reader.Close();

     connection.Close();
    }
}

This is called as soon as the form launches

public Form1()
{
    InitializeComponent();

    PopulateListBox();
}

You can type the path to the location of the file, and I added the ability to browse using the “openFiledialog” object to select a file. I’ve seen it fail with permission issues when I try to select from certain directories, therefore use files from an accessible location. I’m not running down that rat-hole for this sample.

private void lblBrowse_Click(object sender, EventArgs e)
{
    OpenFileDialog openFileDialog1 = new OpenFileDialog();

    openFileDialog1.InitialDirectory = "c:\\"";
    openFileDialog1.Filter = "txt files (*.txt)|*.txt|All files (*.*)|*.*";
    openFileDialog1.FilterIndex = 2;
    openFileDialog1.RestoreDirectory = true;

    if (openFileDialog1.ShowDialog() == DialogResult.OK)
    {
     try
     {
         txtFilePath.Text = openFileDialog1.FileName.ToString();
         txtFileNameInDB.Text = openFileDialog1.SafeFileName.ToString();
     }
     catch (Exception ex)
     {
         MessageBox.Show("Error: Could not read file from disk. Original error: "
                         + ex.Message);
     }
    }
}

There is an opportunity for the user to create a more identifiable filename for when the Blob is stored in the database. The default is derived from the file name.

When you click on the WriteBlob it will fetch the file that was selected, and write it to the SQL Server database, and of course send it to the RBS Filestream data store. For now I simply confirmed that there were no errors AND the Application’s database had a new record. Later on, I want to confirm that the small files are stored in the tables and large files in the filesystem. There should be no reason that this is not happening.

private void btnWriteBLOB_Click(object sender, EventArgs e)
{
    string sFilePath = txtFilePath.Text;
    string sFileNameInDB = txtFileNameInDB.Text;
    string sConnection = GetSQLConnString().ToString();
    byte[] blobId = WriteBlobPush(sFilePath, sFileNameInDB, sConnection);
}

rivate byte[] WriteBlobPush(string sFilePath, string sFileNameInDB, string sConn)
{
    byte[] blobId = null;

    //Open the connection to the database, Create and write the BLOB.
   using(TransactionScope ts = new TransactionScope(TransactionScopeOption.Required
        , TimeSpan.FromMinutes(10)))
   {
    using (SqlConnection conn = new SqlConnection(sConn))
    {
        conn.Open();

        SqlRemoteBlobContext blobContext = new SqlRemoteBlobContext(conn);
        using (SqlRemoteBlob blob = blobContext.CreateNewBlob())
        {
            using (FileStream fileToRead = new FileStream(sFilePath, FileMode.Open))
            {
                blob.WriteFromStream(fileToRead);
            }
            ts.Complete();

            //Update the application user table with the
            //returned BLOB id and the filename.
         using (SqlCommand cmd = new SqlCommand(string.Format(CultureInfo.InvariantCulture,
           "INSERT INTO BlobUserTable(id, FileName, BlobId) Values('{0}', '{1}', @blobId)",
                Guid.NewGuid(), sFileNameInDB)))
            {
                cmd.Parameters.Add(new SqlParameter("@blobId", blob.GetBlobId()));
                cmd.Connection = conn;
                cmd.ExecuteNonQuery();
            }
        }

        conn.Close();
    }

    lbxBlobInDB.Items.Add(sFileNameInDB);
  }
    return blobId;
}

You should see the listbox get one more record.

You can select an item from the listbox and then select “Read Blob”, you should see the content of your file appear in the box.  Be cautious because the code fixes the length at 256 bytes, which will be handled in my future enhancements. Additionally, pay careful attention to the following scenario (I’m still determining if this actually happens), the scenario is where you store the filename in the database such that it contains the path to the file. When the function refers to it, and the path doesn’t exist then an error will be generated.

private void ReadBlobPull(string sFileNameInDB, string sConn)
 {
     using (TransactionScope ts = new TransactionScope(TransactionScopeOption.Required
            , TimeSpan.FromMinutes(10)))
     {
         using (SqlConnection connection = new SqlConnection(sConn))
         {
           connection.Open();

           byte[] blobId;

           using (SqlCommand cmd = new SqlCommand(string.Format(CultureInfo.InvariantCulture,
               "SELECT BlobId FROM BlobUserTable WHERE FileName = '{0}'", sFileNameInDB)))
           {
               cmd.Connection = connection;
               blobId = (byte[])cmd.ExecuteScalar();
           }

           //Read the BLOB directly from the store to the FileStream.
           SqlRemoteBlobContext blobContext = new SqlRemoteBlobContext(connection);
           using (FileStream readStream = new FileStream(sFileNameInDB, FileMode.Create))
           {
               using (SqlRemoteBlob readBlob = blobContext.OpenBlob(blobId))
               {
                   byte[] content = new byte[256];
                   readBlob.Read(content, 0, content.Length);
                   txtCurrentBlob.Text = Encoding.Default.GetString(content);
               }
               lblCurrentBlob.Text = sFileNameInDB;
           }
       }
   }
 }
In summary, the experience has ensured that I could focus on the fundamentals, learn a few nuances and prepare an environment where the next step allows for more attention to be spent on optimizing RBS specific access patterns.  Most importantly this was fun, and I can’t wait to start working on the buffer logic.
Posted in Work | Tagged , , , | Leave a comment

Review & musings of whitepaper “Remote BLOB Storage”

This is a personal reflection of the 56-page whitepaper “Remote BLOB Storage

Right off the bat, I should let you know that when it comes to RBS I’m always looking for content that is helpful for implementations that don’t relate to Sharepoint.

This paper promised to appeal to people like me. I believe it did a good job in meeting that objective.

The leading paragraphs “The Case for RBS” is ok, and one could nitpick when you see sentences like “If a given SQL Server database would grow to 500GB without RBS enabled, then RBS would be a beneficial option”, but that’s not worth getting upset about because we simply assume that they mean that most of the data might qualify as RBS candidates. It’s often that you’ll hear about the benefits to “Backup & Restore” that RBS offers, such as the whitepaper “SQL Server RBS Performance with SharePoint Server 2010 and StorSimple Storage Solution” states, however it’s very important to understand that no matter where you move the data to, it will inevitably need to get backed up. So statements that say “Backup is now under 1min, whereas previously was 30min” are misleading (Not stated here, but to some degree is inferred).

I really enjoy the paragraph “When Storage Tiers Need to be implemented”, as I believe it can be useful. What is not been said is that if data qualifies to be on different storage tiers based on age, then at some point it needs to be moved. Unfortunately I have not seen any BLOB storage feature, filestream, RBS or FileTable that can leverage set-based mass data movement. They inevitably all require row-by-row mechanisms. Nonetheless, RBS definitely makes available the option to tier data storage far easier than the other BLOB storage options.

Now we get to the primary reason I was reading (this time round) the whitepaper, namely to get advanced knowledge of the installation steps, and write my first sample.

I initially didn’t pay attention to the key word “LOCAL”. This will impact your understanding, and interestingly it made me realize a potential design. It’s possible to install RBS Filestream in the same database as where the application’s data resides, essentially simply creating a filegroup. WOW! That means one could install RBS Filestream provider solution without requiring additional databases…interesting Thinking smile

The document then describes the different components that get installed. In short, you will have to install some components at the application servers, others at the meta-data server, and others at the BLOB store server.

This is how I think it is aligned:

  • RBS Library (Client) and Filestream Provider (Client) –> Application Server
  • RBS Server Component –> Database Server hosting the Application’s data
  • Filestream Provider (Server) –> Database Server hosting the BLOB store
  • Maintainer –> I could be wrong, but I think it’s best to find this on the Database Server hosting the application’s data

NOTE: I installed everything on a single laptop, and used the REMOTE configuration to have the BLOB stored in a dedicated database. I plan to have multiple servers in the new server, and will likely update this review then (or create a separate blog)

Throughout the document there are references made that I thought could create confusion, so here’s my take on the synonymous references:

  • Application’s database = Content database; Meta-data Database; RBS Enabled database
  • BLOB Store = Location where the BLOB data can be found, in our filestream case it’s the filestream enabled database.

I found the detail about the maintainer, as it pertains the Pool sizes and such, presented a little too early in the document. A pool refers to the number of records the maintainer will select per orphan cleanup iteration. I was perplexed and still don’t know why a data table is created per pool, however I know that you must not confuse this with the concept that manages how many objects are stored in the NTFS directory.

NOTE: NTFS will allow billions of documents to be stored on a volume, and within a directory. Customers and product teams have recognized that too many reduces the performance of traversing these files and a general guideline of ~300,000 objects (files) is offered up. I’ll refer to this later on in this blog.

The document goes on to describe HA options, which are fine but keep in mind that Denali offers Database Mirroring support and integration with the AlwaysON feature. Very Cool!!!

The process flow catches me every time. In simple terms I would conclude that an application makes a call to the RBS API to retrieve the BLOB, and several activities occur behind the scenes to return the BLOB. I’m focusing on the installation concepts, so I browse over this for now.

NOTE: I later learned quite a bit about this step, so expect a blog on this at a later date.

What’s important to note is that the application will make a call to the Application’s Database to first retrieve the BlobID, and then when reading or writing a call is directed to the Blob Store database.

BLOB deleting is interesting, though it doesn’t impact installation considerations. It will impact data administration.

Aaahhhh…..now we find those “Do before you run RBS setup” steps

  • Enable Filestream for the SQL Server Instance
  • Create the databases… yes you must create the RBS Blob Store database too
  • Create filegroups and files
    • You can define the max size limit for the filestream directories
    • Don’t get confused about the filestream references found in the DB Options. These are related to FileTable
  • Create a Master Key for each of the databases
    • I haven’t quite worked out when and if I will be using a certificate in a trusted domain where all SQL Server Instances will be connecting through trusted connections. Additionally there doesn’t appear to be any reason for cross-database T-SQL queries. Nonetheless, RBS requires a master key otherwise it will not install

NOTE: I imagine that at some point, a naming convention and more thought is required about the configuration, but I’m going to leave the crossing of that bridge for another day.

Ok, that’s it. Now where do we get the RBS setup file? Historically this was found as part of a separate download, namely the feature pack. Now you can find the rbs.msi file from the install directory, for English edition it is

%Installdirectory%\SQLFULL_x64_ENU\1033_ENU_LP\x64\redist\RemoteBlobStore

The document makes a few assumptions, so while I click “Next” most of the time, I’m very thoughtful about filegroup destinations. Default will imply “Primary” for the internal tables, and your first filestream filegroup for the BLOB data.

For the initial setup, “I simply want to install” approach, the document covers more than I need, namely all the command line stuff. With that said it is insightful to work through each option as it is easy to relate to the GUI and certainly improves understanding. At some point, you realize that you can have multiple provider instances sent to the same database…WOW!!…that would make for a complex design. I would probably avoid that design unless there was a very compelling reason to combine multiple filestream providers into a single database.

The advanced configuration settings are great, and one of the reasons that make RBS a viable alternative than a standard filestream implementation. I mentioned earlier that the general guideline is to not have more than ~300,000 files in a single directory.

NOTE: I was looking for a Windows Server 2008 NTFS filesystem limit to the number of files per directory, but couldn’t find any. So the guideline is from a Windows Server 2003 article – “How NTFS Works

The “Pool Capacity” is where RBS will manage this in the background. The default setting is 200,000. What the document fails to reveal is that this setting will not directly correlate to the number of files found in the directory, in some cases there could be many more files, this can ocucr because the update & delete modifications create a scenario where there are more files than the tables would indicate. An update in filestream, creates a new file, a delete is a deferred operation. After backups and garbage collection have taken place, then yes the files in the directory will reduce.

It’s great to see the RBS service handle the in-band and out-band concepts. There is an optimal size for when objects should be written using the Win32 API, and this occurs around the 1.2MB size. Until this point, TDS protocol is a more efficient for reading and writing BLOBs. The objects are still stored in the filesystem, but access to them is handled differently. This is configurable and occurs transparent to the application.

Another fantastic configuration is the inline BLOB store. This will make sure that those filesystem performance debilitating small files are stored directly in the database, without the application needing to worry about where they are been stored. The paper doesn’t expand on where they are stored other than “in the mdf file”, so we can only assume that they are stored in the rowdata filegroup defined during setup. If these files account for large amounts of data, then backup & restore may become unwieldy, of course at 61KB recommended value you would need to have > billion small files before worrying about that.

The write allocation size is an interesting concept, which the document fails to expand upon. This is an unfortunate gap. I would like to know more about its value proposition. Maybe there isn’t one, because any updates create a new file, so pre-allocating has no gain and may in fact cause degradation but is there a performance gain when creating a file of 1MB, and then streaming the bytes in? I don’t know, and hope to find out going forward.

I didn’t find the Maintainer, XML Configuration Files, multiple instance deployments, and administrative views particular useful at this stage, but took a mental note that this is a place to come to in the future. The backup & restore is discussed, but really it’s pretty simple, but dangerous. You stop the maintainer. You backup the application’s database first, and then you backup the blob store. When you restore, the maintainer must not be running until everything is restored. The diagrams say restore the Blob store first, and that you can start the restore of the content database but that simply amuses me as the content database will (typically) complete within minutes. So I am currently of the mind that you can restore whenever you want, but will not have access to the blobs.

It’s my bet that the following is the best restore approach…though don’t be surprised if I get killed by those more knowledgeable than Nerd smile

    1. Restore Primary Filegroup of Blob store as a Piecemeal Restore
    2. Restore RowData Filegroups of Blob store as a Piecemeal Restore
    3. Make the blob store database available to be accessed
    4. Restore the Content (Application’s) database

I would also be interested to evaluate that the blob store database can have their filegroups turned into Read-Only. Right now I don’t think we can make new records go to a new filegroup, but I do know that we can make new records go to a new blob store. As soon as a blob store is no longer getting populated or modified, then maybe we can make the database read-only (that would be awesome) or at least make the filestream filegroup read-only.

The next section in the document is all about Sharepoint. It’s worth a read, as we think about what the custom application will be capable of doing but for the most part I find the Sharepoint application imposes interesting restrictions.

So you think you can use RBS now….unfortunately not Disappointed smile

Before you can use RBS, you have to do additional steps. These are described in “How To: Install (Remote BLOB Store)”:

  1. Enable the Application’s database for RBS by running SP mssqlrbs.rbs_sp_enable_rbs
  2. Create a user table and enable the column running SP rbs_sp_register_column
    1. Remember to index the column that is been registered

NOTE: Unfortunately the document doesn’t provide recommendations or guidelines as it pertains to indexes, or other considerations. Therefore I know that I must investigate what are the experiences when you have billions of rows.

The document misses the above, yet it goes into detail about the GAC and configuration files. I struggled to get the references in my custom application sample, and I can’t really comment on how much of the GAC and configuration file recommendations actually assisted in resolving the issue. In the end I had to add the Microsoft.Data.SqlRemoteBlobs by browsing for the DLL.

I appreciated the insights into the various approaches to reading and writing the blobs. I would’ve liked the content to include transaction concepts, and guidelines. There is an example that demonstrates writing blobs using a buffer, however no insights about buffer sizes or what one should consider when developing a solution and trying to establish the appropriate buffer size.

I definitely don’t want to enlist a Distributed transaction, and therefore need to know why the transaction scope is required.

I’ve created a basic sample based on the articles “How To: Write a BLOB (Remote BLOB Store)” and “How To: Read a BLOB (Remote BLOB Store)”, I can’t wait to create samples using the other techniques to write and read blob data.

The document did a short blurb on creating a provider, not really an interest area for me right now, so it didn’t get much attention.

In summary, it was a good paper with opportunity for more information and areas that may qualify as redundant. Most importantly, and what I would consider a reason for a HIGH VALUE rating from me, is that it refrained Sharepoint orientated, which has made other documents mostly useless to me.

Posted in Books & Papers, Work | Tagged , , , , | 3 Comments