DBPedias

Your Database Knowledge Community

Kendra Little

  1. It’s a Lock: Due Diligence, Schema Changes, and You

    It’s morning standup, and someone says, “It’s no big deal, we just need to add a couple of columns. It’s already in the build, it works fine.”

    The next time this happens, stop and say, “Let’s take a closer look at that.”

    When you write schema changes for a relational database, take the time to investigate what your change will do at a granular level.

    Schema changes matter. A change that seems simple and works perfectly in a small environment can cause big problems in a large or active environment because of locking issues: it’s happened to me. Let’s keep it from happening to you. The good news is that even though the impact may be very different on an active or large system, you can do your investigation on a pretty small scale– right at home in your test environment with sample data.

    What Could Possibly Go Wrong?

    If you’re remembering to ask this question when you do any schema change, you’ve got the critical part down already. In this post, I’ll help show you how to find the answer.

    Here’s what you’ve already done:

    • Written the code for the schema change
    • Tested the schema change within the database
    • Tested the schema change for calling applications
    • Tested the schema change for any remote procedure calls or replicated subscribers

    If not, head back and do those first.

    Why Does Locking Matter?

    SQL Server’s engine uses locking to isolate resources and protect current transactions.

    Schema changes require the use of different kinds of locks. Data Definition Language (DDL) changes all require schema modification locks (SCH-M), too. As Books Online explains:

    During the time that it is held, the Sch-M lock prevents concurrent access to the table. This means the Sch-M lock blocks all outside operations until the lock is released.

    When you’re partying with schema, you’re partying alone. The other critical thing to know is that Sch-M locks aren’t the ONLY locks required for your change: you may need locks on objects related by constraints, or referenced by your object definition.

    With multiple locks in play, these are going to come in a sequence–and this quickly introduces lots of opportunities for blocking and deadlocks in your schema change.

    What to do? Make it a practice to look at the locks required by your schema change.

    Method 1: Lock Investigation with sys.dm_tran_locks and sp_whoIsActive (Quick and Dirty)

    SQL Server’s dynamic management objects provide the most convenient way to check out locks.

    I’m a big fan of Adam Machanic’s sp_whoisactive stored procedure. It’s incredibly quick and easy to use it to grab information on locks:

    • Open a connection to a test system where sp_whoisactive is installed;
    • Begin a transaction;
    • Run a command to make a schema change;
    • Run the following from another connection to look at locks that are being held: sp_whoisactive @get_locks=1
    • Roll back or commit your transaction.

    Sp_whoisactive will return a nicely organized XML summary of all the locks that are currently being held by querying the sys.dm_tran_locks view– with all the associated IDs translated into meaningful names for you. Adam has recently blogged about this here.

    Strengths: This is great for convenience and an initial quick look at a situation. I’ve found lots of interesting things just using this method against a test database. This can also be incredibly useful in gathering information from a production situation where there is contention.

    Warnings: Using an open transaction and sys.dm_tran_locks won’t show you all locks. Even though a transaction is open, the Database Engine will release some locks as soon as it is able to when doing a schema change, and you’re unlikely to see those in test. Also, As Adam notes in his documentation, it can be time consuming to use @get_locks in some situations. Supervise your performance when you use this option to look at active locks in production.

    Method 2: Trace Your Locks (Messy)

    You can look at lock granularity in SQL Server Profiler, or use Profiler to generate a script to run a server side SQL Trace.

    It’s easy to set up and run Profiler against a test environment, but you’ll quickly notice that tracing locks generates LOTS of rows, and there’s lots of IDs for you to translate to figure out exactly what’s being locked.

    To help work with the results, you can stream your Profiler output to a database table. Even better, you can use a server side trace to write to a file, then import it into a database with FN_TRACE_GETTABLE, then use the sys.trace_events and sys.trace_subclass_values to translate the profiler events and lock modes. Once the trace is in the database, you can resolve all those object IDs programmatically, which is great.

    Strengths: This is the best method for looking at locks on test systems prior to SQL 2008. You can programmatically automate the tracing and import and analysis of the data, which is great.

    Warnings: This method is suitable for test systems only– profiling locks in a test system with low activity can generate a large amount of rows, and doing so on a production system can be catastrophic. You’ll also find that events happen very quickly even in a test system, and matching up pairs of starting and finishing lock events can be tricky.

    Method 3: Extended Events (Difficult, but Fulfilling)

    Extended Events are the best way to handle this task on SQL Server 2008 instances and higher. Looking at locks in a test environment is a perfect way to get to know X Events– this is a technology you want to become familiar with!

    To use this method, you’ll create an event session and add events to the session. Events to look at locks include lock_acquired, lock_released, sql_statement_starting, and sql_statement_completed.

    You’ll need to select a target for your event session. I personally prefer to use an asynchronous file target if I’m looking at lock events in detail. Note that if you select the ring buffer target, it will clear for your session after you stop collecting events. (You can drop events from an active event session to stop collection, but I find it simpler to write to the file target.) Even against a test system, I prefer to collect only the data I need when looking at locks. This simplifies interpretation, limits my footprint on the instance, and allows me to share and review the data at a later time.

    Once this is configured, you simply start your session against your test system, then run your schema changes in another connection. You then stop your event session and query your results. If you used a file target, sys.fn_xe_file_target_read_file will read in the file for you so you can query the data, which is stored an an XML format.

    Strengths: Extended Events are more flexible and perform better than SQL Trace. You can use different targets, such as the bucketizer target to count occurrences, and you can also provide filters to collect only specific events.

    Warnings: Using Extended Events takes a bit of preparation and research. We don’t have a GUI application from the SQL Server team to configure or manage XEvent sessions yet. With some reading and testing, you don’t need one: just allow yourself time to explore and code the right solution. To get started, I recommend watching Jonathan Kehayias’ presentation Opening the SQL Server 2008 Toolbox – An Introduction to Extended Events from SQLBits VII

    Where to Go from Here

    Pick the test solution to evaluate locks that’s going to work best for your team. Pick the solution that you can automate, make repeatable, and use consistently. Regularly run this against test systems to analyze locks needed by schema changes, and talk about what the possible repercussions of your changes might be.

    Even if you decide that you want to use a quick method using sys.dm_tran_locks which won’t show you every single lock, using this tool is much safer than not checking at all. I’ve found lots of interesting gotchas this way.

    Remember: figuring out how to make it easy to figure out what locks will be required isn’t the whole story. You need to analyze and interpret those results for each change.

    If you take the time to think about locks you will find out ahead of time which changes are safe to run mid-week and which changes need to be carefully scheduled or rewritten. Add this to your practices, and I promise you’ll avoid some post-mortem meetings for changes gone wrong.

    As a bonus, mastering these techniques will make you a superstar at troubleshooting blocking issues and deadlocks. And that’s a great thing to be.

    ...
    If you like our posts, you'll love our free webcasts every Tuesday. Register now.

    Kendra Little

    Kendra specializes in high availability and performance tuning. She is a Microsoft Certified Master in SQL Server-- the highest technical SQL Server Certification available. Kendra loves databases and software development more than long walks on the beach. Those cartoons in her blog posts? She draws 'em all. Read more and contact Kendra.

    Website - Twitter - Facebook - More Posts

  2. Do you work with data? You’ve always got options.

    She's ready for the future. Are you?

    In preparing for the SQLPeople event, I thought about the role, motivation, and techniques of a “knowledge worker” in today’s society.

    I found Peter Drucker at the library. Drucker lived from 1909 to 2005 and was one of the original writers about management. He was a student of how people live and interact, and he thought deeply about the huge social change of the rise of the modern corporation. Starting his first job with quill and paper, Drucker lived through the era of computing and technology, and studied, wrote, and talked through it all.

    You are not interchangeable

    Drucker wrote about specialized knowledge workers. He saw that as our labor force becomes more specialized, workers no longer have identical capabilities. Years of training and experience are required for jobs in many different fields. Proficiency with technology is becoming an entry requirement for almost all labor in the western world.

    In technology, and working with data management and software development, the breadth of technologies available is creating a highly individualized workforce. A worker can have breadth across different products, working with open source and proprietary platforms.

    Or a worker can specialize on a suite of products, focussing in on data storage, processing, and optimization, or business intelligence. Expertise can be gained in operating systems, storage subsystems, or hardware. Schema design, access methods, techniques for scaling up or scaling out, data redundancy and business continuity— it goes on.

    The good news is: if you work with data, you are not interchangeable.

    In fact, if your’e working with data you have a huge amount of power. You’re in a field which is highly mobile and desired by multiple types of corporations. You can span different industries, and your skills can adapt and change over time.

    You must be ready for change

    Today, you can work for a company on another continent– and companies in your country can hire workers from other continents. Although workers are not interchangeable, workers are more mobile, and relocation is no longer required.

    This introduces fluidity and change into the workplace. As the corporation has grown and evolved, the relationship of workers to the corporation is changing– workers no longer expect to work for a single corporation for most of their adult life.

    Instead, we need to do our best work for our company, and also simultaneously expand our reach with technology. Change will be initiated by either your employer or yourself: in many ways it does not matter. What is important is if you have the resources, and confidence, to adapt quickly.

    Learning by doing

    In 2003, Drucker predicted that education would change, and we would come to view the two most important periods of education in people’s lives as the early childhood period, and the adult period.

    Although Aristotle viewed knowledge as being separated from action, in the Ethics he mentions that

    For the things we have to learn before we can do them, we learn by doing them, e.g. men become builders by building and lyreplayers by playing the lyre; so too, we become just by doing just acts, temperate by doing temperate acts, brave by doing brave acts.

    In technology, we find that learning is similar to these things– you may learn foundational skills in the classroom, but you learn to build large scale data repositories by building them. You learn to scale up a high traffic website by building it– depending on where you start from and where you’re going, you may build it several times. We learn a huge amount from experience, and we are constaintly expanding our experience and refining our opinions.

    This is related to courage. It takes courage to suggest a significant change or a redesign, to explore a new area, to start a new venture.

    See also: Jeff Atwood on Quantity Trumping Quality.

    Decisions

    The commonality I see between Aristotle and Drucker lies in decisions.

    Building software and managing data requires a constant stream of decisions. No single decision is irreversible, but many decisions are very time consuming and difficult to change later. We have a responsibility to make decisions well and to act well in our teams, but at times we are required to make decisions quickly.

    Making good decisions– acting well as a technologist– requires practice.

    What inspires me: the field of opportunity

    The field of opportunity in technology is now global.

    Step back from your daily life for a moment. Read about what life was like 100 years ago, 300 years ago, or farther.

    Think about the vast changes sweeping the world, and the ways you have to take part in those changes.

    If you’re already working in technology in any position, realize that you have huge power and potential to change your own life– far moreso than people who haven’t broken into the field.

    Let’s go out there and build something.

  3. Andy Leonard’s great idea: SQLPeople events

    Not too long ago, Andy Leonard (blog|twitter) dreamed up the idea to create the SQLPeople community. The community is forming around the stories and ideas of its members. The SQLPeople website shares stories.

    I spoke at the first SQLPeople event!

    And now SQLPeople events are starting: the first was held on Saturday, April 9 in Richmond, VA.

    I am proud to have been among the speakers– it was a thoughtful and inspiring event. Other speakers included:

    SQLPeople events are unique.

    Andy simply posed a question to speakers: What inspires you? What is your vision?

    What a refreshing question! I love speaking about technical, practical topics, but being asked to step back and take a look at the larger view was a great reminder to also look at the long view. It was a challenge to think even bigger.

    The format for the event included talks, interviews, videos, and lots of conversation with the audience.  It was great to have a single track and be able to hear everyone’s talk with the same group– and it was fantastic to see the different things that inspire us, and also the common threads that run through it all.

    If there’s one way to describe what made this day different, it’s that it was not directly focused on training, it was focused on thinking.

    It was an energizing day.

    Learn more

    If you like to think about data, check out SQLPeople.net, and attend a future event.

  4. To Do: Win Great SQL Training and a Cruise

    Wondering what to do when you get a break from work email today?

    Make your opening move in a fun contest.

    What’s to win?

    The prize is tempting booty: Idera Software is giving away a free trip to SQLCruise Alaska. And not just part of the trip, it’s the full meal deal: a 7-day cruise for two from Seattle to Alaska (departs from Seattle, WA on May 29th, 2011, returns June 5th, 2011), one seat in the SQLCruise training taking place aboard the cruise, and airfare for two to Seattle (up to $1,500).

    Wow. Did you notice that “cruise for two” and “airfare for two” part? You get the training AND you get to take along a friend or loved one. Pretty rad.

    The agenda’s been posted for the cruise— it’s going to be unique and amazing. The cruise is full of great technical content and training to build your skills and work with you to find practical changes to improve your work environment. I love how the agenda is laid out to change up the pace and keep everyone thinking and learning.

    Check out the awesome training on the Alaska SQLCruise here.

    How to enter

    Head on over to the contest forum, which also has all the official rules for entry.

    There’s a quick registration step (it’s painless, I did it myself), and then here’s how you enter:

    • Tell us about your SQL victory. Post an entry describing what horrid SQL beast you encountered and how you vanquished it.
    • Post a picture of yourself looking victorious (extra credit for Viking helmets and/or attire)
    • Include the phrase “I VANQUISHED THE BEAST!” in your entry.
    • Share your submission with your friends.

    Worried you might submit now and come up with a better idea later? It’s ok, you can enter more than once, and each entry will be judged individually. There’s no reason not to dive in now.

    Tell your story

    Getting fitted for my judge's wig

    I love this contest because it’s about STORIES, and it’s open to everyone with a story to tell.

    Tell a story about your victory. There’s so many people out there in the SQL community who have great passion and do so much. There’s so many of you who love to write and share your experiences.

    Think about the things you’ve done to save the day– in the office, on the forums, maybe even on… Twitter???

    I’m a judge!

    I’m honored to be one of five judges for the event. I look forward to reading about your SQL Victories– let’s hear it!

  5. I laughed, I cried, it was better than CATS: The Fast Track Data Warehouse 3.0 Reference Guide

    Careful what you say about cats, lady.

    You know what’s crazy?

    A comprehensive, technical, well thought-out, and ENJOYABLE document. One written with the occasional interesting diagram and a reasonable use of acronyms, with effective tables and practical advice. A document that’s written for a human being which has helpful links to supporting documentation, but still makes you think.

    Don’t get me wrong– a lot of people write very good documentation. And Microsoft publishes a really large volume of helpful information.

    It’s just remarkable when you find great documentation that is technical, covers a lot of ground, and yet is very readable.

    But I found some! It’s the Fast Track Data Warehouse 3.0 reference guide.

    But I don’t have a Fast Track Data Warehouse…

    Doesn’t matter, it’s still a really good read. You should read this document if:

    • You are interested in SQL Server
    • You are interested in Data Warehouses
    • You are interested in technical writing

    Along the way, this document talks about everything from categorizing workloads, startup options for data warehouses, Resource Governor, creating and configuring filegroups and managing fragmentation, determining optimal table structure, statistics, compression, loading data, benchmarking, and validation. That’s a lot of ground, and a lot of it is useful to think about for a wide variety of systems.

    Example: I came across this document while searching for specific use cases for a partitioned heap. The document talks about considerations for large partitioned objects in data warehouses, and when a partitioned heap might be appropriate vs a partitioned table with a clustered index– it does it quickly, neatly, and thoughtfully.

    The most interesting thing

    This isn’t a sales document, but it makes me want a Fast Track Data Warehouse.

    It makes me feel like a Fast Track system is for smart people– after all, smart people took a whole lot of time to write this doc. It just FEELS smart.

    To me, that’s really impressive– if I can write a little bit more like that every day, I’m moving in the right direction.

    DISCLAIMER: the title to this post does NOT refer to the SQL CAT team, who produce some pretty freaking amazing documents. Instead, it refers to free range clip-art cats, which may or may not have pianos pictured as falling on them in my slide decks. And possibly a Broadway musical which I never saw. And probably to SNL.

  6. Dynamic Management Quickie: Exploring SQL Server’s system views and functions as you work

    Note: If you like this post, here’s one you’re going to like even better! Check out DMV/DMF Info Just A Couple Clicks Away by Brad Schultz.

    There’s a lot of dynamic management and system objects to keep track of in SQL Server.

    We all sometimes have the moment when we can’t remember exactly which DMV, DMF or other system view/function returns a particular column, or if something even IS accessible from the system objects.

    When this happens, remember that it’s easy to query system object and column names. Sys.system_columns and sys.system_objects are here to help.

    Exploring the system views and functions yourself will also help you find new things.

    This example shows all the system views and functions which are likely to have to do with CPU:

    SELECT
    SCHEMA_NAME(SCHEMA_ID)+ '.' + o.Name AS SysObjName,
    o.type_desc,
    c.name AS ColumnName
    FROM sys.system_columns c
    INNER JOIN sys.system_objects o ON o.object_id=c.object_id
    WHERE c.name LIKE '%cpu%'
    

    I like to use this version of the query, which includes the URL to look up more about the DMV. I like to use the browser in SSMS itself to look these up, so I include the shortcut for that in the header.

    SELECT
    	SCHEMA_NAME(SCHEMA_ID)+ '.' + o.Name AS SysObjName,
    	o.type_desc,
    	c.name AS ColumnName,
    	'http://social.msdn.microsoft.com/Search/en-US/?Refinement=117&Query=' + SCHEMA_NAME(SCHEMA_ID)+ '.' + o.Name AS [Help! Ctrl + ALT + R to open web browser in SSMS]
    FROM sys.system_columns c
    INNER JOIN sys.system_objects o ON o.object_id=c.object_id
    WHERE c.name LIKE '%page%'
    

    The output looks like this (click for a larger view):

    I’ve started using this lately since it helps me explore as well as remember.

    Just when you think you know everything about the system objects, you’ll find something new.

    Fun example: look at all the columns like ‘%page%’.

  7. How To: Automate Continuous SQLServer Activity with Stored Procedures and Powershell Jobs

    The Goal

    It’s often useful to be able to run a bunch of stored procedures in the background over a period of time against a test instance.

    This can be nice for:

    • Demos and presentations.
    • Populating DMVs with data you can slice and dice.
    • Learning to use things like extended events and server side trace (which are much more interesting with something to look at).
    • Testing a variety of automation scripts.

    This post shows you how to create several stored procedures for AdventureWorks2008R2 which will provide different result sets and have slightly different run times when run with a variety of parameters– in this case, individual letters of the alphabet.

    You can then run PowerShell commands which start jobs in the background. Each job runs a stored procedure and loops through all letters of the alphabet, providing each one as a parameter. You can set the job to do that loop a configurable amount of times (the commands are set to 100). In other words, as given, each stored procedure will be run 2600 times. Since you’re running multiple jobs and they’re all going asynchronously in their own threads, you’ll have a variety of commands trying to run at the same time.

    Optional: you can start the PowerShell jobs under different credentials if you need.

    Alternatives: In the past, I’ve typically done things like this with T-SQL loops (often with dynamic SQL) and multiple Management Studio windows. This works OK, but it’s a little time consuming to open each window, paste everything in (or open multiple files), and start them all up. I find it much more convenient now to use scripts.

    Step 1: Create Stored Procedures with a single alphabet-based parameter

    Let’s get one thing clear: these procedures aren’t designed to run optimally, and they aren’t coded nicely.

    You’ll notice these procedures have all sorts of problems. And that’s by design– my goals are for testing things around these, so it’s really a little better for me if they don’t play perfectly nice.

    In other words, these sure ain’t for production. :)

    /****************
    Jump in the kiddie pool
    ********************/
    USE AdventureWorks2008R2;
    go
    
    /****************
    CREATE THE SCHEMA
    ********************/
    IF SCHEMA_ID(N'test')  IS NULL
    	EXEC sp_executesql N'CREATE SCHEMA test AUTHORIZATION dbo'
    GO
    
    /****************
    CREATE Silly Stored Procedures in the Schema
    ********************/
    IF OBJECT_ID(N'test.EmployeeByLastName', 'P') IS NULL
    	EXEC sp_executesql N'CREATE PROCEDURE test.EmployeeByLastName as return 0'
    GO
    ALTER PROCEDURE test.EmployeeByLastName
    	@lName nvarchar(255)
    AS
    	SELECT @lName = N'%' + @lName + N'%'
    
    	select *
    	FROM HumanResources.vEmployee
    	WHERE LastName LIKE @lName
    GO
    
    IF OBJECT_ID(N'test.EmployeeByFirstName', 'P') IS NULL
    	EXEC sp_executesql N'CREATE PROCEDURE test.EmployeeByFirstName as return 0'
    GO
    ALTER PROCEDURE test.EmployeeByFirstName
    	@fName nvarchar(255)
    AS
    	SELECT @fName = '%' + @fName + '%'
    
    	select *
    	FROM HumanResources.vEmployee
    	WHERE FirstName LIKE @fName
    GO
    
    IF OBJECT_ID(N'test.EmployeeDepartmentHistoryByLastName', 'P') IS NULL
    	EXEC sp_executesql N'CREATE PROCEDURE test.EmployeeDepartmentHistoryByLastName as return 0'
    GO
    ALTER PROCEDURE test.EmployeeDepartmentHistoryByLastName
    	@lName nvarchar(255)
    AS
    	SELECT @lName = N'%' + @lName + N'%'
    
    	select *
    	FROM HumanResources.vEmployeeDepartmentHistory
    	WHERE LastName LIKE @lName
    GO
    
    IF OBJECT_ID(N'test.EmployeeDepartmentHistoryByFirstName', 'P') IS NULL
    	EXEC sp_executesql N'CREATE PROCEDURE test.EmployeeDepartmentHistoryByFirstName as return 0'
    GO
    ALTER PROCEDURE test.EmployeeDepartmentHistoryByFirstName
    	@fName nvarchar(255)
    AS
    	SELECT @fName = '%' + @fName + '%'
    
    	select *
    	FROM HumanResources.vEmployeeDepartmentHistory
    	WHERE FirstName LIKE @fName
    GO
    
    IF OBJECT_ID(N'test.ProductAndDescriptionByKeyword', 'P') IS NULL
    	EXEC sp_executesql N'CREATE PROCEDURE test.ProductAndDescriptionByKeyword as return 0'
    GO
    ALTER PROCEDURE test.ProductAndDescriptionByKeyword
    	@keyword nvarchar(255)
    AS
    	SELECT @keyword = '%' + @keyword + '%'
    
    	select *
    	FROM Production.vProductAndDescription
    	WHERE Name LIKE @keyword OR ProductModel like @keyword OR description LIKE @keyword
    GO
    

    Once you’ve got the procedures written, you just need to set up your PowerShell commands.

    Step 2: Create PowerShell Jobs to Run the Procedures in Loops

    These commands use PowerShell background jobs.

    Even if you don’t know PowerShell, if you look at these commands you can pretty easily pick out where the 1 to 100 loop is, where the a to z loop is, and what commands are being run.

    Since the jobs are running to create load in the background and I don’t care about collecting query results, I pipe the output all to Out-Null.

    </span>
    <pre>#test.EmployeeByLastName
    Start-Job -ScriptBlock {Import-Module sqlps; foreach($_ in 1..100) {foreach ($_ in [char]"a"..[char]"z") {Invoke-Sqlcmd -Query "exec test.EmployeeByLastName '$([char]$_)'" -ServerInstance "YOURMACHINE\YOURINSTANCE" -Database AdventureWorks2008R2 | Out-Null }}}
    
    #"test.EmployeeByFirstName"
    Start-Job -ScriptBlock {Import-Module sqlps; foreach($_ in 1..100) {foreach ($_ in [char]"a"..[char]"z") {Invoke-Sqlcmd -Query "exec test.EmployeeByFirstName '$([char]$_)'" -ServerInstance "YOURMACHINE\YOURINSTANCE" -Database AdventureWorks2008R2 | Out-Null }}}
    
    #"test.EmployeeDepartmentHistoryByFirstName"
    Start-Job -ScriptBlock {Import-Module sqlps; foreach($_ in 1..100) {foreach ($_ in [char]"a"..[char]"z") {Invoke-Sqlcmd -Query "exec test.EmployeeDepartmentHistoryByFirstName '$([char]$_)'" -ServerInstance "YOURMACHINE\YOURINSTANCE" -Database AdventureWorks2008R2 | Out-Null }}}
    
    #"test.EmployeeDepartmentHistoryByLastName"
    Start-Job -ScriptBlock {Import-Module sqlps; foreach($_ in 1..100) {foreach ($_ in [char]"a"..[char]"z") {Invoke-Sqlcmd -Query "exec test.EmployeeDepartmentHistoryByLastName '$([char]$_)'" -ServerInstance "YOURMACHINE\YOURINSTANCE" -Database AdventureWorks2008R2 | Out-Null }}}
    
    #"test.ProductAndDescriptionByKeyword"
    Start-Job -ScriptBlock {Import-Module sqlps; foreach($_ in 1..100) {foreach ($_ in [char]"a"..[char]"z") {Invoke-Sqlcmd -Query "exec test.ProductAndDescriptionByKeyword '$([char]$_)'" -ServerInstance "YOURMACHINE\YOURINSTANCE" -Database AdventureWorks2008R2 | Out-Null }}}
    

    Each command will start an asynchronous background job.

    Step 3: Manage Jobs (if needed)

    Once the jobs are running in the background, you may want to check on their status. You can do so by running:

    get-job
    

    if you want to remove a job from the list, you can use Remove-Job with the job number, or you can remove all jobs (whether or not they are running) with:

    Remove-Job * -Force
    

    If you want to see the output of a job, you can use Receive-Job– supply the jobnumber. If you’re troubleshooting and want to see errors, you probably want to remove | Out-Null from the command that starts the job, and use a fewer number of loops. Then you can can receive the job’s output and see any errors.

    Receive-Job JOBNUMBER
    
  8. Internals Matter: Why You Should Check Your Page Verify Settings, and I Should Go to Masters Immersion Training (SQL Skills Contest Entry)

    This post is about two things:

    1) Your Page Verification Settings are Critical

    You should confirm you are running with the correct page verification settings on your SQL Server databases. It’s very important, and I’ll show you why.

    2) I Should Attend the Awesome SQLSkills Master Immersion Event – Internals and Performance

    My scripts to demo the importance of page verification settings are part of  my entry to win free registration to the SQLSkills Master Immersion Event on Internals and Performance in February. Read more about the training here.

    Keep reading to see why I hope this helps me win.

    Let’s Start with Page Verification

    The default setting for the PAGE_VERIFY database option for SQL 2000 was TORN_PAGE_DETECTION. As Books Online explains here:

    When TORN_PAGE_DETECTION is specified, a specific 2-bit pattern for each 512-byte sector in the 8-kilobyte (KB) database page is saved and stored in the database page header when the page is written to disk. When the page is read from disk, the torn bits stored in the page header are compared to the actual page sector information.

    SQL 2005 introduced a new page verify database option, CHECKSUM. This was the default for new databases, but the page verification option was not automatically changed when databases were migrated or upgraded from SQL 2000 to SQL 2005. Books Online explains that when you run with CHECKSUM…

    the Database Engine calculates a checksum over the contents of the whole page and stores the value in the page header when a page is written to disk. When the page is read from disk, the checksum is recomputed and compared to the checksum value stored in the page header.

    It is quite common to find databases on SQL 2005 instances and higher still using the TORN_PAGE_DETECTION setting for page verification. When a change to the setting is suggested, people naturally ask if the extra protection is worth the small amount of extra overhead of computing the checksum.

    It certainly is! This post demonstrates why.

    To Play Along At Home…

    You may want to read my prior post on causing database corruption with a hex editor here. This post explains how to edit data in unsupported ways, and these techniques should never be used near a production server, or a database of any value at all. Make sure to wash your hands before and after using these practices, and having a change of clothes handy is also advisable.

    You’ll also need to download the XVI32 hex editor by Christian Maas. No installer is necessary: download the zip file, then unzip all files to a directory and run XVI32.exe

    Our Victim Database

    Let’s say we have a database with entry data for a contest. The contest is, quite coincidentally, the SQLSkills Master Immersion Event on Internals and Performance in February!

    Let’s create our database and schema for our contest entries.

    USE master ;
    IF DB_ID('CorruptMe') IS NOT NULL
        BEGIN
            ALTER DATABASE CorruptMe SET SINGLE_USER WITH ROLLBACK IMMEDIATE
            DROP DATABASE CorruptMe
        END
    CREATE DATABASE CorruptMe ;
    GO
    ALTER DATABASE CorruptMe SET PAGE_VERIFY TORN_PAGE_DETECTION ;
    GO
    USE CorruptMe ;
    --This schema might have an issue or two ;) Just play along.
    CREATE TABLE dbo.Contest (
        ContestId INT IDENTITY,
        ContestName NVARCHAR(256),
        CONSTRAINT cxContestId_Contest UNIQUE CLUSTERED (ContestId)
     )
    CREATE TABLE dbo.ContestEntry (
        ContestEntryId INT IDENTITY,
        ContestId INT REFERENCES dbo.Contest ( ContestId ) ,
        FirstName NVARCHAR(256) ,
        LastName NVARCHAR(256) ,
        ContestStatus NVARCHAR(256),
        CONSTRAINT cxContestEntryId_ContestEntryId UNIQUE CLUSTERED (ContestEntryId)
    )
     
    INSERT  dbo.Contest ( ContestName)
    VALUES ( '5-Day Internals and Performance class in Dallas, February 21-25')
     
    INSERT  dbo.ContestEntry ( ContestId, FirstName, LastName, ContestStatus )
    VALUES ( 1, 'Hello', 'Kitty', 'Win!' ),
    	( 1, 'PePe', 'LePeu', 'Loss' ),
    	( 1, 'Kendra', 'Little', 'Sigh' )

    It looks like the contest has already been judged. How did everyone do?

    SELECT c.ContestName, ce.FirstName, ce.LastName, ce.ContestStatus
    FROM dbo.Contest c
    JOIN dbo.ContestEntry ce ON c.ContestId=ce.ContestId

    Well, it figures. Hello Kitty can be pretty persuasive. Congrats, Kitty.

    But What If…

    What if something happened to the data? The database is running with TORN_PAGE_DETECTION– let’s see what happens if we change some of the values  in the physical data (mdf) file with a hex editor.

    First, take a look at the pages for dbo.ContestEntry:

    DBCC IND ('CorruptMe', 'ContestEntry', 1)

    On my instance, we have one data page in our clustered index (PageType1). It’s PagePID 147. Let’s see if we can make some changes on that page and if SQL Server will notice.

    To do this, take the database offline, and figure out your physical file name and your offset:

    USE master ;
    ALTER DATABASE CorruptMe SET OFFLINE ;
     
    SELECT physical_name FROM sys.master_files WHERE name = 'CorruptMe' ;
    SELECT 147 * 8192 AS [My Offset]

    Then run the XV134 Hex editor with administrative privileges, and open the file.
    Use Address -> GoTo and enter your offset as calculated above. (In my case: 1204224).

    Here’s the data as I found it on the page (click to view in a new window):

    And here is the data after I made quite a few changes in the hex editor, just by typing in the right pane (click to view in a new window):

    That couldn’t possibly work, could  it?

    Let’s put the database back online and see.

    ALTER DATABASE CorruptMe SET ONLINE
    GO
    USE CorruptMe
    GO
    SELECT c.ContestName, ce.FirstName, ce.LastName, ce.ContestStatus
    FROM dbo.Contest c
    JOIN dbo.ContestEntry ce ON c.ContestId=ce.ContestId

    This returns the following. The values on the page aren’t just in a different order, they’re different than the original values:

    What About When I Run the Next CheckDb?

    Let’s see:

    DBCC CHECKDB('CorruptMe')

    This comes back clean. An excerpt from the output is:

    What Would Happen if We Were Using Page Verify CHECKSUM?

    You can see this by running through the demo again. This time, use CHECKSUM as your database PAGE_VERIFY option, either by default or by running this after it is created:

    ALTER DATABASE CorruptMe SET PAGE_VERIFY CHECKSUM;

    After editing the page as described in the hex editor, you will be able to bring the database back online successfully. However, as soon as you read the records from the dbo.ContestEntry table, you’ll get an 824 error for the page we edited, like this:

    Msg 824, Level 24, State 2, Line 1
    SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x54bc84f5; actual: 0×41298172). It occurred during a read of page (1:147) in database ID …

    CheckDb will also alert you to the error the first time it is run.

    The Moral of the Story Is: CHECKSUM is the Way to Go For Page Verify

    This isn’t about security– this is about you knowing if changes to SQL Server’s files are happening outside of its control, whether by corruption or any other means.

    It’s very easy to check if you’re using CHECKSUM. Take a look at your settings today, and make a plan to move to checksum for any databases which need it!

    SELECT NAME
    FROM sys.DATABASES
    WHERE page_verify_option_desc <> 'CHECKSUM'

    Ideally you will include an index rebuild in your upgrade plan, or shortly thereafter– because the checksum needs to be written to each page, this doesn’t start working for a page  until it’s had a write.

    For all sorts of more information, see Paul Randal’s post on Myths around Checksums.

    Meanwhile, I’m Entering A Contest

    Back to our contest for the SQLSkills Master Immersion Event on Internals and Performance in February (info).

    My challenge was to express why I’d like to the go to the class, and why I’ll make the best use of the knowledge I gain.

    I’d love to go because:

    • I’m passionate about learning about internals and performance. Hopefully I’ve demonstrated that here!
    • I still have a lot to learn, and there’s no replacement for a great classroom experience.
    • I know SQLSkills training is highly interactive and is a great place to ask questions and get real-world, in-depth answers.
    • This training will help me grow as a professional.

    I’ll make the best use of the knowledge because:

    • I love using what I learn to explore more and write blog posts like this– it helps me learn more and share with the community.
    • I’m now the *only DBA* at a small, clever internet business. Every bit of expertise I can gain can help me work as part of a great team and make a real difference in our business.
    • I’m good at asking questions and helping others learn– I work hard to be a great fellow student and learn together with others.
    • I’ve clearly learned enough to be dangerous ;)

    What I’d Like to Do at the Training

    I would love to blog my way through my first immersion event. I certainly couldn’t (and wouldn’t) capture all the content we cover, but I’d like to capture what the experience is like. I’d also like to characterize how it  helps me think about, and work  with, SQL Server differently.

    I hope whomever is selected as the winner chooses to share their experience.

  9. Corrupting Databases for Dummies- Hex Editor Edition

    Corruption is so ugly it gets a lolworm instead of a lolcat.

    Let’s make one thing clear from the start:

    This Post Tells You How To Corrupt a SQL Server Database with a Hex Editor in Gruesome Detail

    And that’s all this post tells you. Not how to fix anything, just how to break it. If you aren’t familiar with corruption, corruption is bad. It is no fun at all on any data, or any server, that you care about.

    Where You (Possibly) Want To Do This

    You only want to do this on a test database, in a land far far away from your customers, for the purpose of practicing dealing with corruption. When things go badly, you want to  be  prepared. This post gives you the tools in a simple, step by step fashion, to create different types of corruption so that you can practice resolving them. Big Disclaimer: Do not run this in production. Or anywhere near production, or anything important. Ever. Only use this at home, in a dark room, alone, when not connected to your workplace, or anything you’ve ever cared about. If you corrupt the wrong pages in a user database, you may not be able to bring it back online. If you corrupt a system database, you may be reinstalling SQL Server.

    References, and Thanks to Paul Randal

    Everything I’m doing here I learned from Paul Randal’s blog posts. It just took me a little bit to understand how to use the hex editor and make sure I was doing it properly, so I thought I’d put down the steps I used here in detail. If you’d like to go straight to the source:

    First, Get Your Hex Editor

    Download XVI32 by Christian Maas. No installer is necessary: download the zip file, then unzip all files to a directory and run XVI32.exe

    Create a Database to Corrupt

    For our adventure, our database is named CorruptMe. We’ll create a single table, insert some data, and create a clustered index and nonclustered index on it. (Note: Data generation technique found on Stack Overflow, attributed to Itzik Ben-Gan.)
    USE master;
    IF db_id('CorruptMe') IS NOT NULL
    BEGIN
    	ALTER DATABASE CorruptMe SET SINGLE_USER WITH ROLLBACK IMMEDIATE
    	DROP DATABASE CorruptMe
    END
    
    CREATE DATABASE CorruptMe;
    GO
    
    --Make sure we're using CHECKSUM as our page verify option
    --I'll talk about other settings in a later post.
    ALTER DATABASE CorruptMe SET PAGE_VERIFY CHECKSUM;
    
    USE CorruptMe;
    
    --Insert some dead birdies
    CREATE TABLE dbo.DeadBirdies (
        birdId INT NOT NULL ,
        birdName NVARCHAR(256) NOT NULL,
        rowCreatedDate DATETIME2(0) NOT NULL )
    
    ;WITH
      Pass0 AS (SELECT 1 AS C UNION ALL SELECT 1),
      Pass1 AS (SELECT 1 AS C FROM Pass0 AS A, Pass0 AS B),
      Pass2 AS (SELECT 1 AS C FROM Pass1 AS A, Pass1 AS B),
      Pass3 AS (SELECT 1 AS C FROM Pass2 AS A, Pass2 AS B),
      Pass4 AS (SELECT 1 AS C FROM Pass3 AS A, Pass3 AS B),
      Pass5 AS (SELECT 1 AS C FROM Pass4 AS A, Pass4 AS B),
      Tally AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS NUMBER FROM Pass5)
    INSERT dbo.DeadBirdies (birdId, birdName, rowCreatedDate)
    SELECT NUMBER AS birdId ,
        'Tweetie' AS birdName ,
        DATEADD(mi, NUMBER, '2000-01-01')
    FROM Tally
    WHERE NUMBER <= 500000
    
    --Cluster on BirdId.
    CREATE UNIQUE CLUSTERED INDEX cxBirdsBirdId ON dbo.DeadBirdies(BirdId)
    --Create a nonclustered index on BirdName
    CREATE NONCLUSTERED INDEX ncBirds ON dbo.DeadBirdies(BirdName)
    GO
    Now we can take a look at the pages our table and nonclustered index got created on. I wanted to specifically corrupt a page in the nonclustered index on the DeadBirdies table. Of course if you wanted the clustered index, you could use index Id 1.
    DBCC IND ('CorruptMe', 'DeadBirdies', 2)
    I want to pick a data page for this nonclustered index, so I pick a PagePID where PageType=2. (The reference I use for DBCC IND is here.) I pick PagePID 2784. Note: If you’re following along, you may get a different PagePID if you use a different default fill factor.

    Optional: Check out the page with DBCC PAGE

    If you’d like to take a look at the page you’re about to corrupt, you can do so with the following command.
    --Turn on a trace flag to have the output of DBCC PAGE return in management studio
    --Otherwise it goes to the error log
    DBCC TRACEON (3604);
    GO
    DBCC PAGE('CorruptMe', 1,2784,3);

    Set the database offline

    You must take your victim database offline to render it fully helpless accessible to your hex editor.
    USE master;
    ALTER DATABASE CorruptMe SET OFFLINE;
    Also, get the name of your physical data file which you’ll open in your hex editor. Copy this to your clipboard.
    SELECT physical_name FROM sys.master_files WHERE name='CorruptMe';
    Figure out the starting offset of the page you want to corrupt. You do this simply by multiplying the page ID (PagePid) by 8192 (the number of bytes on a page).
    SELECT 2784*8192 AS [My Offset]

    It’s the Moment We’ve Been Waiting For: Trash That Page

    Fire up your hex editor: run XVI32.exe. Depending on your operating system, you may want to run this with elevated privileges / right click and “run as administrator”. Open the database file by using File → Open, and then the data file name you copied to the clipboard. (If you didn’t set the database offline, you’ll get an error that it’s in use. If you got an error that you don’t have permissions to view the file, make sure you do have permissions and that you ran XVI32.exe with elevated privileges.) Go to the page you want to corrupt by using Address → GoTo (or Ctrl + G), then paste in your Offset Value. You want to search for this as a decimal. XVI43.exe will take to right to the beginning of that page. You can see the ASCII representation of the data in the right pane. For our example, you should be able to see the word ‘Tweetie’ represented. I like to put the cursor  in the right pane at the beginning of the word ‘Tweetie’. XVI32.exe will automatically move the cursor in the left pane, to the appropriate location. You can corrupt the data  by editing in the right pane or left pane. For my example, I am replacing the ASCII ‘T’ in the first occurrence of the word ‘Tweetie’ with an ‘S’. You can edit more, but a little tiny corruption goes a long way. Save the file, and you’re done!

    Admire Your Own Corruption

    First, bring your database back online. If you correctly edited pages in the data, this should work just fine. Note: If you corrupted critical system tables early in the database, this may not work! If so, go back to the steps above to identify a good page offset.
    ALTER DATABASE CorruptMe SET ONLINE;
    You can see the corruption in a couple of different ways. If you have checksums enabled on the database, you can see the corruption by reading the page with the data on it. Since I corrupted a page in a nonclustered index in my example, I need to make sure I use that index. So I can see it with this query:
    USE CorruptMe;
    SELECT birdName FROM dbo.deadBirdies;
    That returns this big scary error, which confirms I did indeed corrupt page 2784:

    Msg 824, Level 24, State 2, Line 1 SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0xb633a8e1; actual: 0xaeb39361). It occurred during a read of page (1:2784) in database ID 18 at offset 0x000000015c0000 in file ‘D:\BlahBlahBlah\CorruptMe.mdf’. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    You can also see this by running a CHECKDB or CHECKTABLE command.
    DBCC CHECKDB('CorruptMe')
    An excerpt from its output:
    Msg 8928, Level 16, State 1, Line 1
    Object ID 2105058535, index ID 2, partition ID 72057594038910976, alloc unit ID 72057594039828480 (type In-row data): Page (1:2784) could not be processed.  See other errors for details.
    Msg 8939, Level 16, State 98, Line 1
    Table error: Object ID 2105058535, index ID 2, partition ID 72057594038910976, alloc unit ID 72057594039828480 (type In-row data), page (1:2784). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12716041 and -4.

    Now Sit Back and Laugh Maniacally. And Then Fix It.

    So, the whole point of this was probably to test something. So take a moment to enjoy the fact that FOR ONCE you don’t have to panic when you see these errors, because it’s all part of your master plan. Then go out and fix the corruption, and run your tests.
  10. Read from the Right End of the Index: BACKWARD Scans

    Optimizing queries is the most fun when you don’t need to add indexes. There’s nothing quite so nice as finding a way to make reading data faster, without slowing down writes or creating new data structures that need to be maintained.

    Here’s one way you can use BACKWARD scans to do this.

    The Scenario: Clustered index on an increasing integer, and you’d like recently created rows

    This is a common enough situation: you have a table with a clustered index on an integer value which increases with each row. You have another column which records the date the row was created.

    You’d like frequently query the most recently created rows over some period of time.

    The table has very frequent inserts, so for performance reasons you want to use the minimal indexes required. (And in general, this is the best practice.)

    Question: Do you need to add a nonclustered index on the column containing the date the row was created?

    Answer: Maybe not!

    Getting the right clustered index scan

    Say we’re working with the following table, which we have filled with five million rows of Tweetie birds. (Note: This generation technique is a tally table population technique which I found on Stack Overflow, which is attributed to Itzik Ben-Gan.)

    CREATE TABLE dbo.Birds (
        birdId INT NOT NULL ,
        birdName NVARCHAR(256) NOT NULL,
        rowCreatedDate DATETIME2(0) NOT NULL )
    GO	
     
    --Insert 5 million Tweetie birds
    --Make them as if they were all created a minute apart.
    ;WITH
      Pass0 AS (SELECT 1 AS C UNION ALL SELECT 1),
      Pass1 AS (SELECT 1 AS C FROM Pass0 AS A, Pass0 AS B),
      Pass2 AS (SELECT 1 AS C FROM Pass1 AS A, Pass1 AS B),
      Pass3 AS (SELECT 1 AS C FROM Pass2 AS A, Pass2 AS B),
      Pass4 AS (SELECT 1 AS C FROM Pass3 AS A, Pass3 AS B),
      Pass5 AS (SELECT 1 AS C FROM Pass4 AS A, Pass4 AS B),
      Tally AS (SELECT ROW_NUMBER() OVER(ORDER BY C) AS NUMBER FROM Pass5)
    INSERT dbo.Birds (birdId, birdName, rowCreatedDate)
    SELECT NUMBER AS birdId ,
        'Tweetie' AS birdName ,
        DATEADD(mi, NUMBER, '2000-01-01')
    FROM Tally
    WHERE NUMBER <= 5000000
     
    --Cluster on BirdId. We won't add any other indexes.
    CREATE UNIQUE CLUSTERED INDEX cxBirdsBirdId ON dbo.Birds(BirdId)

    Say we would just like to see the maximum value in the rowCreatedDate column.

    The most basic way to get this row is with this query:

    SELECT MAX(rowCreatedDate)
    FROM dbo.Birds

    However, that leads to a table scan. We get lots of reads: 22,975 logical reads and 201 physical reads.

    If we know we have a strong association between the BirdId column and the RowCreatedDate column, and that the highest ID in the table is the most recent row, we can rewrite the query like this:

    SELECT MAX(rowCreatedDate)
    FROM dbo.Birds
    WHERE birdId = (SELECT MAX(birdId) FROM dbo.Birds)

    This query still does a clustered index scan. But yet it does only 3 logical reads and 2 physical reads.

    Looking in the execution plan, our query was able to use the extra information we provided it to scan the index backwards. It stopped when it had everything it needed, which was after a short distance– after all, it only needed recent rows, and those are all at one end of the table.

    This backwards scan can be very useful, and can make using the MAX aggregate very useful.

    But you usually need more than just the max value…

    To see a bit more about how you extend this logic, compare these three queries:

    Query A

    This makes you think you need that non-clustered index: it does 22,975 logical reads, 305 physical reads, and 22968 read-ahead reads.

    --Only run against a test server, not good for production
    DBCC DROPCLEANBUFFERS
     
    SELECT birdId, birdName, rowCreatedDate
    FROM dbo.Birds
    WHERE rowCreatedDate >= '2009-07-01 05:00:00'

    Query B

    We can introduce the backwards scan by adding an ORDER BY BIrdId DESC to the query. Now we get 23019 logical reads, 47 physical reads, and 22960 read-ahead reads.

    --Only run against a test server, not good for production
    DBCC DROPCLEANBUFFERS
     
    SELECT birdId, birdName, rowCreatedDate
    FROM dbo.Birds
    WHERE rowCreatedDate >= '2009-07-01 05:00:00'
    ORDER BY birdid DESC

    Query C

    The this last query gives the optimizer extra information about using BirdId to do a BACKWARD scan to grab the maximum BirdId, and then use that to do a BACKWARD seek of the clustered index in nested loops to get the data. It does only 50 logical reads, 4 physical reads, and 817 read-ahead reads.

    --Only run against a test server, not good for production
    DBCC DROPCLEANBUFFERS
     
    SELECT birdId, birdName, rowCreatedDate
    FROM dbo.Birds
    WHERE birdId >=
    	(SELECT MAX(birdId)
    	FROM dbo.Birds
    	WHERE rowCreatedDate <= '2009-07-01 05:00:00')
    	AND rowCreatedDate >= '2009-07-01 05:00:00'
    ORDER BY birdId DESC

    Be Careful Out There

    The examples I’m using work because there is a correlation between the integer field and the date field. Not all tables may be like this. As with all queries, you need to be familiar with your data.

    Consider Your Options– Even the Ones You Don’t Think Are Great

    I’m quite sure BACKWARD index reads are covered in some talks and publications on tuning.  But I learned about this by considering multiple approaches, even those I didn’t think would work at first. It pays to try things out, and you can look a lot by looking carefully at execution plans (including the properties) and your Statistics IO output.

    What this means to me: it’s good to keep an open mind.

  1. 1
  2. Next ›
  3. Last »