DBPedias

Your Database Knowledge Community

Scott Herbert

  1. Generate a GUID/newid() in SSIS 2008

    Inexplicably, SSIS doesn't natively include a method for generating new GUIDs in the Derived Column Data Flow Transformation.

    The get-around is to create a custom Script Component. In SSIS 2008, we have a choice of using Visual Basic or C# to write the script code; I'm going to demonstrate a C# script.

    Let's say we have a Data Flow Source and a Data Flow Destination.
    (you can click on these pictures to enlarge):



    For the sake of this demo, the only difference between the source and the destination is that the destination has a GUID column, with no default constraint which adds its own NEWID() on insert. So we need to generate the GUID in SSIS. The usualy was we'd add a new column worth of data is by using the Derived Column transformation, but as I've mentioned (and you've probably found if you're reading this) there isn't a new GUID option.

    So we add a script component.



    Choose the "Transformation" option, hit OK, and drag the Source's green arrow onto the Script Component.

    Double click the script component, then "Inputs and Outputs", then expand Output 0 and Add Column. Call the column whatever you like, but I'm going to call it "SQLNinjaGUID" so you can see how it's referenced in the C# code. Change the data type of the column to unique identifier [DT_GUID].



    Jump back to the script tab, make sure we're using Microsoft Visual C# 2008 as the ScriptLanguage, Edit Script, and replace ALL of the code with the following.

    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
    using Microsoft.SqlServer.Dts.Runtime.Wrapper;

    [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
    public class ScriptMain : UserComponent
    {

        
    public override void Input0_ProcessInputRow(Input0Buffer Row)
        
    {
            Row.SQLNinjaGUID
    = System.Guid.NewGuid();
        
    }

    }


    Now when you drag the Script Component's green arrow onto your Data Flow Destination, you should see SQLNinjaGUID (or whatever you called your new column) pop up as an available Input Column in the destination's Mapping tab.


  2. Project Lucy

    Quest Software, the kind people behind SQLServerPedia are running an experiment in cloud based performance analysis called Project Lucy.

    The idea is that you can upload a SQL .trc (Profiler trace) file, and Project Lucy will do some analysis for you - analysing CPU time, durations, IO etc. There's histograms of statement durations, and the ability to filter the trace easily by picking from drop downs.

    All this is of course possible by uploading your trace file to a database table (or saving the trace output directly to a table) and writing your own SQL queries. However, Project Lucy's aim is to start mining the crowd sourced data to provide comparison between your trace and similar workloads uploaded by the rest of the community.

    It's an interesting project, and like SQLServerPedia it needs community support to succeed. It's useful now, but to reach its potential it needs feeding with data. I've started uploading traces over the last couple of days, and I'm finding that the analyses available thus far are useful enough to keep you interested while Quest work on extending the functionalities. Each trace file you upload is its own "analysis", and all your analyses are saved for later appraisal.

    Also, they're giving away a $50 Amazon voucher each day (to US residents only unfortunately), so that's a pretty good incentive to give it a try.


  3. Get a list of all the Tables in a Database

    Script follows, with little to no fanfare!


    SELECT
            
    schemas.name + '.' + tables.name SchemaTableName
    FROM
            
    sys.tables
      
    JOIN
            
    sys.schemas
        
    ON tables.schema_id = schemas.schema_id


  4. Report Permissions

    A reader of this blog asked me this morning: "If you wanted to design a query to make a report for management that would show which AD Groups or users have access to which reports, how would you go about it."


    This is what I quickly whipped up:

    SELECT
             Catalog.Name ReportName
             ,Users.UserName
             ,Roles.RoleName
       FROM [dbo].[Catalog]
             JOIN
             dbo.PolicyUserRole
             ON   [Catalog].PolicyID = PolicyUserRole.PolicyID
             JOIN
             dbo.Users
             ON   PolicyUserRole.UserID = Users.UserID
             JOIN
             dbo.Roles
             ON   PolicyUserRole.RoleID = Roles.RoleID



    Can anyone see any problems with that?
  5. Microsoft Certified Master - Database Structures

    The MCM program is aimed at the uber-uber-uber SQL gurus, and from what I'm reading on various blogs it's no walk in the park for them.

    The awesome thing that's popped up out of the newly revised program is that there are a bunch of free training material videos available.

    While I'm not personally anywhere near thinking about attempting the MCM certification, I'm interested in hearing about what lies under the covers of SQL server from people who know it inside out. The absolute nitty gritty: like learning about DNA and mitochondria as opposed to the more systemic/anatomical view that the MCITP type courses cover.

    The first video in the series covers database structures: the ways that data is stored and managed on the disk. It's presented by Paul Randal, who spent 9 years working on the storage engine for SQL server, so the info is really from the horse's mouth.

    To paraphrase the summary of the video:

    Records - the rows, or "slots" which make up our tables,
    Pages - the 8kB chunks where the records live,
    Extents - collections of 8 contiguous pages,
    Allocation bitmaps - keep an eye on the extents, and;
    IAM chains and allocation units - keep track of what's living where.


    If you've a spare 42 minutes, check it out here.
  6. Formatting SQL Code for Blogs

    Up until recently, all the SQL code on this blog looked awful: completely lacking in formatting and unreadable (though still useful!). It looked like this:

    SELECT
    *
    FROM
    Ow.My.Eyes.Hurt

    Then I discovered The Simple-Talk Code Prettifier.

    You pop your formatted SQL code in (copied out of SSMS or Visual Studio), choose the style of HTML (forums in my case), whether to correct indenting and lenth of tabs etc., and voila: html code is produced which results in something real pretty like the below:

    SELECT
             *
       FROM
             Oh.That.Be.Nice
             


    Thanks to the HOBT (Aaron Alton) for blogging about this tool better and sooner than I.
  7. Importing an Excel Sheet to a Database Table.

    I've always found the SQL Server Import and Export Wizard a great way to pull data out of Excel into SQL Server. Except for when it doesn't work. It can be a frustrating beast, with all kinds of little idiosyncratic things to consider.

    However, it's easily avoided for basic imports from Excel. If you just want a table in a SQL database which looks and feels exactly like the source data, this command is your new friend:

    SELECT
             *
             INTO
             dbName.schemaName.tableName
       FROM
             OPENROWSET('Microsoft.ACE.OLEDB.12.0'
                , 'Excel 12.0;Database=excelFilePathName.xlsx'
                , [worksheetName$])

    Works a charm, and avoids clicking through a GUI which tends to crash when there's any slight problems.

    If you have any "missing provider" problems, which may happen on new 64 bit SQL servers, download the provider from Microsoft Access Database Engine 2010 Redistributable.
  8. Which Reports Aren't Inherting Permissions?

    Managing security settings in Reporting Services through the Report Manager website is usually pretty easy. However I've found that sometimes things go astray. Today I found a few reports which weren't inheriting permissions from their parent directory. Easy enough to fix, you jump into the security settings for that report and set "Revert to Parent Security". However, I have dozens of reports in each of a dozen directories; if I suspect this problem is going on in other reports, I'm not keen on a "needle in a haystack" search.

    Here's the code for identifying which reports are no longer inheriting permissions from their parent:

    USE ReportServer

    SELECT
    [Path],
    [Name]
    FROM
    [dbo].[Catalog]
    WHERE
    [PolicyRoot] = 1

  9. Another Standards Discussion: Views and Business Rules

    Let’s say that the business you work for has also been around for many years, and in that time loads of little rules about the data have appeared. Market A works differently to market B. This car shouldn’t be included in this list of products. When pulling together a list of shoppers, we only care about those that have credit cards.

    Every time a new project comes up, a new part of a public facing website or a new module in an in-house application, new stored procs are developed. Often, these procs are written by people who weren’t around when the DB schema involved was designed, or they haven’t got all of those little rules I discussed above memorized. Who does?

    This is where Views can come in handy. Instead of building each of those annoying little rules into each new Stored Procedure that’s developed, build them into Views, and point all Stored Procs towards Views instead of Tables.

    And I mean ALL Stored Procs. As a standard. If you follow this standard, you can be sure that when it’s time to implement a new business rule, you can find the appropriate View, alter some code, and all the dependant Stored Procedures will now follow that new business rule.

    For example, you have a table called Lotto.Entry. It records who has entered a lottery, and how old they are. Supplying the website, there are three hundred stored procedures which access data from this table. The CEO at your company decides that it’s no longer ethical to be reporting on people under 17 who have entered the lottery. If all those 300 Stored Procedures are pointing towards the table, you’re faced with moving data around, or changing all of those procs. If the procs are instead pointing towards Lotto.vEntry, that view can have a “WHERE Entry.Age > 17” clause added. Other Views should point towards this View instead of the underlying Table, so that the business rules are “inherited” across the system.

    Another advantage is that the business rules are centralized. You know to look towards the basic view of a table if you want to know how its data is handled.

    Pretending to delete data is another thing that can be done in views. Let’s say you want the users of the website/UI to feel like they’re deleting data, but you want an audit trail on that data, and you simply want to hide it from the users. Add an “IsDeleted” field to your table, and a “WHERE IsDeleted = 0” clause to the base view of that table. The data is then easily “undeleted” in cases of errors.

    Views are a centralized, transparent and predictable way of implementing rules in which your data is accessed.

  10. Object Naming Standards

    Having a look through the new features of SQL2008, I was intrigued to find the new Declarative Management Framework. In short, it's a way to enforce various policies on your environments.

    Without going too far into it (in this post), it made me think about what kinds of policies or standards I like to enforce as a Team Leader.

    One particular area is in object naming conventions. This is always going to be an idiosyncratic/personal aspect of programming, but I’ve found that if one person takes the initiative, other people will follow. An agreed standard makes it easy to predict what an object will be called, cutting down on searching and guessing when you’re not familiar with a schema that someone else has designed.

    Tables:
    - No tables have a plural name, i.e. “Lottery.Entry”, not “Lottery.Entries”

    Stored Procedures
    - have the prefix “s”
    - wherever possible include the main object they reference, what they are doing to the object, and by what parameters (if any, and only when there’s only one or two, otherwise use “filters” or “various” etc.) in the format “s[OBJECTNAME][ACTION]by[PARAMETERS]”. The main point is the object name and the action. This is open to a fair bit of interpretation, but here’s some examples
    o Lottery.sEntrySelectByID
    o Lottery.sEntryInsert
    o Lottery.sResultSelectByFilters
    o Lottery.sPredictionUpdate

    - no underscores. One of my bugbears is guessing whether or not an underscore might separate one part of an object name from another. The simplest answer is to not use them

    Views
    - have the prefix “v”
    - if they return fields from one table only, then they are named “v[TableName]”
    - it they return fields mostly from one table, and reference other tables/views in order to bring back names etc,. for IDs in the original table, then they are named “v[TableName]Overview”
    - no underscores

    Triggers
    - prefixed with “trig”
    - suffixed with “AfterInsert”, “InsteadOfInsert” etc
    - example: Lottery.trigEntryAfterInsert

    Functions
    - same rules as for stored procedures, but prefixed with “f”

    Keys
    - prefixed with “pk” for primary, “fk” for foreign, “uq” for unique
    - perhaps “uk” would be more consistent for uniques? Hmmm I may have stuffed that one up over the last few years….
    - reference the object name and the fields: e.g. fkEntry_LotteryID
    - hey what’s that underscore doing there? I’m fine with underscores in objects that are rarely referenced which writing code off the top of your head. I think they’re fine in keys, indices etc., anything that’s not a view, proc or table

    Indices
    - ixTABLENAME_FieldName1_FieldName2 etc
    - no need to reference INCLUDEd columns

    Constraints
    - “df” prefix for default constraints, e.g. dfEntryName
    - “ck” prefix for Check Constraints

    I’m sure there are plenty of people who would vehemently disagree with some of the choices I’ve made above. That’s great, and I’d love to hear why! My main point though is that having standards helps save time by making it easy for fellow developers/DBAs to predict what your objects are called when they’re searching for them. In a database like the main one I manage at the moment, where there are now over 10,000 objects, this can save some of the precious sanity we have left.

  1. 1
  2. Next ›
  3. Last »