Sometimes it is nice to mash data from multiple connections into a single table. One case for using multiple connections for a single table is to analyze data feeds. For example, suppose you are nosy (like me) and you like to check on home valuations on Zillow. Zillow has a free data set on the Azure data market, called the HomeValuationAPIs, that allows you to query for the current valuation for any given home.
Let’s take an example. Here’s a house that makes me giggle – a home on offer for a cool $3.4 million in Redmond, WA. Not exactly a shocker that this house has sat on the market for over 8 months with no buyers. Redmond’s nice, but not that nice.
We can keep up to date on what Zillow thinks this property should be priced at using the Zillow API. The service URL to use is https://api.datamarket.azure.com/data.ashx/Zillow/HomeValuationAPIs/GetZestimate?ZillowPropertyID=48817763&$top=100. If you import this feed into SSDT, you see the Zillow estimates for this home.
Of course, one home is not enough for analysis. You need many homes in a single table to do anything interesting. Unfortunately, the tools only allow us to use one connection per table. Because the property ID is encoded in the data feed’s URL, and the URL is what determines the connection string, the tools limit us to querying one single home per table. This is not very much fun.
A workaround is create a second partition on the table that uses a second connection. This is not exposed anywhere in the UI, but can be accomplished via script. Now, what I am about to show you is something that is probably not really supported. It just so happens to work, but asking the product team to fix bugs on it might result in dirty looks and postponement of fixes until the next release. Consider yourself warned.
Here is how you do it:
- Deploy the model with the data feed.
- Open SSMS and connect to the instance hosting the model.
- Right-click the connection for the data feed, then script a create to a new query editor window, as shown:
- Modify the generated script, changing the three highlighted lines – ID, Name, and Connection String.
I used the ever creative "DataFeed GetZestimate2" for both the Name and ID, and changed the ZillowPropertyID in the URL to 48817769, which is the house practically next door that has also been languishing on the market for a cool $3.4 million. Press F5, the new connection is created. Refresh the Object Explorer in SSMS to verify.
- Now we need to add the account key for the Azure Data Market so we can process the data from this new connection. To do this:
- Right-click the newly created connection and select Properties.
- Select the Connection String and then click … to edit it.
- Set the Password to your account key on the Azure Data Market, and then click OK.
- Now we need to create a partition that uses this new connection. To do this:
- Right-click the table that contains the data feed and select Partitions.
- Click the copy button in the Partition Manager to create a copy of the current partition.
- Press the Script button to script out the partition definition, then Cancel the Copy Partition dialog. We will create the new partition in script.
- Change the three highlighted lines in the generated script.
Use "DataFeed GetZestimate2" for the DataSourceID. Use whatever unique ID and Name that you like, I used something creative like “Data Feed Content 2”. Press F5 to execute the script.
- The new partition should appear in the Partition Manager. Click the Process button and load the data into it.
Now you have a table that uses data from multiple connections. Happy house hunting.