Prefix every column name with a specific string? - c#

I'm trying to manually map some rows to instances of their appropriate classes. I know that I need to use every column of every table, and map all of those columns from one table into a given class.
However, I was wondering if there would be an easier way to do it. Right now, I have a class called School and a class called User. Each of these classes has a Name property, and other properties (but the ´Name` one is the important one, since it is a mutual name for both classes).
Right now, I am doing the following to map them down.
SELECT u.SomeOtherColumn, u.Name AS userName, s.SomeOtherColumn, s.Name AS schoolName FROM User AS u INNER JOIN School AS s ON something
I would love to do the following, but I can't, since Name is a mutual name between the classes.
SELECT u.*, s.* FROM User AS u INNER JOIN School AS s ON something
This however generates an error since they both have the column Name. Can I prefix them somehow? Like this for instance?
u.user_*, s.school_*
So that every column of each of those tables have a prefix? For instance user_Name and school_Name?

Years ago I wrote a bunch of functions and procedures to help me with developing automatic code-generation routines for SQL Servers and applications using dynamic SQL. Here is the one that I think would be most helpful to your situation:
Create FUNCTION [dbo].[ColumnString2]
(
#TableName As SYSNAME, --table or view whose column names you want
#Template As NVarchar(MAX), --replaces '{c}' with the name for every column,
#Between As NVarchar(MAX) --puts this string between every column string
)
RETURNS NVarchar(MAX) AS
BEGIN
DECLARE #str As NVarchar(MAX);
SELECT TOP 999
#str = COALESCE(
#str + #Between + REPLACE(#Template,N'{c}',COLUMN_NAME),
REPLACE(#Template,N'{c}',COLUMN_NAME)
)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA= COALESCE(PARSENAME(#TableName, 2), N'dbo')
And TABLE_NAME = PARSENAME(#TableName, 1)
ORDER BY ORDINAL_POSITION
RETURN #str;
END
This allows you to format all of the column names of a table or view any way that you want. Simply pass it a table name, and a Template string with '{c}' everywhere that you want the column name inserted for each column. It will do this for every column in #TableName, and add the #Between string in between them.
Here is an example of how to vertically format all of the column names for a table, renaming them with a prefix in a way that is suitable for inclusion into a SELECT query:
SELECT dbo.[ColumnString2](N'yourTable', N'
{c} As prefix_{c}', N',')
This function was intended for use with dynamic SQL, but you can use it too by executing it in Management Studio with your output set to Text (instead of Grid). Then cut and paste the output into your desired query, view or code text. (Be sure to change your SSMS Query options for Text Results to raise the "maximum number of characters displayed" from 256 to the max (8000). If that still gets cut off for you, then you can change this procedure to a function that outputs each column as a separate row, instead of as one single large string.)

Related

SQL/C# - Apply a function to columns within a SQL query

Is there a way to parse a given SQL SELECT query and wrap each column with a function call e.g. dbo.Foo(column_name) prior to running the SQL query?
We have looked into using a regular expression type 'replace' on the column names, however, we cannot seem to account for all the ways in which a SQL query can be written.
An example of the SQL query would be;
SELECT
[ColumnA]
, [ColumnB]
, [ColumnC] AS [Column C]
, CAST([ColumnD] AS VARCHAR(11)) AS [Bar]
, DATEPART([yyyy], GETDATE()) - DATEPART([yyyy], [ColumnD]) AS [Diff]
, [ColumnE]
FROM [MyTable]
WHERE LEN([ColumnE]) > 0
ORDER BY
[ColumnA]
, DATEPART([yyyy], [ColumnD]) - DATEPART([yyyy], GETDATE());
The result we require would be;
SELECT
[dbo].[Foo]([ColumnA])
, [dbo].[Foo]([ColumnB])
, [dbo].[Foo]([ColumnC]) AS [Column C]
, CAST([dbo].[Foo]([ColumnD]) AS VARCHAR(11)) AS [Bar]
, DATEPART([yyyy], GETDATE()) - DATEPART([yyyy], [dbo].[Foo]([ColumnD])) AS [Diff]
, [dbo].[Foo]([ColumnE])
FROM [MyTable]
WHERE LEN([dbo].[Foo]([ColumnE])) > 0
ORDER BY
[dbo].[Foo]([ColumnA])
, DATEPART([yyyy], [dbo].[Foo]([ColumnD])) - DATEPART([yyyy], GETDATE());
Any or all of the above columns might need the function called on them (including columns used in the WHERE and ORDER BY) which is why we require a query wide solution.
We have many pre-written queries like the above which need to be updated, which is why a manual update will be difficult.
The above example shows that some result columns might be calculated and some have simply been renamed. Most are also made up with joins and some contain case statements which I have left out for the purpose of this example.
Another scenario which would need to be accounted for is table name aliasing e.g. SELECT t1.ColumnA, t2.ColumnF etc.
Either a SQL or C# solution for solving this problem would be ideal.
Instead of replacing each occurrence of every column, you can replace the statement...
FROM MyTable
...with a subselect that includes all existing columns with the function call:
FROM (
SELECT dbo.Foo(ColumnA) AS ColumnA, dbo.Foo(ColumnB) AS ColumnB,
dbo.Foo(ColumnC) AS ColumnC --etc.
FROM MyTable
) AS MyTable
The rest of the query can remain unchanged. In case of table aliasing, you simply replace AS Table1 with AS t1.
Another option you should consider is to create views in your database that would be essentially the subselect. Combined with a naming convention, you can easily replace the occurrences in your FROM (and JOIN) statements with the view name:
FROM MyTable_Foo AS t1
If you want to replace all queries that you'll ever use, consider renaming the tables and creating views that are named like the old tables.
On a more general note: You should reconsider your approach to the underlying problem, since what you are doing here takes away much of the power of SQL. The worst thing here is that once you call the function on all columns, you will not be able to use the indices on those columns, which could mean a serious hit on DB performance.

How to use sub Query in insert statement

I have tried but I get error:
SubQuery are not allowed in this context message comes.
I have two tables Product and Category and want to use categoryId base on CategoryName.
The query is
Insert into Product(Product_Name,Product_Model,Price,Category_id)
values(' P1','M1' , 100, (select CategoryID from Category where Category_Name=Laptop))
Please tell me a solution with code.
(you didn't clearly specify what database you're using - this is for SQL Server but should apply to others as well, with some minor differences)
The INSERT command comes in two flavors:
(1) either you have all your values available, as literals or SQL Server variables - in that case, you can use the INSERT .. VALUES() approach:
INSERT INTO dbo.YourTable(Col1, Col2, ...., ColN)
VALUES(Value1, Value2, #Variable3, #Variable4, ...., ValueN)
Note: I would recommend to always explicitly specify the list of column to insert data into - that way, you won't have any nasty surprises if suddenly your table has an extra column, or if your tables has an IDENTITY or computed column. Yes - it's a tiny bit more work - once - but then you have your INSERT statement as solid as it can be and you won't have to constantly fiddle around with it if your table changes.
(2) if you don't have all your values as literals and/or variables, but instead you want to rely on another table, multiple tables, or views, to provide the values, then you can use the INSERT ... SELECT ... approach:
INSERT INTO dbo.YourTable(Col1, Col2, ...., ColN)
SELECT
SourceColumn1, SourceColumn2, #Variable3, #Variable4, ...., SourceColumnN
FROM
dbo.YourProvidingTableOrView
Here, you must define exactly as many items in the SELECT as your INSERT expects - and those can be columns from the table(s) (or view(s)), or those can be literals or variables. Again: explicitly provide the list of columns to insert into - see above.
You can use one or the other - but you cannot mix the two - you cannot use VALUES(...) and then have a SELECT query in the middle of your list of values - pick one of the two - stick with it.
So in your concrete case, you'll need to use:
INSERT INTO dbo.Product(Product_Name, Product_Model, Price, Category_id)
SELECT
' P1', 'M1', 100, CategoryID
FROM
dbo.Category
WHERE
Category_Name = 'Laptop'
Try like this
Insert into Product
(
Product_Name,
Product_Model,
Price,Category_id
)
Select
'P1',
'M1' ,
100,
CategoryID
From
Category
where Category_Name='Laptop'
Try this:
DECLARE #CategoryID BIGINT = (select top 1 CategoryID from Category where Category_Name='Laptop')
Insert into Product(Product_Name,Product_Model,Price,Category_id)
values(' P1','M1' , 100, #CategoryID)

Reversing cross join input

A table is populated by the following stored procedure:
exec('
insert into tblSegments
(SegmentName, CarTypeID, EngineTypeID, AxleTypeID)
select distinct
''' + #SegmentName + '''
, CT.CarTypeID
, ET.EngineTypeID
, AT.AxleTypeID
from
tblCarTypes CT
cross join tblEngineTypes ET
cross join tblAxleTypes AT
where
CT.CarTypeName in (' + #CarTypes + ')
and ET.EngineTypeName in (' + #EngineTypes + ')
and AT.AxleTypeName in (' + #AxleTypes + ')
')
parameters, with the exception of #SegmentName, are strings such as (for #CarTypes) 'hatchback','suv','sedan'.
Can the data in the table be used to create a list, for a single SegmentName, of the previous entries to the stored procedure akin to
Run1: #CarTypes, #EngineTypes, #AxleTypes
Run2: #CarTypes, #EngineTypes, #AxleTypes
Run3: #CarTypes, #EngineTypes, #AxleTypes
...?
Runs don't need to be in sequential order. The process can involve a combination of T-SQL and C#. I'm pretty sure this is impossible; perhaps someone can prove me wrong.
No, it's not possible because you're taking in a potentially comma-delimited string of values which will create separate rows in your result table. You can easily get a single value each for the CarTypes, EngineTypes and AxleTypes variables, but to group them separately by each execution of your dynamic SQL you would need some kind of executionID column or something to group the rows on per execution.
So you're correct in that what you want to do is completely possible, but not with the schema design you've provided. I would just create another table and populate it at runtime if this is information you want to keep. You could put an identify column on the table that houses the input variables and use the ##IDENTITY for the insert into that table to populate an executionID column in your main table so you can easily associate the variable summary table with the cross joined result table.

Passing multiple rows of data to a stored procedure

I have a list of objects (created from several text files) in C#.net that I need to store in a SQL2005 database file. Unfortunately, Table-Valued Parameters began with SQL2008 so they won't help. I found from MSDN that one method is to "Bundle multiple data values into delimited strings or XML documents and then pass those text values to a procedure or statement" but I am rather new to stored procedures and need more help than that. I know I could create a stored procedure to create one record then loop through my list and add them, but that's what I'm trying to avoid. Thanks.
Input file example (Other files contain pricing and availability):
Matnr ShortDescription LongDescription ManufPartNo Manufacturer ManufacturerGlobalDescr GTIN ProdFamilyID ProdFamily ProdClassID ProdClass ProdSubClassID ProdSubClass ArticleCreationDate CNETavailable CNETid ListPrice Weight Length Width Heigth NoReturn MayRequireAuthorization EndUserInformation FreightPolicyException
10000000 A&D ENGINEERING SMALL ADULT CUFF FOR UA-767PBT UA-279 A&D ENGINEERING A&D ENG 093764011542 GENERAL General TDINTERNL TD Internal TDINTERNL TD Internal 2012-05-13 12:18:43 N 18.000 .350 N N N N
10000001 A&D ENGINEERING MEDIUM ADULT CUFF FOR UA-767PBT UA-280 A&D ENGINEERING A&D ENG 093764046070 GENERAL General TDINTERNL TD Internal TDINTERNL TD Internal 2012-05-13 12:18:43 N 18.000 .450 N N N N
Some DataBase File fields:
EffectiveDate varchar(50)
MfgName varchar(500)
MfgPartNbr varchar(500)
Cost varchar(200)
QtyOnHand varchar(200)
You can split multiple values from a single string quite easily. Say you can bundle the string like this, using a comma to separate "columns", and a semi-colon to separate "rows":
foo, 20120101, 26; bar, 20120612, 32
(This assumes that colons and semi-colons can't appear naturally in the data; if they can, you'll need to choose other delimiters.)
You can build a split routine like this, which includes an output column that allows you to determine the order the value appeared in the original string:
CREATE FUNCTION dbo.SplitStrings
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
AS
RETURN (SELECT Number = ROW_NUMBER() OVER (ORDER BY Number),
Item FROM (SELECT Number, Item = LTRIM(RTRIM(SUBSTRING(#List, Number,
CHARINDEX(#Delimiter, #List + #Delimiter, Number) - Number)))
FROM (SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
FROM sys.all_objects) AS n(Number)
WHERE Number <= CONVERT(INT, LEN(#List))
AND SUBSTRING(#Delimiter + #List, Number, 1) = #Delimiter
) AS y);
GO
Then you can query it like this (for simplicity and illustration I'm only handling 3 properties but you can extrapolate this for 11 or n):
DECLARE #x NVARCHAR(MAX); -- a parameter to your stored procedure
SET #x = N'foo, 20120101, 26; bar, 20120612, 32';
;WITH x AS
(
SELECT ID = s.Number, InnerID = y.Number, y.Item
-- parameter and "row" delimiter here:
FROM dbo.SplitStrings(#x, ';') AS s
-- output and "column" delimiter here:
CROSS APPLY dbo.SplitStrings(s.Item, ',') AS y
)
SELECT
prop1 = x.Item,
prop2 = x2.Item,
prop3 = x3.Item
FROM x
INNER JOIN x AS x2
ON x.InnerID = x2.InnerID - 1
AND x.ID = x2.ID
INNER JOIN x AS x3
ON x2.InnerID = x3.InnerID - 1
AND x2.ID = x3.ID
WHERE x.InnerID = 1
ORDER BY x.ID;
Results:
prop1 prop2 prop3
------ -------- -------
foo 20120101 26
bar 20120612 32
We use XML data types like this...
declare #contentXML xml
set #contentXML=convert(xml,N'<ROOT><V a="124694"/><V a="124699"/><V a="124701"/></ROOT>')
SELECT content_id,
FROM dbo.table c WITH (nolock)
JOIN #contentXML.nodes('/ROOT/V') AS R ( v ) ON c.content_id = R.v.value('#a', 'INT')
Here is what it would look like if calling a stored procedure...
DbCommand dbCommand = database.GetStoredProcCommand("MyStroredProcedure);
database.AddInParameter(dbCommand, "dataPubXML", DbType.Xml, dataPublicationXml);
CREATE PROC dbo.usp_get_object_content
(
#contentXML XML
)
AS
BEGIN
SET NOCOUNT ON
SELECT content_id,
FROM dbo.tblIVContent c WITH (nolock)
JOIN #contentXML.nodes('/ROOT/V') AS R ( v ) ON c.content_id = R.v.value('#a', 'INT')
END
SQL Server does not parse XML very quickly so the use of the SplitStrings function might be more performant. Just wanted to provide an alternative.
I can think of a few options, but as I was typing one of them (the Split option) was posted by Mr. #Bertrand above. The only problem with it is that SQL just isn't that good at string manipulation.
So, another option would be to use a #Temp table that your sproc assumes will be present. Build dynamic SQL to the following effect:
Start a transaction, CREATE TABLE #InsertData with the shape you need, then loop over the data you are going to insert, using INSERT INTO #InsertData SELECT <values> UNION ALL SELECT <values>....
There are some limitations to this approach, one of which is that as the data set becomes very large you may need to split the INSERTs into batches. (I don't recall the specific error I got when I learned this myself, but for very long lists of values I have had SQL complain.) The solution, though, is simple: just generate a series of INSERTs with a smaller number of rows each. For instance, you might do 10 INSERT SELECTs with 1000 UNION ALLs each instead of 1 INSERT SELECT with 10000 UNION ALLs. You can still pass the entire batch as a part of a single command.
The advantage of this (despite its various disadvantages-- the use of temporary tables, long command strings, etc) is that it offloads all the string processing to the much more efficient C# side of the equation and doesn't require an additional persistent database object (the Split function; though, again, who doesn't need one of these sometimes)?
If you DO go with a Split() function, I'd encourage you to offload this to a SQLCLR function, and NOT a T-SQL UDF (for the performance reasons illustrated by the link above).
Finally, whatever method you choose, note that you'll have more problems if your data can include strings that contain the delimiter (for instance, In Aaron's answer you run into problems if the data is:
'I pity the foo!', 20120101, 26; 'bar, I say, bar!', 20120612, 32
Again, because C# is better at string handling than T-SQL, you'll be better off without using a T-SQL UDF to handle this.
Edit
Please note the following additional point to think about for the dynamic INSERT option.
You need to decide whether any input here is potentially dangerous input and would need to be cleaned before use. You cannot easily parameterize this data, so this is a significant one. In the place I used this strategy, I already had strong guarantees about the type of the data (in particular, I have used it for seeding a table with a list of integer IDs to process, so I was iterating over integers and not arbitrary, untrusted strings). If you don't have similar assurances, be aware of the dangers of SQL injection.

What is the best way, algorithm, method to difference large lists of data?

I am receiving a large list of current account numbers daily, and storing them in a database. My task is to find added and released accounts from each file. Right now, I have 4 SQL tables, (AccountsCurrent, AccountsNew, AccountsAdded, AccountsRemoved). When I receive a file, I am adding it entirely to AccountsNew. Then running the below queries to find which we added and removed.
INSERT AccountsAdded(AccountNum, Name) SELECT AccountNum, Name FROM AccountsNew WHERE AccountNumber not in (SELECT AccountNum FROM AccountsCurrent)
INSERT AccountsRemoved(AccountNum, Name) SELECT AccountNum, Name FROM AccountsCurrent WHERE AccountNumber not in (SELECT AccountNum FROM AccountsNew)
TRUNCATE TABLE AccountsCurrent
INSERT AccountsCurrent(AccountNum, Name) SELECT AccountNum, Name FROM AccountsNew
TRUNCATE TABLE AccountsNew
Right now, I am differencing about 250,000 accounts, but this is going to keep growing. Is this the best method, do you have any other ideas?
EDIT:
This is an MSSQL 2000 database. I'm using c# to process the file.
The only data I am focused on is the accounts that were added and removed between the last and current files. The AccountsCurrent, is only used to determine what accounts were added or removed.
To be honest, I think that I'd follow something like your approach. One thing is that you could remove the truncate, do a rename of the "new" to "current" and re-create "new".
Sounds like a history/audit process that might be better done using triggers. Have a separate history table that captures changes (e.g., timestamp, operation, who performed the change, etc.)
New and deleted accounts are easy to understand. "Current" accounts implies that there's an intermediate state between being new and deleted. I don't see any difference between "new" and "added".
I wouldn't have four tables. I'd have a STATUS table that would have the different possible states, and ACCOUNTS or the HISTORY table would have a foreign key to it.
Using IN clauses on long lists can be slow.
If the tables are indexed, using a LEFT JOIN can prove to be faster...
INSERT INTO [table] (
[fields]
)
SELECT
[fields]
FROM
[table1]
LEFT JOIN
[table2]
ON [join condition]
WHERE
[table2].[id] IS NULL
This assumes 1:1 relationships and not 1:many. If you have 1:many you can do any of...
1. SELECT DISTINCT
2. Use a GROUP BY clause
3. Use a different query, see below...
INSERT INTO [table] (
[fields]
)
SELECT
[fields]
FROM
[table1]
WHERE
EXISTS (SELECT * FROM [table2] WHERE [condition to match tables 1 and 2])
-- # This is quick provided that all fields to match the two tables are
-- # indexed in both tables. Should then be much faster than the IN clause.
You could also subtract the intersection to get the differences in one table.
If the initial file is ordered in a sensible and consistent way (big IF!), it would run considerably faster as a C# program which logically compared the files.

Categories

Resources