SQL Server and Chinese character on INSERT/SELECT query

SQL Server and Chinese character on INSERT/SELECT query - c#

I use SQL Server 2014 with collation SQL_Latin1_General_CP1_CI_AS.
I have a C# program that inserts Chinese character into my database, for example :
"你","好".
In SQL Server Management Studio, I can see it clearly and I can also search on it through N"你".
The issue is that for some character, it didn't work :
"〇", "㐄".
When my C# program start to insert this two characters, I have a CONSTRAINT UNIQUE exception raised (because I put it into my database a unique constraint for Chinese character).
InnerException = {"Violation of UNIQUE KEY constraint 'AK_Dictionary_Chinese'. Cannot insert duplicate key in object 'dbo.Dictionary'. The duplicate key value is (㐄).\r\nThe statement has been terminated."
And here is my issue : it seems that these two Chinese character (and I have around 70 similar issue) are not well converted into UTF8 and I encounter issue. If I remove the unique constraint, then of course I can insert it into my database and can see it through SQL Server Management Studio. But when I search for the character using N"〇", the database answer me multiple matches : "〇", "㐄"...
So how can I deal with that ? I tried to change the collation for the Chinese one but I have the same issue...
Thanks for your help.
HOW can I add the Chinese characters in my c# program?
My entity object :
public partial class Dictionary
{
public int Id { get; set; }
public string Chinese { get; set; }
public string Pinyin { get; set; }
public string English { get; set; }
}
I just add a new entity object to my database and call SaveChanges();
var word1 = new Dictionary()
{
Chinese = "〇",
Pinyin = "a",
English = "b",
};
var word2 = new Dictionary()
{
Chinese = "㐄",
Pinyin = "c",
English = "bdsqd",
};
// We insert it into our Db
using (var ctx = new DBEntities())
{
ctx.Dictionaries.Add(word1);
ctx.Dictionaries.Add(word2);
ctx.SaveChanges();
}
If you want to try at home, here is a small sql script that reproduce the issue. You can execute it through SQL MANAGEMENT STUDIO :
DECLARE #T AS TABLE
(
ID INT IDENTITY PRIMARY KEY,
Value NVARCHAR(256)
);
INSERT INTO #T(Value) VALUES (N'〇'), (N'㐄');
SELECT * FROM #T;
SELECT * FROM #T WHERE Value = N'〇';

I found the answer. In fact, the Chinese character is "traditional" and SQL SERVER need a bit of help. Collation was the right idea, but I had to specify Chinese traditional for that (to find it I just query the sample provided with all possible collation....).
DECLARE #T AS TABLE
(
ID INT IDENTITY PRIMARY KEY,
Value NVARCHAR(256)
) ;
INSERT INTO #T(Value) VALUES (N'〇'), (N'㐄');
SELECT * FROM #T;
SELECT * FROM #T WHERE Value = N'㐄' COLLATE Chinese_Traditional_Pinyin_100_CI_AS;
Sorry for making lose your time, I really tried with Chinese collation but not the traditional one (I didn't know that it was traditional character...).
Fixed.

Try this without specifying Collation every time!
DECLARE #T AS TABLE
(
ID INT IDENTITY PRIMARY KEY,
Value NVARCHAR(256)
) ;
INSERT INTO #T(Value) VALUES (N'〇'), (N'㐄'), (N'㐄〇'), (N'〇㐄');
SELECT * FROM #T;
SELECT * FROM #T WHERE CAST(Value AS varbinary) like CAST(N'〇%' AS varbinary)
SELECT * FROM #T WHERE CAST(Value AS varbinary) like CAST(N'㐄%' AS varbinary)

Related

How do I insert multiple records using Dapper while also including other dynamic parameters?

Here is a truncated example of what I'm trying to do:
var stuffTOSave = new List<SomeObject> {
public int OtherTableId { get; set; }
public List<Guid> ComponentIds { get; set; }
};
var sql = #"CREATE TABLE Components( ComponentId uniqueidentifier PRIMARY KEY )
INSERT INTO Components VALUES (#WhatGoesHere?)
SELECT * FROM OtherTable ot
JOIN Components c on c.ComponentId = ot.ComponentId
WHERE Id = #OtherTableId
DROP TABLE Components"
Connection.Execute(sql, stuffToSave);
I know from other SO questions that you can pass a list into an insert statement with Dapper, but I can't find any examples that pass a list as well as another parameter (in my example, OtherTableId), or that have a non-object list (List<Guid> as opposed to a List<SomeObject> that has properties with names to reference).
For the second issue, I could select the ComponentIds into a list to give them a name like:
stuffToSave.ComponentIds.Select(c => new { ComponentId = c })
but then I'm not sure what to put in my sql query so that dapper understands to get the ComponentId property from my list of ComponentIds (Line 7)

I would still like to know the real way of accomplishing this, but I have this workaround that uses string interpolation:
var sql = $#"CREATE TABLE Components( ComponentId uniqueidentifier PRIMARY KEY )
INSERT INTO Components VALUES ('{string.Join($"'),{Environment.NewLine}('", request.ComponentIds)}')
SELECT * FROM OtherTable ot
JOIN Components c on c.ComponentId = ot.ComponentId
WHERE Id = #OtherTableId
DROP TABLE Components"
I'm not worried about SQL Injection since this is just interpolating a list of Guids, but I'd rather avoid this method if possible.

Can I use special characters in column name in SQL Server and C# data table

I have a doubt can I use special characters %, $, +, -, # in SQL Server table column name and C# data table?
I tried to create a table with these characters in SQL Server the table is created, but I want to know its possible or not?

As explained you can since your column name is between square brackets, but it is not a good practice use spaces and special characters in column names.
CREATE TABLE [TABLE1] (ID UNIQUEIDENTIFIER, [% Column1] INT, [$ Column2] INT, [+ Column3]
INT, [- Column4] INT, [# Column5] INT);
INSERT INTO [TABLE1] (ID, [% Column1], [$ Column2], [+ Column3], [- Column4], [# Column5])
VALUES ('8C012194-5D8A-4A58-B225-F33F60875499',1, 2, 3, 4, 5)
If you are using Entity Framework you can map your column to your model class like this:
[Table("Table1")]
public class Test
{
public Guid Id { get; set; }
[Column("% Column1")]
public int Column1 { get; set; }
[Column("$ Column2")]
public int Column2 { get; set; }
[Column("+ Column3")]
public int Column3 { get; set; }
[Column("- Column4")]
public int Column4 { get; set; }
[Column("# Column5")]
public int Column5 { get; set; }
}

Azure sql supports these special characters in your column name.
Because the SQL Server datatype column_name is nvarchar( 128 ).
You can get this form this document: COLUMNS (Transact-SQL)
For C# , as mukesh kudi said, you should use [] brackets.
For example, if you want to select the column with special character '#', the code should like this:
var rows = dt.Select("","[#]");
You can reference this blog: How to access an column with special characters using DataTable.Select()?
Hope this helps.

Welcome to SO,You can have column name with special character or space but you have to use
square bracket []
in order to access those columns.
eg.
Table
%ID
1
2
3
Select [%ID] from Table
For more information on column and table name rules for MSSQL Server you can check out msdn

Preventing duplication from SQL Query, ASP.net

I have the below SQL query using the Query Builder in Visual Studio. As you can see the same user is duplicated 3 times, this is due to the user having 3 different skills. How can I merge the 3 skills together in the SQL query or in a ListView control so that it only displays one result instead of 3 and that the user has their 3 skills listed?
SELECT users.role_id, users.username, users.first_name, users.last_name, users.description, roles.role_id, roles.role, skills.skill_id, skills.user_id, skills.skill
FROM users
INNER JOIN roles ON users.role_id = roles.role_id
INNER JOIN skills ON users.user_id = skills.user_id
WHERE (users.role_id = 3)

Use For XML Path(''), Type. It is a bit of a hack, because you're really creating an XML string without a root and fashioning odd elements, but it works well. Be sure to include the Type bit, otherwise the XML trick will attempt to convert special characters, like < and & into their XML escape sequences (here is an example).
Here is a simplified version of your problem in a SQL Fiddle. Below is the relevant Select snippet.
SELECT users.user_id, users.first_name,
STUFF(
(SELECT ', ' + skill
FROM skills
WHERE users.user_id = skills.user_id
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)')
, 1, 2, '') AS skill_list
FROM users

Try using Stuff and For Xml
Here's the Fiddle:
http://sqlfiddle.com/#!6/fcf71/5
See if it helps, it's just a sample so you will have to change the column names.

You can use PIVOT on the Skill then group those skills into one column.
To make it simple, I test it with some sample data like the following:
CREATE SCHEMA _Test
CREATE TABLE _Test.SkillSet(SkillId INT IDENTITY(1,1) PRIMARY KEY, SkillName NVARCHAR(64))
INSERT INTO _Test.SkillSet(SkillName) VALUES('C/C++')
INSERT INTO _Test.SkillSet(SkillName) VALUES('C#')
INSERT INTO _Test.SkillSet(SkillName) VALUES('Java')
CREATE TABLE _Test.Employees(EmpId INT IDENTITY(1,1) PRIMARY KEY, FullName NVARCHAR(256))
INSERT INTO _Test.Employees(FullName) VALUES('Philip Hatt')
INSERT INTO _Test.Employees(FullName) VALUES('John Rosh')
CREATE TABLE _Test.Employee_Skill(EmpId INT FOREIGN KEY REFERENCES _Test.Employees(EmpId), SkillId INT FOREIGN KEY REFERENCES _Test.SkillSet(SkillId))
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(1, 1)
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(1, 2)
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(1, 3)
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(2, 2)
INSERT INTO _Test.Employee_Skill(EmpId, SkillId) VALUES(2, 3)
WITH tEmpSkill
AS
(SELECT A.EmpId, A.FullName, C.SkillName
FROM _Test.SkillSet C RIGHT JOIN
(
_Test.Employees A LEFT JOIN _Test.Employee_Skill B
ON A.EmpId = B.EmpId
)
ON B.SkillId = C.SkillId
)
SELECT * FROM tEmpSkill
PIVOT(COUNT(SkillName) FOR SkillName IN([C/C++], [C#], [Java])) AS Competency
The query above gives me an intermediate result
PIVOT RESULT
Now you can easily make a string containing all the skills needed for each employee. You can also search for some articles to use the PIVOT with unknown number of columns (skill sets), which may better serve your need.
Hope this can help.

Solution for INSERT or UPDATE and fetching newly created id on SQL server

Is it possible to perform an insert or update with the following constraints :
Dapper is used in the project
The Primary Key is auto-incrementing positive
The newly inserted data may not have a PK value (or has a value of 0)
The data needs to have the PK value on a newly inserted row.
The query is being generated procedurally, and generalizing it would be preferable
Something like :
int newId = db.QueryValue<int>( <<insert or update>>, someData );
I have read about different solutions, and the best solution seems to be this one :
merge tablename as target
using (values ('new value', 'different value'))
as source (field1, field2)
on target.idfield = 7
when matched then
update
set field1 = source.field1,
field2 = source.field2,
...
when not matched then
insert ( idfield, field1, field2, ... )
values ( 7, source.field1, source.field2, ... )
but
it seems to fail on the third constraint and
it does not guarantee to return the newly generated id.
Because of the 5th constraint (or preferance), a stored procedure seems overly complicated.
What are the possible solutions? Thanks!

If your table has an auto-increment field, you can't assign a value to that field when inserting a record. OK you can, but it's normally a bad idea :)
Using the T-SQL MERGE statement you can put all of the values into the source table, including your default invalid identity value, then write the insert clause as:
when not matched then
insert (field1, field2, ...)
values (source.field1, source.field2, ...)
: and use the output clause to get the inserted identity value:
OUTPUT inserted.idfield
That said, I think you might be complicating your SQL code generation a little, especially for tables with a lot of fields. It is often better to generate distinct UPDATE and INSERT queries... especially if you've got some way of tracking the changes to the object so that you can only update the changed fields.
Assuming you're working on MS SQL, you can use SCOPE_IDENTITY() function after the INSERT statement to get the value of the identity field for the record in a composite statement:
INSERT INTO tablename(field1, field2, ...)
VALUES('field1value', 'field2value', ...);
SELECT CAST(SCOPE_IDENTITY() AS INT) ident;
When you execute this SQL statement you'll get back a resultset with the inserted identity in a single column. Your db.QueryValue<int> call will then return the value you're after.
For standard integer auto-increment fields the above is fine. For other field types, or for a more general case, try casting SCOPE_IDENTITY() result to VARCHAR(MAX) and parse the resultant string value to whichever type your identity column expects - GUID, etc.
In the general case, try this in your db class:
public string InsertWithID(string insertQuery, params object[] parms)
{
string query = insertQuery + "\nSELECT CAST(SCOPE_IDENTITY() AS VARCHAR(MAX)) ident;\n";
return this.QueryValue<string>(insertQuery, parms);
}
And/or:
public int InsertWithIntID(string insertQuery, params object[] parms)
{
string query = insertQuery + "\nSELECT CAST(SCOPE_IDENTITY() AS INT) ident;\n";
return this.QueryValue<int>(query, parms);
}
That way you can just prepare your insert query and call the appropriate InsertWithID method to get the resultant identity value. That should satisfy your 5th constraint with luck :)

Ignoring accents while searching the database using Entity Framework

I have a database table that contains names with accented characters. Like ä and so on.
I need to get all records using EF4 from a table that contains some substring regardless of accents.
So the following code:
myEntities.Items.Where(i => i.Name.Contains("a"));
should return all items with a name containing a, but also all items containing ä, â and so on. Is this possible?

If you set an accent-insensitive collation order on the Name column then the queries should work as required.

Setting an accent-insensitive collation will fix the problem.
You can change the collation for a column in SQL Server and Azure database with the next query.
ALTER TABLE TableName
ALTER COLUMN ColumnName NVARCHAR (100)
COLLATE SQL_LATIN1_GENERAL_CP1_CI_AI NOT NULL
SQL_LATIN1_GENERAL_CP1_CI_AI is the collation where LATIN1_GENERAL is English (United States), CP1 is code page 1252, CI is case-insensitive, and AI is accent-insensitive.

I know that is not so clean solution, but after reading this I tried something like this:
var query = this.DataContext.Users.SqlQuery(string.Format("SELECT * FROM dbo.Users WHERE LastName like '%{0}%' COLLATE Latin1_general_CI_AI", parameters.SearchTerm));
After that you are still able to call methods on 'query' object like Count, OrderBy, Skip etc.

You could create an SQL Function to remove the diacritics, by applying to the input string the collation SQL_Latin1_General_CP1253_CI_AI, like so:
CREATE FUNCTION [dbo].[RemoveDiacritics] (
#input varchar(max)
) RETURNS varchar(max)
AS BEGIN
DECLARE #result VARCHAR(max);
select #result = #input collate SQL_Latin1_General_CP1253_CI_AI
return #result
END
Then add it in the DB context (in this case ApplicationDbContext) by mapping it with the attribute DbFunction:
public class ApplicationDbContext : IdentityDbContext<CustomIdentityUser>
{
[DbFunction("RemoveDiacritics", "dbo")]
public static string RemoveDiacritics(string input)
{
throw new NotImplementedException("This method can only be used with LINQ.");
}
public ApplicationDbContext(DbContextOptions<ApplicationDbContext> options)
: base(options)
{
}
}
And Use it in LINQ query, for example:
var query = await db.Users.Where(a => ApplicationDbContext.RemoveDiacritics(a.Name).Contains(ApplicationDbContext.RemoveDiacritics(filter))).tolListAsync();

Accent-insensitive Collation as Stuart Dunkeld suggested is definitely the best solution ...
But maybe good to know:
Michael Kaplan once posted about stripping diacritics:
static string RemoveDiacritics(string stIn)
{
string stFormD = stIn.Normalize(NormalizationForm.FormD);
StringBuilder sb = new StringBuilder();
for(int ich = 0; ich < stFormD.Length; ich++)
{
UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
if(uc != UnicodeCategory.NonSpacingMark)
{
sb.Append(stFormD[ich]);
}
}
return(sb.ToString().Normalize(NormalizationForm.FormC));
}
Source
So your code would be:
myEntities.Items.Where(i => RemoveDiacritics(i.Name).Contains("a"));

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

SQL Server and Chinese character on INSERT/SELECT query - c#

Related

How do I insert multiple records using Dapper while also including other dynamic parameters?

Can I use special characters in column name in SQL Server and C# data table

Preventing duplication from SQL Query, ASP.net

Solution for INSERT or UPDATE and fetching newly created id on SQL server

Ignoring accents while searching the database using Entity Framework

Categories

Resources