How to check all multiple tables are updated? - c#

While writing stored procedure, i have remove set no count on and check whether multiple rows are affected to check whether the table values are affected or not.
Then I have realized it will give bad performance.
Then I have implemented like ##rowcount.
But for checking one table this will be the good idea.
In stored procedure, I will update more than one table and delete more than one table.
How to return whether the values are updated/deleted in efficient way to the server side (where i will use .ExecuteScalar)?

Variant 1: Create trigger to log changes into table and after then get information from this table.
Variant 2: Use system variables ##rowcount to get information about rows effected.
Variant 3: Get information about rows effected and use your variable or output select or output statement which can store number of changed rows
Variant 4: Rewrite your code to pattern: int numberOfRecords = comm.ExecuteNonQuery();

Related

Read from one table and insert into another - one row at a time

I am dealing with a huge database with millions of rows. I would like to run an SQL statement through C#, which selects 1.2 million rows from one database, and inserts them into another after parsing and modifying some data.
I originally wanted to do so by first running the select statement and parsing the data by iterating through the MySqlDataReader object which contains the data. This would be a memory overhead, so I have decided to select one row, parse it and insert into the other database, and then move onto the next row.
How can this be done? I have tried the SELECT....INTO syntax for a MySQL query, however this still seems to select all the data, and then inserts it after.
Use SqlBulkCopy Class to move data from one source to other
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy%28v=vs.110%29.aspx
I am not sure if you are able to add a new column to the existing table. If you are able to add a new column, you can use the new column as a flag. It could be "TRANSFERED(boolean)".
You will select one row at a time with the condition TRANSFERED=FALSE and do the process. After that row is processed, you should update as TRANSFERED=TRUE.
Or, you must have a uniqe id column in your existing table. Create a temp table which will store the id of processed rows, that way you will know which rows are processed or not
I am not quite sure what is your error. For your case, I suggest you should use 'select top 1000 ' to get the data because insert row one by one is really slow. After that, you can use 'insert into query', it should be noted that sqlbulkcopy is just for sql server, I suggest you use the stringbuilder to make the sql query for if you use string, it will has a big overhead to concat the string.

Add column to existing SQL Server table - Implications

I have an existing table in SQL Server with existing entries (over 1 million in fact).
This table gets updated, inserted and selected from on a regular basis by a front-end application. I want/need to add a datetime column e.g. M_DateModified that can be updated like so:
UPDATE Table SET M_DateModified = GETDATE()
whenever a button gets pressed on the front-end and a stored procedure gets called. This column will be added to an existing report as requested.
My problem, and answer is this. Being one of the core tables of our app, will ALTERING the table and adding an additional column break other existing queries? Obviously, you can't insert into a table without specifying all values for all columns so any existing INSERT queries will break (WHICH is a massive problem).
Any help would be much appreciated on the best solution regarding this problem.
First, as marc_s says, It should only affect SELECT * queries, and not even all of them would necessarily be affected.
Secondly, you only need to specify all non-Null fields on an INSERT, so if you make it NULL-able, you don't have to worry about that. Further, for a Created_Date-type column, it is typical to add a DEFAULT setting of =GetDate(), which will fill it in for you if it is not specified.
Thirdly, if you are still worried about impacting your existing code-base, then do the following:
Rename your table to something like "physicalTable".
Create a View with the same name that your table formely had, that does a SELECT .. FROM physicalTable, listing the columns explicitly and in the same order, but do not include the M_DateModified field in it.
Leave your code unmodified, now referencing the View, instead of directly accessing the table.
Now your code can safely interact with the table without any changes (SQL DML code cannot tell the difference between a Table and a writeable View like this).
Finally, this kind of "ModifiedDate" column is a common need and is most often handled, first by making it NULL-able, then by adding an Insert & Update trigger that sets it automatically:
UPDATE t
SET M_DateModified = GetDate()
FROM (SELECT * FROM physicalTable y JOIN inserted i ON y.PkId = i.PkId) As t
This way the application does not have to maintain the field itself. As an added bonus, neither can the application set it incorrectly or falsely (this is a common and acceptable use of triggers in SQL).
If the new column is not mandantory you have nothing to worry about. Unless you have some knuckleheads who wrote select statements with a "*" instead of column list.
Well, as long as your SELECTs are not *, those should be fine. For the INSERTs, if you give the field a default of GETDATE() and allow NULLs, you can exclude it and it will still be filled.
Depends on how your other queries are set up. If they are SELECT [Item1], [Item2], ect.... Then you won't face any issues. If it's a SELECT * FROM then you may experience some unexpected results.
Keep in mind how you want to set it up, you'll either have to set it to be nullable which could give you fits down the road, or set a default date, which could give you incorrect data for reporting, retrieval, queries, ect..

Primary key violation error in sql server 2008

I have created two threads in C# and I am calling two separate functions in parallel. Both functions read the last ID from XYZ table and insert new record with value ID+1. Here ID column is the primary key. When I execute the both functions I am getting primary key violation error. Both function having the below query:
insert into XYZ values((SELECT max(ID)+1 from XYZ),'Name')
Seems like both functions are reading the value at a time and trying to insert with the same value.
How can I solve this problem.. ?
Let the database handle selecting the ID for you. It's obvious from your code above that what you really want is an auto-incrementing integer ID column, which the database can definitely handle doing for you. So set up your table properly and instead of your current insert statement, do this:
insert into XYZ values('Name')
If your database table is already set up I believe you can issue a statement similar to:
alter table your_table modify column you_table_id int(size) auto_increment
Finally, if none of these solutions are adequate for whatever reason (including, as you indicated in the comments section, inability to edit the table schema) then you can do as one of the other users suggested in the comments and create a synchronized method to find the next ID. You would basically just create a static method that returns an int, issue your select id statement in that static method, and use the returned result to insert your next record into the table. Since this method would not guarantee a successful insert (due to external applications ability to also insert into the same table) you would also have to catch Exceptions and retry on failure).
Set ID column to be "Identity" column. Then, you can execute your queries as:
insert into XYZ values('Name')
I think that you can't use ALTER TABLE to change column to be Identity after column is created. Use Managament Studio to set this column to be Identity. If your table has many rows, this can be a long running process, because it will actually copy your data to a new table (will perform table re-creation).
Most likely that option is disabled in your Managament Studio. In order to enable it open Tools->Options->Designers and uncheck option "Prevent saving changes that require table re-creation"...depending on your table size, you will probably have to set timeout, too. Your table will be locked during that time.
A solution for such problems is to have generate the ID using some kind of a sequence.
For example, in SQL Server you can create a sequence using the command below:
CREATE SEQUENCE Test.CountBy1
START WITH 1
INCREMENT BY 1 ;
GO
Then in C#, you can retrieve the next value out of Test and assign it to the ID before inserting it.
It sounds like you want a higher transaction isolation level or more restrictive locking.
I don't use these features too often, so hopefully somebody will suggest an edit if I'm wrong, but you want one of these:
-- specify the strictest isolation level
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
insert into XYZ values((SELECT max(ID)+1 from XYZ),'Name')
or
-- make locks exclusive so other transactions cannot access the same rows
insert into XYZ values((SELECT max(ID)+1 from XYZ WITH (XLOCK)),'Name')

Show how many rows were deleted

I use C# program and my database is in SQL server 2008.
When user deleted some rows from database, I want to show him/her in windows application how many rows deleted.
I want to know how I can send SQL message to C# and show it for user.
For example when I deleted 4 rows from table, SQL show message like (4 row(s) affected). Now I want to send number 4 to my C# program. How can I do it? Thank you.
If you are using SqlCommand from your .NET application to perform your delete/update, the result of ExecuteNonQuery() returns the number of rows affected by the last statement of the command.
See http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.executenonquery.aspx.
If you're using the System.Data.SqlClient.SqlCommand.ExecuteNonQuery method or System.Data.Common.DbCommand.ExecuteNonQuery method, then the return value should be the number of rows affected by your statement (the last statement in your command, I think).
There is a caveat to this...if you execute a batch or stored procedure that does SET NOCOUNT ON, then the number of rows affected by each statement is not reported and ExecuteNonQuery will return -1 instead.
in T-SQL, there is a ##rowcount variable that you can access in order to get the number of rows affected by the last statement. Obviously you would need to grab that immediately after your DELETE statement, but I believe you could do a return ##rowcount within your T-SQL if you are using SET NOCOUNT ON.
Alternatives would be to return the value as an OUTPUT parameter, especially if you have a batch of multiple statements and you'd like to know how many rows are affected by each. Some people like to use the T-SQL RETURN statement to report success/failure, so you may want to avoid returning "number of rows affected" for consistency's sake.
I imagine you would want to do a select "count" on the delete statement before you issue the delete, then capture the number and manipulate it as needed.
Use the ##RowCount SQL Environment variable.
You can return it from a Stored Procedure if you are using them.

Can I get the rowcount before executing a stored procedure?

I have some complex stored procedures that may return many thousands of rows, and take a long time to complete.
Is there any way to find out how many rows are going to be returned before the query executes and fetches the data?
This is with Visual Studio 2005, a Winforms application and SQL Server 2005.
You mentioned your stored procedures take a long time to complete. Is the majority of the time taken up during the process of selecting the rows from the database or returning the rows to the caller?
If it is the latter, maybe you can create a mirror version of your SP that just gets the count instead of the actual rows. If it is the former, well, there isn't really that much you can do since it is the act of finding the eligible rows which is slow.
A solution to your problem might be to re-write the stored procedure so that it limits the result set to some number, like:
SELECT TOP 1000 * FROM tblWHATEVER
in SQL Server, or
SELECT * FROM tblWHATEVER WHERE ROWNUM <= 1000
in Oracle. Or implement a paging solution so that the result set of each call is acceptably small.
make a stored proc to count the rows first.
SELECT COUNT(*) FROM table
Unless there's some aspect of the business logic of you app that allows calculating this, no. The database it going to have to do all the where & join logic to figure out how line rows, and that's the vast majority of the time spend in the SP.
You can't get the rowcount of a procedure without executing the procedure.
You could make a different procedure that accepts the same parameters, the purpose of which is to tell you how many rows the other procedure should return. However, the steps required by this procedure would normally be so similar to those of the main procedure that it should take just about as long as just executing the main procedure.
You would have to write a different version of the stored procedure to get a row count. This one would probably be much faster because you could eliminate joining tables which you aren't filtered against, remove ordering, etc. For example if your stored proc executed the sql such as:
select firstname, lastname, email, orderdate from
customer inner join productorder on customer.customerid=productorder.productorderid
where orderdate>#orderdate order by lastname, firstname;
your counting version would be something like:
select count(*) from productorder where orderdate>#orderdate;
Not in general.
Through knowledge about the operation of the stored procedure, you may be able to get either an estimate or an accurate count (for instance, if the "core" or "base" table of the query is able to be quickly calculated, but it is complex joins and/or summaries which drive the time upwards).
But you would have to call the counting SP first and then the data SP or you could look at using a multiple result set SP.
It could take as long to get a row count as to get the actual data, so I wouldn't advodate performing a count in most cases.
Some possibilities:
1) Does SQL Server expose its query optimiser findings in some way? i.e. can you parse the query and then obtain an estimate of the rowcount? (I don't know SQL Server).
2) Perhaps based on the criteria the user gives you can perform some estimations of your own. For example, if the user enters 'S%' in the customer surname field to query orders you could determine that that matches 7% (say) of the customer records, and extrapolate that the query may return about 7% of the order records.
Going on what Tony Andrews said in his answer, you can get an estimated query plan of the call to your query with:
SET showplan_text OFF
GO
SET showplan_all on
GO
--Replace with call you your stored procedure
select * from MyTable
GO
SET showplan_all ofF
GO
This should return a table, or many tables which will let you get the estimated row count of your query.
You need to analyze the returned data set, to determine what is a logical, (meaningful) primary key for the result set that is being returned. In general this WILL be much faster than the complete procedure, because the server is not constructing a result set from data in all the columns of each row of each table, it is simply counting the rows... In general, it may not even need to read the actual table rows off disk to do this, it may simply need to count index nodes...
Then write another SQL statement that only includes the tables necessary to generate those key columns (Hopefully this is a subset of the tables in the main sql query), and the same where clause with the same filtering predicate values...
Then add another Optional parameter to the Stored Proc called, say, #CountsOnly, with a default of false (0) as so...
Alter Procedure <storedProcName>
#param1 Type,
-- Other current params
#CountsOnly TinyInt = 0
As
Set NoCount On
If #CountsOnly = 1
Select Count(*)
From TableA A
Join TableB B On etc. etc...
Where < here put all Filtering predicates >
Else
<Here put old SQL That returns complete resultset with all data>
Return 0
You can then just call the same stored proc with #CountsOnly set equal to 1 to just get the count of records. Old code that calls the proc would still function as it used to, since the parameter value is set to default to false (0), if it is not included
It's at least technically possible to run a procedure that puts the result set in a temporary table. Then you can find the number of rows before you move the data from server to application and would save having to create the result set twice.
But I doubt it's worth the trouble unless creating the result set takes a very long time, and in that case it may be big enough that the temp table would be a problem. Almost certainly the time to move the big table over the network will be many times what is needed to create it.

Categories

Resources