Problem generating primary key with SSIS - c#

I am with my boss and we are having a problem with an SSIS project.
Are DataModel sucks and doesn't have a automatic primary key so we have to do the classic and nasty
Select Max(id) + 1 from customer
The problem is that from the moment that my script task generate the PK to the moment I insert there are 10 rows that has been turning into my script task so i get 10 time the same ID and the app crash big time!!
How could that in SSIS????

I got a simple answer i juste put my records that i wanted to insert in a DataSet without any PK id and outside my dataflow i do a foreach loop that get foreach record a new PK ID and insert it one by one.
DONE!

create a TempWork table, with the same exact structure as the final destination table, except make the PK an IDENTITY(n,1) where "n" is the next value based on the final destination table's PK. Use SSIS to insert into this TempWork table, and the IDs will be generated for you. when it is all done, do this:
INSERT INTO FinalTable (PK,col1, col2,...) SELECT PK, col1, col2... from TempWork
then DROP TABLE TempWork

Related

Editing duplicate values in a database

I have a DataGrid View pulling some items from my database. What I want to achieve is to be able to edit the pack size or the bar_code fields. I am aware on how to update values in a database but how would I go about doing it if the data is the same? Meaning in many instances a bar code would have multiple pack sizes that is related to the one bar code number. Let's say I have the below screenshot. A data entry error was made and the bar_code and PackSize columns are the exact same. I want to change the first bar code to "1234." How would I achieve this? I can't say update barcode to 'textBox1.Text' where bar_code = '771313166386' because it would then change both data. How do I go about only focusing on one row of data at a time?
You can try using this query to update only the first row:
UPDATE TOP (1) my_table
SET bar_code = '1234'
WHERE bar_code = '771313166386'
You should have an auto-increment id column or a Primary key in your table.
I'd suggest you handle the logic of data duplicate manipulation at the backend rather than pulling them inside the grid and handle it there.
The following query will help you retrieve the duplicate records based on the mentioned columns. You can change it to UPDATE or DELETE as per your requirement.
-- Using cte and ranking function
;With CTE
As
(
Select
Product,
Description,
BarCode,
PackSize
Row_Number() Over(Partition By Product, BarCode, PackSize Order By Product) As RowNum
From YourTable
)
Select * From CTE
-- Where RowNum > 1;
Hope this is helpful :)
This might not help you directly in your answer. But, it is important to mention that your table design is incorrect. You should ensure the data integrity by creating a primary key in your table.
So when you need to update a product you have only one row to update.
Then you can add more tables and use foreign key references between them.
You need to uniquely represent the products. As per your sample data, I guess that there isn't any primary key on your table.
What you can do is either specify a unique constraint on columns to ensure that this type of data entry cannot be done.
If you cannot come up with list of columns to uniquely identify the rows, you can use surrogate keys by specifying Identity column and then while updating, always put a constraint where thisIdentityColumn=value
A data entry error was made and the bar_code and PackSize columns are
the exact same
I think this is the key. Essentially, the exact duplicates are unintentional, and the rows should be unique. Further it looks like bar_code + pack_size is your primary key (subject to data being entered correctly).
So, when you do an update, simply update the first row found that matches a bar_code and a pack_size. If it isn't unique, then the update should ensure that you are one step closer to unique rows in the database.
If you need a non-verbal answer, let me know.

how to insert same LAST_INSERT_ID() in one table to create multiple records;

I am programming a website, with ASP.NET, C#, as well as using MYSQL.
I need to be able to record two rows with same child_id;
My problem is with this one statement, it inserts one row and skips the last.
//data.ChldrenRecord.Service contain two records
// babysit, and tutor
foreach (string s in data.ChildrenRecord.Services)
{
query += (" INSERT INTO ServiceChildren SET service='" + s + "', child_id=LAST_INSERT_ID();");
}
so table should look something like this
id service child_id
1 babysit 1
2 tutor 1
I use Last_INSERT_ID() because child_id is a foreign key. I create a record in another table whose primary key is child_id. Afterwards, I use LAST_INSERT_ID() to reference that one record primary key child_id so that i may use it in my ServiceChildren table.
as it stands my table looks:
id service child_id
1 babysit 1
I think i am on the right track now. if i log directly into the database, this statement works:
SET #last_id = LAST_INSERT_ID();
//then insert statements
INSERT INTO ServiceChildren(service, child_id) VALUES('babysit', #last_id);
INSERT INTO ServiceChildren(service, child_id) VALUES('tutor', #last_id);
BUT IN MY C#, ASP.NET CODE THE LINE
SET #last_id = LAST_INSERT_ID();
does not do anything, even populate tables with a prior insert statement.
OK, I found out my problem. I needed to place Allow User Variables=True in the connection string to be allowed to create and use #last_id.

Query and export from unsortable table SQL Server

First I am sorry for my bad English, is not my language.
My problem is: I have a table with around 10 million records of transaction of bank. It don't have PK and didn't sort as any column.
My work is create a page to filter and export it to csv. But limit of rows to export Csv is around 200k records.
I have some idea like:
create 800 tables of 800 ATMs (just an idea, I know it's stupid) and send data from main table to it 1 time per day => export to 800 file csv
use Linq to get 100k record per time then next time, I skip those. But I am stuck when Skip command need OrderBy and I got OutOfMemoryException with it
db.tblEJTransactions.OrderBy(u => u.Id).Take(100000).ToList()
Can anyone help me, every idea is welcome (my boss said I can use anything includes create hundred of tables, use Nosql ... )
If you don't have a primary key in your table, then add one.
The simplest and easiest is to add an int IDENTITY column.
ALTER TABLE dbo.T
ADD ID int NOT NULL IDENTITY (1, 1)
ALTER TABLE dbo.T
ADD CONSTRAINT PK_T PRIMARY KEY CLUSTERED (ID)
If you can't alter the original table, create a copy.
Once the table has a primary key you can sort by it and select chunks/pages of 200K rows with predictable results.
I'm not sure about my solution. But you can refer and try it:
select top 1000000 *, row_number() over (order by (select null)) from tblEJTransactions
The above query returns sorted list.
And then you can use Linq to get the result.

SQL batch insert, avoid duplicates, no PK

I was given a task to insert over 1000 rows with 4 columns. The table in question does not have a PK or FK. Let's say it contains columns ID, CustomerNo, Description. The records needed to be inserted can have the same CustomerNo and Description values.
I read about importing data to a temporary table, comparing it with the real table, removing duplicates, and moving new records to the real table.
I also could have 1000 queries that check if such a record already exists and insert data if it does not. But I'm too ashamed to try that out for obvious reasons.
I'm not expecting any specific code, because I did not give any specific details. What I'm hoping for is some pseudocode or general advice for completing such tasks. I can't wait to give some upvotes!
So the idea is, you don't want to insert an entry if there's already an entry with the same ID?
If so, after you import your data into a temporary table, you can accomplish what you're looking for in the where clause of a select statement:
insert into table
select ID, CustomerNo, Description from #data_source
where (#data_source.ID not in (select table.ID from table))
I would suggest to you to load the data into a temp table or variable table. Then you can do a "Select Into" using the distinct key word which will removed the duplicated records.
you will always need to read the target table, unless you bulk load the target table into a temp table(in this point you will have two temp tables) compare both, eliminate duplicates and then insert in target table, but even this is not accurate, because you can have a new insert in the target table while you do this.

How to update autoincremented id when delete row in table?

I am creating application that uses MYSQL database in C#. I want to delete row and update autoincremented value of id in table. For example, I have table with two columns: id and station, and table is station list. Something like this
id station
1 pt1
2 pt2
3 pt3
If i delete second row, after deleting the table looks something like this:
id station
1 pt1
3 pt3
Is there any way that I update id of table, for this example that id in third row instead value 3 have value 2?
Thanks in advance!
An autoincrement column, by definition, should not be changed manually.
What happen if some other tables use this ID (3) as foreign key to refer to that record in this table? That table should be changed accordingly.
(Think about it, in your example is simple, but what happen if you delete ID = 2 in a table where the max(ID) is 100000? How many updates in the main table and in the referring tables?)
And in the end there is no real problem if you have gaps in your numbering.
I suggest you don't do anything special when a row is deleted. Yes you will have gaps in the ids, but why do you care? It is just an id.
If you change the value of id_station, you would also need to update the value in all tables that have an id_station field. It causes more unnecessary UPDATES.
The only way to change the value of the id column in other rows is with an UPDATE statement. There is no builtin mechanism to accomplish what you want.
I concur with the other answers here; normally, we do not change the value of an id column in other rows when a row is deleted. Normally, that id column is a primary key, and ideally, that primary key value is immutable (it is assigned once and it doesn't change.) If it does change, then any references to it will also need to change. (The ON UPDATE CASCADE for a foreign key will propagate the change to a child table, for storage engines like InnoDB that support foreign keys, but not with MyISAM.
Basically, changing an id value causes way more problems than it solves.
There is no "automatic" mechanism that changes the value of a column in other rows when a row is deleted.
With that said, there are times in the development cycle where I have had "static" data, and I wanted control over the id values, and I have made changes to id values. But this
is an administrative exercise, not a function performed by an application.

Categories

Resources