I have a sorted list of insert statements that I am trying to write to an Access db. I have triple verified that the list of insert statements is in the correct order. When I open the mdb file the records are never in order. Maybe for the first 100 records, but after that it starts getting out of whack.
I am really at a loss here, any ideas? Note that this table is being created in C# first dynamically - i.e. the set of of columns is not predictable each time this code needs to be run.
Maybe you just need to add an ID field to the tables and then the insertion order should be maintained.
When adding rows to any database, the concept of "Order inside a Table" is meaningless.
You get your order when retrieving records by using an ORDER BY.
Make sure you have an ID or TimeStamp column to sort on.
Related
I am writing an application in C# that will copy data from one postgres table to another on a regular basis. I am using the NPGSql library.
I have run into the following issue: When there are thousands of rows to be copied (> 10k), the program runs very slowly.
I have tried:
For my first attempt, I pulled the entirety of the destination table, then compared the data I was inserting to the data that already existed. Then, I would write an insert or update statement depending on whether it already existed but had alterations, or whether it did not exist at all. This was the worst solution, as every individual statement had to be sent as a command.
Next, I tried putting an "on conflict" trigger on the actual table. This let me send all of the inserts as bulk INSERT INTO.... statements, and the table would take care of updates. This was significantly faster, but not fast enough.
I read about Postgres's COPY method, but it does not seem to suit my needs. It seems that COPY will do ONLY an insert, and NOT an upsert. Because I am modifying this table several times, some of the data will be new, but some will be old rows that need updating.
Has anyone come up with a fast way to UPSERT, provided that I need an option to EDIT a row, not just do a blanket mass INSERT of all of my data?
Please let me know if I can provide any other information
Thank you so much for your time
First of all, I assume the tables are on different databases, otherwise I would just do this all in DML.
I think copy is definitely your friend. There is no faster way to extract or load data, and then you can let the database do the heavy lifting.
On the source database:
copy source_table
to '/var/tmp/foo.csv' csv;
On the destination database:
truncate temp_table;
copy temp_table
from '/var/tmp/foo.csv' csv;
insert into destination_table
select *
from temp_table t
where not exists (
select null
from destination_table d
where t.id = d.id
);
update destination_table d
set
field1 = t.field1,
field2 = t.field2
from temp_table t
where
d.id = t.id and
(d.field1 is distinct from t.field1 or
d.field2 is distinct from t.field2)
It would be great if you can do something like this if the data is readily available:
Couple of other comments:
the insert into uses an anti-join, and this is my favorite construct to insert missing records
on the update, it's important to specify the criteria for what you udpate -- don't update everything; only those records that have changed. This will make a big difference in performance. Hopefully there are a set number of fields you can use to determine if a record has changed.
If there is a field that indicates the record has been updated (last_update_date or something similar), a slightly lazier and wonderful approach is to delete those records and let the anti-join insert re-insert them. This would omit the need for the update statement and would be much less code for tables with lots of columns
I have stored various records in the MySql database "orkut". Now I want to sort that database through a java program. I have connected to the database through the jdbc driver.
Now I want to sort that database in decreasing order of the field "number" of type "int" but don't know the commands. I have "con" reference variable which denotes the connection to the MySql database.
One more thing, there is a field "sr_no" that denotes the serial no. of the record and it is not the primary key.
I want that this field won't change after sorting the database as the serial no. should not change on changing the order of the records.
I want this sorting permanently stored on the same database. I don't want sorted ResultSet.
I want sorted database.
Don't try to sort this through Java--you'll kill yourself trying. SQL has an order by clause that does exactly this. Here's the SQL:
select
number,
sr_no
from
tbl
order by
number desc
Also note that you cannot have a permanently sorted database. The way that the data is stored does not lend itself to being stored in whatever order you choose. You should never count on the order of a database to be the same, unless you use an order by in your query.
as eric told you can't have a permanently sorted database. But if you want to execute this query on a large dataset very frequently then you can do indexing supported by various database.
It will speedup your searching and sorting for a particular key.
I was given a task to insert over 1000 rows with 4 columns. The table in question does not have a PK or FK. Let's say it contains columns ID, CustomerNo, Description. The records needed to be inserted can have the same CustomerNo and Description values.
I read about importing data to a temporary table, comparing it with the real table, removing duplicates, and moving new records to the real table.
I also could have 1000 queries that check if such a record already exists and insert data if it does not. But I'm too ashamed to try that out for obvious reasons.
I'm not expecting any specific code, because I did not give any specific details. What I'm hoping for is some pseudocode or general advice for completing such tasks. I can't wait to give some upvotes!
So the idea is, you don't want to insert an entry if there's already an entry with the same ID?
If so, after you import your data into a temporary table, you can accomplish what you're looking for in the where clause of a select statement:
insert into table
select ID, CustomerNo, Description from #data_source
where (#data_source.ID not in (select table.ID from table))
I would suggest to you to load the data into a temp table or variable table. Then you can do a "Select Into" using the distinct key word which will removed the duplicated records.
you will always need to read the target table, unless you bulk load the target table into a temp table(in this point you will have two temp tables) compare both, eliminate duplicates and then insert in target table, but even this is not accurate, because you can have a new insert in the target table while you do this.
I am dealing with a huge database with millions of rows. I would like to run an SQL statement through C#, which selects 1.2 million rows from one database, and inserts them into another after parsing and modifying some data.
I originally wanted to do so by first running the select statement and parsing the data by iterating through the MySqlDataReader object which contains the data. This would be a memory overhead, so I have decided to select one row, parse it and insert into the other database, and then move onto the next row.
How can this be done? I have tried the SELECT....INTO syntax for a MySQL query, however this still seems to select all the data, and then inserts it after.
Use SqlBulkCopy Class to move data from one source to other
http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy%28v=vs.110%29.aspx
I am not sure if you are able to add a new column to the existing table. If you are able to add a new column, you can use the new column as a flag. It could be "TRANSFERED(boolean)".
You will select one row at a time with the condition TRANSFERED=FALSE and do the process. After that row is processed, you should update as TRANSFERED=TRUE.
Or, you must have a uniqe id column in your existing table. Create a temp table which will store the id of processed rows, that way you will know which rows are processed or not
I am not quite sure what is your error. For your case, I suggest you should use 'select top 1000 ' to get the data because insert row one by one is really slow. After that, you can use 'insert into query', it should be noted that sqlbulkcopy is just for sql server, I suggest you use the stringbuilder to make the sql query for if you use string, it will has a big overhead to concat the string.
i want to store a great amount of strings into my sqlite database. I want them to be always in the same order when i read them as i add them to the database. I know i could give them an autoincrementing primary key and sort by that but since there can be up to 100.000 strings this is a performance issue. Besides the order should NEVER change or be sorted in any different way.
short example:
sql insert "hghtzdz12g"
sql insert "jut65bdt"
sql insert "lkk7676nbgt"
sql select * should give ALWAYS this order {"hghtzdz12g", "jut65bdt", "lkk7676nbgt" }
Any ideas how to achive this ?
Thanks
In a query like
SELECT * FROM MyTable ORDER BY MyColumn
the database does not need to sort the results if the column is indexed, because it can just scan through the index entries in order.
The rowid (or whatever you call the autoincrementing column) is an index, and is even more efficient than a separate index.
If you are sure you will never need anything but exactly this array in exactly this order, you can cheat the database and put in a single blob field.
But then you should ask yourself why you chose a database in the first place.
The correct database solution is indeed a table using a key that you can sort by.
If this performance is not enough, you can have a look here for performance hints.
If you need ultra-fast performance, maybe a database is not the best tool for the job. Databases are used for their ACID abilities and speed is not one of them but rather a secondary objective of everything in software.