How to know when to stop filling an OracleDataAdapter - c#

I'm using the OPD.NET dll in a project that is accessing oracle.
Users can type in any SQL into a text box, that is then executed against the db. I've been trying to use the OracleDataAdapter to populate a datatable with the resultset, but I want to be able to return the resultset in stages (for large select queries).
An example of my problem is...
If a select query returns 13 rows of data, the code snippet below will execute without issue until the fourth time oda.Fill (start row is 15 which doesn't exist) is called, I presume because it is calling into a reader that has closed or something similar.
It then will throw a System.InvalidOperationException with the message - Operation is not valid due to the current state of the object.
How can I find out how many rows in total the command will eventually contain (so that I don't encounter the exception)?
OracleDataAdapter oda = new OracleDataAdapter(oracleCommand);
oda.Requery = false;
var dts = new DataTable[] { dt };
DataTable dt = new DataTable();
oda.Fill(0, 5, dts);
var a = dts[0].Rows.Count;
oda.Fill(a, 5, dts);
var b = dts[0].Rows.Count;
oda.Fill(b, 5, dts);
var c = dts[0].Rows.Count;
oda.Fill(c, 5, dts);
var d = dts[0].Rows.Count;
Note: I've omitted the connection and oracle command objects for brevity.
EDIT 1:
I've just thought I could just wrap the SQL entered by the user in another query and execute it...
SELECT COUNT(*) FROM (...intial query in here...)
but this isn't exactly a clean solution, and surely there is a method somewhere that I haven't seen?
Thanks in advance.

For paging in Oracle, see: http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html
There is no way to know the record set count without running a separate count(*) query. This is by design. The DataReader and DataAdapter are forward-only, read only.
If efficiency is a concern (i.e., large record sets), one should let the database do the paging and not ask the OracleDataAdapter to run the full query. Imagine if Google filled a DataTable with all 1M+ results for each user search!! The following article addresses this concern, although the examples are in sql:
http://www.asp.net/data-access/tutorials/efficiently-paging-through-large-amounts-of-data-cs
I've revised my example below to allow paging on any sql query. The calling procedure is responsible for keeping track of the user's current page and page size. If the result set is less than the requested page size, there are no more pages.
Of course, running custom sql from user input is a huge security risk. But that wasn't the question at hand.
Good luck! --Brett
DataTable GetReport(string sql, int pageIndex, int pageSize)
{
DataTable table = new DataTable();
int rowStart = pageIndex * pageSize + 1;
int rowEnd = (pageIndex + 1) * pageSize;
string qry = string.Format(
#"select *
from (select rownum ""ROWNUM"", a.*
from ({0}) a
where rownum <= :rowEnd)
where ""ROWNUM"" >= :rowStart
", sql);
try
{
using (OracleConnection conn = new OracleConnection(_connStr))
{
OracleCommand cmd = new OracleCommand(qry, conn);
cmd.Parameters.Add(":rowEnd", OracleDbType.Int32).Value = rowEnd;
cmd.Parameters.Add(":rowStart", OracleDbType.Int32).Value = rowStart;
cmd.CommandType = CommandType.Text;
conn.Open();
OracleDataAdapter oda = new OracleDataAdapter(cmd);
oda.Fill(table);
}
}
catch (Exception)
{
throw;
}
return table;
}

You could add an Analytic COUNT to your query:
SELECT foo, bar, COUNT(*) OVER () TheCount WHERE ...;
That way the count of the entire query is returned with each row in TheCount, and you could set your loop to terminate accordingly.

To gain control over Fill DataTable Loop you need own the loop.
Then build your own Function to Fill DataTable using OracleDataReader.
To get Columns information, you can use dataReader.GetSchemaTable
To Fill the Table:
MyTable.BeginLoadData
Dim Values(mySchema.rows.count-1)
Do while myReader.read
MyReader.GetValues(Values)
MyTable.Rows.add(Values)
'Include here your control over load Count
Loop
MyTable.EndLoadData

Related

How to get data from multiple SELECT statements and store it in the DataSet

I am trying to execute multiple SELECT statements, such as,
DataSet ds = new DataSet();
string sql = #"SELECT * FROM CUSTOMERS;
SELECT * FROM CUSTOMERS WHERE AGE > 40;";
using (FbConnection connection = new FbConnection(ConectionString))
{
try
{
using (FbCommand cmd = connection.CreateCommand())
using (FbDataAdapter sda = new FbDataAdapter(cmd))
{
cmd.CommandText = sql;
cmd.CommandType = CommandType.Text;
connection.Open();
sda.Fill(ds);
}
}
catch (FbException e)
{
Error = e.Message;
}
finally
{
connection.Close();
}
}
return ds;
The above code works great for one SELECT statement, but it throws an exception when there are multiple SELECT statements.
Dynamic SQL Error
SQL error code = -104
Token unknown - line 2, column 1
SELECT
I have tried the FbBatchExecution as well but I don't know how to get the returned data from it. It works well when using multiple INSERT or DELETE statements.
You have to build ONE query out of those two using SQL UNION operator
https://en.wikipedia.org/wiki/Set_operations_(SQL)#UNION_operator
https://www.firebirdsql.org/file/documentation/reference_manuals/fblangref25-en/html/fblangref25-dml-select.html#fblangref25-dml-select-union
Note how these your queries fetch the same rows: the first query has all the rows of the second plus some more
SELECT * FROM CUSTOMERS;
SELECT * FROM CUSTOMERS WHERE AGE > 40;
Basically you have two ways to link them together
SELECT * FROM CUSTOMERS
UNION ALL
SELECT * FROM CUSTOMERS WHERE AGE > 40
and
SELECT * FROM CUSTOMERS
UNION DISTINCT
SELECT * FROM CUSTOMERS WHERE AGE > 40
The first option would just link one query after another, without caring what the data they have brought. And unless you add ORDER BY clause it also would most probably keep the order of rows produced by sequential queries, just because it is easier to do so. However there is no warranty on that in neither SQL standard nor Firebird documentation, in other words that is purely "implementation detail" and there is some chance that in future rows would get reordered interleaving the queries even with UNION ALL w/o ORDER BY link (for example if the subqueries would be spawn into different processors for parallelization).
The second option would sort the outputs in temporary buffers and exclude duplicates, which would mean more working time for the server and more volume in memory and/or disk used for that temporary buffers and sorting, but would ensure you do not have duplicates in the rows (which means your specific queries would have then the same set of data as the first query alone).
Contrary to SQL Server, Firebird doesn't support execution of multiple statements in a single execute, nor can a single execute produce multiple result sets. If you want to execute multiple statements, you will need to execute them individually.
You also can't use FbBatchExecution because that is for executing inserts, updates, deletes, etc (statements that don't produce a result set).
I dont know FbConnection provider, but if implement IDbConnection, so you can maybe use Dapper and then your problem solve this:
https://dapper-tutorial.net/querymultiple
string sql = "SELECT * FROM Invoice WHERE InvoiceID = #InvoiceID; SELECT * FROM InvoiceItem WHERE InvoiceID = #InvoiceID;";
using (var connection = My.ConnectionFactory())
{
connection.Open();
using (var multi = connection.QueryMultiple(sql, new {InvoiceID = 1}))
{
var invoice = multi.Read<Invoice>().First();
var invoiceItems = multi.Read<InvoiceItem>().ToList();
}
}
Try this:
using (SqlConnection sql = new SqlConnection(key))
{
try
{
sql.Open();
SqlCommand cmd = new SqlCommand();
cmd.CommandText = #"SELECT * FROM CUSTOMERS;
SELECT * FROM CUSTOMERS WHERE AGE > 40;";
cmd.Connection = sql;
var dr = cmd.ExecuteReader();
if (dr.HasRows)
{
while (dr.Read())
{
//First query data
}
if (dr.NextResult())
{
if (dr.HasRows)
{
while (dr.Read())
{
//Second Query Data
}
}
}
}
}
catch (FbException e)
{
Error = e.Message;
}
finally
{
sql .Close();
}
}

Oracle.ManagedDataAccess.Client.OracleCommand.ExecuteReader Missing Results

I have a very simple method that utilizes Oracle.ManagedDataAccess to query datatables in Oracle. The code is below.
private System.Data.DataTable ByQuery(Oracle.ManagedDataAccess.Client.OracleConnection connection, string query)
{
using(var cmd = new Oracle.ManagedDataAccess.Client.OracleCommand())
{
cmd.Connection = connection;
cmd.CommandText = query;
cmd.CommandType = System.Data.CommandType.Text;
var dr = cmd.ExecuteReader();
dr.Read();
var dataTable = new System.Data.DataTable();
dataTable.Load(dr);
var recordCount = dataTable.Rows.Count;
return dataTable;
}
}
Using a very simple query such as:
SELECT * FROM NKW.VR_ORDER_LI WHERE CTRL_NO = 10
Returns 32 rows of data. However when I run the exact same query from Oracle SQL Developer using the exact same user account I'm using in my connection string for the C# App, I get 33 results.
I'm consistently missing a single row of data.
I've tried querying a different CTRL_NO:
SELECT * FROM NKW.VR_ORDER_LI WHERE CTRL_NO = 17
.Net returns 8 results.
Oracle Sql Developer returns 9 results.
I've tried removing the WHERE statement and just getting all results.
Still 1 row difference between the two.
I've tried googling for an answer but haven't been successful. Any help or advice would be appreciated.
UPDATE 1:
I've determined that I'm always missing the first result that I see in Oracle SQL Developer when I run the exact same query from my C# App.
UPDATE 2:
As suggested I took the DataTable out of the equation.
int rowCount = 0;
while(dr.Read())
{
rowCount++;
}
rowCount skipping the DataTable still results in a missing record.
UPDATE 3:
Tested against a completely different table, NKW.VR_ORDER_LI is actually a view. Still had the same results for some reason I end up with one less row of results from the ExecuteReader() than I do from within SQL Developer.
I ended up figuring out my issue from this thread:
Datareader skips first result
So the culprit in all of this was this part of the code:
var dr = cmd.ExecuteReader();
dr.Read();
var dataTable = new System.Data.DataTable();
dataTable.Load(dr);
The first dr.Read() was'nt neccessary. Getting rid of this line of code solved the problem.
Final fix:
var dr = cmd.ExecuteReader();
var dataTable = new System.Data.DataTable();
dataTable.Load(dr);
I also went back to using the DataTable because it is more consistent with how we interact with transactional data throughout our project at the current time.

fast/efficient way to run a query in Access based on datatable rows?

I have a datatable that may have 1000 or so rows in it. I need to go thru the datatable row by row, get the value of a column, run a query (Access 2007 DB) and update the datatable with the result. Here's what I have so far, which works:
String FilePath = "c:\\MyDB.accdb";
string QueryString = "SELECT MDDB.NDC, MDDB.NDC_DESC "
+ "FROM MDDB_MASTER AS MDDB WHERE MDDB.NDC = #NDC";
OleDbConnection strAccessConn = new OleDbConnection(string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + FilePath));
strAccessConn.Open();
OleDbDataReader reader = null;
int rowcount = InputTable.Rows.Count; //InputTable is the datatable
int count = 0;
while (count < rowcount)
{
string NDC = InputTable.Rows[count]["NDC"].ToString();
//NDC is a column in InputTable
OleDbCommand cmd = new OleDbCommand(QueryString, strAccessConn);
cmd.Parameters.Add("#NDC", OleDbType.VarChar).Value = NDC;
reader = cmd.ExecuteReader();
while (reader.Read())
{
//update the NDCDESC column with the query result
//the query should only return 1 line
dataSet1.Tables["InputTable"].Rows[count]["NDCDESC"] = reader.GetValue(1).ToString();
}
dataGridView1.Refresh();
count++;
}
strAccessConn.Close();
However this seems very inefficient since the query needs to run one time for each row in the datatable. Is there a better way?
You're thinking of an update query. You don't actually have to go over every row one by one. SQL is a set based language, so you only have to write a single statement that it should do for all rows.
Do this:
1) Create > Query Design
2) Close the dialog that selects tables
3) Make sure you're in sql mode (top left corner)
4) Paste this:
UPDATE INPUTTABLE
INNER JOIN MDDB_MASTER ON INPUTTABLE.NDC = MDDB_MASTER.NDC
SET INPUTTABLE.NDCDESC = [MDDB_MASTER].[NDC_DESC];
5) Switch to design mode to see what happens. You may have to correct Input table, I couldn't find it's name. I'm assuming they;re both in the same database.
You'll see the query type is now an update query.
You can run this text through cmd.ExecuteNonQuery(sql) and the whole thing should run very quickly. If it doesn't you'll need an index on one of the tables;
THis works by joining the two table on NDC and then copying the NDC_DESC over from MDDB_MASTER to the inputtable.
I missed the part about InputTable coming from Excel.
For better speed, instead of executing the query in Access over and over, you can get all rows from MDDB_MASTER into a datatable in one select statement:
SELECT MDDB.NDC, MDDB.NDC_DESC FROM MDDB_MASTER
And then use the DataTable.Select method to filter the right row.
mddb_master.Select("NDC = '" + NDC +'")
This will be done in memory and should be much faster than all the round trips you have now. Especially over the network these round trips are expensive. 225k rows should be only a few MB (roughly a JPEG image) so that shouldn't be an issue.
You could use the "IN" clause to build a bigger query such as:
string QueryString = "SELECT MDDB.NDC, MDDB.NDC_DESC "
+ "FROM MDDB_MASTER AS MDDB WHERE MDDB.NDC IN (";
int rowcount = InputTable.Rows.Count; //InputTable is the datatable
int count = 0;
while (count < rowcount)
{
string NDC = InputTable.Rows[count]["NDC"].ToString();
QueryString += (count == 0 ? "" : ",") + "'" + NDC + "'";
}
QueryString += ")";
You can optimize that with StringBuilders since that could be a lot of strings but that's a job for you. :)
Then in a single query, you would get all the NDC descriptions you need and avoid performing 1000 queries. You would then roll through the reader, find values in the InputTable, and update them. Of course, in this case, you're looping through the InputTable multiple times but it might be a better option. Especially if yor InputTable could hold duplicate NDC values.
Also, note that you have a OleDbDataReader leak in your code. You keep reassigning the reader reference to a new instance of a reader before disposing of the old reader. Same with commands. You keep instantiating a new command but are not disposing of it properly.

want to run query on each database, How to calculate the performance difference in C# and in SQL Server

I am working on sql server monitoring product and i have database query that will fetch data regarding All Table details of all the Databases in SQL server.
For this i have two options.
Fire query on data base from code as select name from [master].sys.sysdatabases
Get the DB name of all the data base first then i will fire my main query on each DB
using "USE <fetched DB name>;"+"mainQuery";
Please check followin code for the same.
public DataTable GetResultsOfAllDB(string query)
{
SqlConnection con = new SqlConnection(_ConnectionString);
string locleQuery = "select name from [master].sys.sysdatabases";
DataTable dtResult = new DataTable("Result");
SqlCommand cmdData = new SqlCommand(locleQuery, con);
cmdData.CommandTimeout = 0;
SqlDataAdapter adapter = new SqlDataAdapter(cmdData);
DataTable dtDataBases = new DataTable("DataBase");
adapter.Fill(dtDataBases);
foreach (DataRow drDB in dtDataBases.Rows)
{
if (dtResult.Rows.Count >= 15000)
break;
locleQuery = " Use [" + Convert.ToString(drDB[0]) + "]; " + query;
cmdData = new SqlCommand(locleQuery, con);
adapter = new SqlDataAdapter(cmdData);
DataTable dtTemp = new DataTable();
adapter.Fill(dtTemp);
dtResult.Merge(dtTemp);
}
return dtResult;
}
I will use sys store procedure i.e.EXEC sp_MSforeachdb and fetched data will be stored store data in table datatype select from temptable; Drop Table temptable.
Check following query for the same
Declare #TableDetail table
(
field1 varchar(500),
field2 int,
field3 varchar(500),
field4 varchar(500),
field5 decimal(18,2),
field6 decimal(18,2)
)
INSERT #TableDetail EXEC sp_MSforeachdb 'USE [?]; QYERY/COMMAND FOR ALL DATABASE'
Select
field1,field2 ,field3 ,field4 ,field5,field6 FROM #TableDetail
Note : In second option query takes time because if number of database and number of table are huge then this will wait until all database get finish.
Now my question is which is the good option from above two options and why? or any other solution for the same.
Thanks in advance.
One key difference is the second option blocks until everything is done. All of the work is done sql server side. That has the issue of not being able to apply feedback to the user as it runs and it can potentially time out and not be resiliant to network blips. This option can be used as a pure sql script (some sql admins like that) where the first needs a program.
In the first example, the client is doing iterative more granular tasks where you can supply feedback to the user. You can also retry in the face of network blips without redoing all of the work. In the first example, you can also use SqlConnectionBuild instead of USE concatentation.
If performance is a concern, you could also potentially parallelize the first one with some locking around adapter.Fill
Both suck - they are both serial.
Use the first, get rid of the ridiculous objects (DataSet) and use TASKS to parallelize X databases at the same time. X determined by trying ut how much load the server can handle.
Finished.
If your queries are simple enough you can try to generate single script instead of execute queries in each DB one by one:
select 'DB1' as DB, Field1, Field2, ...
from [DB1]..[TableOrViewName]
union all
select 'DB2' as DB, Field1, Field2, ...
from [DB2]..[TableOrViewName]
union all
...
Everything is looking fine. I just want to add Using statements for IDisposable objects
public DataTable GetResultsOfAllDB(string query)
{
using (SqlConnection con = new SqlConnection(_ConnectionString))
{
string locleQuery = "select name from [master].sys.sysdatabases";
DataTable dtResult = new DataTable("Result");
using (SqlCommand cmdData = new SqlCommand(locleQuery, con))
{
cmdData.CommandTimeout = 0;
using (SqlDataAdapter adapter = new SqlDataAdapter(cmdData))
{
using (DataTable dtDataBases = new DataTable("DataBase"))
{
adapter.Fill(dtDataBases);
foreach (DataRow drDB in dtDataBases.Rows)
{
if (dtResult.Rows.Count >= 15000)
break;
locleQuery = " Use [" + Convert.ToString(drDB[0]) + "]; " + query;
cmdData = new SqlCommand(locleQuery, con);
adapter = new SqlDataAdapter(cmdData);
using (DataTable dtTemp = new DataTable())
{
adapter.Fill(dtTemp);
dtResult.Merge(dtTemp);
}
}
return dtResult;
}
}
}
}
}

How to do a batch update?

I am wondering is there a way to do batch updating? I am using ms sql server 2005.
I saw away with the sqlDataAdaptor but it seems like you have to first the select statement with it, then fill some dataset and make changes to dataset.
Now I am using linq to sql to do the select so I want to try to keep it that way. However it is too slow to do massive updates. So is there away that I can keep my linq to sql(for the select part) but using something different to do the mass update?
Thanks
Edit
I am interested in this staging table way but I am not sure how to do it and still not clear how it will be faster since I don't understand how the update part works.
So can anyone show me how this would work and how to deal with concurrent connections?
Edit2
This was my latest attempt at trying to do a mass update using xml however it uses to much resources and my shared hosting does not allow it to go through. So I need a different way so thats why I am not looking into a staging table.
using (TestDataContext db = new TestDataContext())
{
UserTable[] testRecords = new UserTable[2];
for (int count = 0; count < 2; count++)
{
UserTable testRecord = new UserTable();
if (count == 1)
{
testRecord.CreateDate = new DateTime(2050, 5, 10);
testRecord.AnotherField = true;
}
else
{
testRecord.CreateDate = new DateTime(2015, 5, 10);
testRecord.AnotherField = false;
}
testRecords[count] = testRecord;
}
StringBuilder sBuilder = new StringBuilder();
System.IO.StringWriter sWriter = new System.IO.StringWriter(sBuilder);
XmlSerializer serializer = new XmlSerializer(typeof(UserTable[]));
serializer.Serialize(sWriter, testRecords);
using (SqlConnection con = new SqlConnection(connectionString))
{
string sprocName = "spTEST_UpdateTEST_TEST";
using (SqlCommand cmd = new SqlCommand(sprocName, con))
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandType = System.Data.CommandType.StoredProcedure;
SqlParameter param1 = new SqlParameter("#UpdatedProdData", SqlDbType.VarChar, int.MaxValue);
param1.Value = sBuilder.Remove(0, 41).ToString();
cmd.Parameters.Add(param1);
con.Open();
int result = cmd.ExecuteNonQuery();
con.Close();
}
}
}
# Fredrik Johansson I am not sure what your saying will work. Like it seems to me you want me to make a update statement for each record. I can't do that since I will have need update 1 to 50,000+ records and I will not know till that point.
Edit 3
So this is my SP now. I think it should be able to do concurrent connections but I wanted to make sure.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[sp_MassUpdate]
#BatchNumber uniqueidentifier
AS
BEGIN
update Product
set ProductQty = 50
from Product prod
join StagingTbl stage on prod.ProductId = stage.ProductId
where stage.BatchNumber = #BatchNumber
DELETE FROM StagingTbl
WHERE BatchNumber = #BatchNumber
END
You can use the sqlDataAdapter to do a batch update. It dosen’t matter how you fill your dataset. L2SQL or whatever, you can use different methods to do the update. Just define the query to run using the data in your datatable.
The key here is the UpdateBatchSize. The dataadapter will send the updates in batches of whatever size you define. You need to expirement with this value to see what number works best, but typicaly numbers of 500-1000 do best. SQL can then optimize the update and execute a little faster. Note that when doing batchupdates, you cannot update the row source of the datatable.
I use this method to do updates of 10-100K and it usualy runs in under 2 minutes. It will depend on what you are updating though.
Sorry, this is in VB….
Using da As New SqlDataAdapter
da.UpdateCommand = conn.CreateCommand
da.UpdateCommand.CommandTimeout = 300
da.AcceptChangesDuringUpdate = False
da.ContinueUpdateOnError = False
da.UpdateBatchSize = 1000 ‘Expirement for best preformance
da.UpdateCommand.UpdatedRowSource = UpdateRowSource.None 'Needed if UpdateBatchSize > 1
sql = "UPDATE YourTable"
sql += " SET YourField = #YourField"
sql += " WHERE ID = #ID"
da.UpdateCommand.CommandText = sql
da.UpdateCommand.UpdatedRowSource = UpdateRowSource.None
da.UpdateCommand.Parameters.Clear()
da.UpdateCommand.Parameters.Add("#YourField", SqlDbType.SmallDateTime).SourceColumn = "YourField"
da.UpdateCommand.Parameters.Add("#ID", SqlDbType.SmallDateTime).SourceColumn = "ID"
da.Update(ds.Tables("YourTable”)
End Using
Another option is to bulkcopy to a temp table, and then run a query to update the main table from it. This may be faster.
As allonym said, Use SqlBulkCopy, which is very fast(I found speed improvements of over 200x - from 1500 secs to 6s). However you can use the DataTable and DataRows classes to provide data to SQlBulkCopy (which seems easier). Using SqlBulkCopy this way has the added advantage of bein .NET 3.0 compliant as well (Linq was added only in 3.5).
Checkout http://msdn.microsoft.com/en-us/library/ex21zs8x%28v=VS.100%29.aspx for some sample code.
Use SqlBulkCopy, which is lightning-fast. You'll need a custom IDataReader implementation which enumerates over your linq query results. Look at http://code.msdn.microsoft.com/LinqEntityDataReader for more info and some potentially suitable IDataReader code.
You have to work with the expression trees directly, but it's doable. In fact, it's already been done for you, you just have to download the source:
Batch Updates and Deletes with LINQ to SQL
The alternative is to just use stored procedures or ad-hoc SQL queries using the ExecuteMethodCall and ExecuteCommand methods of the DataContext.
You can use SqlDataAdapter to do a batch-update even if a datatable is filled manually/programmatically (from linq of any other source).
Just remember to manually set the RowState for the rows in the datatable. Use dataRow.SetModified() for this.

Categories

Resources