I have asked a similar query before but now I would appreciate specifics. I have 5-11 SQL that need ran in a C# .NET 4.5 web application, currently they are done sequentially, which results in slow response times.
Talking to various architects/DBA they all tell me this can be improved by running the queries in parallel, but never give the specifics of how, when I ask they become very vague ;0)
Is there some function available in Oracle that I could call to pass queries to run in parallel?
Or I have been looking into ASYNC/AWAIT functionality, however the examples on the web are confusing (most involve returning control to the UI, then updating some text on the screen when the task finally completes), I would like to know how to call several methods for them to execute their SQL in parallel and then wait for all of them to complete before proceeding.
If anyone could point me in the direction of good documentation or provide specific examples I would appreciate it!!!!
Updated with sample code, could someone point out how to update this to async to wait for all the various calls to complete:
private CDTInspection GetDetailsInner(CDTInspection tInspection)
{
//Call Method one to get data
tInspection = Method1(tInspection);
//Call Method two to get data
Method2(tInspection);
//Call Method three to get data
Method3(tInspection);
//Call Method four to get data
Method4(tInspection);
return tInspection;
}
private void method2(CDTInspection tInspection)
{
//Create the parameter list
//Execute the query
//MarshalResults
}
You can create jobs using DBMS_SCHEDULER to run independently. Read more from the documentation about DBMS_SCHEDULER.
For example, you could run jobs in parallel as:
BEGIN
DBMS_SCHEDULER.RUN_JOB('pkg1.proc1', false);
DBMS_SCHEDULER.RUN_JOB('pkg2.proc2', false);
DBMS_SCHEDULER.RUN_JOB('pkg3.proc3', false);
END;
/
If you would like to run your 5-11 queries in parallel within your application you will have to start multiple threads and execute the queries within the threads in parallel.
However, if you want the database to execute a query in parallel on the database server(s), usually useful if the query is long running and you want to speed up the individual query execution time, then you can use Parallel Execution.
Parallel execution benefits systems with all of the following characteristics:
Symmetric multiprocessors (SMPs), clusters, or massively parallel systems
Sufficient I/O bandwidth
Underutilized or intermittently used CPUs (for example, systems where CPU usage is typically less than 30%)
Sufficient memory to support additional memory-intensive processes, such as sorting, hashing, and I/O buffers
The easiest way to implement parallel execution is via a hint:
SELECT /*+ PARALLEL */ col1, col2, col3 FROM mytable;
However, this might not be the best way as it would change your query and has other downsides (like what if you want to deactivate parallelism again, you would have to change the query again). Another way is to specify on table level:
ALTER TABLE mytable PARALLEL;
That would allow to simply deactivate parallel execution again if it is not wanted anymore without changing the query itself.
Related
I have currently tested redis-benchmark on my linux system and was impressed by the results. But while benchmarking, I used pipelining of 16 commands. Now I am trying to execute it on c#.
My main problem is I want to log some thousands of random data into redis and I can't figure how to used pipelining with this.
Thanks in advance.
The most explicit way to use pipelining in StackExchange.Redis is to use the CreateBatch API:
var db = conn.GetDatabase();
var batch = db.CreateBatch();
// not shown: queue some async operations **without** awaiting them (yet)
batch.Execute(); // this sends the queued commands
// now await the things you queued
however, note that you can achieve a lot without that, since:
concurrent load from different threads (whether sync or async) is multiplexed, allowing effective sharing of a single connection
the same trick of "issue multiple async operations but don't await them just yet" still works fine even without the batch API (using the batch API ensures that the batch is sent as a contiguous block without work from concurrent threads getting interleaved within the batch; this is similar to, but less strict than, the CreateTransaction() API)
Note also that in some bulk scenarios you might also want to consider Lua (ScriptEvaluate()); this API is varadic, so can adapt to arbitrary argument lengths - your Lua simply needs to inspect the sizes of KEYS and ARGV (discussed in the EVAL documentation).
i've created a data governance tool, in one module of which i schedule a job that runs a few sql queries. It is now running synchronously but i want to convert it to a way that it runs faster.
the question is, should i use Parallel or Task or Async/Await method in order to achieve what i want?
EDIT:these queries are completely different from each other. They each run a different data quality checks on different columns or tables. These are all select queries, and i use the ExecuteScalar method returning only the count of the rows that breaks my quality rules, such as "how many null values in ABC column"
I scheduled them to run at night as soon as DWH is populated
At the first I suggest you to optimize your SQL queries.
After that, you need to use Async/await because you don't need to block thread.
Take a look on the following topic:
Async/await and parallel in C#
I have a method in .Net (v4.6 using Dapper), named BulkUpdate, that will modify several tables and can include around 10,000 rows or more. This operation can take a few seconds to a couple of minutes depending on the number of data being inserted. Since I will be updating multiple related tables I have to enclose all operations in a TransactionScope.
My question is what is the best way to avoid other read requests (outside the Transaction) from being "locked" or "wait" while my BulkUpdate method is in progress. Please, I do not want to add SET ISOLATION LEVEL READ_UNCOMMITTED to the beginning of every read, nor add NO LOCK hint...are there any other solutions?
Please use TASK concept in front end side C#, split 100 or 1000 rows to each task and run simultaneously. It may be use full for you. already i will improved my application performance using like this.
https://www.codeproject.com/Questions/1226752/Csharp-how-to-run-tasks-in-parallel
We developed windows services with using threads to consume database records(generally .Net 2.0).Code block under the show.
for(int i=0;i<ThreadCount;i++)
{
ParameterizedThreadStart pts=new ParameterizedThreadStart(MyCode.DoWork)
Thread t=new Thread(pts);
t.Start(someObject);
}
ThreadCount read from app.config.MyCode.DoWork(Object someObject) code block select with some data in SQL server,and some operations.In addition to we call sp,and the query contains with(rowlock) vs.
while(someObject.Running)
{
*/select some data
}
Main question is how to improve my windows service.Some article related to manual creating thread increase CPU cost vs.So how to improve my app performance.If i use Parallel Task Library bring to any advantage.Create a task instead of create thread.Does Task create thread manually to look available CPU count,or i convert to like.
for(int i=0;i<ThreadCount;i++){
Task t=new Task(=>MyCode.Work());
}
To improve the performance of your application, you need to find out where the performance is lacking and then why it's lacking. Unfortunately we can't do that for you, because it needs access to the running C# and SQL code.
I suggest doing some performance profiling, either with a profiler tool (I use Redgate's tool) or by adding profiling code to your application. Once you can see the bottlenecks, you can make a theory about what's causing them.
I would start with the database first - look at the cached execution plans to see if there are any clues. Try running the stored procedures separately from the service.
to make things simpler and assuming the windows service is the only client accessing the database.
1. try to access the database from one thread. this reduces lock contention on db tables assuming there is contention in accessing data.
2. put the data retrieved into a queue and have TPL tasks processing the data to utilize all CPU cores assuming your performance bottle neck is the cpu
3. and then buy as many cpu core as you can afford.
This is an intuitive pattern and there are many assumptions in it. You'll need to profile and analyze your program to know whether this is suitable.
I want to get the community's perspective on this. If I have a process which is heavily DB/IO bound, how smart would it be to parallelize individual process paths using the Task Parallel library?
I'll use an example ... if I have a bunch of items, and I need to do the following operations
Query a DB for a list of items
Do some aggregation operations to group certain items based on a dynamic list of parameters.
For each grouped result, Query the database for something based on the aggregated result.
For each grouped result, Do some numeric calculations (3 and 4 would happen sequentially).
Do some inserts and updates for the result calculated in #3
Do some inserts and updates for each item returned in #1
Logically speaking, I can parallelize into a graph of tasks at steps #3, #5, #6 as one item has no bearing on the result of the previous. However, each of these will be waiting on the database (sql server) which is fine and I understand that we can only process as far as the SQL server will let us.
But I want to logically distribute the task on the local machine so that it processes as fast as the Database lets us without having to wait for anything on our end. I've done some mock prototype where I substitute the db calls with Thread.Sleeps (I also tried some variations with .SpinWait, which was a million times faster), and the parallel version is waaaaay faster than the current implementation which is completely serial and not parallel at all.
What I'm afraid of is putting too much strain on the SQL server ... are there any considerations I should consider before I go too far down this path?
If the parallel version is much faster than the serial version, I would not worry about the strain on your SQL server...unless of course the tasks you are performing are low priority compared to some other significant or time-critical operations that are also performed on the DB server.
Your description of tasks is not well understood by me, but it almost sounds like more of those tasks should have been performed directly in the database (I presume there are details that make that not possible?)
Another option would be to create a pipeline so that step 3 for the second group happening at the same time as step 4 for the first group. And if you can overlap the updates at step 5, do that too. That way you're doing concurrent SQL accesses and processing, but not over-taxing the database because you only have two concurrent operations going on at once.
So you do steps 1 and 2 sequentially (I presume) to get a collection of groups that require further processing. Then. your main thread starts:
for each group
query the database
place the results of the query into the calc queue
A second thread services the results queue:
while not end of data
Dequeue result from calc queue
Do numeric calculations
place the results of the query into the update queue
A third thread services the update queue:
while not end of data
Dequeue result from update queue
Update database
The System.Collections.Concurrent.BlockingCollection<T> is a very effective queue for this kind of thing.
The nice thing here is that if you can scale it if you want by adding multiple calculation threads or query/update threads if the SQL Server can handle more concurrent transactions.
I use something very similar to this in a daily merge/update program, with very good results. That particular process doesn't use SQL server, but rather standard file I/O, but the concepts translate very well.