Using ExecuteNonQueryAsync and Reporting Progress

Using ExecuteNonQueryAsync and Reporting Progress - c#

I thought I was trying to do something very simple. I just want to report a running number on the screen so the user gets the idea that the SQL Stored Procedure that I'm executing is working and that they don't get impatient and start clicking buttons.
The problem is that I can't figure out how to actually call the progress reporter for the ExecutNonQueryAsync command. It gets stuck in my reporting loop and never executes the command but, if I put it after the async command, it will get executed and result will never not equal zero.
Any thoughts, comments, ideas would be appreciated. Thank you so much!
int i = 0;
lblProcessing.Text = "Transactions " + i.ToString();
int result = 0;
while (result==0)
{
i++;
if (i % 500 == 0)
{
lblProcessing.Text = "Transactions " + i.ToString();
lblProcessing.Refresh();
}
}
// Yes - I know - the code never gets here - that is the problem!
result = await cmd.ExecuteNonQueryAsync();

The simplest way to do this is to use a second connection to monitor the progress, and report on it. Here's a little sample to get you started:
using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.Text;
using System.Threading.Tasks;
namespace Microsoft.Samples.SqlServer
{
public class SessionStats
{
public long Reads { get; set; }
public long Writes { get; set; }
public long CpuTime { get; set; }
public long RowCount { get; set; }
public long WaitTime { get; set; }
public string LastWaitType { get; set; }
public string Status { get; set; }
public override string ToString()
{
return $"Reads {Reads}, Writes {Writes}, CPU {CpuTime}, RowCount {RowCount}, WaitTime {WaitTime}, LastWaitType {LastWaitType}, Status {Status}";
}
}
public class SqlCommandWithProgress
{
public static async Task ExecuteNonQuery(string ConnectionString, string Query, Action<SessionStats> OnProgress)
{
using (var rdr = await ExecuteReader(ConnectionString, Query, OnProgress))
{
rdr.Dispose();
}
}
public static async Task<DataTable> ExecuteDataTable(string ConnectionString, string Query, Action<SessionStats> OnProgress)
{
using (var rdr = await ExecuteReader(ConnectionString, Query, OnProgress))
{
var dt = new DataTable();
dt.Load(rdr);
return dt;
}
}
public static async Task<SqlDataReader> ExecuteReader(string ConnectionString, string Query, Action<SessionStats> OnProgress)
{
var mainCon = new SqlConnection(ConnectionString);
using (var monitorCon = new SqlConnection(ConnectionString))
{
mainCon.Open();
monitorCon.Open();
var cmd = new SqlCommand("select ##spid session_id", mainCon);
var spid = Convert.ToInt32(cmd.ExecuteScalar());
cmd = new SqlCommand(Query, mainCon);
var monitorQuery = #"
select s.reads, s.writes, r.cpu_time, s.row_count, r.wait_time, r.last_wait_type, r.status
from sys.dm_exec_requests r
join sys.dm_exec_sessions s
on r.session_id = s.session_id
where r.session_id = #session_id";
var monitorCmd = new SqlCommand(monitorQuery, monitorCon);
monitorCmd.Parameters.Add(new SqlParameter("#session_id", spid));
var queryTask = cmd.ExecuteReaderAsync( CommandBehavior.CloseConnection );
var cols = new { reads = 0, writes = 1, cpu_time =2,row_count = 3, wait_time = 4, last_wait_type = 5, status = 6 };
while (!queryTask.IsCompleted)
{
var firstTask = await Task.WhenAny(queryTask, Task.Delay(1000));
if (firstTask == queryTask)
{
break;
}
using (var rdr = await monitorCmd.ExecuteReaderAsync())
{
await rdr.ReadAsync();
var result = new SessionStats()
{
Reads = Convert.ToInt64(rdr[cols.reads]),
Writes = Convert.ToInt64(rdr[cols.writes]),
RowCount = Convert.ToInt64(rdr[cols.row_count]),
CpuTime = Convert.ToInt64(rdr[cols.cpu_time]),
WaitTime = Convert.ToInt64(rdr[cols.wait_time]),
LastWaitType = Convert.ToString(rdr[cols.last_wait_type]),
Status = Convert.ToString(rdr[cols.status]),
};
OnProgress(result);
}
}
return queryTask.Result;
}
}
}
}
Which you would call something like this:
class Program
{
static void Main(string[] args)
{
Run().Wait();
}
static async Task Run()
{
var constr = "server=localhost;database=tempdb;integrated security=true";
var sql = #"
set nocount on;
select newid() d
into #foo
from sys.objects, sys.objects o2, sys.columns
order by newid();
select count(*) from #foo;
";
using (var rdr = await SqlCommandWithProgress.ExecuteReader(constr, sql, s => Console.WriteLine(s)))
{
if (!rdr.IsClosed)
{
while (rdr.Read())
{
Console.WriteLine("Row read");
}
}
}
Console.WriteLine("Hit any key to exit.");
Console.ReadKey();
}
}
Which outputs:
Reads 0, Writes 0, CPU 1061, RowCount 0, WaitTime 0, LastWaitType SOS_SCHEDULER_YIELD, Status running
Reads 0, Writes 0, CPU 2096, RowCount 0, WaitTime 0, LastWaitType SOS_SCHEDULER_YIELD, Status running
Reads 0, Writes 0, CPU 4553, RowCount 11043136, WaitTime 198, LastWaitType CXPACKET, Status suspended
Row read
Hit any key to exit.

You're not going to be able to get ExecuteNonQueryAsync to do what you want here. To do what you're looking for, the result of the method would have to be either row by row or in chunks incremented during the SQL call, but that's not how submitting a query batch to SQL Server works or really how you would want it to work from an overhead perspective. You hand a SQL statement to the server and after it is finished processing the statement, it returns the total number of rows affected by the statement.

Do you just want to let the user know that something is happening, and you don't actually need to display current progress?
If so, you could just display a ProgressBar with its Style set to Marquee.
If you want this to be a "self-contained" method, you could display the progress bar on a modal form, and include the form code in the method itself.
E.g.
public void ExecuteNonQueryWithProgress(SqlCommand cmd) {
Form f = new Form() {
Text = "Please wait...",
Size = new Size(400, 100),
StartPosition = FormStartPosition.CenterScreen,
FormBorderStyle = FormBorderStyle.FixedDialog,
MaximizeBox = false,
ControlBox = false
};
f.Controls.Add(new ProgressBar() {
Style = ProgressBarStyle.Marquee,
Dock = DockStyle.Fill
});
f.Shown += async (sender, e) => {
await cmd.ExecuteNonQueryAsync();
f.Close();
};
f.ShowDialog();
}

That is an interesting question. I have had to implement similar things in the past. In our case the priority was to:
Keep client side responsive in case the user doesn't want to stick around and wait.
Update the user of action and progress.
What I would do is use threading to run the process in the background like:
HostingEnvironment.QueueBackgroundWorkItem(ct => FunctionThatCallsSQLandTakesTime(p, q, s));
Then using a way to estimate work time I would increment a progress bar from client side on a clock. For this, query your data for a variable that gives you a linear relationship to the work time needed by FunctionThatCallsSQLandTakesTime.
For example; the number of active users this month drives the time FunctionThatCallsSQLandTakesTime takes. For each 10000 user it takes 5 minutes. So you can update your progress bar accordingly.

I'm wondering if this might be a reasonable approach:
IAsyncResult result = cmd2.BeginExecuteNonQuery();
int count = 0;
while (!result.IsCompleted)
{
count++;
if (count % 500 == 0)
{
lblProcessing.Text = "Transactions " + i.ToString();
lblProcessing.Refresh();
}
// Wait for 1/10 second, so the counter
// does not consume all available resources
// on the main thread.
System.Threading.Thread.Sleep(100);
}

Related

SQL Insert statements in C# on Azure DB - Running very slow

I'm working on an importer in our web application. With the code I currently have, when you are connecting via local SQL server, it runs fine and within reason. I'm also creating a .sql script that they can download as well
Example 1
40k records, 8 columns, from 1 minute and 30 seconds until 2 minutes
When I move it to production and Azure app service, it is running VERY slow.
Example 2
40k records, 8 columns, from 15 minutes to 18 minutes
The current database is set to: Pricing tier: Standard S2: 50 DTUs
Here is the code:
using (var sqlConnection = new SqlConnection(connectionString))
{
try
{
var generatedScriptFilePathInfo = GetImportGeneratedScriptFilePath(trackingInfo.UploadTempDirectoryPath, trackingInfo.FileDetail);
using (FileStream fileStream = File.Create(generatedScriptFilePathInfo.GeneratedScriptFilePath))
{
using (StreamWriter writer = new StreamWriter(fileStream))
{
sqlConnection.Open();
sqlTransaction = sqlConnection.BeginTransaction();
await writer.WriteLineAsync("/* Insert Scripts */").ConfigureAwait(false);
foreach (var item in trackingInfo.InsertSqlScript)
{
errorSqlScript = item;
using (var cmd = new SqlCommand(item, sqlConnection, sqlTransaction))
{
cmd.CommandTimeout = 800;
cmd.CommandType = CommandType.Text;
await cmd.ExecuteScalarAsync().ConfigureAwait(false);
}
currentRowLine++;
rowsProcessedUpdateEveryXCounter++;
rowsProcessedTotal++;
// append insert statement to the file
await writer.WriteLineAsync(item).ConfigureAwait(false);
}
// write out a couple of blank lines to separate insert statements from post scripts (if there are any)
await writer.WriteLineAsync(string.Empty).ConfigureAwait(false);
await writer.WriteLineAsync(string.Empty).ConfigureAwait(false);
}
}
}
catch (OverflowException exOverFlow)
{
sqlTransaction.Rollback();
sqlTransaction.Dispose();
trackingInfo.IsSuccessful = false;
trackingInfo.ImportMetricUpdateError = new ImportMetricUpdateErrorDTO(trackingInfo.ImportMetricId)
{
ErrorLineNbr = currentRowLine + 1, // add one to go ahead and count the record we are on to sync up with the file
ErrorMessage = string.Format(CultureInfo.CurrentCulture, "{0}", ImporterHelper.ArithmeticOperationOverflowFriendlyErrorText),
ErrorSQL = errorSqlScript,
RowsProcessed = currentRowLine
};
await LogImporterError(trackingInfo.FileDetail, exOverFlow.ToString(), currentUserId).ConfigureAwait(false);
await UpdateImportAfterFailure(trackingInfo.ImportMetricId, exOverFlow.Message, currentUserId).ConfigureAwait(false);
return trackingInfo;
}
catch (Exception ex)
{
sqlTransaction.Rollback();
sqlTransaction.Dispose();
trackingInfo.IsSuccessful = false;
trackingInfo.ImportMetricUpdateError = new ImportMetricUpdateErrorDTO(trackingInfo.ImportMetricId)
{
ErrorLineNbr = currentRowLine + 1, // add one to go ahead and count the record we are on to sync up with the file
ErrorMessage = string.Format(CultureInfo.CurrentCulture, "{0}", ex.Message),
ErrorSQL = errorSqlScript,
RowsProcessed = currentRowLine
};
await LogImporterError(trackingInfo.FileDetail, ex.ToString(), currentUserId).ConfigureAwait(false);
await UpdateImportAfterFailure(trackingInfo.ImportMetricId, ex.Message, currentUserId).ConfigureAwait(false);
return trackingInfo;
}
}
Questions
Is there any way to speed this up on Azure? Or is the only way to upgrade the DTUs?
We are looking into SQL Bulk Copy as well. Will this help any or still cause slowness on Azure: https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlbulkcopy?redirectedfrom=MSDN&view=dotnet-plat-ext-5.0
Desired results
Run at the same speed when running it at a local SQL Server database

For now, I updated my code to batch the insert statements based on how many records. If the record count is over 10k, then it will batch them by dividing the total by 10.
This helped performance BIG TIME on our Azure instance. I was able to add 40k records within 30 seconds. I also think some of the issue was how many different slots use our app service on Azure.
We will also probably move to SQLBulkCopy later on as users need to import larger excel files.
Thanks everyone for the helps and insights!
// apply the create table SQL script if found.
if (string.IsNullOrWhiteSpace(trackingInfo.InsertSqlScript.ToString()) == false)
{
int? updateEveryXRecords = GetProcessedEveryXTimesForApplyingInsertStatementsValue(trackingInfo.FileDetail);
trackingInfo.FileDetail = UpdateImportMetricStatus(trackingInfo.FileDetail, ImportMetricStatus.ApplyingInsertScripts, currentUserId);
int rowsProcessedUpdateEveryXCounter = 0;
int rowsProcessedTotal = 0;
await UpdateImportMetricsRowsProcessed(trackingInfo.ImportMetricId, rowsProcessedTotal, trackingInfo.FileDetail.ImportMetricStatusHistories).ConfigureAwait(false);
bool isBulkMode = trackingInfo.InsertSqlScript.Count >= 10000;
await writer.WriteLineAsync("/* Insert Scripts */").ConfigureAwait(false);
int insertCounter = 0;
int bulkCounter = 0;
int bulkProcessingAmount = 0;
int lastInsertCounter = 0;
if (isBulkMode == true)
{
bulkProcessingAmount = trackingInfo.InsertSqlScript.Count / 10;
}
await LogInsertBulkStatus(trackingInfo.FileDetail, isBulkMode, trackingInfo.InsertSqlScript.Count, bulkProcessingAmount, currentUserId).ConfigureAwait(false);
StringBuilder sbInsertBulk = new StringBuilder();
foreach (var item in trackingInfo.InsertSqlScript)
{
if (isBulkMode == false)
{
errorSqlScript = item;
using (var cmd = new SqlCommand(item, sqlConnection, sqlTransaction))
{
cmd.CommandTimeout = 800;
cmd.CommandType = CommandType.Text;
await cmd.ExecuteScalarAsync().ConfigureAwait(false);
}
currentRowLine++;
rowsProcessedUpdateEveryXCounter++;
rowsProcessedTotal++;
// append insert statement to the file
await writer.WriteLineAsync(item).ConfigureAwait(false);
// Update database with the insert statement created count to alert the user of the status.
if (updateEveryXRecords.HasValue)
{
if (updateEveryXRecords.Value == rowsProcessedUpdateEveryXCounter)
{
await UpdateImportMetricsRowsProcessed(trackingInfo.ImportMetricId, rowsProcessedTotal, trackingInfo.FileDetail.ImportMetricStatusHistories).ConfigureAwait(false);
rowsProcessedUpdateEveryXCounter = 0;
}
}
}
else
{
sbInsertBulk.AppendLine(item);
if (bulkCounter < bulkProcessingAmount)
{
errorSqlScript = string.Format(CultureInfo.CurrentCulture, "IsBulkMode is True | insertCounter = {0}", insertCounter);
bulkCounter++;
}
else
{
// display to the end user
errorSqlScript = string.Format(CultureInfo.CurrentCulture, "IsBulkMode is True | currentInsertCounter value = {0} | lastInsertCounter (insertCounter when the last bulk insert occurred): {1}", insertCounter, lastInsertCounter);
await ApplyBulkInsertStatements(sbInsertBulk, writer, sqlConnection, sqlTransaction, trackingInfo, rowsProcessedTotal).ConfigureAwait(false);
bulkCounter = 0;
sbInsertBulk.Clear();
lastInsertCounter = insertCounter;
}
rowsProcessedTotal++;
}
insertCounter++;
}
// get the remaining records after finishing the forEach insert statement
if (isBulkMode == true)
{
await ApplyBulkInsertStatements(sbInsertBulk, writer, sqlConnection, sqlTransaction, trackingInfo, rowsProcessedTotal).ConfigureAwait(false);
}
}
/// <summary>
/// Applies the bulk insert statements.
/// </summary>
/// <param name="sbInsertBulk">The sb insert bulk.</param>
/// <param name="wrtier">The wrtier.</param>
/// <param name="sqlConnection">The SQL connection.</param>
/// <param name="sqlTransaction">The SQL transaction.</param>
/// <param name="trackingInfo">The tracking information.</param>
/// <param name="rowsProcessedTotal">The rows processed total.</param>
/// <returns>Task</returns>
private async Task ApplyBulkInsertStatements(
StringBuilder sbInsertBulk,
StreamWriter wrtier,
SqlConnection sqlConnection,
SqlTransaction sqlTransaction,
ProcessImporterTrackingDTO trackingInfo,
int rowsProcessedTotal)
{
var bulkInsertStatements = sbInsertBulk.ToString();
using (var cmd = new SqlCommand(bulkInsertStatements, sqlConnection, sqlTransaction))
{
cmd.CommandTimeout = 800;
cmd.CommandType = CommandType.Text;
await cmd.ExecuteScalarAsync().ConfigureAwait(false);
}
// append insert statement to the file
await wrtier.WriteLineAsync(bulkInsertStatements).ConfigureAwait(false);
// Update database with the insert statement created count to alert the user of the status.
await UpdateImportMetricsRowsProcessed(trackingInfo.ImportMetricId, rowsProcessedTotal, trackingInfo.FileDetail.ImportMetricStatusHistories).ConfigureAwait(false);
}

How do I optimize calculating the hash of thousands of files?

I have a .Net program that runs through a directory containing tens of thousands of relatively small files (around 10MB each), calculates their MD5 hash and stores that data in an SQLite database. The whole process works fine, however it takes a relatively long time (1094353ms with around 60 thousand files) and I'm looking for ways to optimize it. Here are the solutions I've thought of:
Use additional threads and calculate the hash of more than one file simultaneously. Not sure how I/O speed would limit me with this one.
Use a better hashing algorithm. I've looked around and the one I'm currently using seems to be the fastest one (on C# at least).
Which would be the best approach, and are there any better ones?
Here's my current code:
private async Task<string> CalculateHash(string file, System.Security.Cryptography.MD5 md5) {
Task<string> MD5 = Task.Run(() =>
{
{
using (var stream = new BufferedStream(System.IO.File.OpenRead(file), 1200000))
{
var hash = md5.ComputeHash(stream);
var fileMD5 = string.Concat(Array.ConvertAll(hash, x => x.ToString("X2")));
return fileMD5;
}
};
});
return await MD5;
}
public async Main() {
using (var md5 = System.Security.Cryptography.MD5.Create()) {
foreach (var file in Directory.GetFiles(path)) {
var hash = await CalculateHash(file, md5);
// Adds `hash` to the database
}
}
}

Create a pipeline of work, the easiest way I know how to create a pipeline that uses both parts of the code that must be single threaded and parts that must be multi-threaded is to use TPL Dataflow
public static class Example
{
private class Dto
{
public Dto(string filePath, byte[] data)
{
FilePath = filePath;
Data = data;
}
public string FilePath { get; }
public byte[] Data { get; }
}
public static async Task ProcessFiles(string path)
{
var getFilesBlock = new TransformBlock<string, Dto>(filePath => new Dto(filePath, File.ReadAllBytes(filePath))); //Only lets one thread do this at a time.
var hashFilesBlock = new TransformBlock<Dto, Dto>(dto => HashFile(dto),
new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism = Environment.ProcessorCount, //We can multi-thread this part.
BoundedCapacity = 50}); //Only allow 50 byte[]'s to be waiting in the queue. It will unblock getFilesBlock once there is room.
var writeToDatabaseBlock = new ActionBlock<Dto>(WriteToDatabase,
new ExecutionDataflowBlockOptions {BoundedCapacity = 50});//MaxDegreeOfParallelism defaults to 1 so we don't need to specifiy it.
//Link the blocks together.
getFilesBlock.LinkTo(hashFilesBlock, new DataflowLinkOptions {PropagateCompletion = true});
hashFilesBlock.LinkTo(writeToDatabaseBlock, new DataflowLinkOptions {PropagateCompletion = true});
//Queue the work for the first block.
foreach (var filePath in Directory.EnumerateFiles(path))
{
await getFilesBlock.SendAsync(filePath).ConfigureAwait(false);
}
//Tell the first block we are done adding files.
getFilesBlock.Complete();
//Wait for the last block to finish processing its last item.
await writeToDatabaseBlock.Completion.ConfigureAwait(false);
}
private static Dto HashFile(Dto dto)
{
using (var md5 = System.Security.Cryptography.MD5.Create())
{
return new Dto(dto.FilePath, md5.ComputeHash(dto.Data));
}
}
private static async Task WriteToDatabase(Dto arg)
{
//Write to the database here.
}
}
This creates a pipeline with 3 segments.
One that is single threaded that reads the files from the hard drive in to memory and stored as a byte[].
A second one that can use up to Enviorement.ProcessorCount threads to hash the files, it will only allow 50 items to be sitting on it's inbound queue, when the first block tries to add it will stop processing new items until the next block is ready to accept new items.
And a third one that is single threaded and adds the data to the database, it allows only 50 items in it's inbound queue at a time.
Because of the two 50 limits there will be at most 100 byte[] in memory (50 in hashFilesBlock queue, 50 in the writeToDatabaseBlock queue, items currently being processed count toward the BoundedCapacity limit.
Update: for fun I wrote a version that reports progress too, it's untested though and uses C# 7 features.
using System;
using System.IO;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
public static class Example
{
private class Dto
{
public Dto(string filePath, byte[] data)
{
FilePath = filePath;
Data = data;
}
public string FilePath { get; }
public byte[] Data { get; }
}
public static async Task ProcessFiles(string path, IProgress<ProgressReport> progress)
{
int totalFilesFound = 0;
int totalFilesRead = 0;
int totalFilesHashed = 0;
int totalFilesUploaded = 0;
DateTime lastReported = DateTime.UtcNow;
void ReportProgress()
{
if (DateTime.UtcNow - lastReported < TimeSpan.FromSeconds(1)) //Try to fire only once a second, but this code is not perfect so you may get a few rapid fire.
{
return;
}
lastReported = DateTime.UtcNow;
var report = new ProgressReport(totalFilesFound, totalFilesRead, totalFilesHashed, totalFilesUploaded);
progress.Report(report);
}
var getFilesBlock = new TransformBlock<string, Dto>(filePath =>
{
var dto = new Dto(filePath, File.ReadAllBytes(filePath));
totalFilesRead++; //safe because single threaded.
return dto;
});
var hashFilesBlock = new TransformBlock<Dto, Dto>(inDto =>
{
using (var md5 = System.Security.Cryptography.MD5.Create())
{
var outDto = new Dto(inDto.FilePath, md5.ComputeHash(inDto.Data));
Interlocked.Increment(ref totalFilesHashed); //Need the interlocked due to multithreaded.
ReportProgress();
return outDto;
}
},
new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism = Environment.ProcessorCount, BoundedCapacity = 50});
var writeToDatabaseBlock = new ActionBlock<Dto>(arg =>
{
//Write to database here.
totalFilesUploaded++;
ReportProgress();
},
new ExecutionDataflowBlockOptions {BoundedCapacity = 50});
getFilesBlock.LinkTo(hashFilesBlock, new DataflowLinkOptions {PropagateCompletion = true});
hashFilesBlock.LinkTo(writeToDatabaseBlock, new DataflowLinkOptions {PropagateCompletion = true});
foreach (var filePath in Directory.EnumerateFiles(path))
{
await getFilesBlock.SendAsync(filePath).ConfigureAwait(false);
totalFilesFound++;
ReportProgress();
}
getFilesBlock.Complete();
await writeToDatabaseBlock.Completion.ConfigureAwait(false);
ReportProgress();
}
}
public class ProgressReport
{
public ProgressReport(int totalFilesFound, int totalFilesRead, int totalFilesHashed, int totalFilesUploaded)
{
TotalFilesFound = totalFilesFound;
TotalFilesRead = totalFilesRead;
TotalFilesHashed = totalFilesHashed;
TotalFilesUploaded = totalFilesUploaded;
}
public int TotalFilesFound { get; }
public int TotalFilesRead{ get; }
public int TotalFilesHashed{ get; }
public int TotalFilesUploaded{ get; }
}

As far as I understand, Task.Run will instantiate a new thread for every file you have there, which leads to lots of threads and context switching between them. The case like you describe, sounds like a good case for using Parallel.For or Parallel.Foreach, something like this:
public void CalcHashes(string path)
{
string GetFileHash(System.Security.Cryptography.MD5 md5, string fileName)
{
using (var stream = new BufferedStream(System.IO.File.OpenRead(fileName), 1200000))
{
var hash = md5.ComputeHash(stream);
var fileMD5 = string.Concat(Array.ConvertAll(hash, x => x.ToString("X2")));
return fileMD5;
}
}
ParallelOptions options = new ParallelOptions();
options.MaxDegreeOfParallelism = 8;
Parallel.ForEach(filenames, options, fileName =>
{
using (var md5 = System.Security.Cryptography.MD5.Create())
{
GetFileHash(md5, fileName);
}
});
}
EDIT: Seems Parallel.ForEach does not actually do the partitioning automatically. Added max degree of parallelism limit to 8. As a result:
107005 files
46628 ms

How to run an infinite thread?

I'm trying to run a Thread infinitely, but it's not working ...
Follow the code:
namespace BTCPrice
{
public static class Price
{
private static volatile bool _shouldStop = false;
public static void RequestStop()
{
_shouldStop = true;
}
public static void Refresh(out string value, int tipo = 0, string source = "https://www.mercadobitcoin.net/api/")
{
while (_shouldStop == false)
{
JavaScriptSerializer serializer = new System.Web.Script.Serialization.JavaScriptSerializer();
WebClient cliente = new WebClient();
string json = cliente.DownloadString(string.Format("{0}/{1}/{2}", source, "BTC", "ticker"));
JObject j = JObject.Parse(json);
switch (tipo)
{
//Get High Price
case 0:
value = j["ticker"]["high"].ToString();
break;
//Get Low Price
case 1:
value = j["ticker"]["low"].ToString();
break;
default:
value = "default";
break;
}
Thread.Sleep(1000);
}
value = "Stopped";
}
}
}
On Start:
string result = "";
Thread workerThread = new Thread(() => {
Price.Refresh(out result);
MessageBox.Show(result);
Invoke(textBox1, result);
Thread.Sleep(1000);
});
No exception occurs ... as long as I remove the While (_shouldStop == false) class the code works perfectly. However, I would like that, while the program is open, it executes the code and updates the textbox with the value that I get by the API.
result without While(_shouldStop == false) in class:
Expected Result with While

You really shouldn't be using threads these days when there are excellent alternatives that handle all of the mess for you.
I'd suggest using Microsoft's Reactive Framework (aka "Rx"). Just NuGet "System.Reactive", "System.Reactive.Windows.Forms" (Windows Forms), "System.Reactive.Windows.Threading" (WPF).
Then you can do this:
int tipo = 0;
string source = "https://www.mercadobitcoin.net/api/";
string url = string.Format("{0}/{1}/{2}", source, "BTC", "ticker");
IObservable<string> feed =
from n in Observable.Interval(TimeSpan.FromSeconds(1.0))
from json in Observable.Using<string, WebClient>(() => new WebClient(), cliente => cliente.DownloadStringTaskAsync(url).ToObservable())
let j = JObject.Parse(json)
let high = j["ticker"]["high"].ToString()
let low = j["ticker"]["low"].ToString()
select tipo == 0 ? high : (tipo == 1 ? low : "default");
IDisposable subscription =
feed
.ObserveOn(this); // for Windows Forms OR .ObservableOnDispatcher() for WPF
.Subscribe(value =>
{
/* Do something with `value` */
});
You'll now get a steady stream of the string value every second. A thread is started automatically and the results are automatically pasted to the UI thread.
When you want to stop the feed producing values just call subscription.Dispose();.
This code entirely replaces your Price class.

Change your while loop in Price.Refresh to inside the thread. Have Price.Refresh return a string instead.
Thread workerThread = new Thread(() => {
while (true)
{
String result = Price.Refresh();
MessageBox.Show(result);
Invoke(textBox1, result);
Thread.Sleep(1000);
});
I agree with Scott Chamberlain in that you should use a timer instead and rewrite this but this will work for you.

SqlDataReader Throws "Invalid attempt to read when no data is present" but OleDbReader does not

I migrated an Access database to SQL using Microsoft SQL Server Migration Assistant.
Now, I am having trouble reading data.
return reader.GetInt32(0); Throws Invalid attempt to read when no data is present exception after 32 rows retrieved. If I add CommandBehavior.SequentialAccess to my command, I am able to read 265 rows, every time.
The data
The query (in SQL Management Studio):
SELECT *
FROM Products2
WHERE Products2.PgId = 337;
There is nothing special about row 32, and if I reverse the order, it is still row 32 that kills it.
Row 265, just for good measure.
The code
The query:
SELECT *
FROM Products2
WHERE Products2.PgId = #productGroupId;
The parameter:
Name = "productGroupId"
Value = 337
The execution:
public async Task ExecuteAsync(string query, Action<IDataReader> onExecute, params DataParameter[] parameters)
{
using(var command = builder.BuildCommand(query))
{
foreach(var parameter in parameters)
command.AddParameter(parameter);
if(!connection.IsOpen)
connection.Open();
await Task.Run(async () =>
{
using(var reader = await command.ExecuteAsync())
if(await reader.ReadAsync())
onExecute(reader);
});
}
}
And reading:
await executor.ExecuteAsync(query, async reader =>
{
do
{
products.Add(GetProduct(reader));
columnValues.Enqueue(GetColumnValues(reader).ToList());
} while(await reader.ReadAsync());
}, parameters.ToArray());
await reader.ReadAsync() returns true, but when GetProduct(reader) calls reader.GetInt32(0); for the 32nd time, it throws the exception.
It works fine if the data is less than 32 rows, or 265 in case of SequentialAccess.
I tried increasing the CommandTimeout, but it didn't help. When I swap the connection to OleDb again, it works just fine.
Thanks in advance.
EDIT
If I replace * in the query with just a few specific columns, it works. When I read ~12 columns, it fails, but later than row 32.
As per request, GetProduct:
private Product GetProduct(IDataReader productReader)
{
return new Product
{
Id = productReader.ReadLong(0),
Number = productReader.ReadString(2),
EanNo = productReader.ReadString(3),
Frequency = productReader.ReadInt(4),
Status = productReader.ReadInt(5),
NameId = productReader.ReadLong(6),
KMPI = productReader.ReadByte(7),
OEM = productReader.ReadString(8),
CurvesetId = productReader.ReadInt(9),
HasServiceInfo = Convert.ToBoolean(productReader.ReadByte(10)),
ColumnData = new List<ColumnData>()
};
}
GetColumnData:
private IEnumerable<long> GetColumnValues(IDataReader productReader)
{
var columnValues = new List<long>();
int columnIndex = 11;
while(!productReader.IsNull(columnIndex))
columnValues.Add(productReader.ReadLong(columnIndex++));
return columnValues;
}
And the adapter:
public long ReadLong(int columnIndex)
{
return reader.GetInt32(columnIndex);
}
Alright, it is getting long. :)
Thanks to #Igor, I tried creating a small working example. This seem to work fine:
private static async Task Run()
{
var result = new List<Product>();
string conString = #" ... ";
var con = new SqlConnection(conString);
con.Open();
using(var command = new SqlCommand("SELECT * FROM Products2 WHERE Products2.PgId = #p;", con))
{
command.Parameters.Add(new SqlParameter("p", 337));
await Task.Run(async () =>
{
using(var productReader = await command.ExecuteReaderAsync())
while(await productReader.ReadAsync())
{
result.Add(new Product
{
Id = productReader.GetInt32(0),
Number = productReader.GetString(2),
EanNo = productReader.GetString(3),
Frequency = productReader.GetInt16(4),
Status = productReader.GetInt16(5),
NameId = productReader.GetInt32(6),
KMPI = productReader.GetByte(7),
OEM = productReader.GetString(8),
CurvesetId = productReader.GetInt16(9),
HasServiceInfo = Convert.ToBoolean(productReader.GetByte(10))
});
GetColumnValues(productReader);
}
});
}
Console.WriteLine("End");
}
private static IEnumerable<long> GetColumnValues(SqlDataReader productReader)
{
var columnValues = new List<long>();
int columnIndex = 11;
while(!productReader.IsDBNull(columnIndex))
columnValues.Add(productReader.GetInt32(columnIndex++));
return columnValues;
}
}
Here's the data in Access, just in case:

I managed to fix the problem.
I still have questions though, since I don't understand why this didn't affect Access.
I changed:
public async Task ExecuteAsync(string query, Action<IDataReader> onExecute, params DataParameter[] parameters)
{
using(var command = builder.BuildCommand(query))
{
foreach(var parameter in parameters)
command.AddParameter(parameter);
if(!connection.IsOpen)
connection.Open();
await Task.Run(async () =>
{
using(var reader = await command.ExecuteAsync())
if(await reader.ReadAsync())
onExecute(reader);
});
}
}
To:
public async Task ExecuteAsync(string query, Func<IDataReader, Task> onExecute, params DataParameter[] parameters)
{
using(var command = builder.BuildCommand(query))
{
foreach(var parameter in parameters)
command.AddParameter(parameter);
if(!connection.IsOpen)
connection.Open();
using(var reader = await command.ExecuteAsync())
await onExecute(reader);
}
}
If someone can explain why this helped, I would really appreciate it.

Multiple Insert and Update causing DeadLock c#

I have two parts to my application which both does massive amount of insert and update respectively and because or poor managemenent, there's deadlock.
I am using entity framework to do my insert and update.
The following is my code for my TestSpool program. The purpose of this program is to insert x number of records with a given interval.
using System;
using System.Linq;
using System.Threading;
using System.Transactions;
namespace TestSpool
{
class Program
{
static void Main(string[] args)
{
using (var db = new TestEntities())
{
decimal start = 700001;
while (true)
{
using (TransactionScope scope = new TransactionScope())
{
//Random ir = new Random();
//int i = ir.Next(1, 50);
var objs = db.BidItems.Where(m => m.BidItem_Close == false);
foreach (BidItem bi in objs)
{
for (int j = 0; j <= 10; j++)
{
Transaction t = new Transaction();
t.Item_Id = bi.BidItemId;
t.User_Id = "Ghost";
t.Trans_Price = start;
t.Trans_TimeStamp = DateTime.Now;
start += 10;
db.Transactions.AddObject(t);
}
Console.WriteLine("Test Spooled for item " + bi.BidItemId.ToString() + " of " + 11 + " bids");
db.SaveChanges();
}
scope.Complete();
Thread.Sleep(5000);
}
}
}
}
}
}
The second part of the program is the testserverclass, the serverclass supposed to processed a huge amount of transactions from testspool and determined the highest amount of the transaction and update to another table.
using System;
using System.Linq;
using System.Transactions;
public class TestServerClass
{
public void Start()
{
try
{
using (var db = new TestServer.TestEntities())
{
while (true)
{
using (TransactionScope scope = new TransactionScope())
{
var objsItems = db.BidItems.Where(m => m.BidItem_Close == false);
foreach (TestServer.BidItem bi in objsItems)
{
var trans = db.Transactions.Where(m => m.Trans_Proceesed == null && m.Item_Id == bi.BidItemId).OrderBy(m => m.Trans_TimeStamp).Take(100);
if (trans.Count() > 0)
{
var tran = trans.OrderByDescending(m => m.Trans_Price).FirstOrDefault();
// TestServer.BidItem bi = db.BidItems.FirstOrDefault(m => m.BidItemId == itemid);
if (bi != null)
{
bi.BidMinBid_LastBid_TimeStamp = tran.Trans_TimeStamp;
bi.BidMinBid_LastBidAmount = tran.Trans_Price;
bi.BidMinBid_LastBidBy = tran.User_Id;
}
foreach (var t in trans)
{
t.Trans_Proceesed = "1";
db.Transactions.ApplyCurrentValues(t);
}
db.BidItems.ApplyCurrentValues(bi);
Console.WriteLine("Processed " + trans.Count() + " bids for Item " + bi.BidItemId);
db.SaveChanges();
}
}
scope.Complete();
}
}
}
}
catch (Exception e)
{
Start();
}
}
}
However, as both application con-currently runs, it will go into deadlock pretty fast randomly either from the first test or server application. How do i optimised my code for both side to prevent deadlocks ? I am expecting huge amount of inserts from the testspool application.

Since they work on the same data and get in each others way, I believe the cleanest way to do this would be to avoid executing these two at the same time.
Define a global static variable, or a mutex or a flag of some kind, maybe on the database. Whoever starts executing raises the flag, other one waits for the flag to come down. When flag comes down the other one raises the flag and starts executing.
To avoid long wait times on each class you can alter both your classes to process only a limited amount of records each turn. You should also introduce a maximum wait time for the flag. Maximum record limit should be chosen carefully to ensure that each class finishes its job in shorter time than max wait time.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Using ExecuteNonQueryAsync and Reporting Progress - c#

Related

SQL Insert statements in C# on Azure DB - Running very slow

How do I optimize calculating the hash of thousands of files?

How to run an infinite thread?

SqlDataReader Throws "Invalid attempt to read when no data is present" but OleDbReader does not

Multiple Insert and Update causing DeadLock c#

Categories

Resources