Memory buffer and IO operations - c#

Are the following two code samples equal in terms of perfomance?
Code Sample 1:
var count = 9999999999;
using(var sw = new StreamWriter())
{
for(int i=0;i<count;i++)
{
var result = SomeRelativeLongOperation(i);
sw.WriteLine(result);
}
}
Code Sample 2:
var count = 9999999999;
var resultCollection = new ....
using(var sw = new StreamWriter())
{
for(int i=0;i<count;i++)
{
resultCollection.Add(SomeRelativeLongOperation(i));
if(resultCollection.Count%100==0)
{
WriteBlock(sw,resultCollection);
resultCollection.Clear();
}
}
}
I know that Windows uses memory buffers for IO operations. So, when I call the StreamWriter.WriteLine method, it first stores data in memory and then flush to the hard drive, right?

StreamWriter is already buffered, so adding an additional buffer is simply going to make it less efficient.

Related

C# Reading XPS causing memory leak

I have a simple program that just reading XPS file, i've read the following post and it did solve part of the issue.
Opening XPS document in .Net causes a memory leak
class Program
{
static int intCounter = 0;
static object _intLock = new object();
static int getInt()
{
lock (_intLock)
{
return intCounter++;
}
}
static void Main(string[] args)
{
Console.ReadLine();
for (int i = 0; i < 100; i++)
{
Thread t = new Thread(() =>
{
var ogXps = File.ReadAllBytes(#"C:\Users\Nathan\Desktop\Objective.xps");
readXps(ogXps);
Console.WriteLine(getInt().ToString());
});
t.SetApartmentState(ApartmentState.STA);
t.Start();
Thread.Sleep(50);
}
Console.ReadLine();
}
static void readXps(byte[] originalXPS)
{
try
{
MemoryStream inputStream = new MemoryStream(originalXPS);
string memoryStreamUri = "memorystream://" + Path.GetFileName(Guid.NewGuid().ToString() + ".xps");
Uri packageUri = new Uri(memoryStreamUri);
Package oldPackage = Package.Open(inputStream);
PackageStore.AddPackage(packageUri, oldPackage);
XpsDocument xpsOld = new XpsDocument(oldPackage, CompressionOption.Normal, memoryStreamUri);
FixedDocumentSequence seqOld = xpsOld.GetFixedDocumentSequence();
//The following did solve some of the memory issue
//-----------------------------------------------
var docPager = seqOld.DocumentPaginator;
docPager.ComputePageCount();
for (int i = 0; i < docPager.PageCount; i++)
{
FixedPage fp = docPager.GetPage(i).Visual as FixedPage;
fp.UpdateLayout();
}
seqOld = null;
//-----------------------------------------------
xpsOld.Close();
oldPackage.Close();
oldPackage = null;
inputStream.Close();
inputStream.Dispose();
inputStream = null;
PackageStore.RemovePackage(packageUri);
}
catch (Exception e)
{
}
}
}
^ The program will read a XPS file for hundred times
^Before apply the fix
^After apply the fix
So the fix in the post suggested did eliminate some objects, however i found that there are still objects like Dispatcher , ContextLayoutManager and MediaContext still exists in memory and their number are exactly 100, is this a normal behavior or a memory leak? How do i fix this? Thanks.
25/7/2018 Update
Adding the line Dispatcher.CurrentDispatcher.InvokeShutdown(); did get rid of the Dispatcher , ContextLayoutManager and MediaContext object, don't know if this is an ideal way to fix.
It looks like those classes you're left with are from the XPSDocument, that implements IDisposable but you don't call those. And there are a few more classes that implement that same interface and if they do, as a rule of thumb, either wrap them in a using statement so it is guaranteed their Dispose method gets called or call their Dispose method your self.
An improved version of your readXps method will look like this:
static void readXps(byte[] originalXPS)
{
try
{
using (MemoryStream inputStream = new MemoryStream(originalXPS))
{
string memoryStreamUri = "memorystream://" + Path.GetFileName(Guid.NewGuid().ToString() + ".xps");
Uri packageUri = new Uri(memoryStreamUri);
using(Package oldPackage = Package.Open(inputStream))
{
PackageStore.AddPackage(packageUri, oldPackage);
using(XpsDocument xpsOld = new XpsDocument(oldPackage, CompressionOption.Normal, memoryStreamUri))
{
FixedDocumentSequence seqOld = xpsOld.GetFixedDocumentSequence();
//The following did solve some of the memory issue
//-----------------------------------------------
var docPager = seqOld.DocumentPaginator;
docPager.ComputePageCount();
for (int i = 0; i < docPager.PageCount; i++)
{
FixedPage fp = docPager.GetPage(i).Visual as FixedPage;
fp.UpdateLayout();
}
seqOld = null;
//-----------------------------------------------
} // disposes XpsDocument
} // dispose Package
PackageStore.RemovePackage(packageUri);
} // dispose MemoryStream
}
catch (Exception e)
{
// really do something here, at least:
Debug.WriteLine(e);
}
}
This should at least clean-up most of the objects. I'm not sure if you're going to see the effects in your profiling as that depends on if the objects are actually collected during your analysis. Profiling a debug build might give unanticipated results.
As the remainder of those object instances seem to be bound to the System.Windows.Threading.Dispatcher I suggest you could try to keep a reference to your Threads (but at this point you might consider looking into Tasks) ansd once all threads are done, call the static ExitAllFrames on the Dispatcher.
Your main method will then look like this:
Console.ReadLine();
Thread[] all = new Thread[100];
for (int i = 0; i < all.Length; i++)
{
var t = new Thread(() =>
{
var ogXps = File.ReadAllBytes(#"C:\Users\Nathan\Desktop\Objective.xps");
readXps(ogXps);
Console.WriteLine(getInt().ToString());
});
t.SetApartmentState(ApartmentState.STA);
t.Start();
all[i] = t; // keep reference
Thread.Sleep(50);
}
foreach(var t in all) t.Join(); // https://stackoverflow.com/questions/263116/c-waiting-for-all-threads-to-complete
all = null; // meh
Dispatcher.ExitAllFrames(); // https://stackoverflow.com/a/41953265/578411
Console.ReadLine();

WCF streaming large number objects

I have a WCF service that query a database and returns a large number of records. There is so many records, that the server runs out of memory and fails before it can return.
So I want to send the records back as I fetch them from the database, or a set number back at a time.
For additional clarity, I cannot collect call records fetched into a collection on the server, as the server runs out of memory before I have collected all the records. I want to try and find away to send them back one by one or in chunks, in one call.
For example, in chunks:
Fetch first 1000 records
Add to collection
Send collection to client
Clear collection
Fetch next 1000 records, and repeat from step 2
So the idea I have how the web service code will look something like this:
Public IEnumerable<Customer> GetAllCustomers()
{
// Setup Query
string query = PrepareQuery();
// Create Connection
connection = new SqlConnection(ConnectionString);
connection.Open();
var sqlcommand = connection.CreateCommand();
sqlcommand.CommandText = query.ToString();
// Read Results
var reader = sqlcommand.ExecuteReader();
while (reader.Read())
{
Customer customer = new Customer();
foreach (var column in Columns)
{
int fieldIndex = reader.GetOrdinal(column);
object value = reader.GetValue(fieldIndex);
customer[column.Name] = value;
}
yield return customer;
}
}
I don't want to consider paging as the Order By on the SQL server is slow.
Looking for way to do this in WCF
I think you answer your own question. There are 2 ways to do it, stream or chunk.
You can do streaming in wcf - see https://learn.microsoft.com/en-us/dotnet/framework/wcf/feature-details/large-data-and-streaming
You get a Stream to write to, so you need to handle yourself how you are going to encode your data on that stream, and how you are going decode it at the client.
The alternative is you do chunking/paging. You just modify your service so it accepts e.g. a page number or some other way to indicate which page is needed.
Which one you do depends on the application, eg how much data? what is the nature of the client? is it possible to use some field to page on? etc etc
Here is some psudo code for making a stream that can do this on the server side. It is based on the example here: https://learn.microsoft.com/en-us/dotnet/framework/wcf/feature-details/how-to-enable-streaming
I'm not writing the full compilable code for you, but this is the gist of it.
In the server:
public Stream GetBigData()
{
return new BigDataStream();
}
BigDataStream (the non-implimented methods are not shown):
class BigDataStream : Stream
{
public BigDataStream()
{
// open DB connection
// run your query
// get a DataReader
}
// you need a buffer to encode your data between calls to Read
List<byte> _encodeBuffer = new List<byte>();
public override int Read(byte[] buffer, int offset, int count)
{
// read from the DataReader and populate the _encodeBuffer
// until the _encodeBuffer contains at least count bytes
// (or until there are no more records)
// for example:
while (_encodeBuffer.Count < count && _reader.Read())
{
// (1)
// encode the record into a byte array. How to do this?
// you can read into a class and then use the data
// contract serialization for example. If you do this, you
// will probably find it easier to prepend an integer which
// specifies the length of the following encoded message.
// This will make it easier for the client to deserialize it.
// (2)
// append the encoded record bytes (plus any length prefix
// etc) to _encodeBuffer
}
// remove up to the first count bytes from _encodeBuffer
// and copy them into buffer at the offset requested
// return the number of bytes added
}
public override void Close()
{
// close the reader + db connection
base.Close();
}
}
Thank to mikelegg & Reniuz for helping come to a solution. I wish I could give them the tick for the right answer, but I am a afraid the next developer to read this question would not fully benefit. So where is what I ended up with.
Setup the config files for the Server and Client (Follow link: Large Data and Streaming)
Followed this solution, can download source code from here
I had to change the DBRowStream.DBThreadProc method a bit to work so I post the source code:
DBRowStream Class:
void DBThreadProc(object o)
{
SqlConnection con = null;
SqlCommand com = null;
try
{
con = new System.Data.SqlClient.SqlConnection(/*ConnectionString*/);
com = new SqlCommand();
com.Connection = con;
com.CommandText = PrepareQuery();
con.Open();
SqlDataReader reader = com.ExecuteReader();
int count = 0;
MemoryStream memStream = memStream1;
memStreamWriteStatus = 1;
readyToWriteToMemStream1.WaitOne();
while (reader.Read())
{
// Populate
Customer customer = new Customer();
foreach (var column in Columns)
{
int fieldIndex = reader.GetOrdinal(column);
object value = reader.GetValue(fieldIndex);
customer[column.Name] = value;
}
// Serialize: I used a custom Serializer
// but BinaryFormatter should be fine
DBDataFormatter.Serialize(memStream, customer);
count++;
if (count == PAGESIZE) // const int PAGESIZE = 10000
{
switch (memStreamWriteStatus)
{
case 1: // done writing to stream 1
{
memStream1.Position = 0;
readyToSendFromMemStream1.Set();
// write stream 1 is done...waiting for stream 2
readyToWriteToMemStream2.WaitOne();
memStream = memStream2;
memStream.Position = 0;
memStream.SetLength(0); // Added:To Reset the stream. Else was getting garbage data back
memStreamWriteStatus = 2;
break;
}
case 2: // done writing to stream 2
{
memStream2.Position = 0;
readyToSendFromMemStream2.Set();
// Write on stream 2 is done...waiting for stream 1
readyToWriteToMemStream1.WaitOne();
// done waiting for stream 1
memStream = memStream1;
memStreamWriteStatus = 1;
memStream.Position = 0;
memStream.SetLength(0); // Added: Reset the stream. Else was getting garbage data back
break;
}
}
count = 0;
}
}
if (count > 0)
{
switch (memStreamWriteStatus)
{
case 1: // done writing to stream 1
{
memStream1.Position = 0;
readyToSendFromMemStream1.Set();
// END write stream 1 is done...waiting for stream 2
break;
}
case 2: // done writing to stream 2
{
memStream2.Position = 0;
readyToSendFromMemStream2.Set();
// END write stream 2 is done...waiting for stream 1
break;
}
}
}
bDoneWriting = true;
bCanRead = false;
}
catch
{
throw;
}
finally
{
if (com != null)
{
com.Dispose();
com = null;
}
if (con != null)
{
con.Close();
con.Dispose();
con = null;
}
}
}
And then the Client side:
private static void TestGetRecordsAndDump()
{
const string FILE_NAME = "Records.CSV";
File.Delete(FILE_NAME);
var file = File.AppendText(FILE_NAME);
long count = 0;
try
{
ServiceReference1.ServiceClient service = new ServiceReference1.DataServiceClient();
var stream = service.GetDBRowStream();
Console.WriteLine("Records Retrieved : ");
Console.WriteLine("File Size (MB) : ");
var canDoLastRead = true;
while (stream.CanRead && canDoLastRead)
{
try
{
Customer customer = DBDataFormatter.Deserialize(stream); // Used custom Deserializer, but BinaryFormatter should be fine
file.Write(customer.ToString());
count++;
}
catch
{
canDoLastRead = false; // Bug: stream.CanRead is not set to false at the end of stream, so I do this trick to know if I finished retruning all records.
}
finally
{
Console.SetCursorPosition("Records Retrieved : ".Length, 0);
Console.Write(string.Format("{0} ", count));
Console.SetCursorPosition("File Size (MB) : ".Length, 1);
Console.Write(string.Format("{0:G} ", file.BaseStream.Length / 1024f / 1024f));
}
}
finally
{
file.Close();
}
}
}
There is a bug I cannot seem to solve, that stream.CanRead is not set to false, then all the records have been returned, have not been able to work out why, but at least now, I can query large data sets, and return all records, with out the server or client running out of memory.

Capture Slow Output from Method

I have a slow running utility method that logs one line of output at a time. I need to be able to output each of those lines and then read them from other locations in code. I have attempted using Tasks and Streams similar to the code below:
public static Task SlowOutput(Stream output)
{
Task result = new Task(() =>
{
using(StreamWriter sw = new StreamWriter(output))
{
for(var i = 0; i < int.MaxValue; i++)
{
sw.WriteLine(i.ToString());
System.Threading.Thread.Sleep(1000);
}
}
}
}
And then called like this:
MemoryStream ms = new MemoryStream();
var t = SlowOutput(ms);
using (var sr = new StreamReader(ms))
{
while (!t.IsCompleted)
{
Console.WriteLine(sr.ReadLine())
}
}
But of course, sr.ReadLine() is always empty because as soon as the method's sw.WriteLine() is called, it changes the position of the underlying stream to the end.
What I'm trying to do is pipe the output of the stream by maybe queueing up the characters that the method outputs and then consuming them from outside the method. Streams don't seem to be the way to go.
Is there a generally accepted way to do this?
What I would do is switch to a BlockingCollection<String>.
public static Task SlowOutput(BlockingCollection<string> output)
{
return Task.Run(() =>
{
for(var i = 0; i < int.MaxValue; i++)
{
output.Add(i);
System.Threading.Thread.Sleep(1000);
}
output.Complete​Adding();
}
}
consumed by
var bc = BlockingCollection<string>();
SlowOutput(bc);
foreach(var line in bc.GetConsumingEnumerable()) //Blocks till a item is added to the collection. Leaves the foreach loop after CompleteAdding() is called and there are no more items to be processed.
{
Console.WriteLine(line)
}

DbContext OutOfMemoryException

I have a DbContext with a dataset of >20M records, that has to be converted to a different data format. Therefore, I read the data into memory, perform some tasks and then dispose the DbContext. The code works fine, but after a while I get OutOfMemoryExceptions. I have been able to narrow it down to the following piece of code, where I retrieve 2M records, then release them and fetch them again. The first retrieval works just fine, the second one throws an exception.
// first call runs fine
using (var dbContext = new CustomDbContext())
{
var list = dbContext.Items.Take(2000000).ToArray();
foreach (var item in list)
{
// perform conversion tasks...
item.Converted = true;
}
}
// second call throws exception
using (var dbContext = new CustomDbContext())
{
var list = dbContext.Items.Take(2000000).ToArray();
foreach (var item in list)
{
// perform conversion tasks...
item.Converted = true;
}
}
Shouldn't the GC automatically release all memory allocated in the first using block, such that the second block should run as fine as the first one?
In my actual code, I do not retrieve 2 million records at once, but something between 0 and 30K in each iteration. However, after about 15 minutes, I run out of memory, although all objects should have been released.
I suspect you met LOH. Probably your objects are bigger than threashold and they are getting there, thus GC doesnt help by default.
Try this: https://www.simple-talk.com/dotnet/.net-framework/large-object-heap-compaction-should-you-use-it/
and see if your exception goes away.
i.e. add this between first and second part:
GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce;
GC.Collect();
IEnumerable has GetEnumerator() so you could try this to avoid .ToArray() or .ToList() that aren´t necessary if you just want to read:
// first call
using (var dbContext = new CustomDbContext())
{
foreach (var item in dbContext.Items.Take(2000000))
{
// perform conversion tasks...
item.Converted = true;
}
}
// second call
using (var dbContext = new CustomDbContext())
{
foreach (var item in dbContext.Items.Take(2000000))
{
// perform conversion tasks...
item.Converted = true;
}
}
Running GC will not help you, you have to run each iteration in different context. And dispose your context.
// ID is your primary key
long startID = 0;
while(true){
using(var db = new CustomDbContext()){
var slice = db.Items.Where(x=>x.ID > startID)
.OrderBy(x=>x.ID)
.Take(1000).ToList();
// stop if there is nothing to process
if(!slice.Any())
break;
foreach(var item in slice){
// your logic...
item.Converted = true;
}
startID = slice.Last().ID;
}
}
If you want to process these things faster, alternate approach would be to run slices in parallel ....
Alternate Approach
I would recommend using dividing slices in 100x100, then I can process 100 slices of 100 items in parallel.
You can always easily customize slicing to meet your speed needs.
public IEnumerable<IEnumerable<T>> Slice(IEnumerable<T> src, int size){
while(src.Any()){
var s = src.Take(size);
src = src.Skip(size);
yield return s;
}
}
long startID = 0;
while(true){
using(var db = new CustomDbContext()){
var src = db.Items.Where(x=>x.ID > startID)
.OrderBy(x=>x.ID)
.Take(10000).Select(x=>x.ID).ToList();
// stop if there is nothing to process
if(!src.Any())
break;
Parallel.ForEach(src.Slice(100), slice => {
using(var sdb = new CustomDbContext()){
foreach(var item in sdb.Items.Where(x=> slice.Contains(x.ID)){
item.Converted = true;
}
}
} );
startID = src.Last();
}
}
After refactoring, memory gets released. I don't know why, but it works.
private static void Debug()
{
var iteration = 0;
while(true)
{
Console.WriteLine("Iteration {0}", iteration++);
Convert();
}
}
private static void Convert()
{
using (var dbContext = new CustomDbContext(args[0]))
{
var list = dbContext.Items.Take(2000000).ToList();
foreach (var item in list)
{
item.Converted = true;
}
}
}
When I move the content of Convert() to the while loop in Debug(), the OutOfMemoryExceptions is thrown.
private static void Debug()
{
var iteration = 0;
while(true)
{
Console.WriteLine("Iteration {0}", iteration++);
using (var dbContext = new CustomDbContext(args[0]))
{
// OutOfMemoryException in second iteration
var list = dbContext.Items.Take(2000000).ToList();
foreach (var item in list)
{
item.Converted = true;
}
}
}
}

Async I/O intensive code is running slower than non-async, why?

I am refactoring an application and trying to add an asynchronous version of an existing function to improve performance times in an ASP.NET MVC application. I understand that there is an overhead involved with asynchronous functions, but I expected that with enough iterations, the I/O intensive nature of loading the data from the database would more than compensate for the overhead penalty and that I would receive significant performance gains.
The TermusRepository.LoadByTermusId function loads data by retrieving a bunch of datatables from the database (using ADO.NET and the Oracle Managed Client), populates a model, and returns it. TermusRepository.LoadByTermusIdAsync is similar, except it does so asynchronously, with a slightly different method of loading up datatable download tasks when there's multiple datatables to retrieve.
public async Task<ActionResult> AsyncPerformanceTest()
{
var vm = new AsyncPerformanceTestViewModel();
Stopwatch watch = new Stopwatch();
watch.Start();
for (int i = 0; i < 60; i++)
{
TermusRepository.LoadByTermusId<Termus2011_2012EndYear>("1");
TermusRepository.LoadByTermusId<Termus2011_2012EndYear>("5");
TermusRepository.LoadByTermusId<Termus2011_2012EndYear>("6");
TermusRepository.LoadByTermusId<Termus2011_2012EndYear>("7");
}
watch.Stop();
vm.NonAsyncElapsedTime = watch.Elapsed;
watch.Reset();
watch.Start();
var tasks = new List<Task<Termus2011_2012EndYear>>();
for (int i = 0; i < 60; i++)
{
tasks.Add(TermusRepository.LoadByTermusIdAsync<Termus2011_2012EndYear>("1"));
tasks.Add(TermusRepository.LoadByTermusIdAsync<Termus2011_2012EndYear>("5"));
tasks.Add(TermusRepository.LoadByTermusIdAsync<Termus2011_2012EndYear>("6"));
tasks.Add(TermusRepository.LoadByTermusIdAsync<Termus2011_2012EndYear>("7"));
}
await Task.WhenAll(tasks.ToArray());
watch.Stop();
vm.AsyncElapsedTime = watch.Elapsed;
return View(vm);
}
public static async Task<T> LoadByTermusIdAsync<T>(string termusId) where T : Appraisal
{
var AppraisalHeader = new OracleCommand("select tu.termus_id, tu.manager_username, tu.evaluee_name, tu.evaluee_username, tu.termus_complete_date, termus_start_date, tu.termus_status, tu.termus_version, tn.managername from tercons.termus_users tu left outer join tercons.termus_names tn on tu.termus_id=tn.termus_id where tu.termus_id=:termusid");
AppraisalHeader.BindByName = true;
AppraisalHeader.Parameters.Add("termusid", termusId);
var dt = await Database.GetDataTableAsync(AppraisalHeader);
T Termus = Activator.CreateInstance<T>();
var row = dt.AsEnumerable().Single();
Termus.TermusId = row.Field<decimal>("termus_id").ToString();
Termus.ManagerUsername = row.Field<string>("manager_username");
Termus.EvalueeUsername = row.Field<string>("evaluee_username");
Termus.EvalueeName = row.Field<string>("evaluee_name");
Termus.ManagerName = row.Field<string>("managername");
Termus.TERMUSCompleteDate = row.Field<DateTime?>("termus_complete_date");
Termus.TERMUSStartDate = row.Field<DateTime>("termus_start_date");
Termus.Status = row.Field<string>("termus_status");
Termus.TERMUSVersion = row.Field<string>("termus_version");
Termus.QuestionsAndAnswers = new Dictionary<string, string>();
var RetrieveQuestionIdsCommand = new OracleCommand("select termus_question_id from tercons.termus_questions where termus_version=:termus_version");
RetrieveQuestionIdsCommand.BindByName = true;
RetrieveQuestionIdsCommand.Parameters.Add("termus_version", Termus.TERMUSVersion);
var QuestionIdsDt = await Database.GetDataTableAsync(RetrieveQuestionIdsCommand);
var QuestionIds = QuestionIdsDt.AsEnumerable().Select(r => r.Field<string>("termus_question_id"));
//There's about 60 questions/answers, so this should result in 60 calls to the database. It'd be a good spot to combine to a single DB call, but left it this way so I could see if async would speed it up for learning purposes.
var DownloadAnswersTasks = new List<Task<DataTable>>();
foreach (var QuestionId in QuestionIds)
{
var RetrieveAnswerCommand = new OracleCommand("select termus_response, termus_question_id from tercons.termus_responses where termus_id=:termus_id and termus_question_id=:questionid");
RetrieveAnswerCommand.BindByName = true;
RetrieveAnswerCommand.Parameters.Add("termus_id", termusId);
RetrieveAnswerCommand.Parameters.Add("questionid", QuestionId);
DownloadAnswersTasks.Add(Database.GetDataTableAsync(RetrieveAnswerCommand));
}
while (DownloadAnswersTasks.Count > 0)
{
var FinishedDownloadAnswerTask = await Task.WhenAny(DownloadAnswersTasks);
DownloadAnswersTasks.Remove(FinishedDownloadAnswerTask);
var AnswerDt = await FinishedDownloadAnswerTask;
var Answer = AnswerDt.AsEnumerable().Select(r => r.Field<string>("termus_response")).SingleOrDefault();
var QuestionId = AnswerDt.AsEnumerable().Select(r => r.Field<string>("termus_question_id")).SingleOrDefault();
if (!String.IsNullOrEmpty(Answer))
{
Termus.QuestionsAndAnswers.Add(QuestionId, System.Net.WebUtility.HtmlDecode(Answer));
}
}
return Termus;
}
public static async Task<DataTable> GetDataTableAsync(OracleCommand command)
{
DataTable dt = new DataTable();
using (var connection = GetDefaultOracleConnection())
{
command.Connection = connection;
await connection.OpenAsync();
dt.Load(await command.ExecuteReaderAsync());
}
return dt;
}
public static T LoadByTermusId<T>(string TermusId) where T : Appraisal
{
var RetrieveAppraisalHeaderCommand = new OracleCommand("select tu.termus_id, tu.manager_username, tu.evaluee_name, tu.evaluee_username, tu.termus_complete_date, termus_start_date, tu.termus_status, tu.termus_version, tn.managername from tercons.termus_users tu left outer join tercons.termus_names tn on tu.termus_id=tn.termus_id where tu.termus_id=:termusid");
RetrieveAppraisalHeaderCommand.BindByName = true;
RetrieveAppraisalHeaderCommand.Parameters.Add("termusid", TermusId);
var AppraisalHeaderDt = Database.GetDataTable(RetrieveAppraisalHeaderCommand);
T Termus = Activator.CreateInstance<T>();
var AppraisalHeaderRow = AppraisalHeaderDt.AsEnumerable().Single();
Termus.TermusId = AppraisalHeaderRow.Field<decimal>("termus_id").ToString();
Termus.ManagerUsername = AppraisalHeaderRow.Field<string>("manager_username");
Termus.EvalueeUsername = AppraisalHeaderRow.Field<string>("evaluee_username");
Termus.EvalueeName = AppraisalHeaderRow.Field<string>("evaluee_name");
Termus.ManagerName = AppraisalHeaderRow.Field<string>("managername");
Termus.TERMUSCompleteDate = AppraisalHeaderRow.Field<DateTime?>("termus_complete_date");
Termus.TERMUSStartDate = AppraisalHeaderRow.Field<DateTime>("termus_start_date");
Termus.Status = AppraisalHeaderRow.Field<string>("termus_status");
Termus.TERMUSVersion = AppraisalHeaderRow.Field<string>("termus_version");
Termus.QuestionsAndAnswers = new Dictionary<string, string>();
var RetrieveQuestionIdsCommand = new OracleCommand("select termus_question_id from tercons.termus_questions where termus_version=:termus_version");
RetrieveQuestionIdsCommand.BindByName = true;
RetrieveQuestionIdsCommand.Parameters.Add("termus_version", Termus.TERMUSVersion);
var QuestionIdsDt = Database.GetDataTable(RetrieveQuestionIdsCommand);
var QuestionIds = QuestionIdsDt.AsEnumerable().Select(r => r.Field<string>("termus_question_id"));
//There's about 60 questions/answers, so this should result in 60 calls to the database. It'd be a good spot to combine to a single DB call, but left it this way so I could see if async would speed it up for learning purposes.
foreach (var QuestionId in QuestionIds)
{
var RetrieveAnswersCommand = new OracleCommand("select termus_response from tercons.termus_responses where termus_id=:termus_id and termus_question_id=:questionid");
RetrieveAnswersCommand.BindByName = true;
RetrieveAnswersCommand.Parameters.Add("termus_id", TermusId);
RetrieveAnswersCommand.Parameters.Add("questionid", QuestionId);
var AnswersDt = Database.GetDataTable(RetrieveAnswersCommand);
var Answer = AnswersDt.AsEnumerable().Select(r => r.Field<string>("termus_response")).SingleOrDefault();
if (!String.IsNullOrEmpty(Answer))
{
Termus.QuestionsAndAnswers.Add(QuestionId, System.Net.WebUtility.HtmlDecode(Answer));
}
}
return Termus;
}
public static DataTable GetDataTable(OracleCommand command)
{
DataTable dt = new DataTable();
using (var connection = GetDefaultOracleConnection())
{
command.Connection = connection;
connection.Open();
dt.Load(command.ExecuteReader());
}
return dt;
}
public static OracleConnection GetDefaultOracleConnection()
{
return new OracleConnection(ConfigurationManager.ConnectionStrings[connectionstringname].ConnectionString);
}
Results for 60 iterations are:
Non Async 18.4375460 seconds
Async 19.8092854 seconds
The results of this test are consistent. No matter how many iterations I go through of the for loop in AsyncPerformanceTest() action method, the async stuff runs about 1 second slower than the non-async. (I run the test multiple times in a row to account for the JITter warming up.) What am I doing wrong that's causing the async to be slower than the non-async? Am I misunderstanding something fundamental about writing asynchronous code?
The asynchronous version will always be slower than the synchronous version when there is no concurrency. It's doing all of the same work as the non-async version, but with a small amount of overhead added to manage the asynchrony.
Asynchrony is advantageous, with respect to performance, by allowing improved availability. Each individual request will be slower, but if you make 1000 requests at the same time, the asynchronous implementation will be able to handle them all more quickly (at least in certain circumstances).
This happens because the asynchronous solution allows the thread that was allocated to handle the request to go back to the pool and handle other requests, whereas the synchronous solution forces the thread to sit there and do nothing while it waits for the asynchronous operation to complete. There is overhead in structuring the program in a way that allows the thread to be freed up to do other work, but the advantage is the ability of that thread to go do other work. In your program there is no other work for the thread to go do, so it ends up being a net loss.
Turns out the Oracle Managed Driver is "fake async", which would partially explain why my async code is running slower.

Categories

Resources