Same method in Azure Function takes longer - c#

I was curious about how much the response times would be improved when moving an intensive method to an Azure Function.
So I created 2 small applications. The first one is an ASP.NET MVC application (app service plan S1). The second one is an ASP.NET MVC application with 1 Azure Function (in a Function project), App Service plan S1 combined with a Consumption plan for the FunctionApp.
Both applications are the same except for 1 method, which is moved to an Azure Function, the MergeDocumentsAndExport.
In my application a given number of Document records are created on application start.
public class Document {
[Key]
public int Id { get; set; }
public string Name { get; set; }
public byte[] Content { get; set; }
public DateTime DateCreated { get; set; }
}
The enduser is able to download all the Documents in the database by clicking on a button in a view. When this button is clicked, the method MergeDocumentsAndExportis called.
Without Azure Functions
public byte[] MergeDocumentsAndExport()
{
var startmoment = DateTime.Now;
var baseDocument = new Spire.Doc.Document();
var documents = _documentDataService.GetAllDocuments();
foreach (var document in documents)
{
using (var memoryStream = new MemoryStream(document.Content))
{
var documentToLoad = new Spire.Doc.Document();
documentToLoad.LoadFromStream(memoryStream, FileFormat.Doc);
foreach (Section section in documentToLoad.Sections)
{
baseDocument.Sections.Add(section.Clone());
}
}
}
byte[] byteArrayToReturn;
using (var memoryStream = new MemoryStream())
{
baseDocument.SaveToStream(memoryStream, FileFormat.Doc);
byteArrayToReturn = memoryStream.ToArray();
}
_responseLogger.Log(startmoment, DateTime.Now, nameof(MergeDocumentsAndExport);
return byteArrayToReturn;
}
The _responseLogger.Log(..) method logs the start and end of the method with the name of the method en determines the executiontime (maybe responselogger isn't the best name for this service).
With Azure Functions
The same method is transformed into an HTTP-triggered Azure Function.
[DependencyInjectionConfig(typeof(DependencyConfig))]
public static class MergeDocumentsAndExportFunction
{
[FunctionName("MergeDocumentsAndExportFunction")]
public static HttpResponseMessage Run(
[HttpTrigger(AuthorizationLevel.Function, "get", Route = "MergeDocumentsAndExport")]HttpRequestMessage req,
TraceWriter log,
[Inject]IDocumentDataService documentDataService,
[Inject]IResponseLogger responseLogger)
{
var startmoment = DateTime.Now;
log.Info("MergeDocumentsAndExportFunction processed a request.");
var baseDocument = new Document();
var documents = documentDataService.GetAllDocuments();
foreach (var document in documents)
{
using (var memoryStream = new MemoryStream(document.Content))
{
var documentToLoad = new Document();
documentToLoad.LoadFromStream(memoryStream, FileFormat.Doc);
foreach (Section section in documentToLoad.Sections)
{
baseDocument.Sections.Add(section.Clone());
}
}
}
using (var memoryStream = new MemoryStream())
{
baseDocument.SaveToStream(memoryStream, FileFormat.Doc);
// Create response to send document as byte[] back.
var response = req.CreateResponse(HttpStatusCode.OK);
var buffer = memoryStream.ToArray();
var contentLength = buffer.Length;
response.Content = new StreamContent(new MemoryStream(buffer));
response.Content.Headers.ContentType = new MediaTypeHeaderValue("application/ms-word");
response.Content.Headers.ContentLength = contentLength;
responseLogger.Log(startmoment, DateTime.Now, "MergeDocumentsAndExport");
return response;
}
}
In the MergeDocumentsAndExportFunction a bit of code is added like creating and returning a HttpResponseMessage. I didn't implement async calls, because it was only for this minor test and I wanted to compare the synchronous execution time of the method MergeDocumentsAndExport in both environments.
The results I got, was not what I was expecting. In almost every case the execution time was about the same or was much longer for the Azure Function. I know there is a startup time for an Azure Function and I excluded those. Maybe because the dependency injection with Autofac can take some time? The outcome in seconds when 1000 Documents from the database are merged and exported:
Without Azure Function:
49,34
50,21
51,26
49,00
50,21
50,68
...and so on...
Average: 49,69 seconds.
With Azure Function:
133,64 (startup, excluded)
77,68
85,18
66,46
86,00
65,17
...and so on...
Average: 82,69 seconds.
The execution time of the same method in different environments can differ 30 seconds for merging 1000 documents into 1. How is it possible that the Azure Function takes longer to execute (+ 30 seconds)? Is it because I am using it wrong? Thanks.

Azure Functions on Consumption Plan aren't very suitable to improve response time of long-running CPU-intensive workloads.
Based on my observation, the instances have quite moderate performance, so the request duration won't probably go down compared to fixed-plan App Service.
The strength of Functions is in short-lived executions with small- or variable throughput.
If you still want to compare, you could deploy the same Function App on Fixed App Service plan, measure the timings there and then choose what suits you best.

If an Azure Function on consumption plan is idle it releases the reserved compute from the host. When a new request comes it takes some time for the function to start up and request new compute.
You can set a Function to use app service plan and "always on" setting instead of consumption plan and it should go faster.
https://learn.microsoft.com/en-us/azure/azure-functions/functions-scale

Related

AmazonS3Client Single connection Vs new connection for each call C#

I am using AmazonS3Client to Read/Write data to S3 Object Storage. In my code i am creating a new connection everytime while doing operations like Read,List Buckets, Upload, Rename, Delete etc. After deploying my application to production i encountered some performance issues. After going throughh few blogs it was recommended to use single amazonS3 client connection. My code below ->
For every below CRUD operations if you see i am creating a new connection and then disposing it by using block. I am planning to have single connection and use it without using block on every call. Does maintaining a single connection good choice ? I have ~400 users accessing application at the same time.
public ObjectFileInfo(string path)
{
StorageClient = ObjectFileManager.GetClient();
objectFileInfo = ObjectFileManager.getFileInfo(StorageClient, path);
}
public class ObjectFileManager
{
public static Amazon.S3.AmazonS3Client GetClient()
{
AmazonS3Config Config = new AmazonS3Config();
AmazonS3Client StorageClient;
Config.RegionEndpoint = null;
Config.ServiceURL = ConfigurationManager.NGDMSobjECSEndPoint;
Config.AllowAutoRedirect = true;
Config.ForcePathStyle = true;
Config.Timeout = TimeSpan.FromMinutes(30);
StorageClient = new AmazonS3Client(ConfigurationManager.NGDMSobjECSUser, ConfigurationManager.NGDMSobjECSKey, Config);
return StorageClient;
}
public static string[] ListBuckets()
{
ListBucketsResponse Response;
//Creating AmazonS3Client and disposing it in using
using (AmazonS3Client StorageClient = GetClient())
{
Response = StorageClient.ListBuckets();
}
var BucketNames = from Bucket in Response.Buckets select Bucket.BucketName;
return BucketNames.ToArray();
}
public static bool DeleteFile(string keyName)
{
var delRequest = new DeleteObjectRequest
{
BucketName = bucketName,
Key = keyName
};
//Creating AmazonS3Client and disposing it in using
using (AmazonS3Client StorageClient = GetClient())
{
StorageClient.DeleteObject(delRequest);
}
return true;
}
}
Planning to use Singleton as below and removing using block ->
class S3ObjectStorageClient
{
/// <summary>
/// Singleton implementation of Object Storage Client
/// </summary>
private S3ObjectStorageClient()
{
}
public static AmazonS3Client Client
{
get
{
return S3Client.clientInstance;
}
}
/// <summary>
/// Nested private class to ensure Singleton
/// </summary>
private class S3Client
{
static S3Client()
{
}
internal static readonly AmazonS3Client clientInstance = ObjectFileManager.GetClient();
}
}
public ObjectFileInfo(string path)
{
StorageClient = S3ObjectStorageClient.Client; //Singleton
objectFileInfo = ObjectFileManager.getFileInfo(StorageClient, path);
}
public static string[] ListBuckets()
{
ListBucketsResponse Response;
//Singleton and removed using block
AmazonS3Client StorageClient = S3ObjectStorageClient.Client;
Response = StorageClient.ListBuckets();
var BucketNames = from Bucket in Response.Buckets select Bucket.BucketName;
return BucketNames.ToArray();
}
public static bool DeleteFile(string keyName)
{
var delRequest = new DeleteObjectRequest
{
BucketName = bucketName,
Key = keyName
};
//Singleton and removed using block
AmazonS3Client StorageClient = S3ObjectStorageClient.Client;
StorageClient.DeleteObject(delRequest);
return true;
}
}
As one of the authors of the AWS .NET SDK I can give a little more context. Under the cover the AmazonS3Client along with all of the other service clients in the SDK it manages a pool of HttpClients which are the expensive object to create. So when you are creating a new AmazonS3Client the SDK is reusing an HttpClient from a pool the SDK is managing.
If you are using a proxy with proxy credentials then the SDK does have to create a new HttpClient each time a service client is created.
An area where there could be potential performance issues with creating service clients all the time is determining the AWS credentials to use when an AWSCredentials object is not passed into the constructor. That means each service client will have to resolve the credentials which if you are using an assume role profile that could cause a lot of extra calls to perform the assume role. Getting credentials from instance metadata is optimized so that only one thread is refreshing those credentials per process.
Actually you can safely reuse it, according to the docs it is not a bad idea to create and reuse a client. But creating a new client is not very expensive:
The best-known aspect of the AWS SDK for .NET are the various service clients that you can use to interact with AWS. Client objects are thread safe, disposable, and can be reused. (Client objects are inexpensive, so you are not incurring a large overhead by constructing multiple instances, but it’s not a bad idea to create and reuse a client.)
Thus, according to this the performance benefits are probably not that huge. But since there is a small cost to creating a new client I would always reuse the client. That said, according to the docs your code
using (AmazonS3Client StorageClient = GetClient())
{
Response = StorageClient.ListBuckets();
}
is not really bad, but just a bit less efficient than using a singleton. If you think it hurts your performance in a noticable way, best bet is to measure it and if it is really the cause refactor to using a singleton.
Both are valid approach but you'll certainly gain code efficiency using a singleton.
Moreover, dependency injection is promoted by AWS as the right pattern when it comes to using clients. For example, new AWS service CodeGuru profiler highlights multiple client instances as a source of optimization.
See also : https://aws.amazon.com/fr/blogs/developer/working-with-dependency-injection-in-net-standard-inject-your-aws-clients-part-1/

Idempotency for BigQuery load jobs using Google.Cloud.BigQuery.V2

You are able to create a csv load job to load data from a csv file in Google Cloud Storage by using the BigQueryClient in Google.Cloud.BigQuery.V2 which has a CreateLoadJob method.
How can you guarantee idempotency with this API to ensure that say the network dropped before getting a response and you kicked off a retry you would not end up with the same data being loaded into BigQuery multiple times?
Example API usage
private void LoadCsv(string sourceUri, string tableId, string timePartitionField)
{
var tableReference = new TableReference()
{
DatasetId = _dataSetId,
ProjectId = _projectId,
TableId = tableId
};
var options = new CreateLoadJobOptions
{
WriteDisposition = WriteDisposition.WriteAppend,
CreateDisposition = CreateDisposition.CreateNever,
SkipLeadingRows = 1,
SourceFormat = FileFormat.Csv,
TimePartitioning = new TimePartitioning
{
Type = _partitionByDayType,
Field = timePartitionField
}
};
BigQueryJob loadJob = _bigQueryClient.CreateLoadJob(sourceUri: sourceUri,
destination: tableReference,
schema: null,
options: options);
loadJob.PollUntilCompletedAsync().Wait();
if (loadJob.Status.Errors == null || !loadJob.Status.Errors.Any())
{
//Log success
return;
}
//Log error
}
You can achieve idempotency by generating your own jobid based on e.g. file location you loaded and target table.
job_id = 'my_load_job_{}'.format(hashlib.md5(sourceUri+_projectId+_datasetId+tableId).hexdigest())
var options = new CreateLoadJobOptions
{
WriteDisposition = WriteDisposition.WriteAppend,
CreateDisposition = CreateDisposition.CreateNever,
SkipLeadingRows = 1,
JobId = job_id, #add this
SourceFormat = FileFormat.Csv,
TimePartitioning = new TimePartitioning
{
Type = _partitionByDayType,
Field = timePartitionField
}
};
In this case if you try reinsert the same job_id you got error.
You can also easily generate this job_id for check in case if pooling failed.
There are two places you could end up losing the response:
When creating the job to start with
When polling for completion
The first one is relatively tricky to recover from without a job ID; you could list all the jobs in the project and try to find one that looks like the one you'd otherwise create.
However, the C# client library generates a job ID so that it can retry, or you can specify your own job ID via CreateLoadJobOptions.
The second failure time is much simpler: keep the returned BigQueryJob so you can retry the polling if that fails. (You could store the job name so that you can recover even if your process dies while waiting for it to complete, for example.)

In ASP.Net Core, is it possible to start streaming JSON results?

I am using ASP.Net Core WebAPI.
I have a method that retrieves 10000 results from the database at a time, but I notice that it takes 1.17s to "wait" and 0.3s for the actual transfer (based on Chrome's network graph).
With the results from the database (postgres) are iterated through the DataReader and converted into a struct, added to a list, and ultimately returned as a JsonResult.
I do not know what to expect exactly for options, but I would like to be able to start returning as soon as possible to make the total request lower. I am also doing this for the first time on this platform, so I may not be doing things the best way.
[HttpGet("{turbine:int}")]
public IActionResult GetBearingTemperature(int turbine)
{
using (var connection = Database.GetConnection())
{
connection.Open();
int? page = GetPage();
var command = connection.CreateCommand();
if (page.HasValue)
{
command.CommandText = #"select turbine, timestamp, mainbearingtemperature from readings where turbine = :turbine limit 10000 offset :offset;";
command.Parameters.AddWithValue("offset", NpgsqlTypes.NpgsqlDbType.Integer, page.Value * 10000);
} else
{
command.CommandText = #"select turbine, timestamp, mainbearingtemperature from readings where turbine = :turbine limit 10000;";
}
command.Parameters.AddWithValue("turbine", NpgsqlTypes.NpgsqlDbType.Integer, 4, turbine);
var reader = command.ExecuteReader();
var collection = new List<BearingTemperature>();
if (reader.HasRows)
{
var bt = new BearingTemperature();
while (reader.Read())
{
bt.Time = reader.GetDateTime(1);
bt.Turbine = reader.GetInt32(0);
bt.Value = reader.GetDouble(2);
collection.Add(bt);
}
return new JsonResult(collection);
}
else
{
return new EmptyResult();
}
}
}
private int? GetPage()
{
if (Request.Query.ContainsKey("page"))
{
return int.Parse(Request.Query["page"]);
}
else return null;
}
struct BearingTemperature
{
public int Turbine;
public DateTime Time;
public double Value;
}
So I know this question is old, but this is very much possible in Asp.Net Core 2.2 (probably even from earlier versions, ever since IEnumerable<T> was supported as a return result on an action).
While I'm not entirely familiar with postgres and DataReader, the functionality is there to get it streaming the result to the client. Appending to a list, and returning the result in its entirety takes up a lot of memory depending on the size of the result, and streaming helps us avoid that.
Here is an example of an action, that returns an IEnumerable<string> that is streamed to the client (it is sent in chunks until everything has been delivered using the Transfer-Encoding: chunked header).
[HttpGet]
public IEnumerable<string> Get()
{
return GetStringsFor(10000);
}
private static readonly Random random = new Random();
private IEnumerable<string> GetStringsFor(int amount)
{
const string chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
while (amount-- > 0)
{
yield return new string(Enumerable.Repeat(chars, random.Next(1000)).Select(s => s[random.Next(s.Length)]).ToArray());
}
}
This will ensure that not everything is loaded into memory, but sent on demand. You would be able to implement something similar in your case when you're reading to data into memory, because that is one time where the system could just start sending the result instead.
private IEnumerable<BearingTemperature> ReadTemperatures(SqlDataReader reader)
{
if (reader.HasRows)
{
var bt = new BearingTemperature();
while (reader.Read())
{
bt.Time = reader.GetDateTime(1);
bt.Turbine = reader.GetInt32(0);
bt.Value = reader.GetDouble(2);
yield return bt;
}
}
yield break;
}
[HttpGet("{turbine:int}")]
public IEnumerable<BearingTemperature> GetBearingTemperature(int turbine)
{
using (var connection = Database.GetConnection())
{
<snip>
var reader = command.ExecuteReader();
return ReadTemperatures(reader);
}
}
Considering that your database is going to execute the query and return the entire result set, it's not possible for you to stream a partial result set (though you can google streaming database for other offerings). What you could do instead is use a paging technique combined with ajax to retrieve slices of the total result set and compose them together on the client to keep the responsiveness high and create the illusion of streaming query results.
You'll want to look at OFFSET and LIMIT clauses
On your api, you'd include parameters for offset and limit to allow the client to step through and retrieve the result set in whatever size chunks it wants, you can play with it to determine what seems responsive enough. Then on your client, you'll need a loop over an ajax call to your api, probably using jquery, and keep looping page after page adding the results to the bound collection on the client or create ui elements, or whatever, until the results come back empty.
Alternatively, if showing the whole 10k records at once isn't necessary, you could simply page the results and provide an interface to step through the pages. One that I've used for such a purpose is from Sakura on git hub: PagedList

Adding AsParallel() call cause my code to break on writing a file

I'm building a console application that have to process a bunch of document.
To stay simple, the process is :
for each year between X and Y, query the DB to get a list of document reference to process
for each of this reference, process a local file
The process method is, I think, independent and should be parallelized as soon as input args are different :
private static bool ProcessDocument(
DocumentsDataset.DocumentsRow d,
string langCode
)
{
try
{
var htmFileName = d.UniqueDocRef.Trim() + langCode + ".htm";
var htmFullPath = Path.Combine("x:\path", htmFileName;
missingHtmlFile = !File.Exists(htmFullPath);
if (!missingHtmlFile)
{
var html = File.ReadAllText(htmFullPath);
// ProcessHtml is quite long : it use a regex search for a list of reference
// which are other documents, then sends the result to a custom WS
ProcessHtml(ref html);
File.WriteAllText(htmFullPath, html);
}
return true;
}
catch (Exception exc)
{
Trace.TraceError("{0,8}Fail processing {1} : {2}","[FATAL]", d.UniqueDocRef, exc.ToString());
return false;
}
}
In order to enumerate my document, I have this method :
private static IEnumerable<DocumentsDataset.DocumentsRow> EnumerateDocuments()
{
return Enumerable.Range(1990, 2020 - 1990).AsParallel().SelectMany(year => {
return Document.FindAll((short)year).Documents;
});
}
Document is a business class that wrap the retrieval of documents. The output of this method is a typed dataset (I'm returning the Documents table). The method is waiting for a year and I'm sure a document can't be returned by more than one year (year is part of the key actually).
Note the use of AsParallel() here, but I never got issue with this one.
Now, my main method is :
var documents = EnumerateDocuments();
var result = documents.Select(d => {
bool success = true;
foreach (var langCode in new string[] { "-e","-f" })
{
success &= ProcessDocument(d, langCode);
}
return new {
d.UniqueDocRef,
success
};
});
using (var sw = File.CreateText("summary.csv"))
{
sw.WriteLine("Level;UniqueDocRef");
foreach (var item in result)
{
string level;
if (!item.success) level = "[ERROR]";
else level = "[OK]";
sw.WriteLine(
"{0};{1}",
level,
item.UniqueDocRef
);
//sw.WriteLine(item);
}
}
This method works as expected under this form. However, if I replace
var documents = EnumerateDocuments();
by
var documents = EnumerateDocuments().AsParrallel();
It stops to work, and I don't understand why.
The error appears exactly here (in my process method):
File.WriteAllText(htmFullPath, html);
It tells me that the file is already opened by another program.
I don't understand what can cause my program not to works as expected. As my documents variable is an IEnumerable returning unique values, why my process method is breaking ?
thx for advises
[Edit] Code for retrieving document :
/// <summary>
/// Get all documents in data store
/// </summary>
public static DocumentsDS FindAll(short? year)
{
Database db = DatabaseFactory.CreateDatabase(connStringName); // MS Entlib
DbCommand cm = db.GetStoredProcCommand("Document_Select");
if (year.HasValue) db.AddInParameter(cm, "Year", DbType.Int16, year.Value);
string[] tableNames = { "Documents", "Years" };
DocumentsDS ds = new DocumentsDS();
db.LoadDataSet(cm, ds, tableNames);
return ds;
}
[Edit2] Possible source of my issue, thanks to mquander. If I wrote :
var test = EnumerateDocuments().AsParallel().Select(d => d.UniqueDocRef);
var testGr = test.GroupBy(d => d).Select(d => new { d.Key, Count = d.Count() }).Where(c=>c.Count>1);
var testLst = testGr.ToList();
Console.WriteLine(testLst.Where(x => x.Count == 1).Count());
Console.WriteLine(testLst.Where(x => x.Count > 1).Count());
I get this result :
0
1758
Removing the AsParallel returns the same output.
Conclusion : my EnumerateDocuments have something wrong and returns twice each documents.
Have to dive here I think
This is probably my source enumeration in cause
I suggest you to have each task put the file data into a global queue and have a parallel thread take writing requests from the queue and do the actual writing.
Anyway, the performance of writing in parallel on a single disk is much worse than writing sequentially, because the disk needs to spin to seek the next writing location, so you are just bouncing the disk around between seeks. It's better to do the writes sequentially.
Is Document.FindAll((short)year).Documents threadsafe? Because the difference between the first and the second version is that in the second (broken) version, this call is running multiple times concurrently. That could plausibly be the cause of the issue.
Sounds like you're trying to write to the same file. Only one thread/program can write to a file at a given time, so you can't use Parallel.
If you're reading from the same file, then you need to open the file with only read permissions as not to put a write lock on it.
The simplest way to fix the issue is to place a lock around your File.WriteAllText, assuming the writing is fast and it's worth parallelizing the rest of the code.

Store the cache data locally

I develops a C# Winform application, it is a client and connect to web service to get data. The data returned by webservice is a DataTable. Client will display it on a DataGridView.
My problem is that: Client will take more time to get all data from server (web service is not local with client). So I must to use a thread to get data. This is my model:
Client create a thread to get data -> thread complete and send event to client -> client display data on datagridview on a form.
However, when user closes the form, user can open this form in another time, and client must get data again. This solution will cause the client slowly.
So, I think about a cached data:
Client <---get/add/edit/delete---> Cached Data ---get/add/edit/delete--->Server (web service)
Please give me some suggestions.
Example: cached data should be developed in another application which is same host with client? Or cached data is running in client.
Please give me some techniques to implement this solution.
If having any examples, please give me.
Thanks.
UPDATE : Hello everyone, maybe you think my problem so far. I only want to cache data in client's lifetime. I think cache data should be stored in memory. And when client want to get data, it will check from cache.
If you're using C# 2.0 and you're prepared to ship System.Web as a dependency, then you can use the ASP.NET cache:
using System.Web;
using System.Web.Caching;
Cache webCache;
webCache = HttpContext.Current.Cache;
// See if there's a cached item already
cachedObject = webCache.Get("MyCacheItem");
if (cachedObject == null)
{
// If there's nothing in the cache, call the web service to get a new item
webServiceResult = new Object();
// Cache the web service result for five minutes
webCache.Add("MyCacheItem", webServiceResult, null, DateTime.Now.AddMinutes(5), Cache.NoSlidingExpiration, System.Web.Caching.CacheItemPriority.Normal, null);
}
else
{
// Item already in the cache - cast it to the right type
webServiceResult = (object)cachedObject;
}
If you're not prepared to ship System.Web, then you might want to take a look at the Enterprise Library Caching block.
If you're on .NET 4.0, however, caching has been pushed into the System.Runtime.Caching namespace. To use this, you'll need to add a reference to System.Runtime.Caching, and then your code will look something like this:
using System.Runtime.Caching;
MemoryCache cache;
object cachedObject;
object webServiceResult;
cache = new MemoryCache("StackOverflow");
cachedObject = cache.Get("MyCacheItem");
if (cachedObject == null)
{
// Call the web service
webServiceResult = new Object();
cache.Add("MyCacheItem", webServiceResult, DateTime.Now.AddMinutes(5));
}
else
{
webServiceResult = (object)cachedObject;
}
All these caches run in-process to the client. Because your data is coming from a web service, as Adam says, you're going to have difficulty determining the freshness of the data - you'll have to make a judgement call on how often the data changes and how long you cache the data for.
Do you have the ability to make changes/add to the webservice?
If you can Sync Services may be an option for you. You can define which tables are syncronised, and all the sync stuff is managed for you.
Check out
http://msdn.microsoft.com/en-us/sync/default.aspx
and shout if you need more information.
You might try the Enterprise Library's Caching Application Block. It's easy to use, stores in memory and, if you ever need to later, it supports adding a backup location for persisting beyond the life of the application (such as to a database, isolated storage, file, etc.) and even encryption too.
Use EntLib 3.1 if you're stuck with .NET 2.0. There's not much new (for caching, at least) in the newer EntLibs aside from better customization support.
Identify which objects you would like to serialize, and cache to isolated storage. Specify the level of data isolation you would like (application level, user level, etc).
Example:
You could create a generic serializer, a very basic sample would look like this:
public class SampleDataSerializer
{
public static void Deserialize<T>(out T data, Stream stm)
{
var xs = new XmlSerializer(typeof(T));
data = (T)xs.Deserialize(stm);
}
public static void Serialize<T>(T data, Stream stm)
{
try
{
var xs = new XmlSerializer(typeof(T));
xs.Serialize(stm, data);
}
catch (Exception e)
{
throw;
}
}
}
Note that you probably should put in some overloads to the Serialize and Deserialize methods to accomodate readers, or any other types you are actually using in your app (e.g., XmlDocuments, etc).
The operation to save to IsolatedStorage can be handled by a utility class (example below):
public class SampleIsolatedStorageManager : IDisposable
{
private string filename;
private string directoryname;
IsolatedStorageFile isf;
public SampleIsolatedStorageManager()
{
filename = string.Empty;
directoryname = string.Empty;
// create an ISF scoped to domain user...
isf = IsolatedStorageFile.GetStore(IsolatedStorageScope.User |
IsolatedStorageScope.Assembly | IsolatedStorageScope.Domain,
typeof(System.Security.Policy.Url), typeof(System.Security.Policy.Url));
}
public void Save<T>(T parm)
{
using (IsolatedStorageFileStream stm = GetStreamByStoredType<T>(FileMode.Create))
{
SampleDataSerializer.Serialize<T>(parm, stm);
}
}
public T Restore<T>() where T : new()
{
try
{
if (GetFileNameByType<T>().Length > 0)
{
T result = new T();
using (IsolatedStorageFileStream stm = GetStreamByStoredType<T>(FileMode.Open))
{
SampleDataSerializer.Deserialize<T>(out result, stm);
}
return result;
}
else
{
return default(T);
}
}
catch
{
try
{
Clear<T>();
}
catch
{
}
return default(T);
}
}
public void Clear<T>()
{
if (isf.GetFileNames(GetFileNameByType<T>()).Length > 0)
{
isf.DeleteFile(GetFileNameByType<T>());
}
}
private string GetFileNameByType<T>()
{
return typeof(T).Name + ".cache";
}
private IsolatedStorageFileStream GetStreamByStoredType<T>(FileMode mode)
{
var stm = new IsolatedStorageFileStream(GetFileNameByType<T>(), mode, isf);
return stm;
}
#region IDisposable Members
public void Dispose()
{
isf.Close();
}
}
Finally, remember to add the following using clauses:
using System.IO;
using System.IO.IsolatedStorage;
using System.Xml.Serialization;
The actual code to use the classes above could look like this:
var myClass = new MyClass();
myClass.name = "something";
using (var mgr = new SampleIsolatedStorageManager())
{
mgr.Save<MyClass>(myClass);
}
This will save the instance you specify to be saved to the isolated storage. To retrieve the instance, simply call:
using (var mgr = new SampleIsolatedStorageManager())
{
mgr.Restore<MyClass>();
}
Note: the sample I've provided only supports one serialized instance per type. I'm not sure if you need more than that. Make whatever modifications you need to support further functionalities.
HTH!
You can serialise the DataTable to file:
http://forums.asp.net/t/1441971.aspx
Your only concern then is deciding when the cache has gone stale. Perhaps timestamp the file?
In our implementation every row in the database has a last-updated timestamp. Every time our client application accesses a table we select the latest last-updated timestamp from the cache and send that value to the server. The server responds with all the rows that have newer timestamps.

Categories

Resources