Currently we use Apache Ignite.NET thin client to cache different sets of data. When data request has came we check if data is already stored is the cache and, if not, request data from database and put it into the cache.
I want to prevent several database requests if two data requests has came at the same time.
Is there any way to manually lock cache before the first database request started? Thus second data request could wait until first request is completed.
I cannot solve the task isung .NET concurrency primitives cause cache could be used by multiple client instances (load-balancing).
I've already found ICache.Lock(TK key) method, but it seems that it locks only specified rows in cache and is supported only for in self-hosted mode, not for Ignite.NET this client.
Small piece of code that illustrates the issue:
var key = "cache_key";
using (var ignite = Ignition.StartClient(new Core.Client.IgniteClientConfiguration { Host = "127.0.0.1" }))
{
var cacheNames = ignite.GetCacheNames();
if (cacheNames.Contains(key))
{
return ignite.GetCache<int, Employee>(key).AsCacheQueryable();
}
else
{
var data = RequestDataFromDatabase();
var cache = ignite.CreateCache<int, Employee>(new CacheClientConfiguration(
EmployeeCacheName, new QueryEntity(typeof(int), typeof(Employee))));
cache.PutAll(data);
return cache.AsCacheQueryable();
}
}
The thin client doesn't have the required API.
If you don't need to check for individual records and it's only required to know whether the cache is available, you might just call CreateCache multiple times. It should throw an exception saying that the cache with a particular name already has started for further invocations.
try {
var cache = ignite.CreateCache<int, Employee>(new CacheClientConfiguration(
EmployeeCacheName, new QueryEntity(typeof(int), typeof(Employee))));
// Cache created by this call => add data here
} catch (IgniteClientException e) when (e.Message.Contains("already started")) {
// Return existing cache, don't add data
}
Alexandr has provided a good and simple solution if you just need to initialize the cache once.
If you need more complex synchronization logic, atomic cache operations (PutIfAbsent, Replace) can often replace locks. For example, we could have a special cache to track the status of other caches:
var statusCache = Client.GetOrCreateCache<string, string>("status");
if (statusCache.PutIfAbsent("cache-name", "created"))
{
// Just created, add data
...
//
statusCache.Put("cache-name", "populated");
}
else
{
// Already exists, wait for data
while (statusCache["cache-name"] != "populated")
Thread.Sleep(1000);
}
Related
Hi could someone guide me in the following problem, there must be tons of guides on this problem but for some reason I can't get google to find a nice how to, to follow
I'm implementing this in aspnet core API, but I think the problem/solution could go for any language.
The problem, i have to call a view from a database that is painfully slow it takes about 15-30 seconds to return the ~ 300 rows
It only returns the fields that are required. It joins from a lot of tables and multiple databases. (There are other applications that updates the data, I'm only interested in reading the result)
The DBA says there is nothing he can do, so I have to find a solution, and why not, it could be fun
Now the real problem is there are about 250 autonomous clients requesting data, a client request data about every 2 minutes, and with the time it takes to select data it doesn't take long for the system to become unresponsive. It is the same data in the response for all requests
It would be acceptable to cache the rows for 5 minutes. Now how would I implement it so only one request select form the database and update the cache while all others read from a cache and perhaps are waiting if the cache is empty for a short period, while new data is being loaded to cache from the database view?
(I could write a script to be scheduled to execute every x minute, but it would be more fun to solve this in the application.)
I could perhaps make some cache tables in the database and let the api call check if the cache table is empty, if not get the data from the slow view, populate the cache database and return result. But then what would be a god solution to only empty the cache and populate the cache once and not multiple times when there are going to come multiple requests in the timeframe it takes to load data from the view.
And perhaps there are better alternatives that caching in a database table?
Hope anyone can help
Your question is very unspecific to a technology. So you are asking for an concept. In general
check cache without locking
return data if it is up to date
perform lock
check cache again
update cache
unlock and return data
You may read/use https://learn.microsoft.com/en-us/dotnet/core/extensions/caching
// pseudo code
async Result QueryFromCache()
{
// check cache is up to date - without lock
var cacheData = await GetCacheData(); // latest data or null
if (cacheData == null)
{
// wait for cache data
cacheData = await UpdateCache();
return cacheData.Data;
}
// data is still up to date?
if (cacheData.UpdateDate.AddMinutes(5) > DateTime.Now())
{
return cacheData.Data;
}
// Cache Update is necessary
// Option 1: Start separate "fire and forget" Thread/Task to fill cache, but return old data
Task.Run(UpdateCache()); // do not await
return cacheData.Data; // return immediatly with old data > 5 Minutes
// Option 2:
cacheData = await UpdateCache(); // wait for an update
return cacheData.Data;
}
async CacheData UpdateCache()
{
var lock = GetLock(); // lock, Semaphore, Database-Lock => depends on your architecture
try
{
// doubled check cache is up to date - with lock
var cacheData = await GetCacheData();
if (cacheDate != null && cacheData.UpdateDate.AddMinutes(5) > DateTime.Now())
{
return cacheData;
}
// Query data
var result = await PerformLongQuery();
// update cache
var cacheData = new CacheData {
UpdateDate = DateTime.Now(),
Data = result;
}
await SetCacheData(cacheEntry);
return cacheData;
} finally {
lock.Release();
}
}
currently I'm developing an App with a small team using Xamarin.Forms. We need to communicate with a database to get some locations , order details and so on. We are using Google's Firebase (Realtime DB) for this purpose. Everything is working fine when we are writing and reading data. However in the Firebase Console, in the usage tab, there are over 50 concurrent connections. This is weird since we are currently developing and didn't release any version of our app. There should be at most 5 concurrent connections (we are a team of 5).
We are using the NuGet-Package FirebaseDatabase.net (4.0.4) https://www.nuget.org/packages/FirebaseDatabase.net/ to read and write to the database.
Multiple Listeners are used to be able to react to changes in the database (so far it seems that each listener is taking up one connection which doesn't seem to be correct).
The code below shows the initialization of the FirebaseClient which is called once in the constructor.
private FirebaseClient InitDbClient()
{
var dbClient = new FirebaseClient(Constants.Values.FIREBASE_DATABASE_URL, new FirebaseOptions()
{
AuthTokenAsyncFactory = () => Task.FromResult(_authToken)
});
return dbClient;
}
Each listener is implemented in a similar way to the following code:
public IDisposable SubscribeToChatMessages(string orderID)
{
var observer = _dbClient.Child($"orders/{orderID}/Chat/Messages").AsObservable<JObject>();
var subscribe = observer.Subscribe(t =>
{
if (t.EventType == Firebase.Database.Streaming.FirebaseEventType.Delete)
{
return;
}
ChatMessage msg;
try
{
msg = JsonConvert.DeserializeObject<ChatMessage>(t.Object.ToString());
}catch(Exception e)
{
Debug.WriteLine(e.Message);
msg = new ChatMessage() { C = null, T = new DateTime(), U = null };
}
//...do something with the chat message
});
return subscribe;
}
Since I'm not sure what the problem is I just put some of our code in here. It would be awesome if anyone has a solution for this problem or has any idea what we might try.
I found the answer myself. As I already suspected each listener uses one connection. The reason for that is simple: the package is built on top of the rest api of firebase. In some other question it was mentioned that each listener is basically a streaming web socket (or something like that). Each of these consume one of the concurrent connections.
As a workaround (to reduce the amount of concurrent connections) I replaced all listeners that are not actually necessary with a combination of a timer and a simple db request. To be able to query for new data I added timestamps to the data itself. This allows me to use a db query similar to
dbClient.Child($"orders/{orderID}/Chat/Messages").orderBy("Time").startAt(lastRead).OnceAsync<>()...
This is (hopefully) a less broad version of this: Show progress on long-running process
I have a long running task in ASP.Net Core, and want to be able to provide progress for that task without using SignalR.
Here is an example endpoint:
[HttpGet]
[Route("/getter/world/{world}/cities")]
public async Task<IActionResult> GetCities(int world)
{
ApiGetter getter = new ApiGetter(_config);
try
{
//3rd party cookie
if (!await IsValidCookie(getter, world))
{
return BadRequest("Invalid Session");
}
IEnumerable<PlayerRank> rankings = await getter.GetWorldScoreRankings(world);
List<City> cities = new List<City>();
foreach (PlayerRank rank in rankings)
{
try
{
IEnumerable<City> playerCities = await getter.GetPlayerCities(rank, 5);
IEnumerable<City> uniqueCities = playerCities.GroupBy(c => c.CityID).Select(c => c.First()).ToList();
cities.AddRange(uniqueCities);
_repository.InsertCities(uniqueCities);
}
catch (DbUpdateException ex) when ((ex.InnerException as MySqlException)?.Number == 1062)
{
continue;
}
}
return Ok(cities.Count);
}
catch (DbUpdateException ex) when ((ex.InnerException as MySqlException)?.Number == 1062)
{
return BadRequest("City data already inserted for today");
}
}
This endpoint is managing multiple things, I am aware of the issues with not following SRP.
To try and avoid the "Question too broad" flag, here is what I want to sort out:
What I need to know:
How to keep a record of the progress of the foreach loop
How to return that to the client upon request
What I don't need to know:
How to make an AJAX request
How to poll with JavaScript
How to make endpoints
Since I can't return multiple time from the GetCities() method, I assume I'll need an additional endpoint that I can call to retrieve the current progress. If so, how do I keep record of that progress in a way that it is available to that other endpoint?
Optional if simple/in-scope: how do I ensure only the user that started the task sees the progress?
Edit: Would the Session TempData be appropriate for this? Write the progress to TempData, and further requests read it?
Your question already covers the basic "how do I get the status of a long running task" so I'll assume that you did your homework about the easy solutions.
One way to make sure that only the client that initiated the request can check the progress, is to pass a client secret to the request (a client generated string or number). That way, only by knowing that secret you can check the progress. The downside of this approach if that if the client dies, it might loose that secret.
With that "client secret" in mind, let's answer the questions:
How to keep a record of the progress of the foreach loop
Use the client secret to keep track create a map of progress. Use a map of some sort (in memory or redis) to map client secret <-> progress. On each for loop iteration, update the map and invalidate the entry after a certain time.
How to return that to the client upon request
Have an endpoint that checks the progress using the client secret.
how do I ensure only the user that started the task sees the progress?
The client secret solves that.
I am consuming a web service provided to me by a vendor in c# application. This application calls a web method in a loop and that slows down the performance. To get the complete set of results, it takes more than an hour.
Can I apply multi threading on my side to consume this web service in multiple threads and combine the results together?
Is there any better approach to retrieve data in minutes instead of hours?
First of all you have to make sure your vendor does indeed support this or does not prohibit it (which is very probable too).
The code itself to do this is fairly straightforward, using a method such as Parallel.For
Simple Example (google.com):
Parallel.For(0, norequests,
i => {
//Code that does your request goes here
} );
Exaplanation:
In a Parallel.For loop, all the requests get executed in-parallel (as implied in the name), which could potentially provide a very significant increase in performance.
Further reading:
MSDN on Parallel.For loops
You should really ask your vendor. We can only speculate about why it takes that long or if firing multiple requests will actually yield the same results as the one that takes long.
Basically, sending one request getting one response should beat the multi-threaded variant because it should be easier to optimize on the servers side.
If you want to know why this is not the case with the current version of the service, ask the vendor.
This is only samples, if you call web services in parallel:
private void TestParallelForeach()
{
string[] uris = {"http://192.168.1.2", "http://192.168.1.3", "http://192.168.1.4"};
var results = new List<string>();
var syncObj = new object();
Parallel.ForEach(uris, uri =>
{
using (var webClient = new WebClient())
{
webClient.Encoding = Encoding.UTF8;
try
{
var result = webClient.DownloadString(uri);
lock (syncObj)
{
results.Add(result);
}
}
catch (Exception ex)
{
// Do error handling here...
}
}
});
// Do with "results" here....
}
I am currently working on a system that makes calls to an external service and caches some of the data in the HttpContext.Current.Items collection for performance. The data can change quite regularly and it is user sensitive which is why we are currently storing it only for the duration of the current HttpRequest.
Example:
if (HttpContext.Current.Items[cacheKey] != null)
{
LogHelper.Debug<ExampleService>("[- CACHED RESULT -] GetUser({0})", () => email);
return (ExampleUser)HttpContext.Current.Items[cacheKey];
}
using (var client = new UserServiceClient())
{
using (new OperationContextScope(client.InnerChannel))
{
LogHelper.Debug<ExampleService>("GetUser({0})", () => email);
exampleUser = svc.GetUser(email);
HttpContext.Current.Items.Add(cacheKey, exampleUser);
}
}
In my local environment this behaves as expected and mostly also does in staging where the same thread is used for the duration of the request however in production this is not the case and there are still multiple calls to the external service in the same request. This can be seen from the logs which show that the value in HttpContext.Current.Items[cacheKey] is not returned in cases where the Thread ID does not match the original request.
This I guess means that my current understanding of HttpContext.Current.Items is wrong and that this is not a suitable solution for my needs.
My question therefore is can this be made to work across threads in the same request and if so should it, otherwise what suitable alternative is there?
One option is to use Session to store your data. Unfortunately it's not applicable for API-specific requests (e.g mobile device makes a call to server API). Besides, server session state requires all of your data serializable (DB session state doesn't).
If session does not satisfy your requirements, then you should go to next option: Using cache protected by something that represents your requests coming from the same user (a.k.a access token).