Let's say that I have REST API app and I enable inserting through it some items to database, I would like to make sure that a lot of people at the same time can call that REST API to insert some objects but I also want to be sure that there will never be inserted more than x items of given type.
What would be optimal strategy for that? Let's say that there are 2 options, app is hosted only on a single node and second option is that it is distributed and can have multiple nodes on different servers but with the same database.
If it is hosted on one machine then I assume that I can have shared semaphore, easy but requests will be blocking each other all the time.
On distributed option I assume that there would have to be transaction od db level but how it could look like if performance is inportant?
I assume that this is generic and problem of well known class of problems so maybe you can give me any hints where can I read about it?
Thanks for help!
Related
In my application there are incoming messages and I would like to sample some of them, for simplicity let's say 1 every 10. I have a settings file in which I have the following properties:
MaxPerHour
MaxPerDay
MaxAllTime
It's not an option to keep the counts in the current class so I somehow need to store them (on database or memory).
Also an important thing is that there are multiple collectors, so I would need to be able to know how many Collector1 has collected in the last hour / this day. It's also in an async environment
I am out of ideas as I know that if I were to store this data in database it's would not be that performant.
It would be of value to know what your hosting environment is. I think a distributed cache may turn out to be your holy grail. You can share this instance over all your connectors and easily read/write/invalidate data over this shared instance.
For example, running your system in Azure, Azure Cache for Redis is awesome to solve your problem but then again, your infra is key to the correct answer.
I've build a small C# application that has been able to get all values I wanted, specifically this is about products, from a webservice API. I have just switched to a larger database (via webservice API access) and now finding out that I get blocked while iterating through the requests, since I've gone from 25 products to 1000 products.
Well, honestly I'm guessing this is the case, but the database is the same built in both my trails, so what has changed is just the amount.
Might mention that I'm new to the whole conect of using webservices/API:s.
I have not tried anything specific yet. I guess one could use a timer between the requests, to have the server rest in between.
Currently I'm iterating through the list of all products, getting their URI, then each URI to get the specifics about each product.
In the best of worlds, I would like to either find a solution whichs allows me to make the numer of reqests lesser than what it is today, or to find a way to let the service rest from making requests. I.e. compare to SQL SELECT * FROM dbTable, make one request, use it
Anyone have any good suggestions. Thanks :)
Thanks to Jeremy, I understand that my backgroundinfo was incomplete. My programs purpose is to fetch all the data from a webshop (products) and distribute them to other systems. So this process of iterating through them will happen ever so often, to detect changes made to the products.
I am building a Wiki / Blog similar application, and I have a question about the best way to store the View Count for each of the articles. The requirement is that I only want to store the unique number of users that viewed the article and not the total view count. So far I have come up with 3 different ways to accomplish:
1. SQL Server stored procedure: the problem with this approach is that the data is stored in XML data type and it might be a bit complicated to achieve the requirement using this method. I am leaving this as a last resort.
2. MSMQ: this would work great, since I can process the requests serially. The only problem with this approach is that, I cannot ensure that MSMQ is installed on the host server. This one is out of the question!
3. Using Application.Lock(): I know that using this method I can lock access to the Application object, update some entry in the application, update the database, and then call Application.Unlock(). While this sounds as a functional approach, it still feels like a workaround.
Does anyone has a suggestion on what I should do to achieve the requirement?
MsMQ and Application.Lock are def not the options to consider for something simple you want to do. (Application.Lock() is a def NO GO)
I also see no reason for XML. A stored proc does not rely on XML
Create a table
[page,userip]
on every view of the page
insert into <table>(page,userip) values(#page,#userip)
For the statistics just issue the a query
select count(*) from <table> group by userip having page=#page
This identifies a user on its IP, not completely failsafe as multiple users can come from the same ip.
But why not investigate google Analytics? All the info you need (and more)
I wonder if somebody could point me in the right direction. I've recently started playing with LinqToSQL and love the strongly typed data objects etc.
I'm just struggling to understand the impact on database performance etc. For example, say I was developing a simple user profile page. The page shows basic information about the user, some information on their recent activity, and a list of unread notifications.
If I was developing a stored procedure for this page, I could create a single SP which returns multiple datatables covering all of the required information - resulting in a single db call.
However, using LinqToSQL, this could results in many calls - one for user info, atleast one for activity, atleast one for notifications, if I then want further info on notifications this may result in further calls - multiple db calls.
Should I be worried about the number of db calls happenning as a result of using this design pattern? Ie, are the multiple db handshakes etc going to degrade my db etc?
I'd appreciate your thoughts on this!
Thanks
David
LINQ to SQL can consume multiple results from a stored proc if you need to go that route. Unfortnately the designer has problems mapping them correctly, so you will probably need to create your mapping manually. See http://www.thinqlinq.com/Default/Using-LINQ-to-SQL-to-return-Multiple-Results.aspx.
You can configure LINQ to SQL to eagerly load the child records if you know that you're going to need them for every parent record. Use the DataLoadOptions and .LoadWith to configure it.
You can also project an object graph with multiple child collections in the Select clause of a LINQ query to reduce the number of DB hits that you make.
Ultimately, you need to check a number of options to determine which route is the best performance for your situation. It's not a one size fits all scenario.
Is it worst from a performance standpoint ? Yes, it should be. Multiple roundtrips are usually worse than single.
The real question is, do you mind? Is your application going to receive enough visits to warrant the added complexity of a stored procedure? Or do you value the simplicity of future modifications over raw performance?
In any case, if you need the performance, you can create a stored procedure and map it on your context. This will give you one single call, but return the data as objects
Here is an article explaining a bit about that option:
linq-to-sql-returning-multiple-result-sets
I need some input on how to design a database layer.
In my application I have a List of T. The information in T have information from multiple database tables.
There are of course multiple ways to do this.
Two ways that I think of is :
chatty database layer and cacheable:
List<SomeX> list = new List<SomeX>();
foreach(...) {
list.Add(new SomeX() {
prop1 = dataRow["someId1"],
prop2 = GetSomeValueFromCacheOrDb(dataRow["someId2"])
});
}
The problem that I see with the above is that if we want a list of 500 items, it could potentially make 500 database requests. With all the network latency and that.
Another problem is that the users could have been deleted after we got the list from the database but before we are trying to get it from cache/db, which means that we will have null-problems. Which we have to handle manually.
The good thing is that it's highly cacheable.
non chatty but not cacheable:
List<SomeX> list = new List<SomeX>();
foreach(...) {
list.Add(new SomeX() {
prop1 = dataRow["someId1"],
prop2 = dataRow["someValue"]
});
}
The problem that I see with the above is that its hard to cache, since potentially all users have unique lists. The other problem is that it will be a lot of joins which could result in a lot of reads against the database.
The good thing is that we know for sure that all information exists after the query is run (inner join etc)
non so chatty, but still cacheable
A third option could be to first loop through the data rows, and collect all necessary someId2 and then make one more database request to get all the SomeId2 values.
"The problem that I see with the above is that if we want a list of 500 items, it could potentially make 500 database requests. With all the network latency and that."
True. Could also create unnecessary contention and consume server resources maintaining locks as you iterate over a query.
"Another problem is that the users could have been deleted after we got the list from the database but before we are trying to get it from cache/db, which means that we will have null-problems."
If I take that quote, then this quote:
"The good thing is that it's highly cacheable."
Is not true, because you've cached stale data. So strike off the only advantage so far.
But to directly answer your question, the most efficient design, which seems to be what you are asking, is to use the database for what it is good for, enforcing ACID compliance and various constraints, most notably pk's and fk's, but also for returning aggregated answers to cut down on round trips and wasted cycles on the app side.
This means you either put SQL into your app code, which has been ruled to be Infinite Bad Taste by the Code Thought Police, or go to sprocs. Either one works. Putting the code into the App makes it more maintainable, but you'll never be invited to any more elegant OOP parties.
Some suggestions:
SQL is a set based language, so don't design things for iterating over loops. Even with stored procedures, still see cursors now and then when a set based query will solve the issue. So, always try and get the information with 1 query. Now sometimes this isn't possible but in the majority this will be. You can also design Views to make your querying easier if you have a schema with many tables to pull the information that is needed with one statement.
Use proxies. Let's say I have an object with 50 properties. At first you display a list of objects to the user. In this case, I would create a proxy of the most important properties and display that to the user, maybe 2 or three important ones like name, ID, etc. This cuts down on amount of information sent initially. When the user actually wants to edit or change the object, then make a second query to get the "full" object. Only get what you need. This is especially important over the web when serialization XML between the layers.
Come up with a paging strategy. Most systems work fine until they get a lot of data and then the query comes to a halt because it is reurning 1000s of data rows/records. Page early and often. If you are doing a web application, probably paging directly in the database will be the most performant because only the paged data is being sent between the layers.
Data caching depends on the data. For highly volatile data (changing all the time) caching isn't worth it. By for semi-volatile or non-volatile data, caching can be worth it, but you have to manage the cache either directly or indirectly if you are using a built in framework.
A good place to use a cache is say you have a zip codes table. Certianly, those don't change that often and you could cache those to boost performance if you had a zip code drop down in your application. This is just an example, but caching IMO depends on the type of data.