Db4O activation depth, Faq, Best Practise for Web Application

Db4O activation depth, Faq, Best Practise for Web Application - c#

Our database includes 4,000,000 records (sql server) and it's physical size is 550 MB .
Entities in database are related each other as graph style. When i load an entity from db with 5 level depth there is a problem (all records are loaded).
Is there any mechanism like Entity Framework( Include("MyProperty.ItsProperty"))
What is the best Types for using with db4O databases?
Is there any issue for Guid, Generic Collections?
Is there any best practise for WebApplication with db4o? Session Containers+EmbeddedDb4ODb or Client/ServerDb4O?
Thx for help..
Thx for good explanation. But i want to give my exact problem as a sample:
I have three entities: (N-N relationship. B is an intersection Entity. Concept:Graph)
class A
{
public B[] BList;
public int Number;
public R R;
}
class B
{
public A A;
public C C;
public D D;
public int Number;
}
class C
{
public B[] BList;
public E E;
public F F;
public int Number;
}
I want to query dbContext.A.Include("BList.C.BList.A").Include("BList.C.E.G").Where(....)
I want to get :A.BList.C.BList.A.R
But I dont want to get :A.R
I want to get :A.BList.C.E.G
But I dont want to get :A.BList.C.F
I want to get :A.BList.C.E.G
But I dont want get :A.BList.D
Note:this requirements can change a query to another query
Extra question is there any possibility to load
A.BList[#Number<120].C.BList.A[#Number>100] Super syntax :)

Activation: As you said db4o uses it's activation-mechanism to control which objects are loaded. To prevent that to many objects are loaded there are different strategies.
Lower the global default activation-depth: configuration.Common.ActivationDepth = 2 Then use the strategies below to activate objects on need.
Use class-specific activation configuration like cascading activation, minimum and maximun activation-depth etc.
Activate objects explicit on demand: container.Activate(theObject,5)
However all these stuff is rather painful on complex object graphs. The only strategy to get away from that pain is transparent activation. Create an attribute like TransparentlyActivated. Use this attribute to mark your stored classes. Then use the db4otool to enhance your classes. Add the db4otool-command to the Post-Build events in Visual Studio: Like 'PathTo\Db4oTool.exe -ta -debug -by-attribute:YourNamespace.TransparentlyActivated $(TargetPath)
Guid, Generic Collections:
No (in Version 7.12 or 8.0). However if you store your own structs: Those are handled very poorly by db4o
WebApplication: I recommend an embedded-container, and then a session-container for each request.
Update for extended question part
To your case. For such complex activation schema I would use transparent activation.
I assume you are using properties and not public fields in your real scenario, otherwise transparent persistence doesn't work.
The transparent activation basically loads an object in the moment a method/property is called the first. So when you access the property A.R then A itself it loaded, but not the referenced objects. I just go through a few of you access patterns to show what I mean:
Getting 'A.BList.C.BList.A.R'
A is loaded when you access A.BList. The BList array is filled with unactivate objects
You keep navigating further to BList.C. At this moment the BList object is loaded
Then you access C.BList. db4o loads the C-object
And so on and so forth.
So when you get 'A.BList.C.BList.A.R' then 'A.R' isn't loaded
A unloaded object is represented by an 'empty'-shell object, which has all values set to null or the default value. Arrays are always fully loaded, but first filled with unactivated objects.
Note that theres no real query syntax to do some kind of elaborate load requests. You load your start object and then pull stuff in as you need it.
I also need to mention that this kind of access will perform terrible over the network with db4o.
Yet another hint. If you want to do elaborate work on a graph-structure, you also should take a look at graph databases, like Neo4J or Sones Graph DB

Related

How to deal with IEnumerables/Arrays/Collections in Fluxor State?

I'm currently trying to implement Fluxor for my Blazor WASM app and all the instructions/tutorials I found recommended something like this example for the Store:
public record AppStore {
int ClickCounter,
bool IsLoading,
WeatherForecast[]? Forecasts
}
and then only talk about initial state and updates only happen to the bool and the int while the array is only ever replaced outright. I.e. the examples always fetch the complete data from the server, e.g. a 100 entries.
Now, here's my question: How do I properly deal with the array in my reducer when I have already 100 entries in there and only want do add/update/delete one? Is that even a good idea in the first place?

The best thing to do is to use ImmutableList<T> or ImmutableArray<T> instead, as this class is optimised for the purpose of returning a new instance that includes old data but without having to copy the elements.
I've recently released a new library called Reducible that helps to create complex state reducers. It results in fewer updates (e.g. a new parent object isn't created if an item in the list is not replaced).
https://github.com/mrpmorris/Reducible/blob/master/README.md

Should I use linked list or list and how do I serialize it.

This is for c#
I'm an old dinosaur, writing 360 assembler since the 70's, trying to write stuff for the PC. Along the way I am replacing my old write it myself thinking with use the existing infrastructure.
Here is what I have now. Two objects, System and Planet. A field in System has a pointer to the next System, there is also a second chain of Systems that meet current selection criteria. Also System has a pointer to Planet and Planet has a pointer to the next Planet. Planet also has a chain of all planets.
Now the questions. Should I use lists and have C# handle all the linking etc. I'm fairly sure 1 object instance can be in multiple lists, so I can have 1 list of all systems and a second list of selected systems. Plus have a list of Planets in the system and another list of all Planets.
I also want to save this mess to disk. I've spent some time looking at serialization and it appears to be great at saving all the instances in a list, but things break down when you want to serialize multiple classes. Am I missing something basic, just a yes will send me back to looking, or do I have to roll my own?
I don't want code examples, just a gentle puch in the direction I should be looking at.

I would simply create two classes, one being the System with a List<Planet> containing all its planets and the other one being the Planet, containing a reference to his system (if one is required). The systems are themselves saved in a List<System>. Like the planets they could hold a reference to their parent so they have access to the list, but if they don't need to, its fine.
Saving this stuff should be three lines of code with a serializing system of your choice, either in text or binary (Json.Net, the Xml stuff .Net provides, yaml, binary formatter...).
Linked lists are not worth the implementation, they aren't as useful as dynamic arrays (like the List<T> in System.Collections.Generic or the Vector<T> in C++) which resize themselves when needed, and they aren't that easy to keep track of. They definetly have applications but this is not one of them IMO.

Should I use linked list or list...
The answer depends on what your object represents and how you are going to use it. For example, if I was representing houses, and the people who live at each house; then I might choose to have a collection of House objects. I'm using collection as a generic term there: specifically, I would probably use List<T> from the System.Collections.Generic namespace (where T can represent any type, so it would be a List<House> in this case), unless I needed something more specific like a Stack<T>, Queue<T>, Dictionary<T,U>, etc, etc.
Notice how in this approach, each House doesn't know which house is next, because the whole concept of 'next' relates to the collection of houses: each individual house doesn't need to know where it is in the collection - that's the responsibility of the collection. This is a design principle called "separation of concerns".
For example, if I wanted to create a different collection of House objects (e.g. the ones with red front doors), I could do so by creating a new collection, referring to the same House objects; whereas with the approach mentioned of an object having a reference to the next one, I would have to create a different House object because the next value would be different in those two collections.
Using List<T> allows you to focus on writing your classes, instead of having to write the implementation of the collection.
There are also performance reasons against using linked lists unless you only plan to access the data in sequential order.
Each House has-a collection of people. So I might put a property on House called People, of type List<Person>. And if I needed to get to the house that the person was associated with, I could have a property on Person called House, of type House.
I hope this structure of Houses and People corresponds to your scenario with Systems and Planets.
Maybe also worth looking at When should I use a List vs a LinkedList
...and how do I serialize it.
Plenty on the internet, try these...
How to Serialize List<T>?
https://www.thomaslevesque.com/2009/06/12/c-parentchild-relationship-and-xml-serialization/
Hope this helps to get you started.

From the sound of it, I will create class of System, Planet with one to many reference of planets in System (List here). In order to avoid strong coupling between System and Planet, One can look at Chain of Responsibility pattern.
Saving this data to database one can serialise using Json.Net (newtonsoft). SQL server supports directly putting json array.
Pseudo code:
class Planet {
public Planet(System system) {System = system;}
public System System {get; private set;} // singleton
}
class System {
public Planet Planet {get; set;}
// list of planets
private List<Planet> planets = new List<Planet>();
public List<Planet> Planets { get {return planets; } }
}

Manage Cache collection by multiple properties

I have Sensor class which contains few properties: id, a, b.
Another class called SensorCache and responsible for manage in memory cache for my sensor collection.
SensorCache implements "Cache aside pattern" see here
SensorCache works in traditional way - each Sensor request (the requests made by the id property) first goes to SensorCache:
if it already exists in memory - SensorCache return it
if not in memory, it brings the required Sensor from my DB, save into memory cache object (represented by `Dictionary') and return it.
Currently my dictionary key is based on the Sensor.id field.
I got a new requirement to return a Sensor by 2 fields (a and b) and keep my cache logic.
My cache object currently built to search by single property (Sensor.id) so I need to think about new structure which able to search in memory by 2 different options: Sensor.id or 'Sensor.a' and 'Sensor.b' pairs.
What is the best approach to handle this?
I thought about holding two different objects, one for each kind of search but this approach will consume much more memory (x2) so I want to hear another ideas before doing it.

You can write a separate class that implements IEquatable and overrides GetHashCode (and sometimes you have to, to achieve the required performance) but in this simple case, it sounds like you could use Tuple, that is, Dictionary<Tuple<(type of a), (type of b)>, Sensor>.

NHibernate lazy loading doesn't appear to be working for my domain?

I'm new to NHibernate, but have managed to get it all running fine for my latest project. But now I've reached the inevitable performance problem where I need to get beyond the abstraction to fix it.
I've created a nunit test to isolate the method that takes a long time. But first a quick overview of my domain model is probably a good idea:
I have a 'PmqccForm' which is an object that has a 'Project' object, which contains Name, Number etc and it also has a 'Questions' object, which is a class that itself contains properties for various different 'Question' objects. There is a JobVelocityQuestion object which itself has an answer and some other properties, and a whole bunch of similar Question objects.
This is what I'm talking about with my PmqccForm having a Questions object
This is the questions part of the model:
The key point is that I want to be able to type
form.Questions.JobVelocityQuestion
as there is always exactly 1 JobVelocityQuestion for each PmqccForm, its the same for all the other questions. These are C# properties on the Questions object which is just a holding place for them.
Now, the method that is causing me issues is this:
public IEnumerable<PmqccForm> GetPmqccFormsByUser(StaffMember staffMember)
{
ISession session = NHibernateSessionManager.Instance.GetSession();
ICriteria criteria = session.CreateCriteria(typeof(PmqccForm));
criteria.CreateAlias("Project", "Project");
criteria.Add(Expression.Eq("Project.ProjectLeader", staffMember));
criteria.Add(Expression.Eq("Project.IsArchived", false));
return criteria.List<PmqccForm>();
}
A look in my console from the Nunit test which just runs this method shows that there is nearly 2000 sql queries being processsed!
http://rodhowarth.com/otherstorage/queries.txt is the console log.
The thing is, at this stage I just want the form object, the actual questions can be accessed on a need to know basis. I thought that NHibernate was meant to be able to do this?
Here is my mapping file:
http://rodhowarth.com/otherstorage/hibernatemapping.txt
Can anyone hint me as to what I'm missing? or a way to optimize what I'm doing in relation to NHibernate?
What if I made the questions a collection, and then made the properties loop through this collection and return the correct one. Would this be better optimization from nhibernates point of view?

Just try to add fetch="subselect" to the mapping file for Questions component and see if this solves the issue with multiple selects to that table - you should now see one 2nd select instead of hundreds separate queries, e.g.
<component name="Questions" insert="true" update="true" class="PmqccDomain.DomainObjects.Questions" fetch="subselect">
See for more info - Improving performance

Is it bad practice to use an enum that maps to some seed data in a Database?

I have a table in my database called "OrderItemType" which has about 5 records for the different OrderItemTypes in my system. Each OrderItem contains an OrderItemType, and this gives me referential integrity. In my middletier code, I also have an enum which matches the values in this table so that I can have business logic for the different types.
My dev manager says he hates it when people do this, and I am not exactly sure why. Is there a better practice I should be following?

I do this all the time and I see nothing wrong with this. The fact of the matter is, there are values that are special to your application and your code needs to react differently to those values. Would your manager rather you hard-code an Int or a GUID to identify the Type? Or would he rather you derive a special object from OrderItem for each different Type in the database? Both of those suck much worse than an enum.

I don't see any problem in having enum values stored in the database, this actually prevents your code from dealing with invalid code types. After I started doing this I started to have fewer problems, actually. Does your manager offer any rationale for his hatred?

We do this, too. In our database we have an Int column that we map to an Enum value in the code.

If you have a real business concern for each of the specific types, then I would keep the enum and ditch it in the database.
The reason behind this approach is simple:
Every time you add an OrderType, you're going to have to add business logic for it. So that justifies it being in your business domain somewhere (whether its an enum or not). However, in this case having it in the database doesn't do anything for you.

I have seen this done for performance reasons but I think that using a caching mechanism would be perferable in most cases.

One alternative to help with the synchronization of the database values and the business logic enum values would be to use the EnumBuilder class to dynamically generate a .dll containing the current enum values from the database. Your business logic could then reference it, and have intellisense-supported synchonized enum values.
It's actually much less complicated than it sounds.
Here's a link to MSDN to explain how to dynamically build the enum.
http://msdn.microsoft.com/en-us/library/system.reflection.emit.enumbuilder.aspx
You just have to sub in the database access code to grab the enum values:

One more vote for you, I also use mapping database int <-> application enum, in addition, I usually describe my enums like this:
public enum Operation
{
[Description("Add item")]
AddItem = 0,
[Description("Remove item")]
RemoveItem = 1
}
which leaves me absolutely free to add new values without need to change database and with a very short workaround I can work i.e. with lists containing descriptions (that are very strongly tied to values!) - just a little bit of reflection reaches the goal!
In code, you can typically just add a property like this:
public class Order
{
public int OrderTypeInt;
public OrderTypeEnum OrderType
{
get { return (OrderTypeEnum)OrderTypeInt; }
set { OrderTypeInt = (int)value; }
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.