I have a simple parent child relationship that I would like to load with LINQ to SQL. I want to load the children at the same time as the parent. The generated SQL is doing too much work. It is trying to count the children as well as join to them. I will not update these objects. I will not add children to the parent. I'm only interested in reading it. I have simplified the tables down to the bare minimum. In reality I have more columns. LINQ to SQL is generating the following SQL
SELECT [t0].[parentId] AS [Id], [t0].[name], [t1].[childId] AS [Id2],
[t1].[parentId], [t1].[name] AS [name2],
( SELECT COUNT(*)
FROM [dbo].[linqchild] AS [t2]
WHERE [t2].[parentId] = [t0].[parentId]
) AS [value]
FROM [dbo].[linqparent] AS [t0]
LEFT OUTER JOIN [dbo].[linqchild] AS [t1] ON [t1].[parentId] = [t0].[parentId]
ORDER BY [t0].[parentId], [t1].[childId]
I don't know why the SELECT COUNT(*) ... is there. I'd rather it went away. Both the parent and child tables will have millions of rows in them in production. The extra query is costing a great deal of time. It seems unecessary. Is there a way to make it go away? I'm also not sure where the ORDER BY is coming from either.
The classes look like this.
[Table(Name = "dbo.linqparent")]
public class LinqParent
{
[Column(Name = "parentId", AutoSync = AutoSync.OnInsert, IsPrimaryKey = true, IsDbGenerated = true, CanBeNull = false)]
public long Id { get; set; }
[ Column( Name = "name", CanBeNull = false ) ]
public string name { get; set; }
[Association(OtherKey = "parentId", ThisKey = "Id", IsForeignKey = true)]
public IEnumerable<LinqChild> Kids { get; set; }
}
[Table(Name = "dbo.linqchild")]
public class LinqChild
{
[Column(Name = "childId", AutoSync = AutoSync.OnInsert, IsPrimaryKey = true, IsDbGenerated = true, CanBeNull = false)]
public long Id { get; set; }
[ Column( Name = "parentId", CanBeNull = false ) ]
public long parentId { get; set; }
[Column(Name = "name", CanBeNull = false)]
public string name { get; set; }
}
I'm using something like the following to query, there would be a where clause in production and an index that matches.
using (DataContext context = new DataContext(new DatabaseStringFinder().ConnectionString, new AttributeMappingSource()) { ObjectTrackingEnabled = false, DeferredLoadingEnabled = false })
{
var loadOptions = new DataLoadOptions();
loadOptions.LoadWith<LinqParent>(f => f.Kids);
context.LoadOptions = loadOptions;
var table = context.GetTable<LinqParent>();
context.Log = Console.Out;
// do something with table.
}
Unfortunately, no. ORM's are never the most performant solution; you'll always get better performance if you write your own SQL (or use stored procedures), but that's the tradeoff that gets made.
What you're seeing is standard practice with ORM's; rather than using a multiple result query (which seems to me to be the most efficient way, but I'm not an ORM library author), the ORM will flatten the entire graph into a single query and bring back all of the information it needs--including information that helps it determine what bits of data are duplicated--to rebuild the graph.
This is also where the ORDER BY comes from, as it requires that linked entities be in contiguous blocks.
The query that is being generated is not all that inefficient. If you look at the estimated execution plan you will see that the count(*) expense is very minimal. The order by clause should be ordering by your primary key which is probably your clustered index, so it also should have very little impact on performance.
One thing to make sure of when testing performance on your LINQ queries, is to make sure that the context.Log is not being set. Setting this to Console.Out will cause a huge performance hit.
Hope this helps.
Edit:
After looking a little closer at the execution plan, I see that even though my Count(*) was just a clustered index scan, it was still 33% of my execution, so I agree it is kind of annoying to have this extra sub-select in the sql. If this really is the performance bottle neck, then you might want to consider creating a view or stored proc to return your results.
Related
Simple Issue (EDITED: To show easy reproducible example first, then the detailed scenario)
Having the following Classes:
public class SalesMetrics {
public decimal SalesDollars { get; set; }
public decimal SalesUnits { get; set; }
}
public class ProductGroup {
public string ProductId { get; set; }
public string ProductName { get; set; }
}
using the Following Dapper Query, my result equals [{Key = null, Value = null}]:
IEnumerable<KeyValuePair<ProductGroup, SalesMetrics>> result = sqlConnection
.Query<KeyValuePair<ProductGroup, SalesMetrics>>(
sql: #"SELECT
1 As ProductId,
'Test' AS ProductName,
1.00 As SalesDollars,
1 As SalesUnits");
I'm wondering If Dapper could handle a KeyValuePair as output type, is so : how the query need to be?
Full Scenario (Why I'm needing this)
I am creating a Sales Query builder function that can group my sales results by different grouping predicate and should return different result type based on that predicate type.
I am using Dapper nuget package to get my result from SQL-Server.I am using Dapper.Query<T>() extension method on IDbConnection
Basically, no matter the grouping type I want to return a sum of SalesDollars & SalesUnits. For this part of the output, I've created the following class SalesMetrics
I want my Sales Query function to accept the Group class (ProductGroup, or any other class ...) as generic parameter named TGroup, The function should return a Collection of KeyValuePair<TGroup,SalesMetric>
Data Source
Here is the layout of my sales Table FlatSales
CREATE TABLE dbo.FlatSales (
SalesDate DATE NOT NULL,
ProductId INT NOT NULL,
ProductName VARCHAR(100) NOT NULL,
ProductCategoryId INT NOT NULL,
ProductCategoryName VARCHAR(100) NOT NULL,
CustomerGroupId INT NOT NULL,
CustomerGroupName VARCHAR(100) NOT NULL,
CustomerId INT NOT NULL,
CustomerName VARCHAR(100) NOT NULL,
SalesUnits INT NOT NULL,
SalesDollars INT NOT NULL
)
Where I'm having an Issue
I have the following function for querying the DB.
public static IEnumerable<KeyValuePair<TGroup,SalesMetrics>> SalesTotalsCompute<TGroup>(System.Data.IDbConnection connection)
{
string[] groupByColumnNames = typeof(TGroup)
.GetProperties(bindingAttr: System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.Instance)
.Select(x => x.Name)
.ToArray();
string joinedGroupByColumnsNames = string.Join(",", groupByColumnNames);
return connection.Query<KeyValuePair<TGroup, SalesMetrics>>(sql: $#"
SELECT SUM(SalesDollars) AS SalesDollars,
SUM(SalesUnits) AS SalesUnits,
{joinedGroupByColumnsNames}
FROM dbo.FlatSales
GROUP BY {joinedGroupByColumnsNames}
");
}
Collection Of NULL
The code does not fail but it return a list of KeyValuePair that both Key and Value are NULL.
I've tried to Alias my columns like ProductName as [Key.ProductName] but then It does not change anything (Not failing also) ...
Generated Sql queries for ProductGroup are as follow (both returns empty KeyValuePair):
SELECT SUM(SalesDollars) AS SalesDollars,
SUM(SalesUnits) AS SalesUnits,
ProductId,ProductName
FROM dbo.FlatSales
GROUP BY ProductId,ProductName
OR
SELECT SUM(SalesDollars) AS SalesDollars as [Value.SalesDollars],
SUM(SalesUnits) AS SalesUnits as [Value.SalesUnits],
ProductId As [Key.ProductId],ProductName As [Key.ProductName]
FROM dbo.FlatSales
GROUP BY ProductId,ProductName
Any Ideas?
I doubt Dapper supports complex objects like that out-of-the-box.
Perhaps you can benefit from Dapper's multi-mapping feature:
public static IEnumerable<KeyValuePair<TGroup, SalesMetrics>> SalesTotalsCompute<TGroup>(System.Data.IDbConnection connection)
{
string joinedGroupByColumnsNames = string.Join(",", GetCachedColumnNamesFor<TGroup>());
return connection.Query<TGroup, SalesMetrics, KeyValuePair<TGroup, SalesMetrics>>(
sql: $#"SELECT {joinedGroupByColumnsNames},
SUM(SalesDollars) AS SalesDollars,
SUM(SalesUnits) AS SalesUnits
FROM dbo.FlatSales
GROUP BY {joinedGroupByColumnsNames}",
map: (groupData, salesMetricsData) => new KeyValuePair<TGroup, SalesMetrics>(groupData, salesMetricsData),
splitOn: "SalesDollars");
}
Remarks
I've reordered the columns, because splitOn needs the name of the column where the two objects split up, otherwise you'd have to pass the first item from the joinedGroupByColumnsNames-array which is a bit more random
If you're on .NET Standard, consider returning ValueTuple's instead of KeyValuePair's
Don't use reflection for every call, I suggest to add a method GetCachedColumnNamesFor that does the reflection only once, using a static ConcurrentDictionary, calling the ConcurrentDictionary.GetOrAdd method.
Other approach
You could also let ProductGroup inherit from SalesMetrics (or make an ISalesMetrics interface and let ProductGroup implement that interface) and do Query<ProductGroup>(...). An additional benefit would be that duplicate fields in both models would be blocked by the compiler.
The resulting method would then look like this:
public static IEnumerable<TSalesData> SalesTotalsCompute<TSalesData>(System.Data.IDbConnection connection)
where TSalesData : ISalesMetric
{
string joinedGroupByColumnsNames = string.Join(",", GetCachedNonSalesMetricColumnNamesFor<TSalesData>());
return connection.Query<TSalesData>(sql: $#"
SELECT SUM(SalesDollars) AS SalesDollars,
SUM(SalesUnits) AS SalesUnits,
{joinedGroupByColumnsNames}
FROM dbo.FlatSales
GROUP BY {joinedGroupByColumnsNames}
");
}
Here the GetCachedNonSalesMetricColumnNamesFor-method reflects the properties from TSalesData excluding those from the ISalesMetric interface, again, caching the result.
i am new in LINQ so i have been following some tutorials and official docs like How to: Map Database Relationships on Developer Network.
I am performing this:
1) Got two classes Locality and Region, there's a one to many relation between this "tables" so than one Region has multiple Localities. I mapped the association like this:
Region:
private EntitySet<City> _cities = new EntitySet<City>();
[Association(Storage = "_cities", ThisKey = "RegionId", OtherKey = "RegionId")]
public EntitySet<City> Cities
{
get { return _cities; }
set { _cities.Assign(value); }
}
The Region has two more fields RegionId and Name. City has two fields too: CityId, Name (beside the RegionId foreign key of course).
Now i populate the database. So i am able to select all cities using a query like the following:
var city = from City cities in db.cities
select cities;
And i can see all the properties belonging to City Entity. But when i perform this query:
var regiones = from Region region in db.regions
select region;
I only can access to RegionId,Name because the Cities EntitySet is always empty. I dont now what i am doing wrong, so i hope some of you could give me a hand.
Are you sure to have LazyLoading enabled on your context.
If not, you can turn it on or you could use db.regions.Include("Cities") to explicitly load Cities.
Sorry for wasting your time guys, i finally found my mistake, i defined the primary key of Region with the flag IsDbGenerated = true, but i am sending a prepared value to the foreign key in City, so the relationship is broken, i fixed the Region Primary Key and Voila.
We are using an extractor application that will export data from the database to csv files. Based on some condition variable it extracts data from different tables, and for some conditions we have to use UNION ALL as the data has to be extracted from more than one table. So to satisfy the UNION ALL condition we are using nulls to match the number of columns.
Right now all the queries in the system are pre-built based on the condition variable. The problem is whenever there is change in the table projection (i.e new column added, existing column modified, column dropped) we have to manually change the code in the application.
Can you please give some suggestions how to extract the column names dynamically so that any changes in the table structure do not require change in the code?
My concern is the condition that decides which table to query. The variable condition is
like
if the condition is A, then load from TableX
if the condition is B then load from TableA and TableY.
We must know from which table we need to get data. Once we know the table it is straightforward to query the column names from the data dictionary. But there is one more condition, which is that some columns need to be excluded, and these columns are different for each table.
I am trying to solve the problem only for dynamically generating the list columns. But my manager told me to make solution on the conceptual level rather than just fixing. This is a very big system with providers and consumers constantly loading and consuming data. So he wanted solution that can be general.
So what is the best way for storing condition, tablename, excluded columns? One way is storing in database. Are there any other ways? If yes what is the best? As I have to give at least a couple of ideas before finalizing.
Thanks,
A simple query like this helps you to know each column name of a table in Oracle.
Select COLUMN_NAME from user_tab_columns where table_name='EMP'
Use it in your code :)
Ok, MNC, try this for size (paste it into a new console app):
using System;
using System.Collections.Generic;
using System.Linq;
using Test.Api;
using Test.Api.Classes;
using Test.Api.Interfaces;
using Test.Api.Models;
namespace Test.Api.Interfaces
{
public interface ITable
{
int Id { get; set; }
string Name { get; set; }
}
}
namespace Test.Api.Models
{
public class MemberTable : ITable
{
public int Id { get; set; }
public string Name { get; set; }
}
public class TableWithRelations
{
public MemberTable Member { get; set; }
// list to contain partnered tables
public IList<ITable> Partner { get; set; }
public TableWithRelations()
{
Member = new MemberTable();
Partner = new List<ITable>();
}
}
}
namespace Test.Api.Classes
{
public class MyClass
{
private readonly IList<TableWithRelations> _tables;
public MyClass()
{
// tableA stuff
var tableA = new TableWithRelations { Member = { Id = 1, Name = "A" } };
var relatedclasses = new List<ITable>
{
new MemberTable
{
Id = 2,
Name = "B"
}
};
tableA.Partner = relatedclasses;
// tableB stuff
var tableB = new TableWithRelations { Member = { Id = 2, Name = "B" } };
relatedclasses = new List<ITable>
{
new MemberTable
{
Id = 3,
Name = "C"
}
};
tableB.Partner = relatedclasses;
// tableC stuff
var tableC = new TableWithRelations { Member = { Id = 3, Name = "C" } };
relatedclasses = new List<ITable>
{
new MemberTable
{
Id = 2,
Name = "D"
}
};
tableC.Partner = relatedclasses;
// tableD stuff
var tableD = new TableWithRelations { Member = { Id = 3, Name = "D" } };
relatedclasses = new List<ITable>
{
new MemberTable
{
Id = 1,
Name = "A"
},
new MemberTable
{
Id = 2,
Name = "B"
},
};
tableD.Partner = relatedclasses;
// add tables to the base tables collection
_tables = new List<TableWithRelations> { tableA, tableB, tableC, tableD };
}
public IList<ITable> Compare(int tableId, string tableName)
{
return _tables.Where(table => table.Member.Id == tableId
&& table.Member.Name == tableName)
.SelectMany(table => table.Partner).ToList();
}
}
}
namespace Test.Api
{
public class TestClass
{
private readonly MyClass _myclass;
private readonly IList<ITable> _relatedMembers;
public IList<ITable> RelatedMembers
{
get { return _relatedMembers; }
}
public TestClass(int id, string name)
{
this._myclass = new MyClass();
// the Compare method would take your two paramters and return
// a mathcing set of related tables that formed the related tables
_relatedMembers = _myclass.Compare(id, name);
// now do something wityh the resulting list
}
}
}
class Program
{
static void Main(string[] args)
{
// change these values to suit, along with rules in MyClass
var id = 3;
var name = "D";
var testClass = new TestClass(id, name);
Console.Write(string.Format("For Table{0} on Id{1}\r\n", name, id));
Console.Write("----------------------\r\n");
foreach (var relatedTable in testClass.RelatedMembers)
{
Console.Write(string.Format("Related Table{0} on Id{1}\r\n",
relatedTable.Name, relatedTable.Id));
}
Console.Read();
}
}
I'll get back in a bit to see if it fits or not.
So what you are really after is designing a rule engine for building dynamic queries. This is no small undertaking. The requirements you have provided are:
Store rules (what you call a "condition variable")
Each rule selects from one or more tables
Additionally some rules specify columns to be excluded from a table
Rules which select from multiple tables are satisfied with the UNION ALL operator; tables whose projections do not match must be brought into alignment with null columns.
Some possible requirements you don't mention:
Format masking e.g. including or excluding the time element of DATE columns
Changing the order of columns in the query's projection
The previous requirement is particularly significant when it comes to the multi-table rules, because the projections of the tables need to match by datatype as well as number of columns.
Following on from that, the padding NULL columns may not necessarily be tacked on to the end of the projection e.g. a three column table may be mapped to a four column table as col1, col2, null, col3.
Some multi-table queries may need to be satisfied by joins rather than set operations.
Rules for adding WHERE clauses.
A mechanism for defining default sets of excluded columns (i.e. which are applied every time a table is queried) .
I would store these rules in database tables. Because they are data and storing data is what databases are for. (Unless you already have a rules engine to hand.)
Taking the first set of requirements you need three tables:
RULES
-----
RuleID
Description
primary key (RuleID)
RULE_TABLES
-----------
RuleID
Table_Name
Table_Query_Order
All_Columns_YN
No_of_padding_cols
primary key (RuleID, Table_Name)
RULE_EXCLUDED_COLUMNS
---------------------
RuleID
Table_Name
Column_Name
primary key (RuleID, Table_Name, Column_Name)
I've used compound primary keys just because it's easier to work with them in this context e.g. running impact analyses; I wouldn't recommend it for regular applications.
I think all of these are self-explanatory except the additional columns on RULE_TABLES.
Table_Query_Order specifies the order in which the tables appear in UNION ALL queries; this matters only if you want to use the column_names of the leading table as headings in the CSV file.
All_Columns_YN indicates whether the query can be written as SELECT * or whether you need to query the column names from the data dictionary and the RULE_EXCLUDED_COLUMNS table.
No_of_padding_cols is a simplistic implementation for matching projections in those UNION ALL columns, by specifying how many NULLs to add to the end of the column list.
I'm not going to tackle those requirements you didn't specify because I don't know whether you care about them. The basic thing is, what your boss is asking for is an application in its own right. Remember that as well as an application for generating queries you're going to need an interface for maintaining the rules.
MNC,
How about creating a dictionary of all the known tables involved in the application process up front (irrespective of the combinations - just a dictionary of the tables) which is keyed on tablename. the members of this dictionary would be a IList<string> of the column names. This would allow you to compare two tables on both the number of columns present dicTable[myVarTableName].Count as well as iterating round the dicTable[myVarTableName].value to pull out the column names.
At the end of the piece, you could do a little linq function to determine the table with the greatest number of columns and create the structure with nulls accordingly.
Hope this gives food for thought..
Learning a bit about Linq.
I have the following code:
(Please excuse the pathetic size of the data set)
class Program
{
static void Main(string[] args)
{
var employees = new List<Employee>
{
new Employee
{
Name = "Bill Bailey",
EmployeeCode = 12345,
Department = "Comedy Lab",
DateOfBirth = DateTime.Parse("13/01/1964"),
CurrentEmployee = true
},
new Employee
{
Name = "Boris Johnson",
EmployeeCode = 56789,
Department = "Cycling Dept.",
DateOfBirth = DateTime.Parse("19/06/1964"),
CurrentEmployee = true
},
new Employee
{
Name = "Bruce Forsyth",
EmployeeCode = 5,
Department = "Comedy Lab",
DateOfBirth = DateTime.Parse("22/03/1928"),
CurrentEmployee = false
},
new Employee
{
Name = "Gordon Brown",
EmployeeCode = 666,
Department = "Backbenches",
DateOfBirth = DateTime.Parse("20/02/1951"),
CurrentEmployee = false
},
new Employee
{
Name = "Russell Howard",
EmployeeCode = 46576,
Department = "Comedy Lab",
DateOfBirth = DateTime.Parse("23/03/1980"),
CurrentEmployee = false
}
};
Func<Employee, bool> oapCalculator = (employee => employee.DateOfBirth.AddYears(65) < DateTime.Now);
var oaps1 = employees.Where(oapCalculator);
var oaps2 = (from employee in employees
where oapCalculator(employee)
select employee);
oaps1.ToList().ForEach(employee => Console.WriteLine(employee.Name));
oaps2.ToList().ForEach(employee => Console.WriteLine(employee.Name));
Console.ReadLine();
}
class Employee
{
public string Name { get; set; }
public int EmployeeCode { get; set; }
public string Department { get; set; }
public DateTime DateOfBirth { get; set; }
public bool CurrentEmployee { get; set; }
}
}
I have a few questions:
As far as I can tell, both of the featured Linq queries are doing the same thing (black magic may be afoot).
Would they both be compiled down to the same IL?
If not, why, and which would be the most efficient given a sizable amount of data?
What is the best way to monitor Linq query efficiency? Performance timers or something built-in?
Is the lambda expression the preferred method, as it is the most concise?
In a department of lambda fearing luddites, is it worth taking the plunge and teaching 'em up or using the SQL-esque syntax?
Thanks
Re
var oaps1 = employees.Where(oapCalculator);
vs
var oaps2 = (from employee in employees
where oapCalculator(employee)
select employee);
There is a slight difference, in particular around the where oapCalculator(employee). The second query is mapped to:
var oaps2 = employees.Where(employee => oapCalculator(employee));
so this is an extra layer of delegate, and will also incur the (small) overhead of a capture-class due to the closure over the variable oapCalculator, and a dereference of this per iteration. But otherwise they are the same. In particular, the Select is trivially removed (in accordance with the spec).
In general, use whichever is clearest in any scenario. In this case, either seems fine, but you will find it easier to use .Where etc if you are regularly dealing in scenarios that involving delegates or Expressions.
I don't mean this to be snide, but sometimes it is better to try things out for yourself. Along those lines, here are some tools, and some of my own experiences.
1 and 2: Disassemble and find out! :) http://www.red-gate.com/products/reflector/
3: Profile your app. This is the answer to any perf-determining question, unless you're doing algorithm work (mathematical proofs, big-o). Profiling tools are built into VS.
4: Which do you prefer? How about your co-workers? This sounds like a statistical question, which would require a survey
5: Similar to 4, try it and find out! As you may have experienced, evangelizing new techniques to your co-workers will teach you as much as it will teach them.
I've found I've had about a 50% success rate w/ teaching general delegate/lambda usage. I made sure to come up with practical examples from my production test code, and showed how the equivalent imperative code had lots of duplication.
I tried going through the free SICP videos with my team (being a really eye-opener on refactoring), and I found it a pretty hard sell. LISP isn't the most attractive language to the majority of programmers...
http://groups.csail.mit.edu/mac/classes/6.001/abelson-sussman-lectures/
Both LINQ queries are equivalent. The second uses syntactic sugar that the compiler translates to an expression similar to your first query before compiling. As far as what is preferred, use whatever seems more readable to you and your team.
I have need to select a number of 'master' rows from a table, also returning for each result a number of detail rows from another table. What is a good way of achieving this without multiple queries (one for the master rows and one per result to get the detail rows).
For example, with a database structure like below:
MasterTable:
- MasterId BIGINT
- Name NVARCHAR(100)
DetailTable:
- DetailId BIGINT
- MasterId BIGINT
- Amount MONEY
How would I most efficiently populate the data object below?
IList<MasterDetail> data;
public class Master
{
private readonly List<Detail> _details = new List<Detail>();
public long MasterId
{
get; set;
}
public string Name
{
get; set;
}
public IList<Detail> Details
{
get
{
return _details;
}
}
}
public class Detail
{
public long DetailId
{
get; set;
}
public decimal Amount
{
get; set;
}
}
Normally, I'd go for the two grids approach - however, you might also want to look at FOR XML - it is fairly easy (in SQL Server 2005 and above) to shape the parent/child data as xml, and load it from there.
SELECT parent.*,
(SELECT * FROM child
WHERE child.parentid = parent.id FOR XML PATH('child'), TYPE)
FROM parent
FOR XML PATH('parent')
Also - LINQ-to-SQL supports this type of model, but you need to tell it which data you want ahead of time. Via DataLoadOptions.LoadWith:
// sample from MSDN
Northwnd db = new Northwnd(#"c:\northwnd.mdf");
DataLoadOptions dlo = new DataLoadOptions();
dlo.LoadWith<Customer>(c => c.Orders);
db.LoadOptions = dlo;
var londonCustomers =
from cust in db.Customers
where cust.City == "London"
select cust;
foreach (var custObj in londonCustomers)
{
Console.WriteLine(custObj.CustomerID);
}
If you don't use LoadWith, you will get n+1 queries - one master, and one child list per master row.
It can be done with a single query like this:
select MasterTable.MasterId,
MasterTable.Name,
DetailTable.DetailId,
DetailTable.Amount
from MasterTable
inner join
DetailTable
on MasterTable.MasterId = DetailTable.MasterId
order by MasterTable.MasterId
Then in psuedo code
foreach(row in result)
{
if (row.MasterId != currentMaster.MasterId)
{
list.Add(currentMaster);
currentMaster = new Master { MasterId = row.MasterId, Name = row.Name };
}
currentMaster.Details.Add(new Detail { DetailId = row.DetailId, Amount = row.Amount});
}
list.Add(currentMaster);
There's a few edges to knock off that but it should give you the general idea.
select < columns > from master
select < columns > from master M join Child C on M.Id = C.MasterID
You can do it with two queries and one pass on each result set:
Query for all masters ordered by MasterId then query for all Details also ordered by MasterId. Then, with two nested loops, iterate the master data and create a new Master object foreach row in the main loop, and iterate the details while they have the same MasterId as the current Master object and populate its _details collection in the nested loop.
Depending on the size of your dataset you can pull all of the data into your application in memory with two queries (one for all masters and one for all nested data) and then use that to programatically create your sublists for each of your objects giving something like:
List<Master> allMasters = GetAllMasters();
List<Detail> allDetail = getAllDetail();
foreach (Master m in allMasters)
m.Details.Add(allDetail.FindAll(delegate (Detail d) { return d.MasterId==m.MasterId });
You're essentially trading memory footprint for speed with this approach. You can easily adapt this so that GetAllMasters and GetAllDetail only return the master and detail items you're interested in. Also note for this to be effective you need to add the MasterId to the detail class
This is an alternative you might consider. It does cost $150 per developer, but time is money too...
We use an object persistence layer called Entity Spaces that generates the code for you to do exactly what you want, and you can regenerate whenever your schema changes. Populating the objects with data is transparent. Using the objects you described above would look like this (excuse my VB, but it works in C# too):
Dim master as New BusinessObjects.Master
master.LoadByPrimaryKey(43)
Console.PrintLine(master.Name)
For Each detail as BusinessObjects.Detail in master.DetailCollectionByMasterId
Console.PrintLine(detail.Amount)
detail.Amount *= 1.15
End For
With master.DetailCollectionByMasterId.AddNew
.Amount = 13
End With
master.Save()