I have a table in a database with 2 fields: index (int), email( varchar(100) )
I need to do the following:
Group all emails by domains names (all emails already lowercase).
Select all emails from all groups where the sum of emails for domain not exceeding 20% of total emails before step 1.
Code example:
DataContext db = new DataContext();
//Domains to group by
List<string> domains = new List<string>() { "gmail.com", "yahoo.com", "hotmail.com" };
Dictionary<string, List<string>> emailGroups = new Dictionary<string, List<string>>();
//Init dictionary
foreach (string thisDomain in domains)
{
emailGroups.Add(thisDomain, new List<string>());
}
//Get distinct emails
var emails = db.Clients.Select(x => x.Email).Distinct();
//Total emails
int totalEmails = emails.Count();
//One percent of total emails
int onePercent = totalEmails / 100;
//Run on each email
foreach (var thisEmail in emails)
{
//Run on each domain
foreach (string thisDomain in emailGroups.Keys)
{
//If email from this domain
if (thisEmail.Contains(thisDomain))
{
//Add to dictionary
emailGroups[thisDomain].Add(thisEmail);
}
}
}
//Will store the final result
List<string> finalEmails = new List<string>();
//Run on each domain
foreach (string thisDomain in emailGroups.Keys)
{
//Get percent of emails in group
int thisDomainPercents = emailGroups[thisDomain].Count / onePercent;
//More than 20%
if (thisDomainPercents > 20)
{
//Take only 20% and join to the final result
finalEmails = finalEmails.Union(emailGroups[thisDomain].Take(20 * onePercent)).ToList();
}
else
{
//Join all to the final result
finalEmails = finalEmails.Union(emailGroups[thisDomain]).ToList();
}
}
Does anyone know a better way to make it?
I can't think of a way of doing this without hitting the DB at least twice, once for the grouping and one for the overall count, you could try something like
var query = from u in db.Users
group u by u.Email.Split('#')[1] into g
select new
{
Domain = g.Key,
Users = g.ToList()
};
query = query.Where(x => x.Users.Count <= (db.Users.Count() * 0.2));
Suppose you want to get the last items in the ascending order in each group:
int m = (int) (input.Count() * 0.2);
var result = input.GroupBy(x=>x.email.Split('#')[1],
(key,g)=>g.OrderByDescending(x=>x.index).Take(m)
.OrderBy(x=>x.index))
.SelectMany(g=>g);//If you want to get the last result without grouping
Or this:
var result = input.GroupBy(x=>x.email.Split('#')[1],
(key,g)=>g.OrderBy(x=>x.index)
.Skip(g.Count()-m))
.SelectMany(g=>g);//If you want to get the last result without grouping
var maxCount = db.Users.Count() * 0.2;
var query = (from u in db.Users
group u by u.Email.Split('#')[1] into g
select new
{
Domain = g.Key,
Users = g.Take(maxCount).ToList()
})
.SelectMany(x => x.Users);
Related
Lets say I have a list of class Car. It has properties Id, Model, Year, Age,CarCode. A carCode is unique for a model but is not for all models.
I want to group by Model and Year together and then group that grouping by Age and then for each Agegroup in each model and year group send Model, Age, and listCarCodes to a method which will update a db table with those CarCodes.
What I want
var carModelYearGroupList=GroupBy CarLIst(Model, Year)
var carAgeGL= groupBy carModelYearGroupList(key.Age)
Parallel.ForEach(carModelYearGroupList, new ParallelOptions { MaxDegreeOfParallelism = 10 }, (car) =>
{
Parallel.ForEach(carAgeGL, new ParallelOptions { MaxDegreeOfParallelism = 10 }, (ca) =>
{
UpdateDb(Model m, Age a, List<string> carCodes);
});
});
What I tried
var consolidatedChildren =
from c in bla
group c by new
{
c.sreSif,
c.sreSerija,
} into gcs
from g in gcs
group g by new
{
g.sreIsplatio
} into gcse
select new
{
srecka= gcs.Key.sreSif,
serija = gcs.Key.sreSerija,
opBroj = gcse.Key.sreIsplatio,
sreckeTemp = gcse.ToList(),
};
foreach (var c in consolidatedChildren)
{
foreach (var isp in c.sreckeTemp)
{
List<string>IsplatniBrojevi= new List<string>();
IsplatniBrojevi.Add(isp.sreIspBroj);
}
}
However I don't see gcs in these lines
select new
{
srecka= gcs.Key.sreSif,
serija = gcs.Key.sreSerija,
Appreciate any help.
I have this class where the query must result in this list a property.
This property must check on table how many duplicated exists.
This code works, but its very slow. can you help me ?
var lst = _uow.Repository.GetAll();
var query =
from p in lst
select new GetRfqResponse
{
ID = p.ID,
//bad performance
Count = lst.Where(x => x.Property == p.Property).AsQueryable().Count(),
//
};
Counting in a queryable list can be easily achieved using the Count() function:
// Find duplicated names
var byName = from s in studentList
group s by s.StudentName into g
select new { Name = g.Key, Count = g.Count() };
Check this fiddle to see it running.
Below is for InMemory
GroupBy should come to help.
var propertyGroupedList = list.GroupBy(l=>l.Property);
var query = list.Select(l => new GetRfqResponse{
Id = l.Id,
Count = propertyGroupedList.First(g=> g.Key == l.Property).Count()
});
Or you can create a dictionary with key as "Property" and value as count, then you will have to loop just once to store the count.
This allows you to get count in constant time
Dictionary<string, int> map = new Dictionary<string, int>();
foreach (var item in lst)
{
if (!map.ContainsKey(lst.Property))
{
map.Add(item.Property, 1);
}
else
map[item.Property]++;
}
var z = lst.Select(l => new GetRfqResponse{
Id = l.ID,
Count = map[l.Property]
});
I'm trying to create a line chart using a datatable that has all the data that I need but I can't seem to group it successfully.
I need to group the datatable by month/year and count all the records that fall within each respective month/year.
I actually tried to extract the data and make the graph in excel. So it should look like this.
Ignore the quarters as those are auto generated by excel
Here's the code:
var chartResults = (from myrow in chartInstruction.Chart.AsEnumerable()
group myrow by new
{
series = myrow.Field<string>(chartInstruction.Series),
xaxis = myrow.Field<DateTime>(chartInstruction.Xaxis).ToShortDateString()
}
into g
select new
{
Field = g.Key,
Total = g.Count()
}).ToArray();
string[] xAxis = chartResults.Select(x => x.Field.xaxis).Distinct().OrderBy(x => x).ToArray();
string[] series = chartResults.Select(x => x.Field.series).Distinct().OrderBy(x => x).ToArray();
var seriesInfo = new List<SeriesInfo>();
foreach (string s in series)
{
var tmp = new List<double>();
foreach (string xx in xAxis)
{
var found = chartResults.FirstOrDefault(x => x.Field.xaxis == xx && x.Field.series == s);
tmp.Add(found?.Total ?? 0);
}
seriesInfo.Add(new SeriesInfo { Name = s, Series = tmp.ToArray() });
}
I'm using linq for search function. I have to search by a list of locations(display students from Tokio,Berlin, New York). I have a foreach statement, which is going throu all the locations and adds them to a list. My problem is that I can't dispaey them all outside of foreach. How can I declare var newstudents before foreach?
Bellow is my code
public void search(IEnumerable<string> Location)
{
foreach (var l in Location)
{
var students = from s in db.Students select s;
students = students.Where(s => s.City.Contains(l));
var customers = students.ToList();
}
int custIndex = 1;
Session["TopEventi"] = customers.ToDictionary(x => custIndex++, x => x);
ViewBag.TotalNumberCustomers = customers.Count();
My problem is that I can't display them all outside of foreach. How
can I declare var newstudents before foreach?
Why can't you do that? You just need to declare the variable as IEnumerable<ClassName>:
IEnumerable<Student> customers = null;
foreach (var l in Location)
{
var students = from s in db.Students
where s.City.Contains(l)
select s;
customers = customers.Concat(students);
}
customers = customers.ToList()
But you don't need the foreach at all, you can do it with one LINQ query:
IEnumerable<Student> customers = db.Students
.Where(s => Location.Any(l => s.City.Contains(l)));
This approach is searching for a substring in Student.City which is the location.
Get rid of the loop entirely.
public void search(IEnumerable<string> Location)
{
string[] locations = Location.Cast<string>().ToArray();
var customers = db.Students.Where(s => locations.Contains(s.City)).ToList();
You could declare the List outside the foreach and in side you only do something like
yourList.AddRange(students.ToList());
you can declare a dictionary for mapping your new student to specific location, and add new lists to it in the loop.
also your use of the word newstudents is a bit confusing - you're nor lookning here new students in your code only map their location. Anyways: considerung new students from outside the loop:
public void search(IEnumerable<string> Location)
{
Dictionary<Location, List<Students>> newStudents = new Dictionary<Location, List<Students>>();
foreach (var l in Location)
{
var students = from s in db.Students select s;
students = students.Where(s => s.City.Contains(l));
newStudents[l]= students.ToList();
}
int custIndex = 1;
//what is this for? seeing lastly added
Session["TopEventi"] = customers.ToDictionary(x => custIndex++, x => x);
ViewBag.TotalNumberCustomers = (from lists in newStudents select lists.Count).Sum();
I have made a group by statement on a datatable like this:
var finalResult = (from r in result.AsEnumerable()
group r by new
{
r.Agent,
r.Reason
} into grp
select new
{
Agent = grp.Key.Agent,
Reason = grp.Key.Reason,
Count = grp.Count()
}).ToList();
The finalResult will be like this:
agent1 reason1 4
agent1 reason2 7
agent2 reason1 8
agent2 reason2 3
..
...
...
agentn reason1 3
agentn reason2 11
I want to loop over agent name in order to get the reasons and the counts for each reason for each agent. In other words: i need to build this :
can you tell me please how to loop over agent name from the finalResult variable?
You need one more GroupBy and you are done:
var solution =
finalResult
.GroupBy(x => x.Agent);
foreach (var group in solution)
{
// group.Key is the agent
// All items in group are a sequence of reasons and counts for this agent
foreach (var item in group)
{
// Item has <Agent, Reason, Count> and belongs to the agent from group.Key
}
}
Outer loop goes over all the agents (so Agent1, Agent2, etc.) while inner loop will go through all reasons for the current agent.
You might want to try GroupBy in LINQ :
You can read more about it here
Perhaps:
var agentGroups = finalResult
.GroupBy(x => x.Agent)
.Select(ag => new
{
Agent = ag.Key,
ReasonCounts = ag.GroupBy(x => x.Reason)
.Select(g => new
{
Agent = ag.Key,
Reason = g.Key,
Count = g.Sum(x => x.Count)
}).ToList(),
Total_Count = ag.Sum(x => x.Count)
});
foreach (var agentGroup in agentGroups)
{
string agent = agentGroup.Agent;
int totalCount = agentGroup.Total_Count;
foreach (var reasonCount in agentGroup.ReasonCounts)
{
string reason = reasonCount.Reason;
int count = reasonCount.Count;
}
}