I have to load and use my data from a db. The data is represented like this:
group_id term
1 hello
1 world
1 bye
2 foo
2 bar
etc.
What is a good C# collection to load and use this data?
Looks like you need a Dictionary<int, List<string>>:
var dict = new Dictionary<int, List<string>>();
var dict.Add(1, new List<string> { "hello", "world", "bye" });
var dict.Add(2, new List<string> { "foo", "bar" });
It all depends on what you have to do with the collection but it seems like Lookup is a good candidate in case you need to group by group_id.
If your data is in a datatable:
var lookup = table.AsEnumerable().ToLookup(row => row.Field<int>("group_id"));
and then access the groups the following way:
foreach (var group in lookup)
{
int groupID = group.Key;
IEnumerable<DataRow> groupRows = group;
}
It depends very strongly on what you need to do with your data.
If you just need to list your data, create a class which holds the data and use a List<Vocable>.
public class Vocable
{
public int Group { get; set; }
public string Term { get; set; }
}
List<Vocable> vocables;
If you need to look up all terms belonging to a group, use a Dictionary<int, List<string>> using the group id as key and a list of terms as value.
If you need to look up the group a term belongs to, use a Dictionary<string, int> using the term as key and the group id as value.
To load and save data from and to a DB you can use the DataSet/DataTable/DataRow classes.
Look at DataAdapter etc. depends on your database server (MySQL, MsSQL, ...)
if you want to work with objects i suggest you the ORMs EntityFramework (Microsoft) or SubSonic.
http://subsonicproject.com/docs/The_5_Minute_Demo
with an ORM you can use LINQ Queries like this:
// Define a query that returns all Department
// objects and course objects, ordered by name.
var departmentQuery = from d in schoolContext.Departments.Include("Courses")
orderby d.Name
select d;
Related
Coming from this list
List<Foo> list = new List<Foo>()
{
new Foo("Albert", 49, 8),
new Foo("Barbara", 153, 45),
new Foo("Albert", -23, 55)
};
I want to get to a dictionary with the names as key and the first Foo-object from the list with that given name as value.
Is there a way to write the logic in a more succinct way using LINQ than what I did here?
Dictionary<string, Foo> fooByName = new Dictionary<string, Foo>();
foreach (var a in assignmentIndetifiers)
{
if (!names.ContainsKey(a.ValueText))
{
names.Add(a.ValueText, a);
}
}
You could try something like this:
var dictionary = list.GroupBy(f=>f.Name)
.Select(gr=>new { Name = gr.Key, Foo = gr.First() })
.ToDictionary(x=>x.Name, x=>x.Foo);
I have assumed that Foo has a property called Name, which is the one you want to use as the key for your dictionary. If not so, you should change this correspondingly.
Essentially, we group by the items found in list based on the Name and then we project the result to an anonymous type of two properties, Name and Foo, which apparently are associated with the key/value of the dictinary you want and then we call the ToDictionary method.
Update
As Igor correctly pointed out, you could by pass the whole projection using the Select and call immediately the ToDictionary
var dictionary = list.GroupBy(f=>f.Name)
.ToDictionary(gr=>gr.Key, gr=>gr.First());
As an alternative to GroupBy() you can use MoreLinq's DistinctBy() (available via NuGet):
var names = items.DistinctBy(x => x.ValueText).ToDictionary(x => x.ValueText);
I have a development scenario where I am joining two collections with Linq; a single list of column header objects which contain presentation metadata, and an enumeration of kv dictionaries which result from a web service call. I can currently iterate (for) through the dictionary enumeration, and join the single header list to the current kv dictionary without issue. After joining, I emit a curated array of dictionary values for each iteration.
What I would like to do is eliminate the for loop, and join the single header list directly to the entire enumeration. I understand the 1-to-1 collection join pretty well, but the 1-to-N syntax is eluding me.
Details
I have the following working method:
public void GetQueryResults(DataTable outputTable)
{
var odClient = new ODataClient(UrlBase);
var odResponse = odClient.FindEntries(CommandText);
foreach (var row in odResponse)
{
var rowValues = OutputFields
.Join(row, h => h.Key, r => r.Key,
(h, r) => new { Header = h, ResultRow = r })
.Select(r => r.ResultRow.Value);
outputTable.Rows.Add(rowValues.ToArray());
}
}
odResponse contains IEnumerable<IDictionary<string, object>>; OutputFields contains IList<QueryField>; the .Join produces an enumeration of anons containing matched field metadata (.Header) and response kv pairs (.ResultRow); finally, the .Select emits the matched response values for row consumption. The OutputField collection looks like this:
class QueryField
{
public string Key { get; set; }
public string Label { get; set; }
public int Order { get; set; }
}
Which is declared as:
public IList<QueryField> OutputFields { get; private set; }
By joining the collection of field headers to the response rows, I can pluck just the columns I need from the response. If the header keys contain { "size", "shape", "color" } and the response keys contain { "size", "mass", "color", "longitude", "latitude" }, I will get an array of values for { "size", "shape", "color" }, where shape is null, and the mass, longitude, and latitude values are ignored. For the purposes of this scenario, I am not concerned with ordering. This all works a treat.
Problem
What I'd like to do is refactor this method to return an enumeration of value array rows, and let the caller manage the consumption of the data:
public IEnumerable<string[]> GetQueryResults()
{
var odClient = new ODataClient(UrlBase);
var odResponse = odClient.FindEntries(CommandText);
var responseRows = //join OutputFields to each row in odResponse by .Key
return responseRows;
}
Followup Question
Would a Linq-implemented solution for this refactor require an immediate scan of the enumeration, or can it pass back a lazy result? The purpose of the refactor is to improve encapsulation without causing redundant collection scans. I can always build imperative loops to reformat the response data the hard way, but what I'd like from Linq is something like a closure.
Thanks heaps for spending the time to read this; any suggestions are appreciated!
I'm not completely sure what you mean but could it be you're meaning something like this?
public IEnumerable<object[]> GetQueryResults()
{
var odClient = new ODataClient(UrlBase);
var odResponse = odClient.FindEntries(CommandText);
// i'd rather you linq here.
var responseRows = from row in odResponse
select new object[]
{
from field in row
join outputfield in OutputFields
on field.Key equals outputfield.Key
select field.Value
};
return responseRows;
}
Instead of filling a DataTable. This will create an array of objects and filling it with field.Value where the field.Key exists in the outputfields. The whole thing is encapsulated in a IEnumerable. (from row in odResponse)
Usage:
var responseRows = GetQueryResults2();
foreach(var rowValues in responseRows)
outputTable.Rows.Add(rowValues);
The trick here is, within one query you iterate a list and create a subquery on the fields and stores the subquery result directly in a object[]. The object[] is only created when the responseRows is iterated. This is the answer on your second question I think -> the Lazy result.
I am parsing a test file, in for form of:
[Person]: [Name]-[John Doe], [Age]-[113], [Favorite Color]-[Red].
[Person]: [Name]-[John Smith], [Age]-[123], [Favorite Color]-[Blue].
[Person]: [Name]-[John Sandles], [Age]-[133], [Favorite Color]-[Green].
[Person]: [Name]-[Joe Blogs], [Age]-[143], [Favorite Color]-[Khaki].
As you can see, the values are not duplicated (though I want to account for future dupes), but the Keys are dupes. The keys being the parts before the hyphen (-).
But everytime I get these into a Dictionary it has a fit and tells me dupes aren't allowed. Why doesn't the Dictionary allow dupes? And how can I overcome this?
The Dictionary has the TKey part of it being hashed for fast lookup, if you have dupes in there, you'll get into collisions and complexities, which will reduce your ability to look things up quickly and efficiently. That is why dupes are not allowed.
You could make a struct with the data in it, and put that in a Dictionnary<ID, MyStruct> for example. This way you avoid dupes in the key (which is unique for each struct, and you have all your data in a Dictionary.
Dictionary can have dupes in value but cannot have dupes in Key because then how will you tell which key's value do you want.
And how can I overcome this
use a KeyvaluePair[] but in that case also how will you tell which key's value do you want?
You can use the Wintellect Power Collections' MultiDictionary class. Power Collections is a long-established set of collection classes for .Net 2 or later. It hasn't been updated for 5 years, but it doesn't need to be.
See here: http://powercollections.codeplex.com/
Download it here: http://powercollections.codeplex.com/releases/view/6863
The simplest thing to do is to use an Dictionary<string, List<string>>.
Usage:
foreach(var person in persons)
{
List<string> list;
if(!dict.TryGetValue(person.Key, out list)
{
list = new List<string>();
dict.Add(person.Key, list);
}
list.Add(person.Data);
}
Lookup<TKey, TElement> class from System.Linq namespace represents a collection of keys each mapped to one or more values. More info: MSDN
List<Person> list= new List<Person>();
// ...
var lookup = list.ToLookup(person => person.Key, person => new {Age=person.Age, Color=person.Color});
IEnumerable<Person> peopleWithKeyX = lookup["X"];
public class Person
{
public string Key { get; set; }
public string Age { get; set; }
public string Color { get; set; }
}
Based on your question I suppose that you are using [Name], [Age] and [Favorite Color] as keys. There are many ways how to put your data into the dictionary using these keys, but the real question is how will you get it back?
The keys in Dictionary should be unique, so you need to find some unique data to use it as a key.
In your case the test file looks like list of Persons, where each line contains person's data. So the most natural way is to compose a dictionary that contains rows about persons where 'unique data' should be a Person's name, unless it is not duplicated.
In real life however Person's name is usually a bad choice, (not only because it may change over time, but also because the probability of identical names is very high), so artificial keys are used instead (row number, Guids, etc.)
Edit I see that number of properties may vary. So you need to use nested dictionaries. Outer - for 'Persons' and inner for Person properties:
Dictionary<string, Dictionary<string, string>> person_property_value;
However for your data structure to be more understandable you should put the inner dictionary inside Person class:
class Person{
public readonly Dictionary<string, string> props;
public Person()
{
props = new Dictionary<string, string>();
}
}
Now you add the Person as:
Person p = new Person();
p.props['Name'] = 'John Doe';
p.props['Age'] = 'age';
dictionary.Add('John Doe', p);
And get it back as:
Person p = dictionary[Name];
Now to allow several persons share the same name you declare the dictionary as Dictionary<string, List<Person>>
i have the next class:
public class Example
{
String name;
Dictionary<String, decimal> data;
public Example()
{
data = new Dictionary<String, decimal>();
}
}
Then, using Linq i need to retrieve all distinct String keys in the data field.
For example:
e1: 1 - [["a", 2m],["b",3m])
e2: 2 - [["b", 2m],["c",3m])
I'll need a list with: ["a","b","c"]
I hope I was clear enough.
Thanks.
PD: One thing i was missing, i have a List of Examples.
Assuming you mean you have a collection of Examples (e1, e2...):
var keys = examples.SelectMany(example => example.data.Keys)
.Distinct();
var keys =
(from ex in examples
from key in ex.Data.Keys
select key).Distinct();
At the moment I am using one list to store one part of my data, and it's working perfectly in this format:
Item
----------------
Joe Bloggs
George Forman
Peter Pan
Now, I would like to add another line to this list, for it to work like so:
NAME EMAIL
------------------------------------------------------
Joe Bloggs joe#bloggs.com
George Forman george#formangrills.co
Peter Pan me#neverland.com
I've tried using this code to create a list within a list, and this code is used in another method in a foreach loop:
// Where List is instantiated
List<List<string>> list2d = new List<List<string>>
...
// Where DataGrid instance is given the list
dg.DataSource = list2d;
dg.DataBind();
...
// In another method, where all people add their names and emails, then are added
// to the two-dimensional list
foreach (People p in ppl.results) {
list.Add(results.name);
list.Add(results.email);
list2d.Add(list);
}
When I run this, I get this result:
Capacity Count
----------------
16 16
16 16
16 16
... ...
Where am I going wrong here. How can I get the output I desire with the code I am using right now?
Why don't you use a List<People> instead of a List<List<string>> ?
Highly recommend something more like this:
public class Person {
public string Name {get; set;}
public string Email {get; set;}
}
var people = new List<Person>();
Easier to read, easy to code.
If for some reason you don't want to define a Person class and use List<Person> as advised, you can use a tuple, such as (C# 7):
var people = new List<(string Name, string Email)>
{
("Joe Bloggs", "joe#bloggs.com"),
("George Forman", "george#formangrills.co"),
("Peter Pan", "me#neverland.com")
};
var georgeEmail = people[1].Email;
The Name and Email member names are optional, you can omit them and access them using Item1 and Item2 respectively.
There are defined tuples for up to 8 members.
For earlier versions of C#, you can still use a List<Tuple<string, string>> (or preferably ValueTuple using this NuGet package), but you won't benefit from customized member names.
Where does the variable results come from?
This block:
foreach (People p in ppl.results) {
list.Add(results.name);
list.Add(results.email);
list2d.Add(list);
}
Should probably read more like:
foreach (People p in ppl.results) {
var list = new List<string>();
list.Add(p.name);
list.Add(p.email);
list2d.Add(list);
}
It's old but thought I'd add my two cents...
Not sure if it will work but try using a KeyValuePair:
List<KeyValuePair<?, ?>> LinkList = new List<KeyValuePair<?, ?>>();
LinkList.Add(new KeyValuePair<?, ?>(Object, Object));
You'll end up with something like this:
LinkList[0] = <Object, Object>
LinkList[1] = <Object, Object>
LinkList[2] = <Object, Object>
and so on...
You should use List<Person> or a HashSet<Person>.
Please show more of your code.
If that last piece of code declares and initializes the list variable outside the loop you're basically reusing the same list object, thus adding everything into one list.
Also show where .Capacity and .Count comes into play, how did you get those values?