I am trying to store movie ratings by users in a Dictionary. The file from which the data is acquired is of the form
UserID | MovieID | Rating | Timestamp
They are tab separated values
//Take the first 100 lines from the file and store each line as a array element of text
string[] text = System.IO.File.ReadLines(#File path).Take(100).ToArray();
//extDic[username] - [moviename][rating] is the structure
Dictionary<string,Dictionary<string,double>> extDic=new Dictionary<string,Dictionary<string,double>>();
Dictionary<string, double> movie=new Dictionary<string,double>();
foreach(string s in text)
{
int rating;
string username=s.Split('\t')[0];
string moviename=s.Split('\t')[1];
Int32.TryParse(s.Split('\t')[2], out rating);
movie.Add(moviename,rating);
if (extDic.ContainsKey(username))
{
//Error line
extDic[username].Add(moviename, rating);
}
else
{
extDic.Add(username, movie);
}
movie.Clear();
}
I get the following error "An item with the same key has already been added" on the error line. I understand what the error is and have tried to solve it by checking with an if statement. However that doesn't solve it.
Also, I wonder if there is a significant of movie.clear()?
There must be duplicates of that user and movie.
To fix the error, you can use this for your "error line":
extDic[username][moviename] = rating;
Though there may be other problems afoot.
The problem might be caused by the fact that you are using the variable movie as a value for all the entries in the extDic dictionary. movie is nothing but a reference, so when you are doing a movie.Clear() you are clearing all the values from extDic.
You could entirely remove the variable movie and replace it with a fresh instance of new Dictionary<string, double>()
string[] text = System.IO.File.ReadLines(#File path).Take(100).ToArray();
//extDic[username] - [moviename][rating] is the structure
Dictionary<string,Dictionary<string,double>> extDic=new Dictionary<string,Dictionary<string,double>>();
foreach(string s in text)
{
int rating;
//split only once
string[] splitted = s.Split('\t');
//UPDATE: skip the current line if the structure is not ok
if(splitted.Length != 3){
continue;
}
string username=splitted[0];
string moviename=splitted[1];
Int32.TryParse(splitted[2], out rating);
//UPDATE: skip the current line if the user name or movie name is not valid
if(string.IsNullOrWhiteSpace(username) || string.IsNullOrWhiteSpace(moviename)){
continue;
}
if(!extDic.ContainsKey(username)){
//create a new Dictionary for every new user
extDic.Add(username, new Dictionary<string,double>());
}
//at this point we are sure to have all the keys set up
//let's assign the movie rating
extDic[username][moviename] = rating;
}
Your problem is that you are adding the same dictionary to all users so when two users have rated the same movie you will see this exception
int rating;
var result = from line in text
let tokens = s.Split('\t')
let username=tokens[0];
let moviename=tokens[1];
where Int32.TryParse(tokens[2], out rating);
group new {username, Rating=new{moviename,rating}} by username;
The above code will give you a structure that from a tree perspective is similar to your own. If you need the lookup capability you can simply call .ToDictionary
var extDic = result.ToDictionary(x=x.Key, x=>x.ToDictonary(y=>y.moviename,y=>y.rating))
The reason why I rewrote it in to LINQ is that it's a lot hard to make those kinds of mistakes using something that's side effect free like LINQ
Related
I'm really new to programming, so take this with a grain of salt.
I've made 2 arrays that correspond to eachother; One is a Name array and one is a Phone Number array. The idea is that the spot [1] in NameArray corresponds to spot [1] in the PhoneArray. In other words, I need to keep these 'pairings' in tact.
I'm trying to make a function that deletes one of the spots in the array, and shifts everything down one, as to fill the space left empty by the deleted element.
namearray = namearray.Where(f => f != iNum).ToArray();
is what I've tried, with iNum being the number corresponding to the element marked for deletion in the array.
I've also tried converting it to a list, removing the item, then array-ing it again.
var namelist = namearray.ToList();
var phonelist = phonearray.ToList();
namelist.Remove(txtName.Text);
phonelist.Remove(txtPhone.Text);
namearray = namelist.ToArray();
phonearray = phonelist.ToArray();
lbName.Items.Clear();
lbPhone.Items.Clear();
lbName.Items.AddRange(namearray);
lbPhone.Items.AddRange(phonearray);
with txtName.Text and txtPhone.Text being the strings for deletion in the corresponding list boxes.
Can someone suggest a better way to do it / What I'm doing wrong / How to fix?
Thanks guys
-Zack
A better way would be to have an array of a class that contains a Name and Phone Number object:
public class PersonData
{
public string Name { get; set; }
public string Phone { get; set; }
}
public PersonData[] data;
That way, instead of keeping two arrays in sync, it's one array with all the appropriate data.
Try a loop through both arrays, moving the values of each down an index each time.
Start the loop at the index of the value you want to delete. So you would find the IndexOf(T) the value you want, storing it as deleteIndex and run the loop starting from that index.
When you hit the end of the array, set the last value as null or string.Empty (depending what value type the array holds).
A bit like this:
var deleteIndex = namearray.IndexOf("TheStringYouWantToDelete");
for (int i = deleteIndex; i < namearray.Length; i++)
{
if (i == namearray.Length - 1) // The "last" item in the array.
{
namearray[i] = string.Empty; // Or null, or your chosen "empty" value.
phonearray[i] = string.Empty; // Or null, or your chosen "empty" value.
}
else
{
namearray[i] = namearray[i+1];
phonearray[i] = phonearray[i+1];
}
}
This will work for deleting and moving values 'down' in index.
You could also rewrite the code for moving them the other way, as it would work similarly.
Reordering them completely? Different ball game...
Hope this helps.
If the namearray and phonearray contain strings and you know the index of the element to remove (iNum) then you need to use the overload of the Where extension that takes a second parameter, the index of the current element in the evaluation
namearray = namearray.Where((x, y) => y != iNum).ToArray();
However the suggestion to use classes for your task is the correct one. Namearray and Phonearray (and whatever else you need to handle in future) are to be thought as properties of a Person class and instead of using arrays use a List<Person>
public class Person
{
public string FirstName {get; set;}
public string LastName {get; set;}
public string Phone {get; set;}
}
List<Person> people = new List<Person>()
{
{new Person() {FirstName="Steve", LastName="OHara", Phone="123456"}},
{new Person() {FirstName="Mark", LastName="Noname", Phone="789012"}}
};
In this scenarion removing an item knowing the LastName could be written as
people = people.Where(x => x.LastName != "OHara").ToList();
(or as before using the index in the list of the element to remove)
people = people.Where((x, y) => y != iNum).ToArray();
The other answers provide some better design suggestions, but if you're using ListBoxes and want to stick with arrays, you can do this to synchronize them:
int idx = lbName.Items.IndexOf(txtName.Text);
if (idx > -1)
{
lbName.Items.RemoveAt(idx);
lbPhone.Items.RemoveAt(idx);
}
namearray = lbName.Items.Cast<string>().ToArray<string>();
phonearray = lbPhone.Items.Cast<string>().ToArray<string>();
Use a dictionary instead.
Dictionary<string, string> phoneBook = new Dictionary<string, string>();
phoneBook["Foo"] = "1234567890";
phoneBook["Bar"] = "0987654321";
phoneBook.Remove("Bar");
Hello need some assistance with this issue. Hopefully i can describe it well.
I have a parser that goes though a document and find sessionID's, strips some tags from them and places them into a list.
while ((line = sr.ReadLine()) != null)
{
Match sID = sessionId.Match(line);
if (sID.Success)
{
String sIDString;
String sid = sID.ToString();
sIDString = Regex.Replace(sid, "<[^>]+>", string.Empty);
sessionIDList.Add(sIDString);
}
}
Then I go thought list and get the distinctSessionID's.
List<String> distinctSessionID = sessionIDList.Distinct().ToList();
Now I need to go thought he document again and add the lines that match the sessionID and add them to the list. This is the part that I am having issue with.
Do I need to create a 2d list so I can add the matching log lines to the corresponding sessionids.
I was looking at this but cannot seem to figure out a way that I could copy over my Distinct list then add the Lines I need into the new array.
From what I can test it looks like this would add the value into the masterlist
List<List<string>> masterLists = new List<List<string>>();
Foreach (string value in distinctSessionID)
{
masterLists[0].Add(value);
}
How do I add Lines I need to the corresponding Masterlist. Say masterList[0].Add value is 1, how do i add the lines to 1?
masterList[0][0].add(myLInes);
Basically i want
Sessionid1
-------> related log line
-------> Related log line
SessionID2
-------> related log line
-------> related log line.
So on and so forth. I have the parsing all working, it's just getting the values into a 2nd string list is the issue.
Thanks,
What you can do is, simple create a class with public properties, and make list of that custom class.
public class Session
{
public int SessionId{get;set;}
public List<string> SessionLog{get;set;}
}
List<Session> objList = new List<Session>();
var session1 = new Session();
session1.SessionId = 1;
session1.SessionLog.Add("description lline1");
objList.Add(session1);
Here is one way to do it:
public class MultiDimDictList: Dictionary<string, List<int>> { }
MultiDimDictList myDictList = new MultiDimDictList ();
Foreach (string value in distinctSessionID)
{
myDictList.Add(value, new List<int>());
for(int j=0; j < lengthofLines; j++)
{
myDictList[value].Add(myLine);
}
}
You would need to replace lengthofLines with a number to indicate how many iterations of lines you have.
See Charles Bretana's answer here
I'm new to C# and programming as a whole and I've been unable to come up with a solution to what I want to do. I want to be able to create a way to display several arrays containing elements from three external text files with values on each line (e.g. #"Files\Column1.txt", #"Files\Column2.txt" #"Files\Column3.txt"). They then need to be displayed like this in the command line:
https://www.dropbox.com/s/0telh1ils201wpy/Untitled.png?dl=0
I also need to be able to sort each column individually (e.g. column 3 from lowest to highest).
I've probably explained this horribly but I'm not sure how else to put it! Any possible solutions will be greatly appreciated!
One way to do it would be to store the corresponding items from each file in a Tuple, and then store those in a List. This way the items will all stay together, but you can sort your list on any of the Tuple fields. If you were doing anything more detailed with these items, I would suggest creating a simple class to store them, so the code would be more maintainable.
Something like:
public class Item
{
public DayOfWeek Day { get; set; }
public DateTime Date { get; set; }
public string Value { get; set; }
}
The example below could easily be converted to use such a class, but for now it uses a Tuple<string, string, string>. As an intermediate step, you could easily convert the items as you create the Tuple to get more strongly-typed versions, for example, you could have Tuple<DayOfWeek, DateTime, string>.
Here's the sample code for reading your file items into a list, and how to sort on each item type:
public static void Main()
{
// For testing sake, I created some dummy files
var file1 = #"D:\Public\Temp\File1.txt";
var file2 = #"D:\Public\Temp\File2.txt";
var file3 = #"D:\Public\Temp\File3.txt";
// Validation that files exist and have same number
// of items is intentionally left out for the example
// Read the contents of each file into a separate variable
var days = File.ReadAllLines(file1);
var dates = File.ReadAllLines(file2);
var values = File.ReadAllLines(file3);
var itemCount = days.Length;
// The list of items read from each file
var fileItems = new List<Tuple<string, string, string>>();
// Add a new item for each line in each file
for (int i = 0; i < itemCount; i++)
{
fileItems.Add(new Tuple<string, string, string>(
days[i], dates[i], values[i]));
}
// Display the items in console window
fileItems.ForEach(item =>
Console.WriteLine("{0} {1} = {2}",
item.Item1, item.Item2, item.Item3));
// Example for how to order the items:
// By days
fileItems = fileItems.OrderBy(item => item.Item1).ToList();
// By dates
fileItems = fileItems.OrderBy(item => item.Item2).ToList();
// By values
fileItems = fileItems.OrderBy(item => item.Item3).ToList();
// Order by descending
fileItems = fileItems.OrderByDescending(item => item.Item1).ToList();
// Show the values based on the last ordering
fileItems.ForEach(item =>
Console.WriteLine("{0} {1} = {2}",
item.Item1, item.Item2, item.Item3));
}
So I'm using C# and Visual Studio. I am reading a file of students and their information. The number of students is variable, but I want to grab their information. At the moment I just want to segment the student's information based off of the string "Student ID" because each student's section starts with Student ID. I'm using ReadAllText and setting it equal to a string and then feeding that string to my function splittingStrings. The file will look like this:
student ID 1
//bunch of info
student ID 2
//bunch of info
student ID 3
//bunch of info
.
.
.
I'm wanting to split each segment into a list since the number of students will be unknown, and the information for each student will vary. So I looked into both Regular string split and Regex string splitting. For regular strings I tried this.
public static List<string> StartParse = new List<string>();
public static void splittingStrings(string v)
{
string[] DiagDelimiters = new string[] {"Student ID "};
StartParse.Add(v.Split(DiagDelimiters, StringSplitOptions.None);
}
And this is what I tried with Regex:
StartParse.Add(Regex.Split("Student ID ");
I haven't used Lists before, but from what I've read they are dynamic and easy to use. My only trouble I'm getting is that all examples I see with split are in combination with an array so syntactically I'm not sure how to do a split on a string and insert it into a list. For output my goal is to have the student segments divided so that if I need to I can call a particular segment later.
Let me verify that I'm after that batch of information not the ID's alone. A lot of the questions seem to be focused on that so I felt I needed to verify that.
To those suggesting other storage bodies:
example of what list will hold:
position 0 will hold [<id> //bunch of info]
position 1 will hold [<anotherID> //bunch of info]
.
.
.
So I'm just using the List to do multiple operations on for information that I need. The information will be FAR more manageable if I can segment them into the list as shown above. I'm aware of dictionaries, but I have to store this information either in sql tables or inside text files depending on the contents of the segments. An example would be if one segment was really funky then I would send an error report that one student's information is bad. Otherwise insert neccessary information into sql table. But I'm having to work with multiple things from the segments so I felt the List was the best way to go since I'll have to also go back and forth in the segment to cross check bits of information with earlier things in that segment I found.
There is no need to use RegEx here and I would recommend against it. Simply splitting on white space will do the trick. Lets pretend you have a list which contains each of those lines (student ID 1, student ID 2, ect) you can get a list of the id's very simply like so;
List<string> ids = students.Select(x => x.Split(' ')[2]).ToList();
The statement above essentially says, for each string in students split the string and return the third token (index 2 because it's 0 indexed). I then call ToList because Select by default returns an IEnumerable<T> but I wouldn't worry about those details just yet. If you don't have a list with each of the lines you showed the idea stays much the same, only you would add the items to you ids list one by one as you split the string. For an given string in the form of student id x I would get x on it's own with myString.Split(' ')[2] that is the basis of the expression I pass into Select.
Based on the OP's comment here is a way to get all of the data without the Student Id part of each batch.
string[] batches = input.Split(new string[] { "student id " } StringSplitOptions.RemoveEmptyEntries);
If you really need a list then you can just call ToList() and change type of batches to List<string> but that would probably just be a waste of CPU cycles.
Here's some pseudo-code, and what i'd do:
List<Integer> ids;
void ParseStudentId(string str) {
var spl = str.split(" ");
ids.add(Integer.parseInt(spl[spl.length-1])); // this will fetch "1" from "Student Id 1"
}
void main() {
ParseStudentId("Student Id 1");
ParseStudentId("Student Id 2");
ParseStudentId("Student Id 3");
foreach ( int id in ids )
Console.WriteLin(id); // will result in:
// 1
// 2
// 3
}
forgive me. i'm a java programmer, so i'm mixing Pascal with camel casing :)
Try this one:
StartParse = new List<string>(Regex.Split(v, #"(?<!^)(?=student ID \d+)"));
(?<!^)(?=student ID \d+) which means Splitting the string at the point student ID but its not at the beginning of the string.
Check this code
public List<string> GetStudents(string filename)
{
List<string> students = new List<string>();
StringBuilder builder = new StringBuilder();
using (StreamReader reader = new StreamReader(filename)){
string line = "";
while (!reader.EndOfStream)
{
line = reader.ReadLine();
if (line.StartsWith("student ID") && builder.Length > 0)
{
students.Add(builder.ToString());
builder.Clear();
builder.Append(line);
continue;
}
builder.Append(line);
}
if (builder.Length > 0)
students.Add(builder.ToString());
}
return students;
}
I have a flat file with an unfortunately dynamic column structure. There is a value that is in a hierarchy of values, and each tier in the hierarchy gets its own column. For example, my flat file might resemble this:
StatisticID|FileId|Tier0ObjectId|Tier1ObjectId|Tier2ObjectId|Tier3ObjectId|Status
1234|7890|abcd|efgh|ijkl|mnop|Pending
...
The same feed the next day may resemble this:
StatisticID|FileId|Tier0ObjectId|Tier1ObjectId|Tier2ObjectId|Status
1234|7890|abcd|efgh|ijkl|Complete
...
The thing is, I don't care much about all the tiers; I only care about the id of the last (bottom) tier, and all the other row data that is not a part of the tier columns. I need normalize the feed to something resembling this to inject into a relational database:
StatisticID|FileId|ObjectId|Status
1234|7890|ijkl|Complete
...
What would be an efficient, easy-to-read mechanism for determining the last tier object id, and organizing the data as described? Every attempt I've made feels kludgy to me.
Some things I've done:
I have tried to examine the column names for regular expression patterns, identify the columns that are tiered, order them by name descending, and select the first record... but I lose the ordinal column number this way, so that didn't look good.
I have placed the columns I want into an IDictionary<string, int> object to reference, but again reliably collecting the ordinal of the dynamic columns is an issue, and it seems this would be rather non-performant.
I ran into a simular problem a few years ago. I used a Dictionary to map the columns, it was not pretty, but it worked.
First make a Dictionary:
private Dictionary<int, int> GetColumnDictionary(string headerLine)
{
Dictionary<int, int> columnDictionary = new Dictionary<int, int>();
List<string> columnNames = headerLine.Split('|').ToList();
string maxTierObjectColumnName = GetMaxTierObjectColumnName(columnNames);
for (int index = 0; index < columnNames.Count; index++)
{
if (columnNames[index] == "StatisticID")
{
columnDictionary.Add(0, index);
}
if (columnNames[index] == "FileId")
{
columnDictionary.Add(1, index);
}
if (columnNames[index] == maxTierObjectColumnName)
{
columnDictionary.Add(2, index);
}
if (columnNames[index] == "Status")
{
columnDictionary.Add(3, index);
}
}
return columnDictionary;
}
private string GetMaxTierObjectColumnName(List<string> columnNames)
{
// Edit this function if Tier ObjectId is greater then 9
var maxTierObjectColumnName = columnNames.Where(c => c.Contains("Tier") && c.Contains("Object")).OrderBy(c => c).Last();
return maxTierObjectColumnName;
}
And after that it's simply running thru the file:
private List<DataObject> ParseFile(string fileName)
{
StreamReader streamReader = new StreamReader(fileName);
string headerLine = streamReader.ReadLine();
Dictionary<int, int> columnDictionary = this.GetColumnDictionary(headerLine);
string line;
List<DataObject> dataObjects = new List<DataObject>();
while ((line = streamReader.ReadLine()) != null)
{
var lineValues = line.Split('|');
string statId = lineValues[columnDictionary[0]];
dataObjects.Add(
new DataObject()
{
StatisticId = lineValues[columnDictionary[0]],
FileId = lineValues[columnDictionary[1]],
ObjectId = lineValues[columnDictionary[2]],
Status = lineValues[columnDictionary[3]]
}
);
}
return dataObjects;
}
I hope this helps (even a little bit).
Personally I would not try to reformat your file. I think the easiest approach would be to parse each row from the front and the back. For example:
itemArray = getMyItems();
statisticId = itemArray[0];
fileId = itemArray[1];
//and so on for the rest of your pre-tier columns
//Then get the second to last column which will be the last tier
lastTierId = itemArray[itemArray.length -1];
Since you know the last tier will always be second from the end you can just start at the end and work your way forwards. This seems like it would be much easier than trying to reformat the datafile.
If you really want to create a new file, you could use this approach to get the data you want to write out.
I don't know C# syntax, but something along these lines:
split line in parts with | as separator
get parts [0], [1], [length - 2] and [length - 1]
pass the parts to the database handling code