Sort text file data into an array - c#

I'm working on a homework problem for my computer science class. A cities census data is on a text file holding records for its citizens. Each line will have four fields(age, gender, marital status, and district) of different data types separated by a comma. For example, 22, F, M, 1.
How should I approach this? My thoughts are that I should use two 1D arrays, one for age and one for district. I need to be able to later count how many people are in each district, and how many people are in different age groups for the whole city.
How do I read each line and get the info I want into each array?
edit**
This is what I've managed to do so far. I'm trying to separate my data from fields into four different arrays. This is where I'm stuck.
FileStream fStream = new FileStream("test.txt", FileMode.Open, FileAccess.Read);
StreamReader inFile = new StreamReader(fStream);
string inputRecord = "";
string[] fields;
int[] ageData = new int[1000];
string[] genderData = new string[1000];
string[] maritalData = new string[1000];
int[] districtData = new int[1000];
inputRecord = inFile.ReadLine();
while (inputRecord != null)
{
fields = inputRecord.Split(',');
int i = 0;
ageData[i] = int.Parse(fields[0]);
genderData[i] = fields[1];
maritalData[i] = fields[2];
districtData[i] = int.Parse(fields[3]);
i++;
inputRecord = inFile.ReadLine();
}
edit 2**
First question, I've decided to use the below code to find out how many citizens are in each district of the census data.
for (int x = 1; x <= 22; x++)
for (int y = 0; y < districtData.Length; y++)
if (districtData[y] == x)
countDist[x]++;
for (int x = 1; x <= 22; x++)
Console.WriteLine("District " + x + " has " + countDist[x] + " citizens");
In my .Writeline when x reaches two digits it throws off my columns. How could I line up my columns better?
Second question, I am not quite sure how to go about separating the values I have in ageData into age groups using an if statement.

It sounds like each of the four fields have something in common... they represent a person surveyed by the census. That's a good time to use a class along the lines of
public class Person
{
public int Age { get; set; }
public string Gender { get; set; }
public string MaritalStatus { get; set; }
public int District { get; set; }
}
Then, just read in all of the lines from the text file (if it's small, it's fine to use File.ReadAllLines()), and then create an instance of Person for each line in the file.
You can create a
List<Person> people;
to hold the Person instances that you parse from the text file.
Since the lines appear to be separated by commas, have a look at String.Split().
UPDATE
The attempt in your edit is pretty close. You keep creating a new i and initializing it to 0. Instead, initialize it outside your loop:
int i = 0;
while (inputRecord != null)
{
fields = inputRecord.Split(',');
Also you may want to trim excess spaces of of your input. If the fields are separated with ", " rather than just "," you will have excess spaces in your fields.
genderData[i] = fields[1].Trim();
maritalData[i] = fields[2].Trim();

How about this?
List<string[]> o = File.ReadAllLines(#"C:\TestCases\test.txt").Select(x => x.Split(',')).OrderBy(y => y[0]).ToList();
Each person is a string array in the list.
Each property is a index in the array eg: age is first.
The above code reads all lines comma delimits them orders them by age and adds them to the list.

public static class PersonsManager
{
public static PersonStatistics LoadFromFile(string filePath)
{
var statistics = new PersonStatistics();
using (var reader = new StreamReader(filePath))
{
var separators = new[] { ',' };
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (string.IsNullOrWhiteSpace(line))
continue; //--malformed line
var lParts = line.Split(separators, StringSplitOptions.RemoveEmptyEntries);
if (lParts.Length != 4)
continue; //--malformed line
var person = new Person
{
Age = int.Parse(lParts[0].Trim()),
Gender = lParts[1].Trim(),
MaritalStatus = lParts[2].Trim(),
District = int.Parse(lParts[3].Trim())
};
statistics.Persons.Add(person);
}
}
return statistics;
}
}
public class PersonStatistics
{
public List<Person> Persons { get; private set; }
public PersonStatistics()
{
Persons = new List<Person>();
}
public IEnumerable<Person> GetAllByGender(string gender)
{
return GetByPredicate(p => string.Equals(p.Gender, gender, StringComparison.InvariantCultureIgnoreCase));
}
//--NOTE: add defined queries as many as you need
public IEnumerable<Person> GetByPredicate(Predicate<Person> predicate)
{
return Persons.Where(p => predicate(p)).ToArray();
}
}
public class Person
{
public int Age { get; set; }
public string Gender { get; set; }
public string MaritalStatus { get; set; }
public int District { get; set; }
}
Usage:
var statistics = PersonsManager.LoadFromFile(#"d:\persons.txt");
var females = statistics.GetAllByGender("F");
foreach (var p in females)
{
Console.WriteLine("{0} {1} {2} {3}", p.Age, p.Gender, p.MaritalStatus, p.District);
}
I hope it helps.

Related

Adding all class object array values to a text file

I want to add all properties values assigned during instantiation of Course class into text File.
CourseManager courseMng = new CourseManager();
Course prog1 = new Course();
Console.WriteLine(prog1.GetInfo());
Console.WriteLine();
prog1.CourseCode = "COMP100";
prog1.Name = "Programming 1";
prog1.Description = "Programming1 description";
prog1.NoOfEvaluations = 3;
Console.WriteLine(prog1.GetInfo());
Course prog2 = new Course("COMP123", "Programming2") { Description = "prog 2 desc", NoOfEvaluations = 2 };
Console.WriteLine(prog2.GetInfo());
courseMng.AddCourse(prog1);
courseMng.AddCourse(prog2);
This was my main and this is my CourseManager class
Course[] courses;
int numberOfCourses;
public int NumberOfCourses
{
get { return numberOfCourses; }
set { numberOfCourses = value; }
}
public Course[] Courses
{
get
{
return courses;
}
set
{
courses = value;
}
}
public CourseManager()
{
Courses = new Course[100];
}
public void AddCourse(Course aCourse)
{
Courses[numberOfCourses] = aCourse;
numberOfCourses++;
aCourse.Manager = this;
}
public void ExportCourses(string fileName, char Delim)
{
FileStream stream = null;
StreamWriter writer;
try
{
stream = new FileStream(fileName, FileMode.Create, FileAccess.Write);
writer = new StreamWriter(stream);
Course aCourse = new Course();
for(int i =0; i<numberOfCourses;i++)
{
//writer.WriteLine(courses.ToString());
writer.WriteLine("{0}{1}{2}{1}{3}{1}{4}", aCourse.CourseCode, Delim, aCourse.Name, aCourse.Description, aCourse.NoOfEvaluations);
}
writer.Close();
}
finally
{
stream.Close();
}
}
So the problem is when i try to writeline it just prints empty values. I want CourseCode,Name,Descrpition and NumberOfEvaluations to be written in .txt file.
If u need any other code please let me know
Thanks in advance
Your issue is where you are getting your data from in the lines
Course aCourse = new Course();
for(int i =0; i<numberOfCourses;i++)
{
//writer.WriteLine(courses.ToString());
writer.WriteLine("{0}{1}{2}{1}{3}{1}{4}", aCourse.CourseCode, Delim, aCourse.Name, aCourse.Description, aCourse.NoOfEvaluations);
}
The issue is you are writing out the values of the properties of aCourse which is never set to anything than a new Course() (as per the first line).
Try changing it to this
for(int i =0; i<numberOfCourses;i++)
{
Course aCourse = Courses[i];
//writer.WriteLine(courses.ToString());
writer.WriteLine("{0}{1}{2}{1}{3}{1}{4}", aCourse.CourseCode, Delim, aCourse.Name, aCourse.Description, aCourse.NoOfEvaluations);
}
In the second example we are setting the value of aCourse to be the next course in the iterator.
As per your comment
i have a file with different courses and i want to add it to courses
array. I am doing this
Course C = new Course();
C.CourseCode = reader.ReadLine();
C.Name = reader.ReadLine();
C.Description = reader.ReadLine();
C.NoOfEvaluations = Int32.Parse(reader.ReadLine());
courses[index++] = C;
But i am getting invalid input error at NoofEvaluations any solution?
The issue here is you exported the data in a single line via the writer.WriteLine method delimiting the values by a delimeter | or more exactly by the 'char Delim` paramter. In this case you need to read a single line into an array that is split by the same character. Somthing like.
var line = reader.ReadLine();
var array = line.Split('|'); // or char Delim if you prefer
var c = new Course();
c.CourseCode = array[0];
c.Name = array[1];
c.Description = array[2];
c.NoOfEvaluations = int.Parse(array[3]);
courses[index++] = c;
Instead of fixing your problem, I'm going suggest different code to clean things up. I would simplify as follows:
public class Course
{
public string CourseCode { get; set; }
public string Name { get; set; }
public string Description { get; set; }
public int NoOfEvaluations { get; set; }
public IEnumerable<Course> Courses { get; set; }
public Course()
{
Courses = new List<Course>();
}
//this will format the text to your specification
public string ToDelimitedString(char Delimiter)
{
return String.Format("{0}{4}{1}{4}{2}{4}{3}{5}"
, this.CourseCode
, this.Name
, this.Description
, this.NoOfEvaluations.ToString()
, Delimiter
, Environment.NewLine);
}
public void Export(string FileName, char Delimiter)
{
//this will be more efficient than writing on line at a time
var coursesExport = new StringBuilder();
foreach(var course in this.Courses)
{
coursesExport.Append(course.ToDelimitedString(Delimiter));
}
File.WriteAllText(FileName, coursesExport.ToString());
}
}

c# Getting the Index from an array inside a JSon Object

I'm trying to find out the index of a string in an array within a JObject object. For example, you could give frc610 and it would return 0.
// Get rankings JSON file from thebluealliance.com
string TBArankings = #"https://www.thebluealliance.com/api/v2/district/ont/2017/rankings?X-TBA-App-Id=frc2706:ONT-ranking-system:v01";
var rankings = new WebClient().DownloadString(TBArankings);
string usableTeamNumber = "frc" + teamNumberString;
string team_key = "";
int rank = 0;
dynamic arr = JsonConvert.DeserializeObject(rankings);
foreach (dynamic obj in arr)
{
team_key = obj.team_key;
rank = obj.rank;
}
int index = Array.IndexOf(arr, (string)usableTeamNumber); // <-- This is where the exception is thrown.
Console.WriteLine(index);
// Wait 20 seconds
System.Threading.Thread.Sleep(20000);
Here's the json file I'm using.
I've tried multiple different solutions, none of which worked.
You could just keep the index in variable.
string usableTeamNumber = $"frc{teamNumberString}";
string team_key = "";
int rank = 0;
int index = 0;
int count = 0;
dynamic arr = JsonConvert.DeserializeObject(rankings);
foreach (dynamic obj in arr)
{
team_key = obj.team_key;
rank = obj.rank;
if (usableTeamNumber.Equals(team_key) {
index = count;
}
count++;
}
Console.WriteLine(index);
Create a class that mimics your data structure, like such (only has 3 of the root fields):
public class EventPoints
{
public int point_total { get; set; }
public int rank { get; set; }
public string team_key { get; set; }
}
Then you can Deserialize the object into a list of those objects and you can use LINQ or other tools to query that list:
string teamNumberString = "frc2056";
string TBArankings = #"https://www.thebluealliance.com/api/v2/district/ont/2017/rankings?X-TBA-App-Id=frc2706:ONT-ranking-system:v01";
var rankings = new WebClient().DownloadString(TBArankings);
List<EventPoints> eps = JsonConvert.DeserializeObject<List<EventPoints>>(rankings);
EventPoints sp = eps.Where(x => x.team_key.Equals(teamNumberString)).FirstOrDefault();
Console.WriteLine(eps.IndexOf(sp));
Console.ReadLine();

Excluding items from a list that exist in another list

I have a list for example List<string> ListProviderKeys that has some values in it.
I also have a second list from a class below, for example List<ChangesSummary> SecondList;
public class ChangesSummary
{
public string TableName { get; set; }
public string ProviderKey { get; set; }
public string ProviderAdrsKey { get; set; }
public string ProviderSpecialtyKey { get; set; }
public string FieldName{ get; set; }
}
Imagine the values that first list holds is the same kind of values we put in ProviderKey field in the second list.
Now What I want is to trim down the second list to only have values that their ProviderKey IS NOT already in the first list.
How Can I do that? I know the operator Except but not sure how to apply it in this situation!
The best I can think of is :
A) Create dictionary and use its fast lookups
B) Use LINQ .Where method with .ContainsKey() on this dictionary which internally uses Hashtable and performs quick lookups.
This should reduce search complexity to almost O(1) rather than O(N) ro worse (when we use LINQ .Where() with .Any() or .Contains() and that leads to nested loops).
From MSDN page :
The Dictionary generic class provides a mapping from a set of keys to
a set of values. Each addition to the dictionary consists of a value
and its associated key. Retrieving a value by using its key is very
fast, close to O(1), because the Dictionary class is implemented as a
hash table.
So what we can do is :
Dictionary<string, string> dict = ListProviderKeys.ToDictionary(s => s);
var newList = SecondList.Where(e => !dict.ContainsKey(e.ProviderKey)).ToList();
Here is a very simple, short, but complete example illustrating it and also testing its performance :
class Person
{
public int Id { get; set; }
}
class Program
{
static void Main(string[] args)
{
List<int> ints = new List<int>();
List<Person> People = new List<Person>(1000);
for (int i = 0; i < 7500; i++)
{
ints.Add(i);
ints.Add(15000 - i - 1);
}
for (int i = 0; i < 45000; i++)
People.Add(new Person() { Id = i });
Stopwatch s = new Stopwatch();
s.Start();
// code A (feel free to uncomment it)
//Dictionary<int, int> dict = ints.ToDictionary(p => p);
//List<Person> newList = People.Where(p => !dict.ContainsKey(p.Id)).ToList();
// code B
List<Person> newList = People.Where(p => !ints.Contains(p.Id)).ToList();
s.Stop();
Console.WriteLine(s.ElapsedMilliseconds);
Console.WriteLine("Number of elements " + newList.Count);
Console.ReadKey();
}
On release mode results are :
Both code A & code B outputs 30 000 elements but :
It took more than 2000 ms with code B and only 5 ms with code A
public class Programm
{
public static void Main()
{
List<ChangesSummary> summaries = new List<ChangesSummary>();
summaries.Add(new ChangesSummary()
{
FieldName = "1",
ProviderKey = "Test1",
});
summaries.Add(new ChangesSummary()
{
FieldName = "2",
ProviderKey = "Test2",
});
summaries.Add(new ChangesSummary()
{
FieldName = "3",
ProviderKey = "Test3",
});
List<string> listProviderKeys = new List<string>();
listProviderKeys.Add("Test1");
listProviderKeys.Add("Test3");
var res = summaries.Where(x => !listProviderKeys.Contains(x.ProviderKey));
res.ToList().ForEach(x => Console.WriteLine(x.ProviderKey));
Console.ReadLine();
}
}
public class ChangesSummary
{
public string TableName { get; set; }
public string ProviderKey { get; set; }
public string ProviderAdrsKey { get; set; }
public string ProviderSpecialtyKey { get; set; }
public string FieldName { get; set; }
}
I think in this case simple Where would be easier and more readable to apply.
var first = new List<string> { "a" };
var second = new List<ChangesSummary>()
{
new ChangesSummary() { ProviderKey = "a" },
new ChangesSummary() { ProviderKey = "b" }
};
var result = second.Where(item => !first.Contains(item.ProviderKey));
// result
// .ToList()
// .ForEach(item => Console.WriteLine(item.ProviderKey));
I believe this will work:
List<ChangesSummary> ExceptionList = SecondList.
Where(x => !ListProviderKeys.Any(key => x.ProviderKey == key)).ToList();

Use 'LoadfromCollection' with a list containing another list inside

My problem is that I have a list that contains a few strings and inside this list another list of decimals, something like this:
public class excelInventario
{
public excelInventario() { cols = new List<decimal>); }
public string codigo { get; set; }
public string nombre { get; set;} .
public List<decimal> cols { get; set; } //Lista de columnas
public decimal suma { get; set; }
public decimal stock { get; set; }
public decimal diferencia { get; set; }
public decimal precio { get; set; }
}
and now I need to put this in Excel. The problem is that when I use the method LoadFromCollection(MyList) the strings appear well in Excel, but the list of decimals is not put correctly, but:
System.Collections.Generic.List`1[System.Decimal].
Can I adapt this method or do I need to use a loop and put "manually" the row values one by one?
I suspect this second option it will be inefficient.
---------------EDIT TO ADD MORE CODE--------------
int tamcolumnas=excelin[0].cols.Count;
using (ExcelPackage package = new ExcelPackage(file))
{
ExcelWorksheet hoja = package.Workbook.Worksheets.Add("Comparativo unidades contadas VS stock");
hoja.Cells["A1"].Value = "CODART";
hoja.Cells["B1"].Value = "NOMBRE";
for(int i=0;i<tamcolumnas;i++)
{ hoja.Cells[1, i+3].Value = "COL"+(i+1); }
var MyList = new List<excelInventario>();
hoja.Cells.LoadFromCollection(MyList,true);
hoja.Cells[2, 3].LoadFromArrays(MyList.Select((r) => r.cols.Cast<object>).ToArray()));
in this last line is where fails.
Say:
System.ArgumentOutOfRangeException
The specified argument is outside the range of valid values.
Since those are Lists the closest you can get to automation is the LoadFromArray since those are not true objects. Its not exactly pretty since it requires casting so check for performance hits. Otherwise, it may be best to use plain old loops. Here is what I mean:
[TestMethod]
public void ListOfList_Test()
{
//http://stackoverflow.com/questions/33825995/how-to-use-loadfromcollection-in-epplus-with-a-list-containing-another-list-insi
//Throw in some data
var MyList = new List<TestExtensions.excelInventario>();
for (var i = 0; i < 10; i++)
{
var row = new TestExtensions.excelInventario
{
codigo = Path.GetRandomFileName(),
nombre = i.ToString(),
cols = new List<decimal> {i, (decimal) (i*1.5), (decimal) (i*2.5)}
};
MyList.Add(row);
}
//Create a test file
var fi = new FileInfo(#"c:\temp\ListOfList.xlsx");
if (fi.Exists)
fi.Delete();
int tamcolumnas = 10; // excelin[0].cols.Count;
using (ExcelPackage package = new ExcelPackage(fi))
{
ExcelWorksheet hoja = package.Workbook.Worksheets.Add("Comparativo unidades contadas VS stock");
hoja.Cells["A1"].Value = "CODART";
hoja.Cells["B1"].Value = "NOMBRE";
for (int i = 0; i < tamcolumnas; i++)
{
hoja.Cells[1, i + 3].Value = "COL" + (i + 1);
}
//var MyList = new List<TestExtensions.excelInventario>();
hoja.Cells.LoadFromCollection(MyList, true);
//hoja.Cells[2, 3].LoadFromArrays(MyList.Select((r) => r.cols.Cast<object>).ToArray()));
hoja.Cells[2, 3].LoadFromArrays(MyList.Select((r) => r.cols.Cast<object>().ToArray()));
package.Save();
}
}

Get string value and add to specific variables

I want to get the value in an array and i want to put it in a variable
This is the array {1,eli}
CsvValues = RowData.Split(new string[] {","},
StringSplitOptions.RemoveEmptyEntries); // RowData is {1,eli}
List<string> elements = new List<string>();
foreach (string data in CsvValues)
{
elements.Add(data);
}
and then I want to put it here:
result.Add(new wsSample()
{
id = elements[0],
name = elements[1]
});
How will i add the elements value to id and name?
public class wsSample
{
[DataMember]
public string id { get; set; }
[DataMember]
public string name { get; set; }
}
How is the rest of the input array structured?
If it is elements = {"1","eli", "2","manning"}
then you might be better off using a for loop.
I think this is what you are looking for
List<wsSample> samples = new List<wsSample>();
for(int i=0; i< elements.length-1; ++i)
{
samples.Add(new wsSample()
{
id = elements[i]
name = elements[i+1]
});
i= i+2;
}

Categories

Resources