So I have a xml structure with the following format:
< Smart>
< Attribute>
< id >1 </id >
< name >name </ name>
< description >description</ description >
</ Attribute>
.
.
.
</Smart>
I then need to get user input to produce a datatable, depending on what the user inputs different constants will be used. The ID is used to distinguish between the different constants. All these constants are predefined before startup. The following is my code to find desired constants and store them into a datatable
for ( int row = 0; row < rowcount; row++)
{
found = false;
XmlTextReader textReader = new XmlTextReader ("Smart_Attributes.xml" );
textReader.ReadStartElement( "Smart" );
while (!found)
{
textReader.ReadStartElement("Attribute" );
DataId = Convert .ToByte(textReader.ReadElementString("id" ));
if (DataId > id)
{
dataView[count][5] = "Unknown" ;
dataView[count][7] = "Unknown" ;
found = true ;
}
if (DataId == id)
{
dataView[count][5] = textReader.ReadElementString("name" );
dataView[count][7] = textReader.ReadElementString("description" );
found = true ;
}
else
{
textReader.ReadElementString("name" );
textReader.ReadElementString("description" );
}
textReader.ReadEndElement(); //</Attribute>
}
count++;
}
}
this does work on getting the desired constants to be found fo their corresponding row. However it seems like a lot of work for not much gain. Could this potentially be done better using something like a dataset? any other suggestions would be super helpful.
The benefit of the XmlTextReader lies in its forward-only nature. This allows you to read a very large file in just one sweep. If your XML file is smaller, and you're happy to hold the whole structure in memory at any one time, you can use Linq2Xml, and read it into an XDocument class.
var doc = XDocument.Load("Smart_Attributes.xml");
var rows = doc.Descendants("Attribute").Where(e => e.Element("id").Value == id)
.Select(e => new
{
Name = e.Element("name").Value,
Description = e.Element("description").Value
});
// Load rows into your datatable
Related
Yes, I'm well aware Not to do this, but I have no choice. I'd agree that it's an XYZ issue, but since I can't update the service I have to use, it's out of my hands. I need some help to save some time, maybe learn something handy in the process.
I'm looking to map a list of models (items in this example) to what is essentially numbered variables of a service I'm posting to, in the example, that's the fields a part of new 'newUser'.
Additionally, there may not be always be X amount items in the list (On the right in the example), and yet I have a finite amount (say 10) of numbered variables from 'newUser' to map to (On the left in the example). So I'll have to perform a bunch of checks to avoid indexing a null value as well.
Current example:
if (items.Count >= 1 && !string.IsNullOrWhiteSpace(items[0].id))
{
newUser.itemId1 = items[0].id;
newUser.itemName1 = items[0].name;
newUser.itemDate1 = items[0].date;
newUser.itemBlah1 = items[0].blah;
}
else
{
// This isn't necessary, but this effectively what will happen
newUser.itemId1 = string.Empty;
newUser.itemName1 = string.Empty;
newUser.itemDate1 = string.Empty;
newUser.itemBlah1 = string.Empty;
}
if (items.Count >= 2 && !string.IsNullOrWhiteSpace(items[1].id))
{
newUser.itemId2 = items[1].id;
newUser.itemName2 = items[1].name;
newUser.itemDate2 = items[1].date;
newUser.itemBlah2 = items[1].blah;
}
// Removed the else to clean it up, but you get the idea.
// And so on, repeated many more times..
I looked into an example using Dictionary, but I'm unsure of how to map that to the model without just manually mapping all the variables.
PS: To all who come across this question, if you're implementing numbered variables in your API, please don't- it's wildly unnecessary and time consuming.
As an alternative to fiddling with the JSON, you could get down and dirty and use Reflection.
Given the following test data:
const int maxItemsToSend = 3;
class ItemToSend {
public string
itemId1, itemName1,
itemId2, itemName2,
itemId3, itemName3;
}
ItemToSend newUser = new();
record Item(string id, string name);
Item[] items = { new("1", "A"), new("2", "B") };
Using the rules you set forth in the question, we can loop through the projected fields as so:
// If `itemid1`,`itemId2`, etc are fields:
var fields = typeof(ItemToSend).GetFields();
// If they're properties, replace GetFields() with
// .GetProperties(BindingFlags.Instance | BindingFlags.Public);
for(var i = 1; i <= maxItemsToSend; i++){
// bounds check
var item = (items.Count() >= i && !string.IsNullOrWhiteSpace(items[i-1].id))
? items[i-1] : null;
// Use Reflection to find and set the fields
fields.FirstOrDefault(f => f.Name.Equals($"itemId{i}"))
?.SetValue(newUser, item?.id ?? string.Empty);
fields.FirstOrDefault(f => f.Name.Equals($"itemName{i}"))
?.SetValue(newUser, item?.name ?? string.Empty);
}
It's not pretty, but it works. Here's a fiddle.
I have thousands of lines of data in a text file I want to make easily searchable by turning it into something easier to search (I am hoping an XML or another type of large data structure, though I am not sure if it will be the best for what I have in mind).
The data looks like this for each line:
Book 31, Thomas,George, 32, 34, 154
(each book is not unique, they are indexes so book will have several different entries of whom is listed in it, and the numbers are the page they are listed)
So I am kinda of lost on how to do this, I would want to read the .txt file, trim out all the spaces and commas, I basically get how to prep the data for it, but how would I programmatically make that many elements and values in xml or populate some other large data structure?
If your csv file does not change too much and the structure is stable, you could simply parse it to a list of objects at startup
private class BookInfo {
string title {get;set;}
string person {get;set;}
List<int> pages {get;set;}
}
private List<BookInfo> allbooks = new List<BookInfo>();
public void parse() {
var lines = File.ReadAllLines(filename); //you could also read the file line by line here to avoid reading the complete file into memory
foreach (var l in lines) {
var info = l.Split(',').Select(x=>x.Trim()).ToArray();
var b = new BookInfo {
title = info[0],
person = info[1]+", " + info[2],
pages = info.Skip(3).Select(x=> int.Parse(x)).ToList()
};
allbooks.Add(b);
}
}
Then you can easily search the allbooks list with for instance LINQ.
EDIT
Now, that you have clarified your input, I adapted the parsing a little bit to better fit your needs.
If you want to search your booklist by either the title or the person more easily, you can also create a lookup on each of the properties
var titleLookup = allbooks.ToLookup(x=> x.title);
var personLookup = allbooks.ToLookup(x => x.person);
So personLookup["Thomas, George"] will give you a list of all bookinfos that mention "Thomas, George" and titleLookup["Book 31"] will give you a list of all bookinfos for "Book 31", ie all persons mentioned in that book.
If you want the CSV file to make easily searchable by turning it into something easier to search, you can convert it to DataTable.
if you want data , you can use LINQ to XML to search
The following class generates both DataTable or Xml data format. You can pass delimeter ,includeHeader or use the default:
class CsvUtility
{
public DataTable Csv2DataTable(string fileName, bool includeHeader = false, char separator = ',')
{
IEnumerable<string> reader = File.ReadAllLines(fileName);
var data = new DataTable("Table");
var headers = reader.First().Split(separator);
if (includeHeader)
{
foreach (var header in headers)
{
data.Columns.Add(header.Trim());
}
reader = reader.Skip(1);
}
else
{
for (int index = 0; index < headers.Length; index++)
{
var header = "Field" + index; // headers[index];
data.Columns.Add(header);
}
}
foreach (var row in reader)
{
if (row != null) data.Rows.Add(row.Split(separator));
}
return data;
}
public string Csv2Xml(string fileName, bool includeHeader = false, char separator = ',')
{
var dt = Csv2DataTable(fileName, includeHeader, separator);
var stream = new StringWriter();
dt.WriteXml(stream);
return stream.ToString();
}
}
example to use:
CsvUtility csv = new CsvUtility();
var dt = csv.Csv2DataTable("f1.txt");
// Search for string in any column
DataRow[] filteredRows = dt.Select("Field1 LIKE '%" + "Thomas" + "%'");
//search in certain field
var filtered = dt.AsEnumerable().Where(r => r.Field<string>("Field1").Contains("Thomas"));
//generate xml
var xml= csv.Csv2Xml("f1.txt");
Console.WriteLine(xml);
/*
output of xml for your sample:
<DocumentElement>
<Table>
<Field0>Book 31</Field0>
<Field1> Thomas</Field1>
<Field2>George</Field2>
<Field3> 32</Field3>
<Field4> 34</Field4>
<Field5> 154</Field5>
</Table>
</DocumentElement>
*/
Let's say I have two List<string>. These are populated from the results of reading a text file
List owner contains:
cross
jhill
bbroms
List assignee contains:
Chris Cross
Jack Hill
Bryan Broms
During the read from a SQL source (the SQL statement contains a join)... I would perform
if(sqlReader["projects.owner"] == "something in owner list" || sqlReader["assign.assignee"] == "something in assignee list")
{
// add this projects information to the primary results LIST
list_by_owner.Add(sqlReader["projects.owner"],sqlReader["projects.project_date_created"],sqlReader["projects.project_name"],sqlReader["projects.project_status"]);
// if the assignee is not null, add also to the secondary results LIST
// logic to determine if assign.assignee is null goes here
list_by_assignee.Add(sqlReader["assign.assignee"],sqlReader["projects.owner"],sqlReader["projects.project_date_created"],sqlReader["projects.project_name"],sqlReader["projects.project_status"]);
}
I do not want to end up using nested foreach.
The FOR loop would probably suffice. Someone had mentioned ZIP to me but wasn't sure if that would be a preferable route to go in my situation.
One loop to iterate through both lists (assuming both have same count):
for (int i = 0; i < alpha.Count; i++)
{
var itemAlpha = alpha[i] // <= your object of list alpha
var itemBeta = beta[i] // <= your object of list beta
//write your code here
}
From what you describe, you don't need to iterate at all.
This is what you need:
http://msdn.microsoft.com/en-us/library/bhkz42b3.aspx
Usage:
if ((listAlpga.contains(resultA) || (listBeta.contains(resultA)) {
// do your operation
}
List Iteration will happen implicitly inside the contains method. And thats 2n comparisions, vs n*n for nested iteration.
You would be better off with sequential iteration in each list one after the other, if at all you need to go that route.
This list is maybe better represented as a List<KeyValuePair<string, string>> which would pair the two list values together in a single list.
There are several options for this. The least "painful" would be plain old for loop:
for (var index = 0; index < alpha.Count; index++)
{
var alphaItem = alpha[index];
var betaItem = beta[index];
// Do something.
}
Another interesting approach is using the indexed LINQ methods (but you need to remember they get evaluated lazily, you have to consume the resulting enumerable), for example:
alpha.Select((alphaItem, index) =>
{
var betaItem = beta[index];
// Do something
})
Or you can enumerate both collection if you use the enumerator directly:
using (var alphaEnumerator = alpha.GetEnumerator())
using (var betaEnumerator = beta.GetEnumerator())
{
while (alphaEnumerator.MoveNext() && betaEnumerator.MoveNext())
{
var alphaItem = alphaEnumerator.Current;
var betaItem = betaEnumerator.Current;
// Do something
}
}
Zip (if you need pairs) or Concat (if you need combined list) are possible options to iterate 2 lists at the same time.
I like doing something like this to enumerate over parallel lists:
int alphaCount = alpha.Count ;
int betaCount = beta.Count ;
int i = 0 ;
while ( i < alphaCount && i < betaCount )
{
var a = alpha[i] ;
bar b = beta[i] ;
// handle matched alpha/beta pairs
++i ;
}
while ( i < alphaCount )
{
var a = alpha[i] ;
// handle unmatched alphas
++i ;
}
while ( i < betaCount )
{
var b = beta[i] ;
// handle unmatched betas
++i ;
}
I have a flat file with an unfortunately dynamic column structure. There is a value that is in a hierarchy of values, and each tier in the hierarchy gets its own column. For example, my flat file might resemble this:
StatisticID|FileId|Tier0ObjectId|Tier1ObjectId|Tier2ObjectId|Tier3ObjectId|Status
1234|7890|abcd|efgh|ijkl|mnop|Pending
...
The same feed the next day may resemble this:
StatisticID|FileId|Tier0ObjectId|Tier1ObjectId|Tier2ObjectId|Status
1234|7890|abcd|efgh|ijkl|Complete
...
The thing is, I don't care much about all the tiers; I only care about the id of the last (bottom) tier, and all the other row data that is not a part of the tier columns. I need normalize the feed to something resembling this to inject into a relational database:
StatisticID|FileId|ObjectId|Status
1234|7890|ijkl|Complete
...
What would be an efficient, easy-to-read mechanism for determining the last tier object id, and organizing the data as described? Every attempt I've made feels kludgy to me.
Some things I've done:
I have tried to examine the column names for regular expression patterns, identify the columns that are tiered, order them by name descending, and select the first record... but I lose the ordinal column number this way, so that didn't look good.
I have placed the columns I want into an IDictionary<string, int> object to reference, but again reliably collecting the ordinal of the dynamic columns is an issue, and it seems this would be rather non-performant.
I ran into a simular problem a few years ago. I used a Dictionary to map the columns, it was not pretty, but it worked.
First make a Dictionary:
private Dictionary<int, int> GetColumnDictionary(string headerLine)
{
Dictionary<int, int> columnDictionary = new Dictionary<int, int>();
List<string> columnNames = headerLine.Split('|').ToList();
string maxTierObjectColumnName = GetMaxTierObjectColumnName(columnNames);
for (int index = 0; index < columnNames.Count; index++)
{
if (columnNames[index] == "StatisticID")
{
columnDictionary.Add(0, index);
}
if (columnNames[index] == "FileId")
{
columnDictionary.Add(1, index);
}
if (columnNames[index] == maxTierObjectColumnName)
{
columnDictionary.Add(2, index);
}
if (columnNames[index] == "Status")
{
columnDictionary.Add(3, index);
}
}
return columnDictionary;
}
private string GetMaxTierObjectColumnName(List<string> columnNames)
{
// Edit this function if Tier ObjectId is greater then 9
var maxTierObjectColumnName = columnNames.Where(c => c.Contains("Tier") && c.Contains("Object")).OrderBy(c => c).Last();
return maxTierObjectColumnName;
}
And after that it's simply running thru the file:
private List<DataObject> ParseFile(string fileName)
{
StreamReader streamReader = new StreamReader(fileName);
string headerLine = streamReader.ReadLine();
Dictionary<int, int> columnDictionary = this.GetColumnDictionary(headerLine);
string line;
List<DataObject> dataObjects = new List<DataObject>();
while ((line = streamReader.ReadLine()) != null)
{
var lineValues = line.Split('|');
string statId = lineValues[columnDictionary[0]];
dataObjects.Add(
new DataObject()
{
StatisticId = lineValues[columnDictionary[0]],
FileId = lineValues[columnDictionary[1]],
ObjectId = lineValues[columnDictionary[2]],
Status = lineValues[columnDictionary[3]]
}
);
}
return dataObjects;
}
I hope this helps (even a little bit).
Personally I would not try to reformat your file. I think the easiest approach would be to parse each row from the front and the back. For example:
itemArray = getMyItems();
statisticId = itemArray[0];
fileId = itemArray[1];
//and so on for the rest of your pre-tier columns
//Then get the second to last column which will be the last tier
lastTierId = itemArray[itemArray.length -1];
Since you know the last tier will always be second from the end you can just start at the end and work your way forwards. This seems like it would be much easier than trying to reformat the datafile.
If you really want to create a new file, you could use this approach to get the data you want to write out.
I don't know C# syntax, but something along these lines:
split line in parts with | as separator
get parts [0], [1], [length - 2] and [length - 1]
pass the parts to the database handling code
I am using Lunece.net 2.0.5 version.
I want to open and display all the records in the index file in a grid (table) format in an ASP.NET web application, and also provide edit option for each cell in that grid.
But I don't know how to read each row from Index file.
I used code below-
private List<String> GetIndexTerms(string indexFolder)
{
List<String> termlist = new List<string>();
IndexReader reader = IndexReader.Open(indexFolder, false);
TermEnum terms = reader.Terms();
while (terms.Next())
{
Term term = terms.Term();
String termText = term.Text();
int frequency = reader.DocFreq(term);
termlist.Add(termText);
}
reader.Close();
return termlist;
}
but it returns list of each term and here I am unable to aggregate data by each row (record).
Let me know if there is way to read file by each row or I need to update version of Lucene that I am currently using.
Also please provide any links to Lucene.net's better documentation websites.
You can read all the records/rows (documents in Lucene terminology) directly from the index without searching
var reader = IndexReader.Open(dir);
for (int i = 0; i < reader.MaxDoc(); i++)
{
if (reader.IsDeleted(i)) continue;
Document d = reader.Document(i);
var fieldValuePairs = d.GetFields()
.Select(f => new {
Name = f.Name(),
Value = f.StringValue() })
.ToArray();
}
PS: v2.0.5 is very old. try latest & greatest Lucene.Net