How to read rows from Lucene.Net's index files

How to read rows from Lucene.Net's index files - c#

I am using Lunece.net 2.0.5 version.
I want to open and display all the records in the index file in a grid (table) format in an ASP.NET web application, and also provide edit option for each cell in that grid.
But I don't know how to read each row from Index file.
I used code below-
private List<String> GetIndexTerms(string indexFolder)
{
List<String> termlist = new List<string>();
IndexReader reader = IndexReader.Open(indexFolder, false);
TermEnum terms = reader.Terms();
while (terms.Next())
{
Term term = terms.Term();
String termText = term.Text();
int frequency = reader.DocFreq(term);
termlist.Add(termText);
}
reader.Close();
return termlist;
}
but it returns list of each term and here I am unable to aggregate data by each row (record).
Let me know if there is way to read file by each row or I need to update version of Lucene that I am currently using.
Also please provide any links to Lucene.net's better documentation websites.

You can read all the records/rows (documents in Lucene terminology) directly from the index without searching
var reader = IndexReader.Open(dir);
for (int i = 0; i < reader.MaxDoc(); i++)
{
if (reader.IsDeleted(i)) continue;
Document d = reader.Document(i);
var fieldValuePairs = d.GetFields()
.Select(f => new {
Name = f.Name(),
Value = f.StringValue() })
.ToArray();
}
PS: v2.0.5 is very old. try latest & greatest Lucene.Net

Related

In C# how do I go through a Google Sheets document and write into a specific cell

Following a tutorial, I have set up everything that needs to be set up for Google Sheets Api v4. In my Google Sheets documetnt, I have names of students in the first column, and in my second column I want to put their GPA. In my code, I made two variables that the user inputs, string name and string gpa. I want to go through column A, look for that name and insert that GPA next to it. I know I should probably use a for loop to go through the column, and compare every cell with the string the user typed, but nothing I tried so far worked.
I wrote a simple method that can get entries, for now it only prints but that can easily be changed:
static void ReadEntries()
{
var range = $"{sheet}!A1:F10";
var request = service.Spreadsheets.Values.Get(SpreadsheetId, range);
var response = request.Execute();
var values = response.Values;
if(values != null && values.Count > 0)
{
foreach(var row in values)
{
Console.WriteLine("{0} | {1}", row[0], row[1]);
}
}
else
{
Console.WriteLine("No data found");
}
}
and a method that can update a specific cell:
static void UpdateEntry()
{
var range = $"{sheet}!B2"; //example
var valueRange = new ValueRange();
var objectList = new List<object>() { "updated" };
valueRange.Values = new List<List<object>> { objectList };
var updateRequest = service.Spreadsheets.Values.Update(valueRange, SpreadsheetId, range);
updateRequest.ValueInputOption = SpreadsheetsResource.ValuesResource.AppendRequest.ValueInputOptionEnum.USERENTERED;
var updateResponse = updateRequest.Execute();
}
EDIT: I need help with making a for loop to go through my A column and find the student with the same name. I know how to update a cell. I just don't know how to find a cell that needs updating.

Sounds like you are very close. You already have the value you are searching in row[0] in the loop, so all you need to track the row number through your loop.
if (values != null && values.Count > 0)
{
int rowNo =0;
foreach (var row in values)
{
rowNo ++;
Console.WriteLine("{0} | {1}", row[0], row[1]);
if (row[0].ToString() == "John")
{
string rangeToUpdate = $"{sheet}!B{rowNo}:B{rowNo}";
...
}
}
}
You could also change from using a foreach to a standard for loop.

I'm not experienced in the .NET client library of the Sheets API.
However, having used the Sheets API with the node and python client libraries, I can point you to the documentation you should follow. This is the official API documentation, with code examples for each language having a Google-provided client library.
For example, here is the spreadsheets.values.update documentation that you use, with a code example for C#.
On to the question then:
According to the json representation of a ValueRange, ValueRange.Range does not seem optional even though it is redundant. You might need to add ValueRange.Range = range; in your code.
Plus, you are using SpreadsheetsResource.ValuesResource.AppendRequest instead of SpreadsheetsResource.ValuesResource.UpdateRequest in the definition of your ValueInputOption.
Let me know if it helped!
Update
This also seems to be a duplicate of Update a cell with C# and Sheets API V4

c# FlaUi get all the Values from DataGridView of another program

I'm trying to pull all the values from another program's DataGridBox. For that I'm using FlaUi. I made a code that does what I want. However, it is very slow. Is there a faster way to pull up all the values from another program's DataGridView using FlaUi?
my code:
var desktop = automation.GetDesktop();
var window = desktop.FindFirstDescendant(cf => cf.ByName("History: NEWLIFE")).AsWindow();
var table = window.FindFirstDescendant(cf => cf.ByName("DataGridView")).AsDataGridView();
int rowscount = (table.FindAllChildren(cf => cf.ByProcessId(30572)).Length) - 2;
// Remove the last row if we have the "add" row
for (int i = 0; i < rowscount; i++)
{
string string1 = "Row " + i;
string string2 = "Symbol Row " + i;
var RowX = table.FindFirstDescendant(cf => cf.ByName(string1));
var SymbolRowX = RowX.FindFirstDescendant(cf => cf.ByName(string2));
SCAN.Add("" + SymbolRowX.Patterns.LegacyIAccessible.Pattern.Value);
}
var message = string.Join(Environment.NewLine, SCAN);
MessageBox.Show(message);
Thank you in-advance

Searching for descendants is pretty slow as it will go thru all objects in the tree until it finds the desired control (or there are no controls left). It might be much faster to use the grid pattern to find the desired cells or get all rows at once and loop thru them.
Alternatively you could try caching as UIA uses inter process calls which are generally slow. So each Find method or value property does such a call. If you have a large grid, that can sum up pretty badly. For that exact case, using UIA Caching could make sense.
For that, you would get everything you need (all descendants of the table and the LegacyIAccessible pattern) in one go inside a cache request and then loop thru those elements in the code with CachedChildren and such.
A simple example for this can be found at the FlaUI wiki at https://github.com/FlaUI/FlaUI/wiki/Caching:
var grid = <FindGrid>.AsGrid();
var cacheRequest = new CacheRequest();
cacheRequest.TreeScope = TreeScope.Descendants;
cacheRequest.Add(Automation.PropertyLibrary.Element.Name);
using (cacheRequest.Activate())
{
var rows = _grid.Rows;
foreach (var row in rows)
{
foreach (var cell in row.CachedChildren)
{
Console.WriteLine(cell.Name);
}
}
}

Is there a way to dynamically create an object at run time in .NET 3.5?

I'm working on an importer that takes tab delimited text files. The first line of each file contains 'columns' like ItemCode, Language, ImportMode etc and there can be varying numbers of columns.
I'm able to get the names of each column, whether there's one or 10 and so on. I use a method to achieve this that returns List<string>:
private List<string> GetColumnNames(string saveLocation, int numColumns)
{
var data = (File.ReadAllLines(saveLocation));
var columnNames = new List<string>();
for (int i = 0; i < numColumns; i++)
{
var cols = from lines in data
.Take(1)
.Where(l => !string.IsNullOrEmpty(l))
.Select(l => l.Split(delimiter.ToCharArray(), StringSplitOptions.None))
.Select(value => string.Join(" ", value))
let split = lines.Split(' ')
select new
{
Temp = split[i].Trim()
};
foreach (var x in cols)
{
columnNames.Add(x.Temp);
}
}
return columnNames;
}
If I always knew what columns to be expecting, I could just create a new object, but since I don't, I'm wondering is there a way I can dynamically create an object with properties that correspond to whatever GetColumnNames() returns?
Any suggestions?

For what it's worth, here's how I used DataTables to achieve what I wanted.
// saveLocation is file location
// numColumns comes from another method that gets number of columns in file
var columnNames = GetColumnNames(saveLocation, numColumns);
var table = new DataTable();
foreach (var header in columnNames)
{
table.Columns.Add(header);
}
// itemAttributeData is the file split into lines
foreach (var row in itemAttributeData)
{
table.Rows.Add(row);
}
Although there was a bit more work involved to be able to manipulate the data in the way I wanted, Karthik's suggestion got me on the right track.

You could create a dictionary of strings where the first string references the "properties" name and the second string its characteristic.

How to read tables from a particular place in a document?

When I use the below line It reads all tables of that particular document:
foreach (Microsoft.Office.Interop.Word.Table tableContent in document.Tables)
But I want to read tables of a particular content for example from one identifier to another identifier.
Identifier can be in the form of [SRS oraganisation_123] to another identifier [SRS Oraganisation_456]
I want to read the tables only in between the above mentioned identifiers.
Suppose 34th page contains my identifier so I want read all tables from that point to until I come across my second identifier. I don't want to read remaining tables.
Please ask me for any clarification in the question.

Say start and end Identifiers are stored in variables called myStartIdentifier and myEndIdentifier -
Range myRange = doc.Range();
int iTagStartIdx = 0;
int iTagEndIdx = 0;
if (myRange.Find.Execute(myStartIdentifier))
iTagStartIdx = myRange.Start;
myRange = doc.Range();
if (myRange.Find.Execute(myEndIdentifier))
iTagEndIdx = myRange.Start;
foreach (Table tbl in doc.Range(iTagStartIdx,iTagEndIdx).Tables)
{
// Your code goes here
}

Not sure how your program is structured... but if you can access the identifier in tableContent then you should be able to write a LINQ query.
var identifiers = new List<string>();
identifiers.Add("myIdentifier");
var tablesWithOnlyTheIdentifiersIWant = document.Tables.Select(tableContent => identifiers.Contains(tableContent.Identifier)
foreach(var tableContent in tablesWithOnlyTheIdentifiersIWant)
{
//Do something
}

Go through following code, if it helps you.
System.Data.DataTable dt = new System.Data.DataTable();
foreach (Microsoft.Office.Interop.Word.Cell c in r.Cells)
{
if(c.Range.Text=="Content you want to compare")
dt.Columns.Add(c.Range.Text);
}
foreach (Microsoft.Office.Interop.Word.Row row in newTable.Rows)
{
System.Data.DataRow dr = dt.NewRow();
int i = 0;
foreach (Cell cell in row.Cells)
{
if (!string.IsNullOrEmpty(cell.Range.Text)&&(cell.Range.Text=="Text you want to compare with"))
{
dr[i] = cell.Range.Text;
}
}
dt.Rows.Add(dr);
i++;
}
Go through following linked 3rd number answer.
Replace bookmark text in Word file using Open XML SDK

Reading a CSV file in .NET?

How do I read a CSV file using C#?

A choice, without using third-party components, is to use the class Microsoft.VisualBasic.FileIO.TextFieldParser (http://msdn.microsoft.com/en-us/library/microsoft.visualbasic.fileio.textfieldparser.aspx) . It provides all the functions for parsing CSV. It is sufficient to import the Microsoft.VisualBasic assembly.
var parser = new Microsoft.VisualBasic.FileIO.TextFieldParser(file);
parser.TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited;
parser.SetDelimiters(new string[] { ";" });
while (!parser.EndOfData)
{
string[] row = parser.ReadFields();
/* do something */
}

You can use the Microsoft.VisualBasic.FileIO.TextFieldParser class in C#:
using System;
using System.Data;
using Microsoft.VisualBasic.FileIO;
static void Main()
{
string csv_file_path = #"C:\Users\Administrator\Desktop\test.csv";
DataTable csvData = GetDataTableFromCSVFile(csv_file_path);
Console.WriteLine("Rows count:" + csvData.Rows.Count);
Console.ReadLine();
}
private static DataTable GetDataTableFromCSVFile(string csv_file_path)
{
DataTable csvData = new DataTable();
try
{
using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
{
csvReader.SetDelimiters(new string[] { "," });
csvReader.HasFieldsEnclosedInQuotes = true;
string[] colFields = csvReader.ReadFields();
foreach (string column in colFields)
{
DataColumn datacolumn = new DataColumn(column);
datacolumn.AllowDBNull = true;
csvData.Columns.Add(datacolumn);
}
while (!csvReader.EndOfData)
{
string[] fieldData = csvReader.ReadFields();
//Making empty value as null
for (int i = 0; i < fieldData.Length; i++)
{
if (fieldData[i] == "")
{
fieldData[i] = null;
}
}
csvData.Rows.Add(fieldData);
}
}
}
catch (Exception ex)
{
}
return csvData;
}

You could try CsvHelper, which is a project I work on. Its goal is to make reading and writing CSV files as easy as possible, while being very fast.
Here are a few ways you can read from a CSV file.
// By type
var records = csv.GetRecords<MyClass>();
var records = csv.GetRecords( typeof( MyClass ) );
// Dynamic
var records = csv.GetRecords<dynamic>();
// Using anonymous type for the class definition
var anonymousTypeDefinition =
{
Id = default( int ),
Name = string.Empty,
MyClass = new MyClass()
};
var records = csv.GetRecords( anonymousTypeDefinition );

I usually use a simplistic approach like this one:
var path = Server.MapPath("~/App_Data/Data.csv");
var csvRows = System.IO.File.ReadAllLines(path, Encoding.Default).ToList();
foreach (var row in csvRows.Skip(1))
{
var columns = row.Split(';');
var field1 = columns[0];
var field2 = columns[1];
var field3 = columns[2];
}

I just used this library in my application. http://www.codeproject.com/KB/database/CsvReader.aspx. Everything went smoothly using this library, so I'm recommending it. It is free under the MIT License, so just include the notice with your source files.
I didn't display the CSV in a browser, but the author has some samples for Repeaters or DataGrids. I did run one of his test projects to test a Sort operation I have added and it looked pretty good.

You can try Cinchoo ETL - an open source lib for reading and writing CSV files.
Couple of ways you can read CSV files
Id, Name
1, Tom
2, Mark
This is how you can use this library to read it
using (var reader = new ChoCSVReader("emp.csv").WithFirstLineHeader())
{
foreach (dynamic item in reader)
{
Console.WriteLine(item.Id);
Console.WriteLine(item.Name);
}
}
If you have POCO object defined to match up with CSV file like below
public class Employee
{
public int Id { get; set; }
public string Name { get; set; }
}
You can parse the same file using this POCO class as below
using (var reader = new ChoCSVReader<Employee>("emp.csv").WithFirstLineHeader())
{
foreach (var item in reader)
{
Console.WriteLine(item.Id);
Console.WriteLine(item.Name);
}
}
Please check out articles at CodeProject on how to use it.
Disclaimer: I'm the author of this library

I recommend Angara.Table, about save/load: http://predictionmachines.github.io/Angara.Table/saveload.html.
It makes column types inference, can save CSV files and is much faster than TextFieldParser. It follows RFC4180 for CSV format and supports multiline strings, NaNs, and escaped strings containing the delimiter character.
The library is under MIT license. Source code is https://github.com/Microsoft/Angara.Table.
Though its API is focused on F#, it can be used in any .NET language but not so succinct as in F#.
Example:
using Angara.Data;
using System.Collections.Immutable;
...
var table = Table.Load("data.csv");
// Print schema:
foreach(Column c in table)
{
string colType;
if (c.Rows.IsRealColumn) colType = "double";
else if (c.Rows.IsStringColumn) colType = "string";
else if (c.Rows.IsDateColumn) colType = "date";
else if (c.Rows.IsIntColumn) colType = "int";
else colType = "bool";
Console.WriteLine("{0} of type {1}", c.Name, colType);
}
// Get column data:
ImmutableArray<double> a = table["a"].Rows.AsReal;
ImmutableArray<string> b = table["b"].Rows.AsString;
Table.Save(table, "data2.csv");

You might be interested in Linq2Csv library at CodeProject. One thing you would need to check is that if it's reading the data when it needs only, so you won't need a lot of memory when working with bigger files.
As for displaying the data on the browser, you could do many things to accomplish it, if you would be more specific on what are your requirements, answer could be more specific, but things you could do:
1. Use HttpListener class to write simple web server (you can find many samples on net to host mini-http server).
2. Use Asp.Net or Asp.Net Mvc, create a page, host it using IIS.

Seems like there are quite a few projects on CodeProject or CodePlex for CSV Parsing.
Here is another CSV Parser on CodePlex
http://commonlibrarynet.codeplex.com/
This library has components for CSV parsing, INI file parsing, Command-Line parsing as well. It's working well for me so far. Only thing is it doesn't have a CSV Writer.

This is just for parsing the CSV. For displaying it in a web page, it is simply a matter of taking the list and rendering it however you want.
Note: This code example does not handle the situation where the input string line contains newlines.
public List<string> SplitCSV(string line)
{
if (string.IsNullOrEmpty(line))
throw new ArgumentException();
List<string> result = new List<string>();
int index = 0;
int start = 0;
bool inQuote = false;
StringBuilder val = new StringBuilder();
// parse line
foreach (char c in line)
{
switch (c)
{
case '"':
inQuote = !inQuote;
break;
case ',':
if (!inQuote)
{
result.Add(line.Substring(start, index - start)
.Replace("\"",""));
start = index + 1;
}
break;
}
index++;
}
if (start < index)
{
result.Add(line.Substring(start, index - start).Replace("\"",""));
}
return result;
}
}

I have been maintaining an open source project called FlatFiles for several years now. It's available for .NET Core and .NET 4.5.1.
Unlike most of the alternatives, it allows you to define a schema (similar to the way EF code-first works) with an extreme level of precision, so you aren't fight conversion issues all the time. You can map directly to your data classes, and there is also support for interfacing with older ADO.NET classes.
Performance-wise, it's been tuned to be one of the fastest parsers for .NET, with a plethora of options for quirky format differences. There's also support for fixed-length files, if you need it.

you can use this library: Sky.Data.Csv
https://www.nuget.org/packages/Sky.Data.Csv/
this is a really fast CSV reader library and it's really easy to use:
using Sky.Data.Csv;
var readerSettings = new CsvReaderSettings{Encoding = Encoding.UTF8};
using(var reader = CsvReader.Create("path-to-file", readerSettings)){
foreach(var row in reader){
//do something with the data
}
}
it also supports reading typed objects with CsvReader<T> class which has a same interface.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to read rows from Lucene.Net's index files - c#

Related

In C# how do I go through a Google Sheets document and write into a specific cell

c# FlaUi get all the Values from DataGridView of another program

Is there a way to dynamically create an object at run time in .NET 3.5?

How to read tables from a particular place in a document?

Reading a CSV file in .NET?

Categories

Resources