I have a method where I'm reading a textfile.
I have to get the words in the textfile which start with "ART".
I have a foreach loop which loops through the method.
class ProductsList
{
public static void Main()
{
String path = #"D:\ProductsProjects\products.txt";
GetProducts(path, s => s.StartsWith("ART"));
//foreach (String productin GetProducts(path, s => s.StartsWith("ART")))
//Console.Write("{0}; ", word);
}
My method looks like this:
public static String GetProducts(String path, Func<String, bool> lambda)
{
try {
using (StreamReader sr = new StreamReader(path)){
string[] products= sr.ReadToEnd().Split(' ');
// need to get all the products starting with ART
foreach (string s in products){
return s;
}
}
}
catch (IOException ioe){
Console.WriteLine(ioe.Message);
}
return ="";
}
}
I'm having problems with the lambda in the method, I'm new to working with lambda's and I don't really know how to apply the lambda in the method.
I'm sorry if I can't really explain myself that well.
just add it here
foreach (string s in products.Where(lambda))
Update:
you should change your method like this to return a list of products and not just a single
public static IEnumerable<string> GetProducts(String path, Func<String, bool> lambda)
{
using (StreamReader sr = new StreamReader(path))
{
string[] products = sr.ReadToEnd().Split(' ');
// need to get all the products starting with ART
foreach (string s in products.Where(lambda))
{
yield return s;
}
}
}
Your code is wrong in that it only ever returns the one string, you want to return multiple strings, if the list of products is large this could also take a while, I'd recommend doing it this way:
public static IEnumerable<string> GetProducts(string path, Func<string, bool> matcher)
{
using(var stream = new FileStream(path, FileMode.Open, FileAccess.Read, FileShare.None))
{
using(var reader = new StreamReader(stream))
{
do
{
var line = reader.ReadLine();
if (matcher(line)) yield return line
}while(!reader.EndOfFile)
}
}
}
Then using it is as simple as:
foreach(var product in GetProducts("abc.txt", s => s.StartsWith("ART")))
{
Console.WriteLine("This is a matching product: {0}", product);
}
This code has the benefit of returning all of the lines that match the predicate (the lambda), as well as doing so using an iterator block, which means it doesn't actually read the next line until you ask for it.
Related
I am new to object-oriented programming and I am working on a small personal project with some SQL scripts.
I have a scenario where a SQL script calls a static method with a file path as input.
queries = Select Query from Table where Utils.ContainsKeyword(Query, #Path1) AND NOT Utils.ContainsKeyword(Query, #Path2);
I had initially created a static class that does the following:
public static class Utils
{
public static bool ContainsKeyword(string query, string path)
{
var isQueryInFile = false;
var stringFromFile = GetStringFromFile(path);
List<Regex>regexList = GetRegexList(stringFromFile);
if(regexList!= null)
{
isQueryInFile = regexList.Any(pattern => pattern.IsMatch(query));
}
return isQueryInFile;
}
private static string GetStringFromFile(string path)
{
var words = String.Empty;
if(!string.IsNullOrEmpty(path))
{
try
{
using (StreamReader sr = File.OpenText(path))
{
words = sr.ReadToEnd().Replace(Environment.Newline, "");
}
}
catch { return words; }
}
return words;
}
private static List<Regex> GetRegexList(string words)
{
if(string.IsNullOrEmpty(words)) { return null; }
return words.Split(',').Select(w=> new Regex(#"\b" + Regex.Escape(w) + #'\b', RegexOptions.Compiled | RegexOptions.IgnoreCase)).ToList();
}
}
My problem is that I neither want to read from the file every time the ContainsKeyword static method is called nor do I want to create a new RegexList every time. Also, I cannot change the SQL script and I have to send the path to the file as an input parameter for the method call in the SQL script since the path might change in the future.
Is there a way to make sure I only read the contents from the input path only once, store them in a string, and use the string for the match with different input queries?
To read the content only once, saving in memory will probaby be needed. Memory capacity could be an issue.
public Dictionary<string, string> FileContentCache { get; set; } // make sure that gets initialized
public string GetFileContentCache(string path)
{
if (FileContentCache == null) FileContentCache = new Dictionary<string, string>();
if (FileContentCache.ContainsKey(path))
return FileContentCache[path];
var fileData = GetStringFromFile(path);
FileContentCache.Add(path, fileData);
return fileData;
}
I have a csv file with the following data:
500000,0.005,6000
690000,0.003,5200
I need to add each line as a separate array. So 50000, 0.005, 6000 would be array1. How would I do this?
Currently my code adds each column into one element.
For example data[0] is showing 500000
690000
static void ReadFromFile(string filePath)
{
try
{
// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
using (StreamReader sr = new StreamReader(filePath))
{
string line;
// Read and display lines from the file until the end of
// the file is reached.
while ((line = sr.ReadLine()) != null)
{
string[] data = line.Split(',');
Console.WriteLine(data[0] + " " + data[1]);
}
}
}
catch (Exception e)
{
// Let the user know what went wrong.
Console.WriteLine("The file could not be read:");
Console.WriteLine(e.Message);
}
}
Using the limited data set you've provided...
const string test = #"500000,0.005,6000
690000,0.003,5200";
var result = test.Split('\n')
.Select(x=> x.Split(',')
.Select(y => Convert.ToDecimal(y))
.ToArray()
)
.ToArray();
foreach (var element in result)
{
Console.WriteLine($"{element[0]}, {element[1]}, {element[2]}");
}
Can it be done without LINQ? Yes, but it's messy...
const string test = #"500000,0.005,6000
690000,0.003,5200";
List<decimal[]> resultList = new List<decimal[]>();
string[] lines = test.Split('\n');
foreach (var line in lines)
{
List<decimal> decimalValueList = new List<decimal>();
string[] splitValuesByComma = line.Split(',');
foreach (string value in splitValuesByComma)
{
decimal convertedValue = Convert.ToDecimal(value);
decimalValueList.Add(convertedValue);
}
decimal[] decimalValueArray = decimalValueList.ToArray();
resultList.Add(decimalValueArray);
}
decimal[][] resultArray = resultList.ToArray();
That will give the exact same output as what I've done with the first example
If you may use a List<string[]> you do not have to worry about the array length.
In the following example, the variable lines will be a list arrays, like:
["500000", "0.005", "6000"]
["690000", "0.003", "5200"]
static void ReadFromFile(string filePath)
{
try
{
// Create an instance of StreamReader to read from a file.
// The using statement also closes the StreamReader.
using (StreamReader sr = new StreamReader(filePath))
{
List<string[]> lines = new List<string[]>();
string line;
// Read and display lines from the file until the end of
// the file is reached.
while ((line = sr.ReadLine()) != null)
{
string[] splittedLine = line.Split(',');
lines.Add(splittedLine);
}
}
}
catch (Exception e)
{
// Let the user know what went wrong.
Console.WriteLine("The file could not be read:");
Console.WriteLine(e.Message);
}
}
While other have split method, I will have a more "scolar"-"specified" method.
You have some Csv value in a file. Find a name for this object stored in a Csv, name every column, type them.
Define the default value of those field. Define what happends for missing column, and malformed field. Header?
Now that you know what you have, define what you want. This time again: Object name -> Property -> Type.
Believe me or not, the simple definition of your input and output solved your issue.
Use CsvHelper to simplify your code.
CSV File Definition:
public class CsvItem_WithARealName
{
public int data1;
public decimal data2;
public int goodVariableNames;
}
public class CsvItemMapper : ClassMap<CsvItem_WithARealName>
{
public CsvItemMapper()
{ //mapping based on index. cause file has no header.
Map(m => m.data1).Index(0);
Map(m => m.data2).Index(1);
Map(m => m.goodVariableNames).Index(2);
}
}
A Csv reader method, point a document it will give your the Csv Item.
Here we have some configuration: no header and InvariantCulture for decimal convertion
private IEnumerable<CsvItem_WithARealName> GetCsvItems(string filePath)
{
using (var fileReader = File.OpenText(filePath))
using (var csvReader = new CsvHelper.CsvReader(fileReader))
{
csvReader.Configuration.CultureInfo = CultureInfo.InvariantCulture;
csvReader.Configuration.HasHeaderRecord = false;
csvReader.Configuration.RegisterClassMap<CsvItemMapper>();
while (csvReader.Read())
{
var record = csvReader.GetRecord<CsvItem_WithARealName>();
yield return record;
}
}
}
Usage :
var filename = "csvExemple.txt";
var items = GetCsvItems(filename);
I'm trying to gather a list of website links which starting from the root directory can branch down to many sub-directory links, below is a link to a simplified graphic which illustrates the structure, I'm only concerned with getting links in Green, Yellow links always lead to other links, so my output array would contain A,B,D,F,G,H,I. I'm trying to code this in C#.
In generic terms, you can do something like
private static IEnumerable<T> Leaves<T>(T root, Func<T, IEnumerable<T>> childSource)
{
var children = childSource(root).ToList();
if (!children.Any()) {
yield return root;
yield break;
}
foreach (var descendant in children.SelectMany(child => Leaves(child, childSource)))
{
yield return descendant;
}
}
Here, childSource is assumed to be a function that can take an element and return that element's children. In your case, you'll want to make a function that uses something like HtmlAgilityPack to take a given url, download it, and return links from that.
private static string Get(int msBetweenRequests, string url)
{
try
{
var webRequest = WebRequest.CreateHttp(url);
using (var webResponse = webRequest.GetResponse())
using (var responseStream = webResponse.GetResponseStream())
using (var responseStreamReader = new StreamReader(responseStream, System.Text.Encoding.UTF8))
{
var result = responseStreamReader.ReadToEnd();
return result;
}
}
catch
{
return null; // really nothing sensible to do here
}
finally
{
// let's be nice to the server we're crawling
System.Threading.Thread.Sleep(msBetweenRequests);
}
}
private static IEnumerable<string> ScrapeForLinks(string url)
{
var noResults = Enumerable.Empty<string>();
var html = Get(1000, url);
if (string.IsNullOrWhiteSpace(html)) return noResults;
var d = new HtmlAgilityPack.HtmlDocument();
d.LoadHtml(html);
var links = d.DocumentNode.SelectNodes("//a[#href]");
return links == null ? noResults :
links.Select(
link =>
link
.Attributes
.Where(a => a.Name.ToLower() == "href")
.Select(a => a.Value)
.First()
)
.Select(linkUrl => FixRelativePaths(url, linkUrl))
;
}
private static string FixRelativePaths(string baseUrl, string relativeUrl)
{
var combined = new Uri(new Uri(baseUrl), relativeUrl);
return combined.ToString();
}
Note that, in a naive approach, you'll run into an infinite loop if there are any cycles in the links between these pages. To alleviate this, you'll want to avoid expanding the children of a url you've visited before.
private static Func<string, IEnumerable<string>> DontVisitMoreThanOnce(Func<string, IEnumerable<string>> naiveChildSource)
{
var alreadyVisited = new HashSet<string>();
return s =>
{
var children = naiveChildSource(s).Select(RemoveTrailingSlash).ToList();
var filteredChildren = children.Where(c => !alreadyVisited.Contains(c)).ToList();
alreadyVisited.UnionWith(children);
return filteredChildren;
};
}
private static string RemoveTrailingSlash(string url)
{
return url.TrimEnd(new[] {'/'});
}
In case you'd like to prevent your crawler from escaping onto the internet and spending time on Youtube, you'll want
private static Func<string, IEnumerable<string>> DontLeaveTheDomain(
string domain,
Func<string, IEnumerable<string>> wanderer)
{
return u => wanderer(u).Where(l => l.StartsWith(domain));
}
Once you've defined these things, what you want is just
var results = Leaves(
myUrl,
DontLeaveTheDomain(
myDomain,
DontVisitMoreThanOnce(ScrapeForLinks)))
.Distinct()
.ToList();
I am trying to use a CSV parser which I found on the net in my project. The problem is I am getting a null reference exception when I try to convert the string to a Tag and my collection does not get populated. Can anyone assist? Thanks
CSV Parser
private static IEnumerable<string[]> parseCSV(string path)
{
List<string[]> parsedData = new List<string[]>();
try
{
using (StreamReader readFile = new StreamReader(path))
{
string line;
string[] row;
while ((line = readFile.ReadLine()) != null)
{
row = line.Split(',');
parsedData.Add(row);
}
}
}
catch (Exception e)
{
System.Windows.MessageBox.Show(e.Message);
}
return parsedData;
}
Tag Class
public class Tag
{
public Tag(string name, int weight)
{
Name = name;
Weight = weight;
}
public string Name { get; set; }
public int Weight { get; set; }
public static IEnumerable<Tag> CreateTags(IEnumerable<string> words)
{
Dictionary<string, int> tags = new Dictionary<string, int>();
foreach (string word in words)
{
int count = 1;
if (tags.ContainsKey(word))
{
count = tags[word] + 1;
}
tags[word] = count;
}
return tags.Select(kvp => new Tag(kvp.Key, kvp.Value));
}
}
Validate all method arguments before you use them!
It breaks on this line: foreach (string word in words)
Remember that foreach loops work by calling GetEnumerator on the collection iterated over. That is, your foreach loop causes a call to words.GetEnumerator, and this call fails if words is null.
Therefore, validate your argument words by adding a guard at the very start of your CreateTags method:
if (words == null)
{
throw new ArgumentNullException("words");
}
This will help you find the location in your code where null is passed into CreateTags, and you can then continue fixing the calling code.
Suggestion: Avoid null whenever possible.
As a very general rule, try to avoid using null values whenever possible. For example, when your code is dealing with sets and collections of items, you could make sure that it also works correctly with empty collections. In a second step, make sure that you never use null to represent an empty collection; instead, use e.g. LINQ's Enumerable.Empty<TItem>() generator to create an empty collection.
One place where you could start doing this is in the CreateTags method by ensuring that no matter what the inputs are, that method will always return a valid, non-null (but possibly empty) collection:
if (words == null)
{
return Enumerable.Empty<Tag>(); // You could do without LINQ by writing:
// return new Tag[] { };
}
Every method should run sanity checks on the arguments it accepts to ensure the arguments are valid input parameters. I would probably do something like
public static IEnumerable<Tag> CreateTags(IEnumerable<string> words)
{
if(words==null)
{
//either throw a new ArgumentException or
return null; //or return new Dictionary<string,int>();
}
Dictionary<string, int> tags = new Dictionary<string, int>();
foreach (string word in words)
{
int count = 1;
if (tags.ContainsKey(word))
{
count = tags[word] + 1;
}
tags[word] = count;
}
return tags.Select(kvp => new Tag(kvp.Key, kvp.Value));
}
As to why your "words" param is null, it would be helpful to see the CSV file you are trying to parse in.
Hope this helps!
My file named as test.txt contains
This document is divided into about 5 logical sections starting with a feature and structure overview, followed by an overview of built in column and cell types. Next is an overview of working with data, followed by an overview of specific major features. Lastly, a “best practice” section concludes the main part of this document.
Now i want to delete 2nd line of the file.
How to do it using c#?
Thanks in advance.
Naveenkumar
List<string> lines = File.ReadAllLines(#"filename.txt").ToList();
if(lines.Count>lineNum){
lines.RemoveAt(lineNum);
}
File.WriteAllLines(#"filename.txt",lines.ToArray());
You can acheive this by splitting the text by \n and then using LINQ to select the lines you want to keep, and re-joining them.
var lineNum=5;
var lines=File
.ReadAllText(#"src.txt")
.Split('\n');
var outTxt=String
.Join(
"\n",
lines
.Take(lineNum)
.Concat(lines.Skip(lineNum+1))
.ToArray()
);
Here's a pretty efficient way to do it.
FileInfo x = new FileInfo(#"path\to\original");
string xpath = x.FullName;
FileInfo y = new FileInfo(#"path\to\temporary\new\file");
using (var reader = x.OpenText())
using (var writer = y.AppendText())
{
// write 1st line
writer.WriteLine(reader.ReadLine());
reader.ReadLine(); // skip 2nd line
// write all remaining lines
while (!reader.EndOfStream)
{
writer.WriteLine(reader.ReadLine());
}
}
x.Delete();
y.MoveTo(xpath);
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace rem2ndline
{
class Program
{
static void Main(string[] args)
{
string inPath = #"c:\rem2ndline.txt";
string outPath = #"c:\rem2ndlineresult.txt";
StringBuilder builder = new StringBuilder();
using (FileStream fso = new FileStream(inPath, FileMode.Open))
{
using (StreamReader rdr = new StreamReader(fso))
{
int lineCount = 0;
bool canRead = true;
while (canRead)
{
var line = rdr.ReadLine();
lineCount++;
if (line == null)
{
canRead = false;
}
else
{
if (lineCount != 2)
{
builder.AppendLine(line);
}
}
}
}
}
using(FileStream fso2 = new FileStream(outPath, FileMode.OpenOrCreate))
{
using (StreamWriter strw = new StreamWriter(fso2))
{
strw.Write(builder.ToString());
}
}
}
}
}
Here's what I'd do. The advantage is that you don't have to have the file in memory all at once, so memory requirements should be similar for files of varying sizes (as long as the lines contained in each of the files are of similar length). The drawback is that you can't pipe back to the same file - you have to mess around with a Delete and a Move afterwards.
The extension methods may be overkill for your simple example, but those are two extension methods I come to rely on again and again, as well as the ReadFile method, so I'd typically only have to write the code in Main().
class Program
{
static void Main()
{
var file = #"C:\myFile.txt";
var tempFile = Path.ChangeExtension(file, "tmp");
using (var writer = new StreamWriter(tempFile))
{
ReadFile(file)
.FilterI((i, line) => i != 1)
.ForEach(l => writer.WriteLine(l));
}
File.Delete(file);
File.Move(tempFile, file);
}
static IEnumerable<String> ReadFile(String file)
{
using (var reader = new StreamReader(file))
{
while (!reader.EndOfStream)
{
yield return reader.ReadLine();
}
}
}
}
static class IEnumerableExtensions
{
public static IEnumerable<T> FilterI<T>(
this IEnumerable<T> seq,
Func<Int32, T, Boolean> filter)
{
var index = 0;
foreach (var item in seq)
{
if (filter(index, item))
{
yield return item;
}
index++;
}
}
public static void ForEach<T>(
this IEnumerable<T> seq,
Action<T> action)
{
foreach (var item in seq)
{
action(item);
}
}
}