IComparer not being called

IComparer not being called - c#

I have a problem where my custom IComparer is not being called. I am thinking that maybe it might have to do with me expecting for it to work with different objects? Why is it not being called?
The Icomparer:
public class MyComparer : IComparer<object>
{
public string[] needles { get; set; }
public MyComparer(string[] argument)
{
needles = argument;
}
public int Compare(object x, object y)
{
int rankA = x.getRankedResults(needles);
int rankB = y.getRankedResults(needles);
if (rankA > rankB) { return 1; }
else if (rankA < rankB) { return -1; }
else { return 0; }
}
}
public static class MyComparerExtensions
{
public static int getRankedResults(this object o, string[] needle)
{
int result = 0;
if (o.GetType() == typeof(StaticPage))
{
var orig = o as StaticPage;
result = needle.countOccurences(orig.PageTitle.StripHTML(), orig.BodyCopy.StripHTML());
}
else if (o.GetType() == typeof(CustomerNewsArticle))
{
var orig = o as CustomerNewsArticle;
result = needle.countOccurences(orig.Title.StripHTML(), orig.BodyCopy.StripHTML());
}
else if (o.GetType() == typeof(Blog))
{
var orig = o as Blog;
result = needle.countOccurences(orig.Title.StripHTML(), orig.Body.StripHTML());
}
else if (o.GetType() == typeof(PressRelease))
{
var orig = o as PressRelease;
result = needle.countOccurences(orig.title.StripHTML(), orig.body.StripHTML());
}
return result;
}
//count the total occurences of the needles inside the haystack
private static int countOccurences(this string[] needle, params string[] haystack)
{
int occurences = 0;
foreach (var n in needle)
{
foreach (var h in haystack)
{
var nh = h;
int lenghtDif = nh.Length - nh.Replace(n, "").Length;
occurences = occurences + (lenghtDif / n.Length);
}
}
return occurences;
}
}
And how it is being called:
string[] needles = new string[] { "one", "two", "three" };
Haystacks.OrderBy(o=> o, new MyComparer(needles));
So "Haystacks" has 42 objects inside of different types (they are rows from different tables, and inside "getRankedResults" I detect the type of the object and choose the fields used to calculate the rank values). When i call the MyComparer on the sort, I can see the argument being set in the constructor, but then "Compare(object x, object y)" is never being called.

Related

getting a performance hit for nested for loop in C#

I have an array of string, a total of(100k). I need to check if the same string has occurred previously, if it occurs all I have to do is return that string. I have written the code using nested for loops, but unfortunately I am getting bad performance. It takes 1.9 mins to process the function correctly for (string[100K]) whereas I am expecting it to finish it within a couple of seconds.
My concern is not the logic. My concern is the LOOP. How to increase the efficiency of the loop.
public string StartMatchingProcess(string[]inputArray)
{
string[] stringArray = inputArray;
string result = string.Empty;
for (long i = 0; i < stringArray.Length; i++)
{
for (long j = 0; j <= i; j++)
{
if(i == j) break;
if (IsPrefix(stringArray[i], stringArray[j]))
{
return stringArray[i];
}
}
}
Console.WriteLine("GOOD SET");
return result;
}
public bool IsPrefix(string string1,string string2)
{
if (AreString1ValuesValid(string1, string2))
{
if (string1 == string2.Substring(0, string1.Length))
{
Console.WriteLine("BAD SET");
Console.WriteLine(string1);
return true;
}
}
else if (AreString2ValuesValid(string1, string2))
{
if (string2 == string1.Substring(0, string2.Length))
{
Console.WriteLine("BAD SET");
Console.WriteLine(string1);
return true;
}
}
return false;
}
public bool AreString1ValuesValid(string string1, string string2)
=> string1.Length <= string2.Length;
public bool AreString2ValuesValid(string string1, string string2)
=> string2.Length <= string1.Length;

Sort the initial array, and you can check neighbors only:
public string StartMatchingProcess(string[] inputArray) {
if (null == inputArray)
throw new ArgumentNullException(nameof(inputArray));
string[] sorted = inputArray.OrderBy(item => item).ToArray();
for (int i = 1; i < sorted.Length; ++i) {
string prior = sorted[i - 1];
string current = sorted[i];
if (current.StartsWith(prior))
return prior;
}
return "";
}
So, you'll have O(n * log(n)) time complexity vs. O(n**2) (initial solution)

It's really bad idea to use nested loops for this task because you have O(n*n) complexity for the answer and need to make 10.000.000.000 calls of Substring() for 100k array.
There is a specific structures for strings. For example, you can use Trie:
public string StartMatchingProcess(string[] inputArray)
{
var trie = new Trie();
foreach(var w in inputArray)
trie.AddWord(w);
foreach(var w in inputArray)
if(trie.HasPrefix(w) || trie.HasWord(w)
return w;
return string.Empty;
}

If you are just trying to determine if your array has duplicate string values, consider LINQ to get the count of the occurences.
string[] arrayTest = new string[] { "hello", "hello", "world"};
string myValue = "hello";
var stringCount = arrayTest.Where(n => n == myValue).Count();
if (stringCount > 1) return myValue;
In the above, we check to see if "hello" is in the array more than once, and if it is, we return it.

Here is a complete solution using linq.
public class Node
{
public char letter { get; }
public int Index { get; set; }
public List<Node> ChildList { get; set; } = new List<Node>();
public Node(char item, int index)
{
Index = index;
letter = item;
}
}
public class NoPrefixSet
{
public Dictionary<char, Node> ParentNode { get; set; } = new Dictionary<char, Node>();
public string GenerateNodes(string[] inputArray)
{
for (int i = 0; i < inputArray.Length; i++)
{
if (IsWordPrefix(inputArray[i]))
{
Console.WriteLine("BAD SET");
Console.WriteLine(inputArray[i]);
return inputArray[i];
}
}
Console.WriteLine("Good Set");
return "Good Set";
}
private void InsertNodeInParent(char item)
=> ParentNode.Add(item, new Node(item, 0));
private bool IsWordPrefix(string word)
{
//Check parent
Node parentNode = null;
bool hasNotInserted = false;
int similarCounter = 0;
if (!ParentNode.Any(a => a.Key == word[0]))
{
InsertNodeInParent(word[0]);
}
parentNode = ParentNode.Where(a => a.Key == word[0]).FirstOrDefault().Value;
for (int letterIndex = 0; letterIndex < word.Length; letterIndex++)
{
if (!parentNode.ChildList.Any(a => a.letter == word[letterIndex]))
{
parentNode.ChildList.Add(new Node(word[letterIndex], letterIndex));
}
else
{
if (!parentNode.ChildList.Where(a => a.letter == word[letterIndex]).First().ChildList.Any() || word.Length == letterIndex+1)
{
if (similarCounter == letterIndex)
return hasNotInserted = true;
}
similarCounter++;
}
parentNode = parentNode.ChildList.Where(a => a.letter == word[letterIndex] && a.Index == letterIndex).First();
}
return hasNotInserted;
}
public void ReadInput()
{
long data = Convert.ToInt64(Console.ReadLine());
string[] stringArray = new string[data];
for (long i = 0; i < data; i++)
{
stringArray[i] = Console.ReadLine();
}
GenerateNodes(stringArray);
}
}

how to to multiply and divide in a static stack?

This is the static array I have been given in making a RPN calculator. From this code the RPN calculator adds and subtracts. Now I need to extend my code to multiply and divide but I cant I don't know how.
public class IntStack
{
private const int maxsize = 10;
private int top = 0;
private int[] array = new int[maxsize];
public void Push(int value)
{
array[top++] = value;
}
public int Pop()
{
return array[--top];
}
public int Peek()
{
return array[top - 1];
}
public bool IsEmpty()
{
return top == 0;
}
public bool IsFull()
{
return top == maxsize;
}
public string Print()
{
StringBuilder output = new StringBuilder();
for (int i = top - 1; i >= 0; i--)
output.Append(array[i] + Environment.NewLine);
return output.ToString();
}
}

Here are some methods you can add to your IntStack class that will perform the multiply and division operations. I've added minimal error checking.
public void Multiply()
{
if (array.Length < 2)
return;
var factor1 = Pop();
var factor2 = Pop();
Push(factor1 * factor2);
}
public void Divide()
{
if (array.Length < 2)
return;
var numerator = Pop();
var divisor = Pop();
if (divisor == 0) { // Return stack back to original state.
Push(divisor);
Push(numerator);
return;
}
Push(numerator / divisor);
}

Getting the longest string using properties

I have following code:
public string Longest
{
get
{
int min = int.MinValue;
string longest = "";
for (Node i = Head; i != null; i = i.Next)
{
if (i.Text.Length > min)
{
longest = i.Text.Length.ToString();
}
return longest;
}
return longest;
}
}
The problem is I have those strings:
List text = new List();
text.Add("Petar");
text.Add("AHS");
text.Add("Google");
text.Add("Me");
When I try out the propertie it says that the longest string is 5 but thats not true the longest string is six. I've tried to find out where my problem but i coulnd't find it.

Your code has a couple of problems:
A length can be, as minimum, 0, so you don't need to use int.MinValue
You are returning on the first iteration
You are not updating min after finding a longer value
You are returning the length of the string, not the string itself
Your code should look like this:
public string Longest
{
get
{
int longestLength = 0;
string longestWord = string.Empty;
for (Node i = Head; i != null; i = i.Next)
{
if (i.Text.Length > longestLength)
{
longestLength = i.Text.Length;
longestWord = i.Text;
}
}
return longestWord;
}
}
If what you want to return is the maximum length instead of the word with the maximum length, your property is both wrongly named and typed, and it should look like this instead:
public int MaximumLength
{
get
{
int maximumLength = 0;
for (Node i = Head; i != null; i = i.Next)
{
if (i.Text.Length > maximumLength)
{
maximumLength = i.Text.Length;
}
}
return maximumLength;
}
}

If you have an IEnumerable<string> then do the following
var list = new List<string>();
list.Add("AAA");
list.Add("AAAAA");
list.Add("A");
list.Add("AAAA");
list.Add("AAAAAA");
list.Add("AA");
// max has the longest string
var max = list.Aggregate(string.Empty,
(bookmark, item) => item.Length>bookmark.Length ? item : bookmark);
or using a loop
string max = string.Empty;
int length=0;
foreach(var item in list)
{
if(item.Length>length)
{
max = item;
length = item.Length;
}
}
But it appears you have a linked list which I recreated as a skeleton below:
public class Node
{
public Node(string text)
{
this.Text = text;
this.Head = this;
}
public Node(Node parent, string text): this(text)
{
if(parent!=null)
{
parent.Next = this;
this.Head = parent.Head;
}
}
public Node Head { get; }
public Node Next { get; set; }
public string Text { get; }
public Node Add(string text) => new Node(this, text);
}
and finding the longest string with a loop is
var list = new Node("AAA");
list = list.Add("AAAAA");
list = list.Add("A");
list = list.Add("AAAA");
list = list.Add("AAAAAA");
list = list.Add("AA");
string max = list.Text;
int length = max.Length;
for(Node node = list.Head; node != null; node = node.Next)
{
if(node.Text.Length > length)
{
max = node.Text;
length= node.Text.Length;
}
}
// max has the longest string
Edit 1
I took the linked list and made it IEnumerable<string> by moving your loop code into a method:
public class Node : IEnumerable<string>
{
public Node(string text)
{
this.Text = text;
this.Head = this;
}
public Node(Node parent, string text) : this(text)
{
if(parent!=null)
{
parent.Next = this;
this.Head = parent.Head;
}
}
public Node Head { get; }
public Node Next { get; set; }
public string Text { get; }
public Node Add(string text) => new Node(this, text);
public IEnumerator<string> GetEnumerator()
{
// Loop through the list, starting from head to end
for(Node node = Head; node != null; node = node.Next)
{
yield return node.Text;
}
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
and now I can use a single LINQ statement
var list = new Node("AAA");
list = list.Add("AAAAA");
list = list.Add("A");
list = list.Add("AAAA");
list = list.Add("AAAAAA");
list = list.Add("AA");
// max has the longest string
var max = list.Aggregate(string.Empty,
(bookmark, item) => item.Length>bookmark.Length ? item : bookmark);

nested loops to IDataReader

I have a program that writes a huge DataTable (2.000.000 to 70.000.000 rows, depends on the configuration) to a database using a SqlBulkCopy.
I decided to change the loop that populates this table into a IDataReader, because the amount of rows often causes an OutOfMemoryException.
The table is populated like this
// int[] firsts;
// string[] seconds;
// byte[] thirds;
var table = new DataTable();
foreach(var f in firsts)
{
foreach(var s in seconds)
{
foreach(var t in thirds)
{
var row = table.NewRow();
row[0] = f;
row[1] = s;
row[2] = t;
table.Rows.Add(row);
}
}
// here I also bulk load the table and clear it
}
so in my IDataReader class I will loop by index. This is my attempt.
class TableReader : IDataReader
{
bool Eof = false;
int FirstIndex;
int SecondIndex;
int ThirdIndex;
//those are populated via constructor
int[] firsts;
string[] seconds;
byte[] thirds;
// this will be retrieved automatically via indexer
object[] Values;
public bool Read()
{
if(ThirdIndex != thirds.Length
&& SecondIndex < seconds.Length
&& FirstIndex < firsts.Length)
{
Values[0] = firsts[FirstIndex];
Values[1] = seconds[SecondIndex];
Values[2] = thirds[ThirdIndex++];
}
else if(SecondIndex != seconds.Length)
{
ThirdIndex = 0;
SecondIndex++;
}
else if(FirstIndex != firsts.Length)
{
SecondIndex = 0;
FirstIndex++;
}
else
{
Eof = true;
}
return !Eof;
}
}
I've created this code using a while(true) loop with a break instead of the Eof, but I can't seem to figure out how to do this.
Anyone can help?

This is actually possible if you implement IDataReader and use the "yield return" keyword to provide rows. IDataReader is a bit of a pain to implement, but it isn't complex at all. The code below can be adapted to load a terabyte worth of data to the database and never run out of memory.
I replaced the DataRow objects with a single object array that is reused throughout the data read.
Because there's no DataTable object to represent the columns, I had to do this myself by storing the data types and column names separately.
class TestDataReader : IDataReader {
int[] firsts = { 1, 2, 3, 4 };
string[] seconds = { "abc", "def", "ghi" };
byte[] thirds = { 0x30, 0x31, 0x32 };
// The data types of each column.
Type[] dataTypes = { typeof(int), typeof(string), typeof(byte) };
// The names of each column.
string[] names = { "firsts", "seconds", "thirds" };
// This function uses coroutines to turn the "push" approach into a "pull" approach.
private IEnumerable<object[]> GetRows() {
// Just re-use the same array.
object[] row = new object[3];
foreach (var f in firsts) {
foreach (var s in seconds) {
foreach (var t in thirds) {
row[0] = f;
row[1] = s;
row[2] = t;
yield return row;
}
}
// here I also bulk load he table and clear it
}
}
// Everything below basically wraps this.
IEnumerator<object[]> rowProvider;
public TestDataReader() {
rowProvider = GetRows().GetEnumerator();
}
public object this[int i] {
get {
return GetValue(i);
}
}
public object this[string name] {
get {
return GetValue(GetOrdinal(name));
}
}
public int Depth { get { return 0; } }
public int FieldCount { get { return dataTypes.Length; } }
public bool IsClosed { get { return false; } }
public int RecordsAffected { get { return 0; } }
// These don't really do anything.
public void Close() { Dispose(); }
public void Dispose() { rowProvider.Dispose(); }
public string GetDataTypeName(int i) { return dataTypes[i].Name; }
public Type GetFieldType(int i) { return dataTypes[i]; }
// These functions get basic data types.
public bool GetBoolean(int i) { return (bool) rowProvider.Current[i]; }
public byte GetByte(int i) { return (byte) rowProvider.Current[i]; }
public char GetChar(int i) { return (char) rowProvider.Current[i]; }
public DateTime GetDateTime(int i) { return (DateTime) rowProvider.Current[i]; }
public decimal GetDecimal(int i) { return (decimal) rowProvider.Current[i]; }
public double GetDouble(int i) { return (double) rowProvider.Current[i]; }
public float GetFloat(int i) { return (float) rowProvider.Current[i]; }
public Guid GetGuid(int i) { return (Guid) rowProvider.Current[i]; }
public short GetInt16(int i) { return (short) rowProvider.Current[i]; }
public int GetInt32(int i) { return (int) rowProvider.Current[i]; }
public long GetInt64(int i) { return (long) rowProvider.Current[i]; }
public string GetString(int i) { return (string) rowProvider.Current[i]; }
public object GetValue(int i) { return (object) rowProvider.Current[i]; }
public string GetName(int i) { return names[i]; }
public bool IsDBNull(int i) {
object obj = rowProvider.Current[i];
return obj == null || obj is DBNull;
}
// Looks up a field number given its name.
public int GetOrdinal(string name) {
return Array.FindIndex(names, x => x.Equals(name, StringComparison.OrdinalIgnoreCase));
}
// Populate "values" given the current row of data.
public int GetValues(object[] values) {
if (values == null) {
return 0;
} else {
int len = Math.Min(values.Length, rowProvider.Current.Length);
Array.Copy(rowProvider.Current, values, len);
return len;
}
}
// This reader only supports a single result set.
public bool NextResult() {
return false;
}
// Move to the next row.
public bool Read() {
return rowProvider.MoveNext();
}
// Don't bother implementing these in any meaningful way.
public long GetBytes(int i, long fieldOffset, byte[] buffer, int bufferoffset, int length) {
throw new NotImplementedException();
}
public long GetChars(int i, long fieldoffset, char[] buffer, int bufferoffset, int length) {
throw new NotImplementedException();
}
public IDataReader GetData(int i) {
throw new NotImplementedException();
}
public DataTable GetSchemaTable() {
return null;
}
}

How to extract properties used in a Expression<Func<T, TResult>> query and test their value?

I need to create a function to evaluate queries for some rules before executing them. Here's the code:
public class DataInfo
{
public int A { get; set; }
public int B { get; set; }
public int C { get; set; }
}
static class Program
{
static void Main()
{
var data = new DataInfo()
{
A = 10,
B = 5,
C = -1
};
// the result should be -1
int result = Calcul<DataInfo>(data, x => x.A / x.B + x.C);
}
static int Calcul<T>(T data, Expression<Func<T, int>> query)
{
// PSEUDO CODE
// if one property used in the query have a
// value of -1 or -2 then return 0
// {
// return 0;
// }
// if one property used in the query have a
// value of 0 AND it is used on the right side of
// a Divide operation then return -1
// {
// return -1;
// }
// if the query respect the rules, apply the query and return the value
return query.Compile().Invoke(data);
}
}
In the previous code, the calcul want to divide A(10) with B(5) and then add C(-1). The rules said that if one property used in the query have a value of -1 or -2, return 0. So in this example, the value return should be -1. If the query respect the rules, then apply the query on the data and return the value.
So how can i extract the properties used in the query and test the value used in them before appying the query on the data?

You need to use an ExpressionVisitor to test the property values. Here is an example of how you could implement the logic.
using System;
using System.Linq.Expressions;
using System.Reflection;
namespace WindowsFormsApplication1
{
static class Program
{
[STAThread]
static void Main()
{
// HasDivideByZero - the result should be -1
int result1 = Calcul<DataInfo>(new DataInfo { A = 10, B = 0, C = 1 }, x => x.A / x.B + x.C);
// HasNegative - the result should be 0
int result2 = Calcul<DataInfo>(new DataInfo { A = 10, B = 5, C = -1 }, x => x.A / x.B + x.C);
// the result should be 3
int result3 = Calcul<DataInfo>(new DataInfo { A = 10, B = 5, C = 1 }, x => x.A / x.B + x.C);
}
static int Calcul<T>(T data, Expression<Func<T, int>> query)
{
if (NegativeValueChecker<T>.HasNegative(data, query))
{
return 0;
}
if (DivideByZeroChecker<T>.HasDivideByZero(data, query))
{
return -1;
}
return query.Compile().Invoke(data);
}
}
class DivideByZeroChecker<T> : ExpressionVisitor
{
private readonly T _data;
private bool _hasDivideByZero;
public static bool HasDivideByZero(T data, Expression expression)
{
var visitor = new DivideByZeroChecker<T>(data);
visitor.Visit(expression);
return visitor._hasDivideByZero;
}
public DivideByZeroChecker(T data)
{
this._data = data;
}
protected override Expression VisitBinary(BinaryExpression node)
{
if (!this._hasDivideByZero && node.NodeType == ExpressionType.Divide)
{
var rightMemeberExpression = (MemberExpression)node.Right;
var propertyInfo = (PropertyInfo)rightMemeberExpression.Member;
var value = Convert.ToInt32(propertyInfo.GetValue(this._data, null));
this._hasDivideByZero = value == 0;
}
return base.VisitBinary(node);
}
}
class NegativeValueChecker<T> : ExpressionVisitor
{
private readonly T _data;
public bool _hasNegative;
public static bool HasNegative(T data, Expression expression)
{
var visitor = new NegativeValueChecker<T>(data);
visitor.Visit(expression);
return visitor._hasNegative;
}
public NegativeValueChecker(T data)
{
this._data = data;
}
protected override Expression VisitMember(MemberExpression node)
{
if (!this._hasNegative)
{
var propertyInfo = (PropertyInfo)node.Member;
var value = Convert.ToInt32(propertyInfo.GetValue(this._data, null));
this._hasNegative = value < 0;
}
return base.VisitMember(node);
}
}
class DataInfo
{
public int A { get; set; }
public int B { get; set; }
public int C { get; set; }
}
}

Get a look at the source of Moq - http://code.google.com/p/moq/.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

IComparer not being called - c#

Related

getting a performance hit for nested for loop in C#

how to to multiply and divide in a static stack?

Getting the longest string using properties

nested loops to IDataReader

How to extract properties used in a Expression<Func<T, TResult>> query and test their value?

Categories

Resources