C# Efficient de-duplication of single Datatable column's data - c#

I have a Datatable with some data, example as below, and need to de-duplicate any names in the names field by appending [1], [2] etc.
Current code below, works but is slow on large tables.
Any tips on the most efficient way of doing this in C# would be appreciated.
Current Table sample:
- ID Name X Y
- 1 John 45 66
- 2 Paul 44 66
- 3 George 88 102
- 4 John 33 90
- 5 John 53 37
- 6 Paul 97 65
- 7 Ringo 01 87
- 8 Ringo 76 65​
Required Table sample:
- ID Name X Y
- 1 John[1] 45 66
- 2 Paul[1] 44 66
- 3 George 88 102
- 4 John[2] 33 90
- 5 John[3] 53 37
- 6 Paul[2] 97 65
- 7 Ringo[1] 01 87
- 8 Ringo[2] 76 65​
Current code below:
foreach (DataRow aRow in ds.Tables[0].Rows) // run through all
{
string aName = aRow["Name"].ToString();
DataRow[] FoundRows = ds.Tables[0].Select("Name = '" + aName +"'"); // Find all rows with same name
if (FoundRows.Length > 1) // As will always find itself
{
int i = 1;
foreach (DataRow row in FoundRows)
{
row["Name"] = row["Name"].ToString() + "[" + i + "]";
i++;
}
ds.Tables[0].AcceptChanges(); // Ensure the rows are updated before looping around.
}
}

Here is one approach
DataTable table = new DataTable();
//test data
table.Columns.Add("Name");
table.Columns.Add("X", typeof(int));
table.Rows.Add(new object[] { "john", 10 });
table.Rows.Add(new object[] { "paul", 44 });
table.Rows.Add(new object[] { "ringo", 312 });
table.Rows.Add(new object[] { "george", 30 });
table.Rows.Add(new object[] { "john", 100 });
table.Rows.Add(new object[] { "paul", 443 });
//converting DataTable to enumerable collection of rows and then grouping by name,
//skipping groups with only one row(such as george or ringo)
var groupedData = table.AsEnumerable().GroupBy(row => row[0].ToString()).Where(g => g.Count() > 1);
//iterate through each group of <string, DataRow>
foreach (var group in groupedData)
{
int counter = 1; //counter for "[x]" suffix
//iterate through all rows under one name, eg. John
foreach (var groupedItem in group)
{
//add [x]
groupedItem[0] = string.Format("{0} [{1}]", group.Key, counter);
counter++;
}
}
EDIT: simplified code and made it a bit more efficient, as suggested by AdrianWragg

Probably old good for loop updating the whole table in one pass will be the fastest approach:
var foundNames = new Dictionary<string, int>();
for (int rowInd = 0; rowInd < dataTable.Rows.Count; rowInd++)
{
// If name is not yet found in foundNames, then store its row
// index. Don't update the dataTable yet -- this is the only
// occurrence so far.
// The index is stored inverted to distinguish from count.
//
// If name is found in foundNames, retrieve the count.
// If count is inverted (non-positive), then we've encountered
// the name second time. In this case update the row with the
// first occurrence and the current row too. Store the count of 2.
//
// If count is positive, then it's third or even later occurrence.
// Update the current row only and store the incremented count.
var name = dataTable.Rows[rowInd]["Name"].ToString();
int count;
if (!foundNames.TryGetValue(name, out count))
foundNames.Add(name, -rowInd);
else
{
if (count <= 0)
{
dataTable.Rows[-count]["Name"] = name + "[1]";
count = 1;
}
count++;
dataTable.Rows[rowInd]["Name"] = name + "[" + count + "]";
foundNames[name] = count;
}
}

Related

Join keys in key-value pair list with close value c#

I have a list of key-value pairs of <string, int>. I want to merge and construct a new string with the keys that has close values (+3-3) and add each new string to a list.
Here are the keys and values of my list:
Luger: 9
Burger: 9
Le: 21
Pigeon: 21
Burger: 21
Hamburger: 25
Double: 30
Animal: 31
Style: 31
The: 43
Original: 43
Burger: 44
Here's the output that i want to achieve:
Luger Burger
Le Pigeon Burger
Hamburger
Double Animal Style
The Original Burger
To achieve this, firstly i created a list containing this key-value pairs. And iterate through each item and tried to find close values, assign them to new key-value pairs and delete that index. But that doesn't work properly. That's the code so far:
for (int i = 0; i < wordslist.Count; i++)
{
for (int j = 0; j < wordslist.Count; j++)
{
if (wordslist[i].Value <= wordslist[j].Value + 3 && wordslist[i].Value >= wordslist[j].Value - 3)
{
wordslist.Add(
new KeyValuePair<string, int>(wordslist[i].Key + " " + wordslist[j].Key, wordslist[i].Value)
);
wordslist.RemoveAt(j);
}
}
wordslist.RemoveAt(i);
}
this doesn't work and produce repetitive results as below:
Pigeon: 21
Style: 30
Burger: 30
Double Double Animal: 30
Burger Burger: 31
Original Original The The Original Burger Original Burger: 42
Is there any algorithm that can iterate through these items and construct a string by merging the keys that has close values and add each item to a list?
You can simplify this logic:
public IEnumerable<string> GetPlusOrMinus3(Dictionary<string, int> fullList, int checkNumber)
{
return fullList.Where(w => checkNumber <= w.Value + 3
&& checkNumber >= w.Value - 3)
.Select(s => $"{s.Key}: {s.Value}" );
}
The string format isn't perfect for you, but the logic should hold.
And in use you could do something like:
var forOne = GetPlusOrMinus3(values, 1);
var resultString = String.Join(", ", forOne);
Console.WriteLine(resultString);
Which would write out:
one: 1, two: 2, four: 4
And to loop through everything:
foreach(var entryValue in values.Values)
{
Console.WriteLine(String.Join(", ", GetPlusOrMinus3(values, entryValue)));
}
Or to loop through anything without resusing any results:
var matchedNumbers = new List<int>();
foreach(var entryValue in values.Values)
{
var matchResults = values.Where(w => entryValue <= w.Value + 3 && entryValue >= w.Value - 3
&& !matchedNumbers.Contains(w.Value)).ToDictionary(x => x.Key, x => x.Value);
if (matchResults.Any())
{
matchedNumbers.AddRange(matchResults.Select(s => s.Value).ToList());
Console.WriteLine(String.Join(", ",
GetPlusOrMinus3(matchResults, entryValue)));
}
}
Logs:
one: 1, two: 2, four: 4
twelve: 12, 10: 10, eleven: 11
six: 6

How to access the string array elements for search?

I have a text file that include of numbers and I save it in a string array.
one line of my text file is this:
2 3 9 14 23 26 34 36 39 40 52 55 59 63 67 76 85 86 90 93 99 108 114:275:5 8 1 14 10 6 10 18 12 25 7 40 1 30 18 8 2 1 5 21 10 2 21
every line save in one of indexes of string array.
now how can i access array elements as int type and search and calculate in all of array?
this is my array:
string [] lines = File.ReadAllLines(txtPath.Text);
for example I want to return indexes of array that include number'14' in all of array .
This is the easiest and clearest way to solve it. I commented so you can better understand what happens in the entire program.
class Program
{
static void Main(string[] args)
{
// this is your array of strings (lines)
string[] lines = new string[1] {
"2 3 9 14 23 26 34 36 39 40 52 55 59 63 67 76 85 86 90 93 99 108 114:275:5 8 1 14 10 6 10 18 12 25 7 40 1 30 18 8 2 1 5 21 10 2 21"
};
// this dictionary contains the line index and the list of indexes containing number 14
// in that line
Dictionary<int, List<int>> dict = new Dictionary<int, List<int>>();
// iterating over lines array
for (int i = 0; i < lines.Length; i++)
{
// creating the list of indexes and the dictionary key
List<int> indexes = new List<int>();
dict.Add(i, indexes);
// splitting the line by space to get numbers
string[] lineElements = lines[i].Split(' ');
// iterating over line elements
for (int j = 0; j < lineElements.Length; j++)
{
int integerNumber;
// checking if the string lineElements[j] is a number (because there also this case 114:275:5)
if (int.TryParse(lineElements[j], out integerNumber))
{
// if it is we check if the number is 14, in that case we add that index to the indexes list
if (integerNumber == 14)
{
indexes.Add(j);
}
}
}
}
// Printing out lines and indexes:
foreach (int key in dict.Keys)
{
Console.WriteLine(string.Format("LINE KEY: {0}", key));
foreach (int index in dict[key])
{
Console.WriteLine(string.Format("INDEX ELEMENT: {0}", index));
}
Console.WriteLine("------------------");
}
Console.ReadLine();
}
}
UPDATE 1:
As you requested:
special thanks for your clear answering.if i want to do search for all of my array elements what can i do? it means instead of only
number'14' i want to print indexes of all numbers that appear in
indexes
If you want to print all the indexes you should Console.WriteLine(j), that is the index of the inner for cycle, instead of checking the number value if (integerNumber == 14).
So, this is the program:
class Program
{
static void Main(string[] args)
{
// this is your array of strings (lines)
string[] lines = new string[1] {
"2 3 9 14 23 26 34 36 39 40 52 55 59 63 67 76 85 86 90 93 99 108 114:275:5 8 1 14 10 6 10 18 12 25 7 40 1 30 18 8 2 1 5 21 10 2 21"
};
// this dictionary contains the line index and the list of indexes containing number 14
// in that line
Dictionary<int, List<int>> dict = new Dictionary<int, List<int>>();
// iterating over lines array
for (int i = 0; i < lines.Length; i++)
{
// creating the list of indexes and the dictionary key
List<int> indexes = new List<int>();
dict.Add(i, indexes);
// splitting the line by space to get numbers
string[] lineElements = lines[i].Split(' ');
// iterating over line elements
for (int j = 0; j < lineElements.Length; j++)
{
// printing all indexes of the current line
Console.WriteLine(string.Format("Element index: {0}", j));
}
}
Console.ReadLine();
}
}
UPDATE 2:
As you requested:
if i want to search my line till first " : " apper and then search next line, what can i do?
You need to break the for cycle when you are on the element with :
class Program
{
static void Main(string[] args)
{
// this is your array of strings (lines)
string[] lines = new string[1] {
"2 3 9 14 23 26 34 36 39 40 52 55 59 63 67 76 85 86 90 93 99 108 114:275:5 8 1 14 10 6 10 18 12 25 7 40 1 30 18 8 2 1 5 21 10 2 21"
};
// this dictionary contains the line index and the list of indexes containing number 14
// in that line
Dictionary<int, List<int>> dict = new Dictionary<int, List<int>>();
// iterating over lines array
for (int i = 0; i < lines.Length; i++)
{
// creating the list of indexes and the dictionary key
List<int> indexes = new List<int>();
dict.Add(i, indexes);
// splitting the line by space to get numbers
string[] lineElements = lines[i].Split(' ');
// iterating over line elements
for (int j = 0; j < lineElements.Length; j++)
{
// I'm saving the content of lineElements[j] as a string
string element = lineElements[j];
// I'm checking if the element saved as string contains the string ":"
if (element.Contains(":"))
{
// If it does, I'm breaking the cycle, and I'll continue with the next line
break;
}
int integerNumber;
// checking if the string lineElements[j] is a number (because there also this case 114:275:5)
if (int.TryParse(lineElements[j], out integerNumber))
{
// if it is we check if the number is 14, in that case we add that index to the indexes list
if (integerNumber == 14)
{
indexes.Add(j);
}
}
}
}
// Printing out lines and indexes:
foreach (int key in dict.Keys)
{
Console.WriteLine(string.Format("LINE KEY: {0}", key));
foreach (int index in dict[key])
{
Console.WriteLine(string.Format("INDEX ELEMENT: {0}", index));
}
Console.WriteLine("------------------");
}
Console.ReadLine();
}
}
As you can see, if you run this piece of code and compare it with the first version, in output you'll get only the index of the first 14 occurrence, because the second one is after the string with :.
First you must get all conttent of file in the string array format:
public string[] readAllInFile(string filepath){
var lines = File.ReadAllLines(path);
var fileContent = string.Join(' ',lines);//join all lines of file content in one variable
return fileContent.Split(' ');//each word(in your case each number) in one index of array
}
and in usage time you can do like this:
var MyFileContent = readAllInFile(txtPath.Text);
int x= Convert.ToInt32(MyFileContent[2]);
IEnumerable<int> numbers = MyFileContent.Select(m=> int.Parse(m);)
var sumeOf = numbers.sum();
you can use linq to have more tools on collections.
var linesAsInts = lines.Select(x => x.Split(' ').Select(int.Parse));
var filteredLines = linesAsInts.Where(x => x.Contains(14));
// define value delimiters.
var splitChars = new char[] { ' ', ':' };
// read lines and parse into enumerable of enumerable of ints.
var lines = File.ReadAllLines(txtPath.Text)
.Select(x => x.Split(splitChars)
.Select(int.Parse));
// search in array.
var occurences = lines
.Select((line,lineIndex) => line
.Select((integer, integerIndex) => new { integer, integerIndex })
.Where(x => x.integer == 10)
.Select(x => x.integerIndex));
// calculate all of array.
var total = lines.Sum(line => line.Sum());

Check if string contains a successive pair of numbers

I have one list. I want check if list[i] contains string "6 1". But this code thinks 6 13 24 31 35 contains "6 1". Its false.
6 13 24 31 35
1 2 3 6 1
stringCheck = "6 1";
List<string> list = new List<string>();
list.Add("6 13 24 31 35");
list.Add("1 2 3 6 1");
for (int i=0; i<list.Count; i++)
{
if (list[i].Contains(stringCheck)
{
// its return me two contains, but in list i have one
}
}
But this code thinks 6 13 24 31 35 contains "6 1". Its false. […]
List<string> list = new List<string>();
list.Add("6 13 24 31 35");
list.Add("1 2 3 6 1");
No, it's true because you are dealing with sequences of characters, not sequences of numbers here, so your numbers get treated as characters.
If you really are working with numbers, why not reflect that in the choice of data type chosen for your list?:
// using System.Linq;
var xss = new int[][]
{
new int[] { 6, 13, 24, 31, 35 },
new int[] { 1, 2, 3, 6, 1 }
};
foreach (int[] xs in xss)
{
if (xs.Where((_, i) => i < xs.Length - 1 && xs[i] == 6 && xs[i + 1] == 1).Any())
{
// list contains a 6, followed by a 1
}
}
or if you prefer a more procedural approach:
foreach (int[] xs in xss)
{
int i = Array.IndexOf(xs, 6);
if (i >= 0)
{
int j = Array.IndexOf(xs, 1, i);
if (i + 1 == j)
{
// list contains a 6, followed by a 1
}
}
}
See also:
Finding a subsequence in longer sequence
Find sequence in IEnumerable<T> using Linq

Reorder HTML Table Cells

I have a ASP.NET page that displays a table in which every cell represents an object. This object is generated by code behind in C# sent to the page via JSON, and a Javascript function reads every object and makes a HTML cell, displaying the properties of the object inside.
I have a defined row and column quantity for each case. For example, if my total of objects is 20, the number of columns would be 4 and the number of rows would be 5.
Each object has a numeric identificator, from 1 to 20 as in the example. The objects are filled on the table from left to right, row by row, resulting in a grid like this:
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
17 18 18 20
The object has the following structure:
public class AreaData
{
public string AreaId;
public int Rows;
public int Columns;
public int TotalCells;
public bool Orientation;
public List Cells;
}
public class CellData
{
public string CellName;
public int CellNumber;
}
The basic outline of the Javascript function that makes the table is:
function ReturnGrid(object, settings) {
var objId = settings["objId"];
var objName = settings["objName"];
if ((settings["columns"] * settings["rows"]) == object[objName].length) {
var table = CreateFullElement("table", "id", "Table" + object[objId], "class", settings["styles"]["table_css"], "style", settings["styles"]["table_style"]);
var tableBody = document.createElement("tbody");
var index = 0;
for (var j = 0; j < settings["rows"]; j++) {
var tableRow = document.createElement("tr");
for (var i = 0; i < settings["columns"]; i++) {
var tableCell = CreateFullElement("td", "id", "CellName-" + object[objName][index]["CellName"], "title", object[objName][index]["CellName"], "style", settings["styles"]["td_style"]);
object[objName][index]["ServiceTag"] = object[objName][index]["ServiceTag"].toUpperCase();
var cellContainer;
cellContainer = CreateFullElement("div", "id", object[objName][index]["CellName"], "class", settings["classes"]["cellContent_css"], "style", cellContent_style + "background-color:" + settings["deviceStatusColors"][object[objName][index]["DeviceStatus"]]+";");
var cellNumText = document.createTextNode(object[objName][index]["CellNumber"]);
cellNumContainer = CreateFullElement("div", "class", settings["classes"]["cellNumContent_css"], "style", cellNumContent_style + textcolor_Style);
cellNumContainer.appendChild(cellNumText);
tableCell.appendChild(cellContainer);
tableRow.appendChild(tableCell);
index++;
}
tableBody.appendChild(tableRow);
}
table.appendChild(tableBody);
return table;
}
else {
console.log("The values of rows and columns does not match with the length of the object");
}
}
However, I want to rearrange that grid based on two conditions:
1) Fill the grid up to down
1 6 11 16
2 7 12 17
3 8 13 18
4 9 14 19
5 10 15 20
2)If required fill the grid as a mirror.
16 11 6 1
17 12 7 2
18 13 8 3
19 14 9 4
20 15 10 5
Is there a way to implement this functionality with Javascript, an already made function with Jquery? Or do I have to sort the object before stringify it to JSON?
Thanks in advance

Merge 2 DataTables

I have 2 Datatable with these fields:
DataTable1
Hour - Value1
0 - 34
1 - 22
2 - NULL
3 - 12
..
..
23 - 10
DataTable2
Hour - Value1
0 - NULL
1 - 22
2 - 35
3 - 11
..
..
23 - NULL
I need to populate an unique DataTable that has the same "Hour" fields (the hours of the day) and for the values has to get the values from DataTable1 if is not null. If the values in Datatable1 is null, I have to take the correspondet values from DataTable2.
If also the value from DataTable2 is null, I have to log the error.
For my example, I need to get:
DataTableResult
Hour - Value1
0 - 34
1 - 22
2 - 35
3 - 12
..
..
23 - 10
How can I get this?
This is question follows a simple conditional logic you would need to process each element in a foreach statement.
You want to process this so every time an element in a row has null.
You go to that row in datatable 2 and check if that has a value if so this becomes the new value in
datatable 1.
If this does not then throw an error.
I have provided this link how to compare but really you don't need this as all you are doing is testing for null in a field in a row.
Using Linq to objects and assuming dataTable1 and dataTable2 have the same columns:
var hoursMap1 = dataTable1.Rows.Cast<DataRow>().ToDictionary(row => row[0]);
var hoursMap2 = dataTable2.Rows.Cast<DataRow>().ToDictionary(row => row[0]);
var resultTable = new DataTable();
// Clone the column from table 1
for (int i = 0; i < dataTable1.Columns.Count; i++)
{
var column = dataTable1.Columns[i];
resultTable.Columns.Add(column.ColumnName, column.ColumnType);
}
foreach (var entry in hoursMap1)
{
int hours = entry.Key;
DataRow row1 = entry.Value;
DataRow row2 = null;
if (!hoursMap2.TryGetValue(hours, out row2))
{
// Hours in table 1 but not table 2, handle error
}
var fields = new object[resultTable.Columns.Count];
int fieldIndex = 0;
fields[fieldIndex++] = hours;
for (int i = 1; i < row1.ItemsArray.Length; i++)
{
var field1 = row1.ItemsArray[i];
var field2 = row2.ItemsArray[i];
var newField = field1 ?? field2;
if (newField == null)
{
// Field neither in table 1 or table 2, handle error
}
fields[fieldIndex++] = newField;
}
resultTable.Rows.Add(fields);
}

Categories

Resources