I am looking for an efficient way to sort the data in a 2D array. The array can have many rows and columns but, in this example, I will just limit it to 6 rows and 5 columns. The data is strings as some are words. I only include one word below but in the real data there are a few columns of words. I realise if we sort, we should treat the data as numbers?
string[,] WeatherDataArray = new string[6,5];
The data is a set of weather data that is read every day and logged. This data goes through many parts of their system which I cannot change and it arrives to me in a way that it needs sorting. An example layout could be:
Day number, temperature, rainfall, wind, cloud
The matrix of data could look like this
3,20,0,12,cumulus
1,20,0,11,none
23,15,0,8,none
4,12,0,1,cirrus
12,20,0,12,cumulus
9,15,2,11,none
They now want the data sorted so it will have temperature in descending order and day number in ascending order. The result would be
1,20,0,11,none
3,20,0,12,cumulus
12,20,0,12,cumulus
9,15,2,11,none
23,15,0,0,none
4,12,0,1,cirrus
The array is stored and later they can extract it to a table and do lots of analysis on it. The extraction side is not changing so I cannot sort the data in the table, I have to create the data in the correct format to match the existing rules they have.
I could parse each row of the array and sort them but this seems a very long-handed method. There must be a quicker more efficient way to sort this 2D array by two columns? I think I could send it to a function and get returned the sorted array like:
private string[,] SortData(string[,] Data)
{
//In here we do the sorting
}
Any ideas please?
I agree with the other answer that it's probably best to parse each row of the data into an instance of a class that encapsulates the data, creating a new 1D array or list from that data. Then you'd sort that 1D collection and convert it back into a 2D array.
However another approach is to write an IComparer class that you can use to compare two rows in a 2D array like so:
public sealed class WeatherComparer: IComparer
{
readonly string[,] _data;
public WeatherComparer(string[,] data)
{
_data = data;
}
public int Compare(object? x, object? y)
{
int row1 = (int)x;
int row2 = (int)y;
double temperature1 = double.Parse(_data[row1, 1]);
double temperature2 = double.Parse(_data[row2, 1]);
if (temperature1 < temperature2)
return 1;
if (temperature2 < temperature1)
return -1;
int day1 = int.Parse(_data[row1,0]);
int day2 = int.Parse(_data[row2,0]);
return day1.CompareTo(day2);
}
}
Note that this includes a reference to the 2D array to be sorted, and parses the elements for sorting as necessary.
Then you need to create a 1D array of indices, which is what you are actually going to sort. (You can't sort a 2D array, but you CAN sort a 1D array of indices that reference the rows of the 2D array.)
public static string[,] SortData(string[,] data)
{
int[] indexer = Enumerable.Range(0, data.GetLength(0)).ToArray();
var comparer = new WeatherComparer(data);
Array.Sort(indexer, comparer);
string[,] result = new string[data.GetLength(0), data.GetLength(1)];
for (int row = 0; row < indexer.Length; ++row)
{
int dest = indexer[row];
for (int col = 0; col < data.GetLength(1); ++col)
result[dest, col] = data[row, col];
}
return result;
}
Then you can call SortData to sort the data:
public static void Main()
{
string[,] weatherDataArray = new string[6, 5]
{
{ "3", "20", "0", "12", "cumulus" },
{ "1", "20", "0", "11", "none" },
{ "23", "15", "0", "8", "none" },
{ "4", "12", "0", "1", "cirrus" },
{ "12", "20", "0", "12", "cumulus" },
{ "9", "15", "2", "11", "none" }
};
var sortedWeatherData = SortData(weatherDataArray);
for (int i = 0; i < sortedWeatherData.GetLength(0); ++i)
{
for (int j = 0; j < sortedWeatherData.GetLength(1); ++j)
Console.Write(sortedWeatherData[i,j] + ", ");
Console.WriteLine();
}
}
Output:
1, 20, 0, 11, none,
3, 20, 0, 12, cumulus,
12, 20, 0, 12, cumulus,
9, 15, 2, 11, none,
23, 15, 0, 8, none,
4, 12, 0, 1, cirrus,
Note that this code does not contain any error checking - it assumes there are no nulls in the data, and that all the parsed data is in fact parsable. You might want to add appropriate error handling.
Try it on .NET Fiddle: https://dotnetfiddle.net/mwXyMs
I would suggest parsing the data into objects that can be sorted by conventional methods. Like using LINQ:
myObjects.OrderBy(obj => obj.Property1)
.ThenBy(obj=> obj.Property2);
Treating data as a table of strings will just make processing more difficult, since at every step you would need to parse values, handle potential errors since a string may be empty or contain an invalid value etc. It is a much better design to do all this parsing and error handling once when the data is read, and convert it to text-form again when writing it to disk or handing it over to the next system.
If this is a legacy system with lots of parts that handle the data in text-form I would still argue to parse the data first, and do it in a separate module so it can be reused. This should allow the other parts to be rewritten part by part to use the object format.
If this is completely infeasible you either need to convert the data to a jagged array, i.e. string[][]. Or write your own sorting that can swap rows in a multidimensional array.
I had fun trying to make something better than the accepted answer, and I think I did.
Reasons it is better:
Which columns it uses to sort and whether in an ascending or descending order is not hardcoded, but passed in as parameters. In the post, I understood that they might change their mind in the future as to how to sort the data.
It supports sorting by columns that do not contain numbers, for if they ever want to sort by the name columns.
In my testing, for large data, it is much faster and allocates less memory.
Reasons it is faster:
It never parses the same index of Data twice. It caches the numbers.
When copying, it uses Span.CopyTo instead of indeces.
It doesn't create a new Data array, it sorts the rows in place. This also means it won't copy the rows that are already in the correct spots.
Here's the usage:
DataSorter.SortDataWithSortAguments(array, (1, false), (0, true));
And here's the code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.InteropServices;
namespace YourNamespace;
public static class DataSorter
{
public static void SortDataWithSortAguments(string[,] Data, params (int columnIndex, bool ascending)[] sortingParams)
{
if (sortingParams.Length == 0)
{
return;
// maybe throw an exception instead? depends on what you want
}
if (sortingParams.Length > 1)
{
var duplicateColumns =
from sortingParam in sortingParams
group false by sortingParam.columnIndex
into sortingGroup
where sortingGroup.Skip(1).Any()
select sortingGroup.Key;
var duplicateColumnsArray = duplicateColumns.ToArray();
if (duplicateColumnsArray.Length > 0)
{
throw new ArgumentException($"Cannot sort by the same column twice. Duplicate columns are: {string.Join(", ", duplicateColumnsArray)}");
}
}
for (int i = 0; i < sortingParams.Length; i++)
{
int col = sortingParams[i].columnIndex;
if (col < 0 || col >= Data.GetLength(1))
{
throw new ArgumentOutOfRangeException($"Column index {col} is not within range 0 to {Data.GetLength(1)}");
}
}
int[] linearRowIndeces = new int[Data.GetLength(0)];
for (int i = 0; i < linearRowIndeces.Length; i++)
{
linearRowIndeces[i] = i;
}
Span<int> sortedRows = SortIndecesByParams(Data, sortingParams, linearRowIndeces);
SortDataRowsByIndecesInPlace(Data, sortedRows);
}
private static float[]? GetColumnAsNumbersOrNull(string[,] Data, int columnIndex)
{
if (!float.TryParse(Data[0, columnIndex], out float firstNumber))
{
return null;
}
// if the first row of the given column is a number, assume all rows of the column should be numbers as well
float[] column = new float[Data.GetLength(0)];
column[0] = firstNumber;
for (int row = 1; row < column.Length; row++)
{
if (!float.TryParse(Data[row, columnIndex], out column[row]))
{
throw new ArgumentException(
$"Rows 0 to {row - 1} of column {columnIndex} contained numbers, but row {row} doesn't");
}
}
return column;
}
private static Span<int> SortIndecesByParams(
string[,] Data,
ReadOnlySpan<(int columnIndex, bool ascending)> sortingParams,
IEnumerable<int> linearRowIndeces)
{
var (firstColumnIndex, firstAscending) = sortingParams[0];
var firstColumn = GetColumnAsNumbersOrNull(Data, firstColumnIndex);
IOrderedEnumerable<int> sortedRowIndeces = (firstColumn, firstAscending) switch
{
(null, true) => linearRowIndeces.OrderBy(row => Data[row, firstColumnIndex]),
(null, false) => linearRowIndeces.OrderByDescending(row => Data[row, firstColumnIndex]),
(not null, true) => linearRowIndeces.OrderBy(row => firstColumn[row]),
(not null, false) => linearRowIndeces.OrderByDescending(row => firstColumn[row])
};
for (int i = 1; i < sortingParams.Length; i++)
{
var (columnIndex, ascending) = sortingParams[i];
var column = GetColumnAsNumbersOrNull(Data, columnIndex);
sortedRowIndeces = (column, ascending) switch
{
(null, true) => sortedRowIndeces.ThenBy(row => Data[row, columnIndex]),
(null, false) => sortedRowIndeces.ThenByDescending(row => Data[row, columnIndex]),
(not null, true) => sortedRowIndeces.ThenBy(row => column[row]),
(not null, false) => sortedRowIndeces.ThenByDescending(row => column[row])
};
}
return sortedRowIndeces.ToArray();
}
private static void SortDataRowsByIndecesInPlace(string[,] Data, Span<int> sortedRows)
{
Span<string> tempRow = new string[Data.GetLength(1)];
for (int i = 0; i < sortedRows.Length; i++)
{
while (i != sortedRows[i])
{
Span<string> firstRow = MemoryMarshal.CreateSpan(ref Data[i, 0], tempRow.Length);
Span<string> secondRow = MemoryMarshal.CreateSpan(ref Data[sortedRows[i], 0], tempRow.Length);
firstRow.CopyTo(tempRow);
secondRow.CopyTo(firstRow);
tempRow.CopyTo(secondRow);
(sortedRows[i], sortedRows[sortedRows[i]]) = (sortedRows[sortedRows[i]], sortedRows[i]);
}
}
}
}
PS: I should not have spent so much time working on this considering my responsibilities, but it was fun.
I'm new to c# and want to process strings according to the following pattern:
var data = new List<object> { "ABCDEFGHIJKLMNO", 80, "TestMain", "PQRSTUVWXY" };
/*
- if string contains > 5 characters --> Split
- check, which is the longest array from the split
- use the longest split to be an array 2D
*/
// expected result
var new_data = new List<object[]> {
new object[] { "ABCDE", 80, "TestM", "PQRST" },
new object[] { "FGHIJ", " ", "ain", "UVWXY" },
new object[] { "KLMNO", " ", " ", " " }
}
You will have to constrain your List<object> to a List<string>, since you cannot assure a valid conversion back to the original type, once you split it.
var data = new List<object> { "ABCDEFGHIJKLMNO", 80, "TestMain", "PQRSTUVWXY" };
List<string> stringData = data.Select(o => o.ToString()).ToList();
const int maxCharacters = 5;
int nrOfEntries = data.Count;
List<string[]> result = new List<string[]>();
while (true)
{
bool finished = true;
string[] newRow = new string[nrOfEntries];
for (int i = 0; i < nrOfEntries; i++)
{
string currentString = stringData[i];
if (string.IsNullOrEmpty(currentString))
{
newRow[i] = " ";
continue;
}
int length = currentString.Length;
int charactersToTake = Math.Min(length, maxCharacters);
int charactersRemaining = length - charactersToTake;
newRow[i] = currentString.Substring(0, charactersToTake);
switch (charactersRemaining)
{
case 0:
stringData[i] = null;
break;
default:
stringData[i] = currentString.Substring(charactersToTake, charactersRemaining);
finished = false;
break;
}
}
result.Add(newRow);
if(finished)
break;
}
You could use List<object[]> result, but that list will only contain strings (and will only be useful as such) since there is no way you can convert back arbitrary objects, as stated before.
I would use Linq to solve the problem. (Be sure you have using System.Linq; at the top of your code file!)
First of all, we define a function to break down an object into several strings with length 5 or less or the object itself, if it is not a string.
object[] BreakDownObject(object o)
=> BreakDownObjectToEnumerable(o).ToArray();
IEnmuerable<object> BreakDownObjectToEnumerable(object o)
{
// If object is string, thant yield return every part
// with 5 characters (or less than 5, if necessary,
// for the last one)
if(o is string s)
{
for(int i = 0; i < s.Length; i += maxStringLength)
{
yield return s.Substring(i, Math.Min(s.Length - i, maxStringLength));
}
}
// object is not a string, don't break it up
else
{
yield return o;
}
}
Wie use Substring in Combination with Math.Min. If length - index is smaller than 5, than we use this instead for the substring.
If we use this function on all items of the list we get an array of arrays of object. This array could be interpreted as "columns", because the first index gives us the columns, and the second index the subsequent broken down strings.
var data = new List<object> { "ABCDEFGHIJKLMNO", 80, "TestMain", "PQRSTUVWXY" };
object[][] columns = data.Select(BreakDownObject).ToArray();
Now we want to transpose the array, so rows first. We write a function, that takes an index and our array of arrays and returns the row with that index. (Again I use Linq-IEnumerable for easier creation of the array):
object[] GetRowAtIndex(int index, object[][] columns)
=> GetRowAtIndexAsEnumerable(index, columns).ToArray();
IEnumerable<object> GetRowAtIndexAsEnumerable(int index, object[][] columns)
{
foreach(var column in columns)
{
// Each column has different length,
// if index is less than length, we
// return the item at that index
if(index < column.Length)
{
yield return column[index];
}
// If index is greater or equal length
// we return a string with a single space
// instead.
else
{
yield return " ";
}
}
}
This function also fills up missing items in the columns with a one-space string.
Last but not least, we iterate through the rows, until no column has items left:
List<object[]> GetAllRows(object[][] columns)
=> GetAllRowsAsEnumerable(columns);
Enumerable<object[]> GetAllRowsAsEnumerable(object[][] columns)
{
int index = 0;
while(true)
{
// Check if any column has items left
if(!columns.Any(column => index < column.Length))
{
// No column with items left, left the loop!
yield break;
}
// return the row at index
yield return GetRowAtIndex(index, columns);
// Increase index
++index;
}
}
Put it together as one function:
List<object[]> BreakDownData(List<object> data)
{
object[][] columns = data.Select(BreakDownObject).ToArray();
return GetAllRows(columns);
}
After that, your code would be:
var data = new List<object> { "ABCDEFGHIJKLMNO", 80, "TestMain", "PQRSTUVWXY" };
var new_data = BreakDownData(data);
I want to find range between closest value of this elements.
Delta value between elements. And it would be positive number because its modulus.
class Element {
double DeltaValue;
double ElementValue;
public Element(double n) {
ElementValue = n;
}
static void Main() {
list<Element> ListElements = new list<Elements>;
ListElements.Add(3);
ListElements.Add(10);
ListElements.Add(43);
ListElements.Add(100);
ListElements.Add(30);
ListElements.Add(140);
for(int i = 0; i < ListElements.Count; i++) {
ListElements[i].DeltaValue = //and problem is here
//example as for ListElements[2].DeltaValue will be 13; because 43-30=13;
}
//example as for ListElements[2].DeltaValue will be 13; because 43-30=13;
Just sort the array in increasing order and the smallest difference between the previous and the next element of the current element will solve your problem. Here for last element you can just look at the difference of its previous element.
Should be able to do it in one line with linq via the following:
public static int GetClosestVal(this int[] values, int place)
{
return values.OrderBy(v => Math.Abs(v - values[place])).ToArray()[1];
}
The following outputs 30
var testArray = new [] {3, 10, 43, 100, 30, 140};
Console.Write(testArray.GetClosestVal(2));
Basically speaking you sort by the absolute difference between each item and the chosen item, then grab the second item in the list since the first will always be the item itself (since n-n=0)
Thus the sorted list should be [43, 30, 20, 3, 100, 140]
I'm not sure, whether I understand your question right. If I have, then the following code snippet can help you:
class Program
{
static void Main(string[] args)
{
Elements ListElements = new Elements();
ListElements.ElementValue.Add(3);
ListElements.ElementValue.Add(10);
ListElements.ElementValue.Add(43);
ListElements.ElementValue.Add(100);
ListElements.ElementValue.Add(30);
ListElements.ElementValue.Add(140);
ListElements.CreateDeltaValues();
for (int i = 0; i < ListElements.DeltaValue.Count; i++)
{
Console.WriteLine("ListElement["+i+"]: " + ListElements.DeltaValue[i]);
//example as for ListElements[2].DeltaValue will be 13; because 43-30=13;
}
Console.ReadKey();
}
}
public class Elements
{
public List<double> DeltaValue = new List<double>();
public List<double> ElementValue = new List<double>();
public void CreateDeltaValues()
{
this.ElementValue.Sort();
for (int i = 1; i < this.ElementValue.Count; i++)
{
var deltaValue = this.ElementValue[i] - this.ElementValue[i-1];
this.DeltaValue.Add(deltaValue);
}
}
}
It's a console application, but this code should work also for other app models.
This code generates the following output:
note: BJ Myers comment was useful and was the answer in fact. However, as it was a comment, I couldn't mark that as an answer but I've placed the corrected code (using his advice) at the end of this question.
Original question below continues:
This situation may look weird at first but here is what I intend to do:
Similar to the syntax in Python, instead of creating a multidimensional array (a 2-d array, to be exact), I want to create an array of arrays (a vector of vectors, in fact).
I'm aware that C# will not let me create pointers in safe code, but I'm still curious whether there is a safer way to accomplish this task w/o getting of the safe code limits.
So, I came up with the code below but couldn't figure out how to extract a specific row from the array (as shown between the comment lines).
Is it possible to pass the r'th row at once or do I need to create another temporary storage for r'th row and then pass that temporary vector through?
(System: Windows-10, VS-2013, C#)
using System;
public class Vector {
public double[] data;
public Vector(double[] data) {
this.data = new double[data.GetLength(0)];
this.data = data;
}
}
public class Matrix {
private int row, col;
public Matrix(double[,] data) {
this.row = data.GetLength(0);
this.col = data.GetLength(1);
Vector[] v = new Vector[this.row];
for (int r = 0; r < this.row; r++) {
// ****** this line below ******
v[r] = new Vector(data[r,???]);
// ****** how to extract the r'th row ******
}
}
static void Main(string[] args) {
double[,] data = { { 9.0, 8.0, 7.0 }, { 5.0, 6.0, 4.0 }, { 3.0, 2.0, 2.0 } };
Matrix A = new Matrix(data);
Console.ReadLine();
}
}
The corrected code is below:
using System;
public class Vector {
public double[] data;
public Vector(double[] data) {
this.data = new double[data.GetLength(0)];
this.data = data;
for (int i = 0; i < data.GetLength(0); i++) {
Console.Write("{0: 0.000 }", this.data[i]);
}
Console.WriteLine();
}
}
public class Matrix {
private int row, col;
public Matrix(double[][] data) {
this.row = data.GetLength(0);
this.col = data[0].GetLength(0);
Vector[] v = new Vector[this.row];
for (int r = 0; r < row; r++) {
v[r] = new Vector(data[r]);
}
Console.WriteLine("rows: " + this.row.ToString());
Console.WriteLine("cols: " + this.col.ToString());
}
static void Main(string[] args) {
double[][] data = { new double[] { 9.0, 8.0, 7.0 },
new double[] { 5.0, 6.0, 4.0 },
new double[] { 3.0, 2.0, 2.0 } };
Matrix A = new Matrix(data);
Console.ReadLine();
}
}
Well, you want to make an array class and acess like one? Make an indexer. what is an indexer? - it's a way to make your class accessible like an array.
Look over the link for examples, I'll help you with your specific case.
public class Vector {
public double[] data;
public double this[int i]
{
get
{
// This indexer is very simple, and just returns or sets
// the corresponding element from the internal array.
return data[i];
}
set
{
data[i] = value;
}
}
public Vector(double[] data) {
this.data = new double[data.GetLength(0)];
this.data = data;
}
}
once it's defined like so, this is perfectly valid:
double elementArray = new double[data.GetLength(1)]; // declaring an array, the size of the second dimention of the data array.
for(int i =0; i<data.GetLength(1);i++)
{
elementArray[i] = data[r,i]; // adding all the elements to the list
}
v[r] = new Vector(elementArray);
EDIT: BJ Myers' comment is right, this solution works perfectly for a jagged array too, but make sure that you declare it properly like he mentioned.
EDIT 2: Using a list is pointless here, changed the stracture to an array.
I have an array of arrays - information about selection in Excel using VSTO, where each element means start and end selection position.
For example,
int[][] selection = {
new int[] { 1 }, // column A
new int[] { 6 }, // column F
new int[] { 6 }, // column F
new int[] { 8, 9 } // columns H:I
new int[] { 8, 9 } // columns H:I
new int[] { 12, 15 } // columns L:O
};
Could you please help me to find a way, maybe using LINQ or Extension methods, to remove duplicated elements? I mean: F and F, H:I and H:I, etc.
If you want to use a pure LINQ/extension method solution, then you'll need to define your own implementation of IEqualityComparer for arrays/sequences. (Unless I'm missing something obvious, there's no pre-existing array or sequence comparer in the BCL). This isn't terribly hard however - here's an example of one that should do the job pretty well:
public class SequenceEqualityComparer<T> : IEqualityComparer<IEnumerable<T>>
{
public bool Equals(IEnumerable<T> x, IEnumerable<T> y)
{
return Enumerable.SequenceEqual(x, y);
}
// Probably not the best hash function for an ordered list, but it should do the job in most cases.
public int GetHashCode(IEnumerable<T> obj)
{
int hash = 0;
int i = 0;
foreach (var element in obj)
hash = unchecked((hash * 37 + hash) + (element.GetHashCode() << (i++ % 16)));
return hash;
}
}
The advantage of this is that you can then simply call the following to remove any duplicate arrays.
var result = selection.Distinct(new SequenceEqualityComparer<int>()).ToArray();
Hope that helps.
First you need a way to compare the integer arrays. To use it with the classes in the framework, you do that by making an EquailtyComparer. If the arrays are always sorted, that is rather easy to implement:
public class IntArrayComparer : IEqualityComparer<int[]> {
public bool Equals(int[] x, int[] y) {
if (x.Length != y.Length) return false;
for (int i = 0; i < x.Length; i++) {
if (x[i] != y[i]) return false;
}
return true;
}
public int GetHashCode(int[] obj) {
int code = 0;
foreach (int value in obj) code ^= value;
return code;
}
}
Now you can use an integer array as key in a HashSet to get the unique arrays:
int[][] selection = {
new int[] { 1 }, // column A
new int[] { 6 }, // column F
new int[] { 6 }, // column F
new int[] { 8, 9 }, // columns H:I
new int[] { 8, 9 }, // columns H:I
new int[] { 12, 15 } // columns L:O
};
HashSet<int[]> arrays = new HashSet<int[]>(new IntArrayComparer());
foreach (int[] array in selection) {
arrays.Add(array);
}
The HashSet just throws away duplicate values, so it now contains four integer arrays.