pick a random records from a datatable - c#

I'm trying to create an application which import an excel file and read the data from it and it returns n records randomly as winners according to how many winners the user want from that list. so i read the data from excel file and assign it to a datatable called dt. here is a small overview
thats the first 30 records in the excel which will be imported to dt. now if user key in 10(thats the total number of winners), i need to pick 10 winners "RANDOMLY" from this dt, but as you can see some of them are duplicated for example: in column D, the entry named "H" has 6 rows. now if the application chose 1 of them, the others "H" have to be removed but that is after it has been chosen. removing the duplicates before choosing any of them, will lower the chance for them to win better prizes.

Could you try something like,
dt2 = dt.Clone();
dt.AsEnumerable().Select(x => x["IC_NUMBER"].ToString()).Distinct().ToList().ForEach(x =>
{
DataRow[] dr = dt.Select("IC_NUMBER = '" + x + "'");
dt2.ImportRow(dr[0]);
dr.ToList().ForEach(y => dt.Rows.Remove(y));
dt.AcceptChanges();
});
EDIT:
int totalWinners = 10;
Random rnd = new Random();
dt2 = dt.Clone();
for (int i = 1; i <= totalWinners; i++)
{
//Pick random datarow
DataRow selectedWinner = dt.Rows[rnd.Next(0, dt.Rows.Count - 1)];
//Insert it in the second table
dt2.ImportRow(selectedWinner);
//Retrieve other datarows that have same 'IC NUMBER'
var rows = dt.AsEnumerable().Where(x => x["IC NUMBER"].ToString() ==
selectedWinner["IC NUMBER"].ToString());
//Delete all the rows with the selected IC NUMBER in the first table
rows.ToList().ForEach(y => dt.Rows.Remove(y));
dt.AcceptChanges();
}
Hope this helps...

Related

How to split DataTable rows using a for loop?

I have a DataTable like this:
And I want to write a for loop that shows debit and credit line on its own separate line like this:
Here is my unfinished code:
DataTable dt = new DataTable();
dt.Columns.Add("DEBIT", typeof(string));
dt.Columns.Add("CREDIT", typeof(string));
dt.Columns.Add("AMOUNT", typeof(double));
dt.Rows.Add("Debit1", "Credit1", 10);
dt.Rows.Add("Debit2", "Credit2", 8);
dt.Rows.Add("Debit3", "Credit3", 12);
for (int i=1; i <= dt.Rows.Count; i++)
{
//The first image (datatable) has three debit and credit lines are showing on the same line. Normally the debit line and credit line are showing on its own separate lines.
//With above given datatable I want to construct for loop that shows three debit lines and three credit lines as demonstrated in the second image. In this case it shows 6 lines
}
I would much appreciate it if you could help me with this.
Steps:
Start the loop in reverse (so you can easily insert rows).
Create a new row for the credit and fill it with the relevant data.
Remove the credit data from the original row.
Insert the new column in the position following the original row.
Something like this should do the trick:
for (int i = dt.Rows.Count - 1; i >= 0; i--)
{
var row = dt.Rows[i];
if (!string.IsNullOrEmpty(row["CREDIT"].ToString()))
{
var creditRow = dt.NewRow();
creditRow["CREDIT"] = row["CREDIT"];
creditRow["AMOUNT"] = row["AMOUNT"];
row["CREDIT"] = string.Empty;
dt.Rows.InsertAt(creditRow, i + 1);
}
}
Try it online.

The source contains no DataRows. error when one iteration in for loop

I am making a program in Visual Studio where you can read in an excel file in a specific format and where my program converts the data from the excel file in a different format and stores it in a database table.
Below you can find a part of my code where something strange happens
//copy schema into new datatable
DataTable _longDataTable = _library.Clone();
foreach (DataRow drlibrary in _library.Rows)
{
//count number of variables in a row
string check = drlibrary["Check"].ToString();
int varCount = check.Length - check.Replace("{", "").Length;
int count_and = 0;
if (check.Contains("and") || check.Contains("or"))
{
count_and = Regex.Matches(check, "and").Count;
varCount = varCount - count_and;
}
//loop through number of counted variables in order to add rows to long datatable (one row per variable)
for (int i = 1; i <= varCount; i++)
{
var newRow = _longDataTable.NewRow();
newRow.ItemArray = drlibrary.ItemArray;
string j = i.ToString();
//fill variablename with variable number
if (i < 10)
{
newRow["VariableName"] = "Variable0" + j;
}
else
{
newRow["VariableName"] = "Variable" + j;
}
}
}
When varCount equals 1, I get the following error message when running the program after inserting an excel file
The source contains no DataRows.
I don't know why I can't run the for loop with just one iteration. Anyone who can help me?

data table out of memory exception

We have lots of old excel files that contain quite a bit of data. I am trying to get this data into SQL Server.
I have a C# application that I have used before to upload data from excel to SQL, the code is shown below.
The excel sheet has dates going across the sheet in row 4. The first date is in cell D4. There are id's (strings) going down from cell A5 to A11005. The values are of type double.
I am getting a System.OutOfMemoryException exception. I am surprised though as this error has been thrown on the 10,785th row & 333rd column. Is it really out of memory? I thought this wouldn't be a huge amount of data to be honest for a data table.
11,000 ids, 785 dates so 8,635,000 doubles. Is Visual Studio out of memory? I have a 64 bit PC with 32 GB RAM.
DataTable dt = new DataTable();
dt.Columns.Add("DateTM", typeof(DateTime));
dt.Columns.Add("Id", typeof(string));
dt.Columns.Add("Vtm", typeof(double));
OpenExcelWorkbook(path + fileName, true);
XlWorksheet = (Excel.Worksheet)XlWorkbook.Worksheets["Sheet1"];
Rng = XlWorksheet.UsedRange;
object[,] valueArray = (object[,])Rng.get_Value(Excel.XlRangeValueDataType.xlRangeValueDefault);
XlWorkbook.Close(false);
// dates start in cell D4
DateTime[] dates = new DateTime[valueArray.GetLength(1) - 3];
for (int t = 4; t <= valueArray.GetLength(1); t++)
dates[t - 4] = Convert.ToDateTime(valueArray[4, t]);
// values start from row 5
for (int n = 5; n <= valueArray.GetLength(0); n++)
{
string id = valueArray[n, 1].ToString().Trim();
// dates start from column D
for (int m = 4; m <= valueArray.GetLength(1); m++)
{
double vt = Convert.ToDouble(valueArray[n, m]);
if (vt == -2146826246) // for any #N/A values
vt = -999;
dt.Rows.Add(dates[m - 4], id, vt);
}
}
using (SqlBulkCopy sqlBulk = new SqlBulkCopy(UtilityLibrary.Database.Connections.Myconnection))
{
sqlBulk.BulkCopyTimeout = 0;
sqlBulk.DestinationTableName = "tblMyTbl";
sqlBulk.WriteToServer(dt);
}

Fastest Way to Loop through SQL Database column against Excel Column - C#

I have a sql table with two columns: OldValue and NewValue. I have the same two columns in an excel spreadsheet. I want to find the quickest way to iterate through both the database and excel spreadsheet checking if the OldValue column in the database is the same as the OldValue column in the spreadsheet.
My logic works such that I iterate the entire sql column (333228 records) looking for a match against the excel column which has 153 000 rows. This iteration is performance heavy and takes hours without even finishing - ends up hanging. How can I quickly do this? 153 000 x 333228 = 24 billion iterations which is computationally intensive.
I read here https://codereview.stackexchange.com/questions/47368/looping-through-an-excel-document-in-c but couldn't get what I was looking for. The code works and has already found 500 matches but its slow considering I need to get through 333228 records in the database.
List<sim_info> exel_sims = new List<sim_info>();
Microsoft.Office.Interop.Excel.Application Excel_app = new Microsoft.Office.Interop.Excel.Application();
Microsoft.Office.Interop.Excel.Workbooks work_books = Excel_app.Workbooks;
string excel_file_path = Application.StartupPath + "\\TestSample";
Microsoft.Office.Interop.Excel.Workbook work_book = work_books.Open(excel_file_path);
work_book.SaveAs(excel_file_path + ".csv", Microsoft.Office.Interop.Excel.XlFileFormat.xlCSVWindows);
Microsoft.Office.Interop.Excel.Sheets work_sheets = work_book.Worksheets;
Microsoft.Office.Interop.Excel.Worksheet work_sheet = (Microsoft.Office.Interop.Excel.Worksheet)work_sheets.get_Item(1);
for (int j = 2; j < work_sheet.Rows.Count; j++)
{
try
{
temp_sim_info.msisdn = cell_to_str(work_sheet.Cells[j, 1]).Trim();
temp_sim_info.mtn_new_number = cell_to_str(work_sheet.Cells[j, 8]).Trim();
temp_sim_info.status = cell_to_str(work_sheet.Cells[j, 9]).Trim();
if (temp_sim_info.msisdn.Length < 5 || temp_sim_info.mtn_new_number.Length > 15) //Valid cellphone number length contains 11 digits +27XXXXXXXXX / 14 digits for the new msisdn. This condition checks for invalid cellphone numbers
{
if (zero_count++ > 10)
break;
}
else
{
zero_count = 0;
exel_sims.Add(temp_sim_info);
if (exel_sims.Count % 10 == 0)
{
txtExcelLoading.Text = exel_sims.Count.ToString();
}
}
}
catch
{
if (zero_count++ > 10)
break;
}
// }
txtExcelLoading.Text = exel_sims.Count.ToString();
work_sheet.Columns.AutoFit();
for (int i = 0; i < TestTableInstance.Rows.Count; i++)
{
string db_oldNumbers = "";
string db_CellNumber = "";
if (!TestTableInstance.Rows[i].IsNull("OldNumber"))
db_oldNumbers = TestTableInstance[i].OldNumber;
else
db_oldNumbers = TestTableInstance[i].CellNumber;
if (!TestTableInstance.Rows[i].IsNull("CellNumber"))
db_CellNumber = temp_sim_info.mtn_new_number;
for (int k = 0; k < exel_sims.Count; k++)
{
sim_info sim_Result = exel_sims.Find(x => TestTableInstance[i].CellNumber == x.msisdn);
if (TestTableInstance[i].CellNumber == exel_sims[k].msisdn && sim_Result != null)
{
//If match found then do logic here
}
}
}
}
MessageBox.show("DONE");
TableInstance is a DataSet of the database loaded in memory. The second inner loop iterates the entire DB column for each record until it finds a match in the first row of the OldValue column in the spreadsheet.
My code works. Its tried and tested when I have an excel sheet of 800 rows and a DB table consisting of 1000 records. It completes under 5 minutes. But for hundred thousand records it hangs for hours.
Exactly! Why the heck are you use C# for this? Load the Excel file into a temp table in your DB and do a comparison between your actual SQL table (which allegedly has all the data you have in the Excel file) and the temp table (or View). This kind of comparison should complete in a couple seconds.
select *
from dbtest02.dbo.article d2
left join dbtest01.dbo.article d1 on d2.id=d1.id
The left join shows all rows from the left table "dbtest02.dbo.article", even if there are no matches in the "dbtest01.dbo.article":
OR
select * from dbtest02.dbo.article
except
select * from dbtest01.dbo.article
See the link below for some other ideas of how to do this.
https://www.mssqltips.com/sqlservertip/2779/ways-to-compare-and-find-differences-for-sql-server-tables-and-data/

Fastest coding way to add rows to an ASP.net DataTable

What's the fastest way in term of speed of coding to add rows to a DataTable? I don't need to know neither the name of columns nor datatype. Is it possible to add rows without previously specify the number or name of dataTable columns?
DataTable t = new DataTable();
t.Rows.Add(value1,
value1,
value2,
value3,
...
valueN
);
DataSet ds = new DataSet();
ds.Tables.Add(t);
If the input comes out of a collection, you could loop it to create the DataColumns with the correct type:
var data = new Object[] { "A", 1, 'B', 2.3 };
DataTable t = new DataTable();
// create all DataColumns
for (int i = 0; i < data.Length; i++)
{
t.Columns.Add(new DataColumn("Column " + i, data[i].GetType()));
}
// add the row to the table
t.Rows.Add(data);
To answer your first question: no, you have to have columns defined on the table. You can't just say, "Hey, make a column for all these values." Nothing stopping you from creating columns on the fly, though, as Mr. Schmelter says.
Without knowing the rows or columns (you first need to add columns, without that not possible)
for(int intCount = 0;intCount < dt.Rows.Count; intCount++)
{
for(int intSubCount = 0; intSubCount < dt.Columns.Count; intSubCount++)
{
dt.Rows[intCount][intSubCount] = yourValue; // or assign to something
}
}

Categories

Resources