Create an odd byte array - c#

I am after some help in creating a byte array that will allow the following:
bytes 1-2 : An integer, n, that specifies the length of the file name
3 - n+2 : The name of the file
n+3 - n+10 : The last modified date of the file
n+11 - n+12 : Integer with value 1
n+13 - n+16 : long integer with the length of the file data
n+17 - n+20 : long integer with value 0
n+21 - end : The file's content.
I already have the following code which places the file into the byte array, but this is on the the last portion.
byte[] filebytes;
st.birth_certificate = detail[4];
downloadfile.HTML = detail[4];
downloadfile.fileName = downloadfile.GetFileNameFromUrl(st.birth_certificate);
downloadfile.toLocation = #"c:\temp\" + downloadfile.fileName;
if (downloadfile.DownloadFile())
{
filebytes= File.ReadAllBytes(downloadfile.toLocation);
st.birth_certificate_file = filebytes;
}
Any help would be greatly appreciated.

Better to do with BinaryReader. I'm not sure if numbers are hex values or ascii numbers (or Big/Little Endian) so I'm doing a little guessing. Code may need some minor tweaks :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string URL = "enter you url here";
FileStream sReader = File.OpenRead(URL);
BinaryReader reader = new BinaryReader(sReader);
int filenameLength = reader.ReadInt16();
string filename = Encoding.UTF8.GetString(reader.ReadBytes(filenameLength));
int year = int.Parse(Encoding.UTF8.GetString(reader.ReadBytes(4)));
int month = int.Parse(Encoding.UTF8.GetString(reader.ReadBytes(2)));
int day = int.Parse(Encoding.UTF8.GetString(reader.ReadBytes(2)));
DateTime date = new DateTime(year, month, day);
short number1 = reader.ReadInt16();
int number2 = reader.ReadInt32();
byte[] data = reader.ReadBytes((int)(reader.BaseStream.Length - reader.BaseStream.Position + 1));
}
}
}

Related

Removing extra letter sets in an inconsistent text file using Regex

I have a hard time figuring out how to remove extra letters using Regex.
I have this example below that says that it has 42 of "|" (vertical bars) per line.
|V.7|42|
1|0|1|58|4|4|351|25|8|||1|0||6|3|1000|49|20|430|17|6|0|10|0|1200|25||30|20|20|20|20|0|100|61028|1|0|0|1|1|0|
1|0|1|58|4|4|351|25|8|||1|0||6|3|1000|49|20|430|17|6|0|10|0|1200|25||30|20|20|20|20|0|100|61028|1|0|0|1|1|0|
2|543|2|58|4|4|366|26|9|100||2|200||8|3|1000|49|20|430|17|6|10|21|54|2400|36||30|20|20|20|20|543|150|61028|2|100|1|2|2|0|
3|1230|3|60|5|5|390|26|10|100||3|1500||10|3|1000|49|20|430|17|6|10|32|123|4800|46||30|20|20|20|20|1230|200|61028|3|1000|2|3|3|0|
4|2002|4|61|6|6|424|27|12|100||4|6000||12|4|769|37|15|315|12|4|10|45|200|9600|57||30|20|20|20|20|2002|250|61028|4|5000|3|4|4|0|
5|3306|5|63|7|7|468|29|14|100||5|18000||16|4|556|27|11|208|8|2|10|58|331||69||30|20|20|20|20|3306|300|61027|1|10000|4|5|5|0|
6|4950|6|66|8|8|522|31|17|100||6|||18|4|435|21|9|147|6|1|10|74|495||80||30|20|20|20|20|4950|350|61027|2|30000|5|6|6|0|
7|6947|7|69|10|10|585|33|20|100||7|||20|4|333|17|7|97|4|1|10|90|695||92||20|15|15|15|15|6947|400|61027|3|50000|6|7|7|0|
8|9309|8|73|12|12|658|35|24|100||8|||24|4|286|14|6|73|3|1|10|109|931||105||20|15|15|15|15|9309|450|61026|1|100000|7|8|8|0|
9|12050|9|77|14|14|741|38|28|100||9|||27|5|250|13|5|55|3|1|10|129|1205||117||20|15|15|15|15|12050|500|61026|2|300000|8|9|9|0|
10|15183|10|82|16|16|834|41|33|100|100|10|||29|5|222|11|4|0|0|0|10|151|1366||130|5|20|15|15|15|15|15183|550|61025|1|500000|9|10|10|0|
11|18720|11|87|19|19|936|45|38|100|100|11|||31|5|200|10|4|0|0|0|11|176|1685||143|10|20|15|15|15|15|18720|600|||||||0|
12|21335|12|92|22|22|1048|48|44|100|100|12|||36|5|182|9|4|0|0|0|12|203|2134||157|15|10|15|10|10|10|21335|650|||||||0|
Now I have another one with 45, what I want is to remove the new letters so that it has exactly 42 vertical bars like above.
|V.8|45|
1|0|1|58|4|4|351|25|8|||1|0||6|3|1000|49|20|430|17|6|0|10|0|1200|25||30|20|20|20|20|0|100|61028|1|0|0|1|1|0|5000|40022|1|
2|543|2|58|4|4|366|26|9|100||2|200||8|3|1000|49|20|430|17|6|10|21|54|2400|36||30|20|20|20|20|543|150|61028|2|100|1|2|2|0|25000|61034|1|
3|1230|3|60|5|5|390|26|10|100||3|1500||10|3|1000|49|20|430|17|6|10|32|123|4800|46||30|20|20|20|20|1230|200|61028|3|1000|2|3|3|0|75000|40250|1|
4|2002|4|61|6|6|424|27|12|100||4|6000||12|4|769|37|15|315|12|4|10|45|200|9600|57||30|20|20|20|20|2002|250|61028|4|5000|3|4|4|0|160000|61035|1|
5|3306|5|63|7|7|468|29|14|100||5|18000||16|4|556|27|11|208|8|2|10|58|331||69||30|20|20|20|20|3306|300|61027|1|10000|4|5|5|0|300000|40355|3|
6|4950|6|66|8|8|522|31|17|100||6|||18|4|435|21|9|147|6|1|10|74|495||80||30|20|20|20|20|4950|350|61027|2|30000|5|6|6|0||||
7|6947|7|69|10|10|585|33|20|100||7|||20|4|333|17|7|97|4|1|10|90|695||92||20|15|15|15|15|6947|400|61027|3|50000|6|7|7|0||||
8|9309|8|73|12|12|658|35|24|100||8|||24|4|286|14|6|73|3|1|10|109|931||105||20|15|15|15|15|9309|450|61026|1|100000|7|8|8|0||||
9|12050|9|77|14|14|741|38|28|100||9|||27|5|250|13|5|55|3|1|10|129|1205||117||20|15|15|15|15|12050|500|61026|2|300000|8|9|9|0||||
10|15183|10|82|16|16|834|41|33|100|100|10|||29|5|222|11|4|0|0|0|10|151|1366||130|5|20|15|15|15|15|15183|550|61025|1|500000|9|10|10|0||||
11|18720|11|87|19|19|936|45|38|100|100|11|||31|5|200|10|4|0|0|0|11|176|1685||143|10|20|15|15|15|15|18720|600|||||||0||||
12|21335|12|92|22|22|1048|48|44|100|100|12|||36|5|182|9|4|0|0|0|12|203|2134||157|15|10|15|10|10|10|21335|650|||||||0||||
And I have this code at the moment:
public string Fix(string FileName, int columnsCount)
{
var InputFile = File.ReadLines(FileName).Skip(1).ToArray();
string Result = "";
for(int i = 0; i < InputFile.Length; i++)
{
int FoundMatches = Regex.Matches(Regex.Escape(InputFile[i]), FindWhatTxtBox.Text).Count;
// If too many letters found, trim the rest.
if(FoundMatches > CountTxtBox.Text.Length)
{
string CurrentLine = InputFile[i];
}
}
return Result;
}
As you can see each line has either one to no numbers inside its vertical bar. How can I remove the extra letters?
Do you have to use a RegEx? It can also be done with string manipulation like this:
using System;
using System.Linq;
public class Program
{
public static void Main()
{
string s = "1|0|1|58|4|4|351|25|8|||1|0||6|3|1000|49|20|430|17|6|0|10|0|1200|25||30|20|20|20|20|0|100|61028|1|0|0|1|1|0|5000|40022|1|";
var arr = s.Split('|') ;
var retVal = String.Join("|", arr.Take(43));
Console.WriteLine(retVal);
}
}
It takes 43 because the 1st digit seems a counter to me... But you can make it 42 of course. Beware that this code will fail is there are less than 43 entries to work with.
Too simple to use Regex. See code below :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
const string INPUT_FILENAME = #"c:\temp\test.txt";
const string OUTPUT_FILENAME = #"c:\temp\test1.txt";
static void Main(string[] args)
{
StreamReader reader = new StreamReader(INPUT_FILENAME);
StreamWriter writer = new StreamWriter(OUTPUT_FILENAME);
string inputLine = "";
int lineCount = 0;
while ((inputLine = reader.ReadLine()) != null)
{
if (++lineCount == 1)
{
writer.WriteLine(inputLine);
}
else
{
string[] inputArray = inputLine.Split(new char[] {'|'});
writer.WriteLine(string.Join("|", inputArray.Take(43)));
}
}
reader.Close();
writer.Flush();
writer.Close();
}
}
}
Here is a data file, let us keep it easy by only needing 5 items but still using Regex.
Keep your examples small for StackOverflow...one will get more answers.
The below code can be changed to 42 ({0,42}) or any number as needed, but the example will read then write out only 5.
Data File
1|2|3|4|5|6|7|8|9|10
10|9|8|7|6|5|4|3|2|1|0|1|
||||||||||||11|12|
Code To get 0 to 5 Items per line
var data = File.ReadAllText(#"C:\Temp\test.txt");
string pattern = #"^(\d*\|){0,5}";
File.WriteAllLines(#"C:\Temp\testOut.txt",
Regex.Matches(data, pattern, RegexOptions.Multiline)
.OfType<Match>()
.Select(mt => mt.Groups[0].Value));
Resultant File
1|2|3|4|5|
10|9|8|7|6|
|||||

C# Writing Binary Data

I am trying to get some data to write to a binary file. The data consists of multiple values (strings, decimal, ints) that need to be a single string and then written to a binary file.
What I have so far creates the file, but it's putting my string in there as they appear and not converting them to binary, which I assume should look like 1010001010 etc. when I open the file in notepad?
The actual output is Jesse23023130123456789.54321 instead of the binary digits.
Where have I steered myself wrong on this?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
namespace BinaryData
{
class Program
{
static void Main(string[] args)
{
string name = "Jesse";
int courseNum = 230;
int num = 23130;
decimal d = 123456789.54321M;
string combined = name + courseNum + num + d;
FileStream writeStream;
writeStream = new FileStream("BinaryData.dat", FileMode.Create);
BinaryWriter bw = new BinaryWriter(writeStream);
bw.Write(combined);
}
}
}
There's more than one way to do this, but here's a basic approach. After you combine everything into a single string iterate through the string and convert each character into it's binary representation with Convert.ToString(char, 2). ASCII characters normally will be 7 bits or less in length, so you'll need to PadLeft(8, '0') to ensure 8 bits per byte. Then for the reverse you just grab 8 bits at a time and convert it back to its ASCII character. Without padding with leading 0's to ensure eight bits you won't be sure how many bits make up each character in the file.
using System;
using System.Text;
public class Program
{
public static void Main()
{
string name = "Jesse";
int courseNum = 230;
int num = 23130;
decimal d = 123456789.54321M;
string combined = name + courseNum + num + d;
// Translate ASCII to binary
StringBuilder sb = new StringBuilder();
foreach (char c in combined)
{
sb.Append(Convert.ToString(c, 2).PadLeft(8, '0'));
}
string binary = sb.ToString();
Console.WriteLine(binary);
// Translate binary to ASCII
StringBuilder decodedBinary = new StringBuilder();
for (int i = 0; i < binary.Length; i += 8)
{
decodedBinary.Append(Convert.ToChar(Convert.ToByte(binary.Substring(i, 8), 2)));
}
Console.WriteLine(decodedBinary);
}
}
Results:
01001010011001010111001101110011011001010011001000110011001100000011001000110011001100010011001100110000001100010011001000110011001101000011010100110110001101110011100000111001001011100011010100110100001100110011001000110001
Jesse23023130123456789.54321
Fiddle Demo
Here you go:
The main method:
static void Main(string[] args)
{
string name = "Jesse";
int courseNum = 230;
int num = 23130;
decimal d = 123456789.54321M;
string combined = name + courseNum + num + d;
string bitString = GetBits(combined);
System.IO.File.WriteAllText(#"your_full_path_with_exiting_text_file", bitString);
Console.ReadLine();
}
The method returns the bits, 0 and 1 based on your string input of-course:
public static string GetBits(string input)
{
StringBuilder sb = new StringBuilder();
foreach (byte b in Encoding.Unicode.GetBytes(input))
{
sb.Append(Convert.ToString(b, 2));
}
return sb.ToString();
}
If you want to create the .txt file then add the code for it. This example has already a .txt created, so it just needs the full path to write to it.

C# setting length of main string as length of a substring

I have problem with the .length code.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
class Program
{
static void Main(string[] args)
{
string Str_Basics = "AIKdepNCZSIDETe";
int Long_Str_Bas;
string Sub_Str_1;
string Sub_Str_2;
Long_Str_Bas = Str_Basics.Length;
//Provide value for M
int M = 0;
Console.WriteLine("Provide value for M");
M = Convert.ToInt32(Console.ReadLine());
for (int i = 0; i < Long_Str_Bas; i++) ;
// First substring
Sub_Str_1 = Str_Basics.Substring(1, (M - 1));
// Second substring
Sub_Str_2 = Str_Basics.Substring((M + 1),Long_Str_Bas);
Console.WriteLine("Substring is " + Sub_Str_1);
Console.WriteLine("Substring is " + Sub_Str_2);
Console.ReadKey();
}
}
I do not know how to transfer Str_Basics.Length into a cordinates of Sub_Str_2 if anyone could explain me how does the .Length works I would be really thankful.
It sounds like you just want to do this:
// First substring
Sub_Str_1 = Str_Basics.Substring(0, M);
// Second substring
Sub_Str_2 = Str_Basics.Substring(M);

How to improve performance and speed in my code (especially in double.Parse)?

I have testFile.txt file (around 400mg). It contains OHLC stock prices with timeframe of 1 minute.
The structure of it: "stock name, date, time, price open, price high, price low, price close, volume"->"OTHE,20010102,230100,1.9007,1.9007,1.9007,1.9007,4" (it's just example).
My major problem - this code very slow. I measured the speed and found that the critical part is double.Parse part. Is it possible to change the code to increase performance?
My c# parsing code:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using System.Globalization;
namespace ConsoleApplication3
{
class Program
{
static void Main(string[] args)
{
string sourceDir = "D:\\testFile.txt",
outDir = "D:\\result.txt";
Thread.CurrentThread.CurrentCulture = System.Globalization.CultureInfo.InvariantCulture;
using (StreamReader sr = new StreamReader(sourceDir))
{
int divider = 5;
string line = sr.ReadLine();
StreamWriter sw = new StreamWriter(outDir);
List<string> listLine = new List<string>();
List<double> listOpen = new List<double>();
List<double> listHigh = new List<double>();
List<double> listLow = new List<double>();
List<double> listClose = new List<double>();
List<double> listVolume = new List<double>();
DateTime dateTimeOut = new DateTime();
string formatDate = "yyyyMMddHHmmss";
string newLine = "";
double priceOpen, priceHigh, priceLow, priceClose, volume;
//read first line, but don't write it
line = sr.ReadLine();
while (line != null)
{
listLine = line.Split(',').ToList();
dateTimeOut = DateTime.ParseExact(listLine[1] + listLine[2], formatDate, null);
double.TryParse(listLine[3], out priceOpen);
double.TryParse(listLine[4], out priceHigh);
double.TryParse(listLine[5], out priceLow);
double.TryParse(listLine[6], out priceClose);
double.TryParse(listLine[7], out volume);
listOpen.Add(priceOpen);
listHigh.Add(priceHigh);
listLow.Add(priceLow);
listClose.Add(priceClose);
listVolume.Add(volume);
if (dateTimeOut.Minute % divider == 0)
{
newLine = dateTimeOut + "," + listOpen[0] + "," + listHigh.Max() + "," + listLow.Min() + "," + listClose[4] + "," + listVolume.Max();
sw.WriteLine(newLine);
}
line = sr.ReadLine();
}
sr.Close();
}
}
}
}
Upd. The problem is here:
if (dateTimeOut.Minute % divider == 0)
{
newLine = "";
sw.WriteLine(newLine);
}
I do not believe that the Double.Parse() is the bottleneck.
I wrote a test program (shown below). The release build parses one hundred million doubles in less than twenty seconds:
using System;
using System.Diagnostics;
namespace Demo
{
internal class Program
{
private void run()
{
string s = "12345.6789";
double result;
Stopwatch sw = Stopwatch.StartNew();
for (int i = 0; i < 100000000; ++i)
double.TryParse(s, out result);
Console.WriteLine("Took " + sw.Elapsed);
}
private static void Main()
{
new Program().run();
}
}
}
You are using LINQ Max() and Min() functions which iterates through the whole collection. Since they are called thousands of times in a loop, and collection contains millions of elements, it's very inefficient. Instead store min and max values outside the loop and update them on every iteration:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using System.Globalization;
namespace ConsoleApplication3
{
class Program
{
static void Main(string[] args)
{
string sourceDir = "D:\\testFile.txt",
outDir = "D:\\result.txt";
Thread.CurrentThread.CurrentCulture = System.Globalization.CultureInfo.InvariantCulture;
using (StreamReader sr = new StreamReader(sourceDir))
{
int divider = 5;
string line = sr.ReadLine();
StreamWriter sw = new StreamWriter(outDir);
List<string> listLine = new List<string>();
List<double> listOpen = new List<double>();
List<double> listHigh = new List<double>();
List<double> listLow = new List<double>();
List<double> listClose = new List<double>();
List<double> listVolume = new List<double>();
DateTime dateTimeOut = new DateTime();
string formatDate = "yyyyMMddHHmmss";
string newLine = "";
double priceOpen, priceHigh, priceLow, priceClose, volume;
//read first line, but don't write it
line = sr.ReadLine();
double highMax = double.MinValue;
double lowMin = double.MaxValue;
double volumeMax = double.MinValue;
while (line != null)
{
listLine = line.Split(',').ToList();
dateTimeOut = DateTime.ParseExact(listLine[1] + listLine[2], formatDate, null);
double.TryParse(listLine[3], out priceOpen);
double.TryParse(listLine[4], out priceHigh);
double.TryParse(listLine[5], out priceLow);
double.TryParse(listLine[6], out priceClose);
double.TryParse(listLine[7], out volume);
listOpen.Add(priceOpen);
listHigh.Add(priceHigh);
listLow.Add(priceLow);
listClose.Add(priceClose);
listVolume.Add(volume);
/*Here is implementation of accumulative max/min calculation*/
if (highMax < priceHigh)
{
highMax = priceHigh;
}
if (lowMin > priceLow)
{
lowMin = priceLow;
}
if (volumeMax < volume)
{
volumeMax = volume;
}
if (dateTimeOut.Minute % divider == 0)
{
newLine = dateTimeOut + "," + listOpen[0] + "," + highMax + "," + lowMin + "," + listClose[4] + "," + volumeMax;
sw.WriteLine(newLine);
}
line = sr.ReadLine();
}
sr.Close();
}
}
}
}
In this case you even don't need to add parsed values to lists (if you don't have other usages of them), so you can remove lists completely, further saving some memory and time.
double.Parse is very slow because there is a lot of ways to represent double values: 1000; 1000.1; 1e3, 1.353e+34, -23.24e-123 etc.
If you have only one predefined format (and it is likely you have), say 10394.324 without exponensial form support, then you can implement much more efficient custom parser: read character by character from stream, check if it is space, digit or dot and accumulare result or handle the result correspondingly. It is relatively simple to implement and will provide much better performance. I suppose 400MB file can be parsed in less than 10 seconds if your hard drive will allow to read so fast =).
Also I wouldn't recommend using string.Split with such a big amount of strings - it will consume all your memory and make garbage collections to occur often, which probably will slow down your code even more than double.Parse. Instread read stream byte by byte.
One more point to mention is ToList() creates new list and copies (references to) all elements of source collection into it. That is also significant time and memory-consuming unneeded operation.
And finally string concatenation shouldn't be done using '+' operator.
So i think your problem may be in this lines:
line.Split(',').ToList();
newLine = dateTimeOut + "," + listOpen[0] + "," + listHigh.Max() + "," + listLow.Min() + "," + listClose[4] + "," + listVolume.Max();
If running your program consumes all machine memory, then 99% that the problem is here.
Try replacing second line with few consequent calls to sw.Write(); to mitigate '+' operators and implement streaming double parser which won't require string splitting.
The problem was with List<>. I did stupid mistake. I almost forgot about List.Clear(). )))
So, thanks for all, especially to Oleksandr and Matthew.

c# how to convert float to int

I need to convert float to int (single precision, 32 bits) like:
'float: 2 (hex: 40000000) to int: 1073741824'. Any idea how to implement that?
I was looking for it in msdn help but with no result.
float f = ...;
int i = BitConverter.ToInt32(BitConverter.GetBytes(f), 0);
BitConverter.DoubleToInt64Bits, as per the accepted answer of this question.
If the above solution is no good for you (due to it acting upon double/Double rather than float/Single) then see David Heffernan's answer.
David THANKS, that was a short answer of my long search of analogue for Java method: Float.floatToIntBits. Here is the entire code:
static void Main()
{
float tempVar = -27.25f;
int intBits = BitConverter.ToInt32(BitConverter.GetBytes(tempVar), 0);
string input = Convert.ToString(intBits, 2);
input = input.PadLeft(32, '0');
string sign = input.Substring(0, 1);
string exponent = input.Substring(1, 8);
string mantissa = input.Substring(9, 23);
Console.WriteLine();
Console.WriteLine("Sign = {0}", sign);
Console.WriteLine("Exponent = {0}", exponent);
Console.WriteLine("Mantissa = {0}", mantissa);
}
If your aiming for versions less than .Net 4 where BitConverter isn't available, or you want to convert floats to 32 bit ints, use a memory stream:
using System;
using System.IO;
namespace Stream
{
class Program
{
static void Main (string [] args)
{
float
f = 1;
int
i;
MemoryStream
s = new MemoryStream ();
BinaryWriter
w = new BinaryWriter (s);
w.Write (f);
s.Position = 0;
BinaryReader
r = new BinaryReader (s);
i = r.ReadInt32 ();
s.Close ();
Console.WriteLine ("Float " + f + " = int " + i);
}
}
}
It is a bit long winded though.

Categories

Resources