C# : Japanese characters with unicode encoding - c#

The intention of the code is printing unicode as japanese characters to a file
String s = "\u30a2\u30c3\u30d7\u30ed\u30fc\u30c9\u3059\u308b\u30d5\u30a1\u30a4\u30eb\u304c\u6307\u5b9a\u3055\u308c\u3066\u3044\u307e\u305b\u3093";
var Bytes = Encoding.Unicode.GetBytes(s);
string key = Encoding.UTF8.GetString(Encoding.Convert(Encoding.Unicode, Encoding.UTF8, Bytes));
Key is I want to print to file but has the value \u30a2\u30c3\u30d7\u30ed\u30fc\u30c9\u3059\u308b\u30d5\u30a1\u30a4\u30eb\u304c\u6307\u5b9a\u3055\u308c\u3066\u3044\u307e\u305b\u3093 Any ideas whats wrong?

What's wrong is that a string (key) has no notion of the bytes used to store it. In this case, your string is:
String:
アップロードするファイルが指定されていません
this is exactly what
"\u30a2\u30c3\u30d7\u30ed\u30fc\u30c9\u3059\u308b\u30d5\u30a1\u30a4\u30eb\u304c\u6307\u5b9a\u3055\u308c\u3066\u3044\u307e\u305b\u3093"
means. The expression '\u30a3' looks like 2 Unicode bytes, but it actually just means the character 'ア'.
if you save to a UTF-8 file, the bytes written will be:
UTF-8 bytes
File.WriteAllText("temp.txt", "アップロードするファイルが指定されていません", Encoding.UTF8);
The contents will be (in bytes)
E3 82 A2 E3 83 83 E3 83 97 E3 83 AD E3 83 BC E3 83 89 E3 81 99 E3 82 8B E3 83
95 E3 82 A1 E3 82 A4 E3 83 AB E3 81 8C E6 8C 87 E5 AE 9A E3 81 95 E3 82 8C E3
81 A6 E3 81 84 E3 81 BE E3 81 9B E3 82 93
UTF-16 bytes
File.WriteAllText("temp.txt", "アップロードするファイルが指定されていません", Encoding.Unicode);
The contents will be (in bytes)
A2 30 C3 30 D7 30 ED 30 FC 30 C9 30 59 30 8B 30 D5 30 A1 30 A4 30 EB 30 4C 30
07 63 9A 5B 55 30 8C 30 66 30 44 30 7E 30 5B 30 93 30

One doesn't "convert" Unicode to UTF-8 :-/
Unicode, besides being the parent for the entire set of specifications, can be thought of as "simply" defining code-points/characters and the rules of interaction. The UTF-8 encoding is the specific set of rules to map a sequence of Unicode code-points into a sequence of octets (8-bit bytes).
Try this in LINQPad:
String s = "\u30a2\u30c3\u30d7\u30ed";
s.Dump(); // original string
var bytes = Encoding.UTF8.GetBytes(s);
bytes.Dump(); // see UTF-8 encoded byte sequence
string key = Encoding.UTF8.GetString(bytes);
key.Dump(); // contents restored
The UTF-8 exists only in bytes.
Happy coding.

Related

Excel CSV Encoding issues

I have a question about danish characters and open saved file as CSV in Excel. See the code below:
[HttpGet]
[Route("/progress/data.csv")]
[Produces("text/csv")]
public IActionResult GetCSV()
{
StringBuilder sb = new StringBuilder();
sb.AppendLine("æø;2;3;");
Encoding encode = Encoding.UTF8;
return File(encode.GetBytes(sb.ToString()), "text/csv", "data.csv");
}
I am using .NET Core 2.1 and the result of this export is that the two first characters æø are displayed as æà .
I am aware that this is a known problem but I have so far not found a solution for it. During the last 4 hours I have tried at least 15 different ways, including UTF with/without BOM. Manually adding BOM with System.Text.Encoding.UTF8.GetPreamble(), various MemoryStream, StreamWriter solutions, tried using windows-1252 with CodePagesEncodingProvider.Instance.GetEncoding(1252) but nothing works. When open this file up in Excel the result is always soemthing different than expected.
Anyone that has a solution for this?
Well ,the problem is the way of Excel to deal with BOM . You might found out to use a StreamWriter :
StreamWriter defaults to using an instance of UTF8Encoding unless specified otherwise. This instance of UTF8Encoding is constructed without a byte order mark (BOM), so its GetPreamble method returns an empty byte array. The default UTF-8 encoding for this constructor throws an exception on invalid bytes. This behavior is different from the behavior provided by the encoding object in the Encoding.UTF8 property. To specify a BOM and determine whether an exception is thrown on invalid bytes, use a constructor that accepts an encoding object as a parameter, such as StreamWriter(String, Boolean, Encoding) or StreamWriter.
So I just create a custom implementation of IActionResult :
public class Utf8ForExcelCsvResult : IActionResult
{
public string Content{get;set;}
public string ContentType{get;set;}
public string FileName {get;set;}
public Task ExecuteResultAsync(ActionContext context)
{
var Response =context.HttpContext.Response;
Response.Headers["Content-Type"] = this.ContentType;
Response.Headers["Content-Disposition"]=$"attachment; filename={this.FileName}; filename*=UTF-8''{this.FileName}";
using(var sw = new StreamWriter(Response.Body,System.Text.Encoding.UTF8)){
sw.Write(Content);
}
return Task.CompletedTask ;
}
}
When you need open such a csv file using Excel , simply return a Utf8ForExcelCsvResult .
[HttpGet]
[Route("/progress/data.csv")]
[Produces("text/csv")]
public IActionResult MyFileDownload()
// public Utf8ForExcelCsvResult MyFileDownload()
{
StringBuilder sb = new StringBuilder();
sb.AppendLine("æø;2;3;");
sb.AppendLine("გამარჯობა");
sb.AppendLine("ဟယ်လို");
sb.AppendLine("ສະບາຍດີ");
sb.AppendLine("cześć");
sb.AppendLine("こんにちは");
sb.AppendLine("你好");
Console.WriteLine(sb.ToString());
return new Utf8ForExcelCsvResult(){
Content=sb.ToString(),
ContentType="text/csv",
FileName="hello.csv",
};
}
We can use Powershell to inspect the HEX representation of csv file by Format-hex -path .\hello.csv :
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 EF BB BF C3 A6 C3 B8 3B 32 3B 33 3B 0D 0A E1 83 æø;2;3;..á
00000010 92 E1 83 90 E1 83 9B E1 83 90 E1 83 A0 E1 83 AF ááá á¯
00000020 E1 83 9D E1 83 91 E1 83 90 0D 0A E1 80 9F E1 80 ááá..áá
00000030 9A E1 80 BA E1 80 9C E1 80 AD E1 80 AF 0D 0A E0 áºáá­á¯..à
00000040 BA AA E0 BA B0 E0 BA 9A E0 BA B2 E0 BA 8D E0 BA ºªàº°àºàº²àºàº
00000050 94 E0 BA B5 0D 0A 63 7A 65 C5 9B C4 87 0D 0A E3 ີ..czeÅ..ã
00000060 81 93 E3 82 93 E3 81 AB E3 81 A1 E3 81 AF 0D 0A ãã«ã¡ã¯..
00000070 E4 BD A0 E5 A5 BD 0D 0A 你好..
Here the first three bytes EF BB BF are the Byte order marks

C# - ecc-certificate requested with BouncyCastle seems to be invalid in .NET

As it turned out in the comments to this SO-question the source of the problem lies elsewhere so I decided to ask a new question.
I request a certificate from our PKI for a ecc keypair (curve is brainpoolP384r1).
This is done via registration authority that does the proof of possession. After that I attach the private key to the issued certificate using some of the code in this helpful questions/answers: generate-certificate-using-ecdsa-in-c-sharp and translating-elliptic-curve-parameters-bc-to-ms.
After that I store the certificate with the private key in the MY-Store. So far everything works and the certificate is shown as valid in the MMC console.
But if I look at it with certutil -user -store my it results in the following error (unfortunately it is in german but I will explain the errors below):
Seriennummer: 4cce6787580be9db
Aussteller: C=DE, O=TestIt, CN=ManagementCA
Nicht vor: 01.03.2018 08:30
Nicht nach: 29.02.2020 08:30
Antragsteller: CN=test#my.domain
Kein Stammzertifikat
Zertifikathash(sha1): 3cd94f55fa6d1c66eff9ed1cc45649006ac12616
Schlüsselcontainer = {5C4E984A-D2DB-4BE5-BD82-7A6826C4A389}
Eindeutiger Containername: 28083d7c2cef0143c31de128e470b486_6097f4ab-4eeb-4550-91e6-2c748bfb85d3
Anbieter = Microsoft Software Key Storage Provider
Der private Schlüssel eignet sich nicht zum Nur-Text-Export.
Öffentlicher Schlüssel des Zertifikats:
Version: 3
Öffentlicher Schlüssel-Algorithmus:
Algorithmus Objekt-ID: 1.2.840.10045.2.1 ECC
Algorithmusparameter:
0000 30 82 01 40 02 01 01 30 3c 06 07 2a 86 48 ce 3d
0010 01 01 02 31 00 8c b9 1e 82 a3 38 6d 28 0f 5d 6f
0020 7e 50 e6 41 df 15 2f 71 09 ed 54 56 b4 12 b1 da
0030 19 7f b7 11 23 ac d3 a7 29 90 1d 1a 71 87 47 00
0040 13 31 07 ec 53 30 64 04 30 7b c3 82 c6 3d 8c 15
0050 0c 3c 72 08 0a ce 05 af a0 c2 be a2 8e 4f b2 27
0060 87 13 91 65 ef ba 91 f9 0f 8a a5 81 4a 50 3a d4
0070 eb 04 a8 c7 dd 22 ce 28 26 04 30 04 a8 c7 dd 22
0080 ce 28 26 8b 39 b5 54 16 f0 44 7c 2f b7 7d e1 07
0090 dc d2 a6 2e 88 0e a5 3e eb 62 d5 7c b4 39 02 95
00a0 db c9 94 3a b7 86 96 fa 50 4c 11 04 61 04 1d 1c
00b0 64 f0 68 cf 45 ff a2 a6 3a 81 b7 c1 3f 6b 88 47
00c0 a3 e7 7e f1 4f e3 db 7f ca fe 0c bd 10 e8 e8 26
00d0 e0 34 36 d6 46 aa ef 87 b2 e2 47 d4 af 1e 8a be
00e0 1d 75 20 f9 c2 a4 5c b1 eb 8e 95 cf d5 52 62 b7
00f0 0b 29 fe ec 58 64 e1 9c 05 4f f9 91 29 28 0e 46
0100 46 21 77 91 81 11 42 82 03 41 26 3c 53 15 02 31
0110 00 8c b9 1e 82 a3 38 6d 28 0f 5d 6f 7e 50 e6 41
0120 df 15 2f 71 09 ed 54 56 b3 1f 16 6e 6c ac 04 25
0130 a7 cf 3a b6 af 6b 7f c3 10 3b 88 32 02 e9 04 65
0140 65 02 01 01
Länge des öffentlichen Schlüssels: 384 Bits
Öffentlicher Schlüssel: Nicht verwendete Bits = 0
0000 04 0e e2 21 a3 24 11 58 28 f9 12 fe 7a 2d 26 5f
0010 ad 90 cc 79 1c b6 68 3a b0 ff f2 df 68 17 84 cd
0020 5f a7 9e 27 10 00 ea 6a 47 d2 74 9f c4 15 36 d1
0030 98 5e 65 5b 2e 7e 61 d4 16 85 ed 3f 24 6b c1 2c
0040 ef 48 b2 26 77 2b c3 61 05 44 e3 1c 2a 31 cb c1
0050 f6 e1 cc a2 d6 3e d8 ac 36 8f ea e7 df 7d b0 9d
0060 9d
Schlüssel-ID-Hash(rfc-sha1): e90f7f6c93e660db6742585d6dd5327f08e2469b
Schlüssel-ID-Hash(sha1): 1a0440ec89a6b951169c97d7d766c477c5a9128d
Schlüssel-ID-Hash(bcrypt-sha1): 66a3af2d30d59c36337fcf153693e6bc0111c14b
Schlüssel-ID-Hash(bcrypt-sha256): cfdf5d9b466f595f9c23dec4430a19b5134f41c882bea709ab120ed1c2496ce9
Container des öffentlichen Schlüssels:
Öffentlicher Schlüssel-Algorithmus:
Algorithmus Objekt-ID: 1.2.840.10045.2.1 ECC
Algorithmusparameter:
06 09 2b 24 03 03 02 08 01 01 0b
1.3.36.3.3.2.8.1.1.11 brainpoolP384r1
Länge des öffentlichen Schlüssels: 384 Bits
Öffentlicher Schlüssel: Nicht verwendete Bits = 0
0000 04 0e e2 21 a3 24 11 58 28 f9 12 fe 7a 2d 26 5f
0010 ad 90 cc 79 1c b6 68 3a b0 ff f2 df 68 17 84 cd
0020 5f a7 9e 27 10 00 ea 6a 47 d2 74 9f c4 15 36 d1
0030 98 5e 65 5b 2e 7e 61 d4 16 85 ed 3f 24 6b c1 2c
0040 ef 48 b2 26 77 2b c3 61 05 44 e3 1c 2a 31 cb c1
0050 f6 e1 cc a2 d6 3e d8 ac 36 8f ea e7 df 7d b0 9d
0060 9d
Schlüssel-ID-Hash(rfc-sha1): e90f7f6c93e660db6742585d6dd5327f08e2469b
Schlüssel-ID-Hash(sha1): 79d60ac0a75a30e1ba3f07ccc4dbace00610696c
Schlüssel-ID-Hash(bcrypt-sha1): 66a3af2d30d59c36337fcf153693e6bc0111c14b
Schlüssel-ID-Hash(bcrypt-sha256): cfdf5d9b466f595f9c23dec4430a19b5134f41c882bea709ab120ed1c2496ce9
FEHLER: Öffentlicher Schlüssel stimmt nicht mit gespeichertem Schlüsselsatz überein.
Das Testen der Signatur ist fehlgeschlagen.
The last part translates as:
ERROR: Certificate public key does NOT match stored keyset
Signature test FAILED
You can see that the public key itself is identical, just the AlgorithmIdentifier is different. I do not know where the container public key comes from, the certification request matches the AlgorithmIdentifier from the certificate public key shown above.
If I try to create the csr with the AlgorithmIdentifier of the container public key shown above, I get an error in the PKI "encoded key spec not recognized".
All this leeds me to the conclusion that I am doing something wrong with the csr.
I will not post the whole code as it spawns mutliple classes just the part wher the SubjectPublicKeyInfo is constructed as this seems to be the part where things go wrong:
Code that creates valid ASN1 but results in the error above (Signature test failed)
byte[] publicKey = ecPublicKeyParameters.Q.GetEncoded();
string base64PublicKey = Convert.ToBase64String(publicKey);
// This base64PublicKey is send to the server, that handles it with the code below:
var ecPars = TeleTrusTNamedCurves.GetByName("brainpoolP384r1");
ECDomainParameters ecDomPars = new ECDomainParameters(
ecPars.Curve,
ecPars.G,
ecPars.N,
ecPars.H,
ecPars.GetSeed());
var curve = ecDomPars.Curve;
byte[] data = Convert.FromBase64String(base64PublicKey);
var ecPoint = curve.DecodePoint(data);
ECPublicKeyParameters publicKey = new ECPublicKeyParameters(ecPoint, ecDomPars);
SubjectPublicKeyInfo publicKeyInfo = SubjectPublicKeyInfoFactory.CreateSubjectPublicKeyInfo(ecPublicKeyParameters);
Code that creates valid ASN1 that makes the PKI complain:
// Using ECDsaCng directly in the hope that the certificate will be valid for windows:
ECParameters ecParams = ecdsaPair.ExportParameters(false);
ECPoint ecPoint = ecParams.Q;
IEnumerable<byte> blobBytes = ecPoint.X.Concat(ecPoint.Y);
byte[] eccblob = blobBytes.ToArray();
// Sending this to the server where it will be used for SubjectPublicKeyInfo
AlgorithmIdentifier algorithmIdentifier = new AlgorithmIdentifier(new DerObjectIdentifier("1.2.840.10045.2.1"), new DerObjectIdentifier("1.3.36.3.3.2.8.1.1.11"));
SubjectPublicKeyInfo publicKeyInfo = new SubjectPublicKeyInfo(algorithmIdentifier, new DerBitString(eccblob));
How can I create a valid SubjectPublicKeyInfo that satisfies the PKI and the certutil validation?
The problem with the cert-as-created seems to be that the certificate uses explicit curve domain parameters (which the RFCs frown on), and the private key did curve normalization to get back to a named curve, which you seem to have identified given the second approach.
In your "more manual" approach you didn't encode the public key correctly, you need a leading 04 to indicate that you're sending an uncompressed coordinate pair. (04 [x coordinate] [y coordinate]). The CA likely rejected your request because it didn't understand it.
04 0e e2 21 a3 24 11 58 28 f9 12 fe 7a 2d 26 5f
ad 90 cc 79 1c b6 68 3a b0 ff f2 df 68 17 84 cd
5f a7 9e 27 10 00 ea 6a 47 d2 74 9f c4 15 36 d1
98 5e 65 5b 2e 7e 61 d4 16 85 ed 3f 24 6b c1 2c
ef 48 b2 26 77 2b c3 61 05 44 e3 1c 2a 31 cb c1
f6 e1 cc a2 d6 3e d8 ac 36 8f ea e7 df 7d b0 9d
9d
Note the leading 04, and how it has odd length
FWIW: If you're on .NET Core 2.0 you can do this without BouncyCastle via System.Security.Cryptography.X509Certificates.CertificateRequest. That class is also available in the 4.7.2 early access release.

Why I am not able to read VarInt field from binary data using protobuf-net

everyone,
I am using protobuf-net library to serialize-deserialize text data into binary files. I had similar error in the past but then i made a mistake of writing binary data to a text file. This time i am sure that the file is written in Binary mode. While I read the data, I get EndOfStream exception: Attempted to read past the end of the stream.
I have a message header before each object in binary file.
message HeaderMessage {
required double timestamp = 1;
required string ric_code = 2;
required int32 count = 3;
required int32 total_message_size = 4;
}
I am getting exception when i am reading total_message_size field at fixed location
HEADER: 1111 1 1 hk 0
File: 398909440 bytes
Reading data objects:
1073561: 09 e3 a5 9b c4 0c b3 e0 40 12 07 31 30 39 33 2e 48 4b 18 04 20 5a
1073677: 09 e3 a5 9b c4 0c b3 e0 40 12 07 30 32 39 37 2e 48 4b 18 02 20 2d
1073748: 09 e3 a5 9b c4 0c b3 e0 40 12 07 30 32 39 37 2e 48 4b 18 04 20 5a
1073864: 09 e3 a5 9b c4 0c b3 e0 40 12 07 38 31 37 33 2e 48 4b 18 02 20 2d
1073935: 09 e3 a5 9b c4 0c b3 e0 40 12 07 38 31 37 33 2e 48 4b 18 04 20 5b
1074052: 09 e3 a5 9b c4 0c b3 e0 40 12 07 30 32 33 35 2e 48 4b 18 02 20 2d
1074123: 09 e3 a5 9b c4 0c b3 e0 40 12 07 30 36 30 33 2e 48 4b 18 02 20 2d
1074194: 09 e3 a5 9b c4 0c b3 e0 40 12 07 30 36 30 33 2e 48 4b 18 04 20 5b
1074311: 09 e3 a5 9b c4 0c b3 e0 40 12 07 30 32 33 35 2e 48 4b 18 06 20 8a
In the above output, first field is the stream position. Total stream length is 398909440. So its not possible that the stream has reached its end. I tried to print individual fields at the point when it fails to read, I see that the ProtoReader class is always failing to read total_message_size field.
In aboe output, the last row is the culprit where protobuf-net is not able to read the data.
1074311: 09 e3 a5 9b c4 0c b3 e0 40 12 07 30 32 33 35 2e 48 4b 18 06 20 8a
If we split the fields, the data looks as follows:
field1 timestamp field: type: 09 payload: e3 a5 9b c4 0c b3 e0 40
field2 ric_code field: type: 12 payload: 07 30 32 33 35 2e 48 4b
field3 count field: type: 18 payload: 06
field4 total_message_size: type: 20 payload: 8a
the exception is raised while reading the payload of 4th field and the value is 8a. (decimal 138).
Stack trace is as follows:
at ProtoBuf.ProtoReader.TryReadUInt32VariantWithoutMoving(Boolean trimNegative, UInt32& value) in C:\Dev\protobuf-net\protobuf-net\ProtoReader.cs:line 101
at ProtoBuf.ProtoReader.ReadUInt32Variant(Boolean trimNegative) in C:\Dev\protobuf-net\protobuf-net\ProtoReader.cs:line 138
at ProtoBuf.ProtoReader.ReadInt32() in C:\Dev\protobuf-net\protobuf-net\ProtoReader.cs:line 264
at protobuf_test.Program.Main(String[] args) in H:\Personal\Visual Studio 2010\Projects\protobuf-test\protobuf-test\Program.cs:line 80
at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()
what is the issue in reading the value 138? What is the issue in this case?
Regards,
Alok
0x8a is not a valid varint. Varint encoding uses the MSB as a continuation bit, meaning: if the MSB is set, there is at least one more byte expected (it continues until the MSB is not set, combining the remaining 7-bit chunks little-endian style). Consequently, 0x8a cannot exist by itself in a valid varint. 0x8a and something else, sure. You can see this in the wire spec. Please ensure you haven't accidentally cut the end off this individual message, or misreported the length (since I gather each record is individually wrapped with a size prefix).

Regex Exclude Pattern

I'm trying to exclude some data from a string using regex.
var match = Regex.Match(text, #"^(24 47(.*?)0D 0A)$");
The idea is to filter out and data starting with "24 47" and ending "0D 0A"
The source string:
A0 A1 00 02 83 00 83 0D 0A
A0 A1 00 02 84 1B 9F 0D 0A
24 47 50 47 47 41 2C 31 32 31 39 30 37 2E 30 30 30 2C 32 34 30 30 2E 30 30 30 30 2C 4E 2C 31 32 31 30 30 2E 30 30 30 30 2C 45 2C 30 2C 30 30 2C 30 2E 30 2C 30 2E 30 2C 4D 2C 30 2E 30 2C 4D 2C 2C 30 30 30 30 2A 36 35 0D 0A
24 47 50 47 53 41 2C 41 2C 31 2C 2C 2C 2C 2C 2C 2C 2C 2C 2C 2C 2C 2C 30 2E 30 2C 30 2E 30 2C 30 2E 30 2A 33 30 0D 0A
24 47 50 52 4D 43 2C 31 32 31 39 30 37 2E 30 30 30 2C 56 2C 32 34 30 30 2E 30 30 30 30 2C 4E 2C 31 32 31 30 30 2E 30 30 30 30 2C 45 2C 30 30 30 2E 30 2C 30 30 30 2E 30 2C 32 38 30 36 30 36 2C 2C 2C 4E 2A 37 34 0D 0A
24 47 50 56 54 47 2C 30 30 30 2E 30 2C 54 2C 2C 4D 2C 30 30 30 2E 30 2C 4E 2C 30 30 30 2E 30 2C 4B 2C 4E 2A 30 32 0D 0A
But I only want this:
A0 A1 00 02 83 00 83 0D 0A
A0 A1 00 02 84 1B 9F 0D 0A
Your regex requires the entire string to start with "24 47" and end with "0D 0A". You want the multiline option that makes ^ and $ match start/end of each line:
Try
var match = Regex.Match(text, #"^24 47(.*)0D 0A$", RegexOptions.Multiline);
If you want to exclude those lines, then use a negative lookahead:
var match = Regex.Match(text, #"^(?!24 47(.*)0D 0A$).*$", RegexOptions.Multiline);
If you want to find and remove delimited substrings anywhere in a long, contiguous string without line breaks, try this:
resultString = Regex.Replace(subjectString, #"\b24 47(.*?)0D 0A\b", "<removed>");
^ matches the start of a string and $ matches the end. If you are considering that your "24 47" and "0D 0A" is in middle of your string then consider removing ^ and $.
var textFiltered = Regex.Replace(originalText, #"(24 47(.*?)\r\n)", "");
UPDATE
TRY THIS, JUST TESTED:
string replace = Regex.Replace(input, #"(24 47(.*?)0D 0A *(\r\n)*)", "", RegexOptions.Multiline);
Do you need Replace instead of Match?
text = Regex.Replace(text, #"^(24 47(.*?)0D 0A)$", "");
In order to match the special line characters (CR and LF), you have to set the options to Singleline. And you have to replace with empty string.
text = Regex.Replace(text, #"^(24 47(.*?)\r\n)$", "", RegexOptions.Singleline );
See here.
You can iterate over the list of strings, try to match the regex ^24 47.*0D 0A $, and select those strings in case of which the match is not successful. Note the extra space before the $. The example strings you gave end with a space.

How to store/retrieve RSA public/private key

I want to use RSA public key encryption. What is the best way to store or retrieve private and public keys? Is XML a good idea here?
How to get the keys?
RSAParameters privateKey = RSA.ExportParameters(true);
RSAParameters publicKey = RSA.ExportParameters(false);
Because RSAParameters have the following members: D, DP, DQ, Exponent, InverseQ, Modulus, P, Q
Which one is the key?
i wanted to point out something as a response to a comment by ala asking if:
Public Key = modulus + exponent
That is exactly correct. There are a few ways of storing this exponent + modulus. The first attempt at a standard was in RFC 3447 (Public-Key Cryptography Standards (PKCS) #1: RSA Cryptography Specifications Version 2.1), which defines a structure for a public key of called RSAPublicKey:
RSAPublicKey ::= SEQUENCE {
modulus INTEGER, -- n
publicExponent INTEGER -- e
}
The same RFC goes on to declare that you should use the DER flavor of ASN.1 encoding to store the public key. i have a sample public key:
publicExponent: 65537 (it is convention that all RSA public keys use 65537 as their exponent)
modulus: 0xDC 67 FA F4 9E F2 72 1D 45 2C B4 80 79 06 A0 94 27 50 8209 DD 67 CE 57 B8 6C 4A 4F 40 9F D2 D1 69 FB 995D 85 0C 07 A1 F9 47 1B 56 16 6E F6 7F B9 CF 2A 58 36 37 99 29 AA 4F A8 12 E8 4F C7 82 2B 9D 72 2A 9C DE 6F C2 EE 12 6D CF F0 F2 B8 C4 DD 7C 5C 1A C8 17 51 A9 AC DF 08 22 04 9D 2B D7 F9 4B 09 DE 9A EB 5C 51 1A D8 F8 F9 56 9E F8 FB 37 9B 3F D3 74 65 24 0D FF 34 75 57 A4 F5 BF 55
The DER ASN.1 encoding of this public key is:
30 81 89 ;SEQUENCE (0x89 bytes = 137 bytes)
| 02 81 81 ;INTEGER (0x81 bytes = 129 bytes)
| | 00 ;leading zero of INTEGER
| | DC 67 FA
| | F4 9E F2 72 1D 45 2C B4 80 79 06 A0 94 27 50 82
| | 09 DD 67 CE 57 B8 6C 4A 4F 40 9F D2 D1 69 FB 99
| | 5D 85 0C 07 A1 F9 47 1B 56 16 6E F6 7F B9 CF 2A
| | 58 36 37 99 29 AA 4F A8 12 E8 4F C7 82 2B 9D 72
| | 2A 9C DE 6F C2 EE 12 6D CF F0 F2 B8 C4 DD 7C 5C
| | 1A C8 17 51 A9 AC DF 08 22 04 9D 2B D7 F9 4B 09
| | DE 9A EB 5C 51 1A D8 F8 F9 56 9E F8 FB 37 9B 3F
| | D3 74 65 24 0D FF 34 75 57 A4 F5 BF 55
| 02 03 ;INTEGER (0x03 = 3 bytes)
| | 01 00 01 ;hex for 65537. see it?
If you take that entire above DER ASN.1 encoded modulus+exponent:
30 81 89 02 81 81 00 DC 67 FA
F4 9E F2 72 1D 45 2C B4 80 79 06 A0 94 27 50 82
09 DD 67 CE 57 B8 6C 4A 4F 40 9F D2 D1 69 FB 99
5D 85 0C 07 A1 F9 47 1B 56 16 6E F6 7F B9 CF 2A
58 36 37 99 29 AA 4F A8 12 E8 4F C7 82 2B 9D 72
2A 9C DE 6F C2 EE 12 6D CF F0 F2 B8 C4 DD 7C 5C
1A C8 17 51 A9 AC DF 08 22 04 9D 2B D7 F9 4B 09
DE 9A EB 5C 51 1A D8 F8 F9 56 9E F8 FB 37 9B 3F
D3 74 65 24 0D FF 34 75 57 A4 F5 BF 55 02 03 01
00 01
and you PEM encode it (i.e. base64):
MIGJAoGBANxn+vSe8nIdRSy0gHkGoJQnUIIJ3WfOV7hsSk9An9LRafuZXY
UMB6H5RxtWFm72f7nPKlg2N5kpqk+oEuhPx4IrnXIqnN5vwu4Sbc/w8rjE
3XxcGsgXUams3wgiBJ0r1/lLCd6a61xRGtj4+Vae+Ps3mz/TdGUkDf80dV
ek9b9VAgMBAAE=
It's a convention to wrap that base64 encoded data in:
-----BEGIN RSA PUBLIC KEY-----
MIGJAoGBANxn+vSe8nIdRSy0gHkGoJQnUIIJ3WfOV7hsSk9An9LRafuZXY
UMB6H5RxtWFm72f7nPKlg2N5kpqk+oEuhPx4IrnXIqnN5vwu4Sbc/w8rjE
3XxcGsgXUams3wgiBJ0r1/lLCd6a61xRGtj4+Vae+Ps3mz/TdGUkDf80dV
ek9b9VAgMBAAE=
-----END RSA PUBLIC KEY-----
And that's how you get an have a PEM DER ASN.1 PKCS#1 RSA Public key.
The next standard was RFC 4716 (The Secure Shell (SSH) Public Key File Format). They included an algorithm identifier (ssh-rsa), before the exponent and modulus:
string "ssh-rsa"
mpint e
mpint n
They didn't want to use DER ASN.1 encoding (as it is horrendously complex), and instead opted for 4-byte length prefixing:
00000007 ;7 byte algorithm identifier
73 73 68 2d 72 73 61 ;"ssh-rsa"
00000003 ;3 byte exponent
01 00 01 ;hex for 65,537
00000080 ;128 byte modulus
DC 67 FA F4 9E F2 72 1D 45 2C B4 80 79 06 A0 94
27 50 82 09 DD 67 CE 57 B8 6C 4A 4F 40 9F D2 D1
69 FB 99 5D 85 0C 07 A1 F9 47 1B 56 16 6E F6 7F
B9 CF 2A 58 36 37 99 29 AA 4F A8 12 E8 4F C7 82
2B 9D 72 2A 9C DE 6F C2 EE 12 6D CF F0 F2 B8 C4
DD 7C 5C 1A C8 17 51 A9 AC DF 08 22 04 9D 2B D7
F9 4B 09 DE 9A EB 5C 51 1A D8 F8 F9 56 9E F8 FB
37 9B 3F D3 74 65 24 0D FF 34 75 57 A4 F5 BF 55
Take the entire above byte sequence and base-64 encode it:
AAAAB3NzaC1yc2EAAAADAQABAAAAgNxn+vSe8nIdRSy0gHkGoJQnUIIJ3WfOV7hs
Sk9An9LRafuZXYUMB6H5RxtWFm72f7nPKlg2N5kpqk+oEuhPx4IrnXIqnN5vwu4S
bc/w8rjE3XxcGsgXUams3wgiBJ0r1/lLCd6a61xRGtj4+Vae+Ps3mz/TdGUkDf80
dVek9b9V
And wrap it in the OpenSSH header and trailer:
---- BEGIN SSH2 PUBLIC KEY ----
AAAAB3NzaC1yc2EAAAADAQABAAAAgNxn+vSe8nIdRSy0gHkGoJQnUIIJ3WfOV7hs
Sk9An9LRafuZXYUMB6H5RxtWFm72f7nPKlg2N5kpqk+oEuhPx4IrnXIqnN5vwu4S
bc/w8rjE3XxcGsgXUams3wgiBJ0r1/lLCd6a61xRGtj4+Vae+Ps3mz/TdGUkDf80
dVek9b9V
---- END SSH2 PUBLIC KEY ----
Note: That OpenSSH uses four dashes with a space (---- ) rather than five dashes and no space (-----).
The next standard was RFC 2459 (Internet X.509 Public Key Infrastructure Certificate and CRL Profile). They took the PKCS#1 public key format:
RSAPublicKey ::= SEQUENCE {
modulus INTEGER, -- n
publicExponent INTEGER -- e
}
and extended it to include an algorithm identifier prefix (in case you want to use a public key encryption algorithm other than RSA):
SubjectPublicKeyInfo ::= SEQUENCE {
algorithm AlgorithmIdentifier,
subjectPublicKey RSAPublicKey }
The "Algorithm Identifier" for RSA is 1.2.840.113549.1.1.1, which comes from:
1 - ISO assigned OIDs
1.2 - ISO member body
1.2.840 - USA
1.2.840.113549 - RSADSI
1.2.840.113549.1 - PKCS
1.2.840.113549.1.1 - PKCS-1
The X.509 is an awful standard, that defines a horribly complicated way of encoding an OID into hex, but in the end the DER ASN.1 encoding of an X.509 SubjectPublicKeyInfo RSA Public key is:
30 81 9F ;SEQUENCE (0x9f bytes = 159 bytes)
| 30 0D ;SEQUENCE (0x0d bytes = 13 bytes)
| | 06 09 ;OBJECT_IDENTIFIER (0x09 = 9 bytes)
| | 2A 86 48 86 ;Hex encoding of 1.2.840.113549.1.1
| | F7 0D 01 01 01
| | 05 00 ;NULL (0 bytes)
| 03 81 8D 00 ;BIT STRING (0x8d bytes = 141 bytes)
| | 30 81 89 ;SEQUENCE (0x89 bytes = 137 bytes)
| | | 02 81 81 ;INTEGER (0x81 bytes = 129 bytes)
| | | 00 ;leading zero of INTEGER
| | | DC 67 FA
| | | F4 9E F2 72 1D 45 2C B4 80 79 06 A0 94 27 50 82
| | | 09 DD 67 CE 57 B8 6C 4A 4F 40 9F D2 D1 69 FB 99
| | | 5D 85 0C 07 A1 F9 47 1B 56 16 6E F6 7F B9 CF 2A
| | | 58 36 37 99 29 AA 4F A8 12 E8 4F C7 82 2B 9D 72
| | | 2A 9C DE 6F C2 EE 12 6D CF F0 F2 B8 C4 DD 7C 5C
| | | 1A C8 17 51 A9 AC DF 08 22 04 9D 2B D7 F9 4B 09
| | | DE 9A EB 5C 51 1A D8 F8 F9 56 9E F8 FB 37 9B 3F
| | | D3 74 65 24 0D FF 34 75 57 A4 F5 BF 55
| | 02 03 ;INTEGER (0x03 = 3 bytes)
| | | 01 00 01 ;hex for 65537. see it?
You can see in the decoded ASN.1 how they just prefixed the old RSAPublicKey with an OBJECT_IDENTIFIER.
Taking the above bytes and PEM (i.e. base-64) encoding them:
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDcZ/r0nvJyHUUstIB5BqCUJ1CC
Cd1nzle4bEpPQJ/S0Wn7mV2FDAeh+UcbVhZu9n+5zypYNjeZKapPqBLoT8eCK51y
Kpzeb8LuEm3P8PK4xN18XBrIF1GprN8IIgSdK9f5SwnemutcURrY+PlWnvj7N5s/
03RlJA3/NHVXpPW/VQIDAQAB
The standard is then to wrap this with a header similar to RSA PKCS#1, but without the "RSA" (since it could be something other than RSA):
-----BEGIN PUBLIC KEY-----
MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDcZ/r0nvJyHUUstIB5BqCUJ1CC
Cd1nzle4bEpPQJ/S0Wn7mV2FDAeh+UcbVhZu9n+5zypYNjeZKapPqBLoT8eCK51y
Kpzeb8LuEm3P8PK4xN18XBrIF1GprN8IIgSdK9f5SwnemutcURrY+PlWnvj7N5s/
03RlJA3/NHVXpPW/VQIDAQAB
-----END PUBLIC KEY-----
And that's how you invent an X.509 SubjectPublicKeyInfo/OpenSSL PEM public key format.
That doesn't stop the list of standard formats for an RSA public key. Next is the proprietary public key format used by OpenSSH:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgNxn+vSe8nIdRSy0gHkGoJQnUIIJ3WfOV7hs Sk9An9LRafuZXYUMB6H5RxtWFm72f7nPKlg2N5kpqk+oEuhPx4IrnXIqnN5vwu4Sbc/w8rjE3XxcGsgXUams3wgiBJ0r1/lLCd6a61xRGtj4+Vae+Ps3mz/TdGUkDf80dVek9b9V
Which is actually the SSH public key format above, but prefixed with ssh-rsa, rather than wrapped in ---- BEGIN SSH2 PUBLIC KEY ----/---- END SSH2 PUBLIC KEY ----.
This is where the ease of the XML RSAKeyValue public key comes in:
Exponent: 0x 010001 base64 encoded is AQAB
Modulus: 0x 00 dc 67 fa f4 9e f2 72 1d 45 2c b4 80 79 06 a0 94 27 50 82 09 dd 67 ce 57 b8 6c 4a 4f 40 9f d2 d1 69 fb 99 5d 85 0c 07 a1 f9 47 1b 56 16 6e f6 7f b9 cf 2a 58 36 37 99 29 aa 4f a8 12 e8 4f c7 82 2b 9d 72 2a 9c de 6f c2 ee 12 6d cf f0 f2 b8 c4 dd 7c 5c 1a c8 17 51 a9 ac df 08 22 04 9d 2b d7 f9 4b 09 de 9a eb 5c 51 1a d8 f8 f9 56 9e f8 fb 37 9b 3f d3 74 65 24 0d ff 34 75 57 a4 f5 bf 55 base64 encoded is ANxn+vSe8nIdRSy0gHkGoJQnUIIJ3WfOV7hsSk9An9LRafuZXYUMB6H5RxtWFm72f7nPKlg2N5kpqk+oEuhPx4IrnXIqnN5vwu4Sbc/w8rjE3XxcGsgXUams3wgiBJ0r1/lLCd6a61xRGtj4+Vae+Ps3mz/TdGUkDf80dVek9b9V.
This means the XML is:
<RSAKeyValue>
<Modulus>ANxn+vSe8nIdRSy0gHkGoJQnUIIJ3WfOV7hsSk9An9LRafuZXYUMB6H5RxtWFm72f7nPKlg2N5kpqk+oEuhPx4IrnXIqnN5vwu4Sbc/w8rjE3XxcGsgXUams3wgiBJ0r1/lLCd6a61xRGtj4+Vae+Ps3mz/TdGUkDf80dVek9b9V</Modulus>
<Exponent>AQAB</Exponent>
</RSAKeyValue>
Much simpler. A downside is that it doesn't wrap, copy, paste, as nicely as (i.e. Xml is not as user friendly as):
-----BEGIN RSA PUBLIC KEY-----
MIGJAoGBANxn+vSe8nIdRSy0gHkGoJQnUIIJ3WfOV7hsSk9An9LRafuZXY
UMB6H5RxtWFm72f7nPKlg2N5kpqk+oEuhPx4IrnXIqnN5vwu4Sbc/w8rjE
3XxcGsgXUams3wgiBJ0r1/lLCd6a61xRGtj4+Vae+Ps3mz/TdGUkDf80dV
ek9b9VAgMBAAE=
-----END RSA PUBLIC KEY-----
But it makes a great neutral storage format.
See also
Translator, Binary: Great for decoding and encoding base64 data
ASN.1 JavaScript decoder: Great for decoding ASN.1 encoded hex data (that you get from Translator, Binary
Microsoft ASN.1 Documentation: Describes the Distinguished Encoding Rules (DER) used for ASN.1 structures (you won't find a better set of documentation anywhere else; i would argue Microsoft's is not only real documentation)
What I have done successfully is to store the keys as XML. There are two methods in RSACryptoServiceProvider: ToXmlString and FromXmlString. The ToXmlString will return an XML string containing either just the public key data or both the public and private key data depending on how you set its parameter. The FromXmlString method will populate the RSACryptoServiceProvider with the appropriate key data when provided an XML string containing either just the public key data or both the public and private key data.
Use a existing standard format, like PEM. Your crypto library should provide functions to load and save keys from files in PEM format.
Exponent and Modulus are the Public key. D and Modulus are the Private key. The other values allow faster computation for the holder of the Private key.
The public key is identified by Modulus and Exponent. The private key is identified by the other members.
I suppose answer from #Ian Boyd would not precisely, the format should be SSH2, instead of OpenSSH, as the RFC4716 defined for SSH2, OpenSSH format have is proprietary:
Note: That OpenSSH uses four dashes with a space (---- ) rather than five dashes and no space (-----).
Is XML a good idea here?
Normally Private keys are stored in HSM's/smart card. This provides a good security.

Categories

Resources