How do you read a UTF8 Arabic text file in Metro? - c#

I'm using the following code to read the contents of the text file. The file is encoded in some sort of Utf8 format:
String File = "ms-appx:///Arabic/file.txt";
contents = await Windows.Storage.PathIO.ReadTextAsync(File, Windows.Storage.Streams.UnicodeEncoding.Utf8);
But the above gives me the error:
WinRT information: No mapping for the Unicode character exists in the target multi-byte code page.
Any ideas what I'm doing wrong here?
Thanks

I had a similar issue trying to read text files that contained certain characters (’, °, –) in a file that was using "Western European (Windows) - Codepage 1252" encoding.
The solution in my case was to force Visual Studio to save the files using UTF-8 encoding.
Open the file in Visual Studio
File > Advanced Save Options... >
Change the encoding to "Unicode (UTF-8 with signature) - Codepage 65001"
Save the file

Try using Windows.Storage.Streams.DataReader:
StorageFolder folder =
Windows.ApplicationModel.Package.Current.InstalledLocation;
StorageFile file = await folder.GetFileAsync("ms-appx:///Arabic/file.txt");
var stream = (await file.OpenAsync(FileAccessMode.Read));
Windows.Storage.Streams.DataReader mreader =
new Windows.Storage.Streams.DataReader(stream.GetInputStreamAt(0));
byte[] dgram = new byte[file.Size];
await mreader.LoadAsync((uint)dgram.Length);
mreader.ReadBytes(dgram);
Hope it helps.

Related

RichTextBox shows results in Chinese?

Trying to import PlainText file with English characters using a RichTextBox in C# with UWP and VS 2017. Imports fine except all the characters are Chinese. I have to use a StorageFile class for the file because that's the only one that works with UWP file privacy issues. I tried all TexSetOptions with no success and can't find a way to specify format in either the stream or rtb. Here's the code:
StorageFile file = await StorageFile.GetFileFromPathAsync(filePath));
IRandomAccessStream stream = await file.OpenAsync(Windows.Storage.FileAccessMode.Read);
/* NOTE: RichTextBox (Name="editor") is defined in Xaml */
editor.Document.LoadFromStream(Windows.UI.Text.TextSetOptions.ApplyRtfDocumentDefaults, stream);
As noted in the comments, this is due to an encoding mismatch. The API expects UTF-16 but you have UTF-8 (or maybe ASCII). Consider using FileIO.ReadTextAsync instead. This should auto-detect the encoding, or if it doesn't there is an overload where you can specify it directly.
Note that if you have a file encoded with an ANSI codepage (not any flavour of Unicode) you'll need to convert it first (check other SO posts).
The UWP RichTextBox standard is random access unicode, so just had to adjust the file stream to match.
string x = await FileIO.ReadTextAsync(file);
byte[] bytes = System.Text.Encoding.Unicode.GetBytes(x);
InMemoryRandomAccessStream randomAccessStream = new InMemoryRandomAccessStream();
await randomAccessStream.WriteAsync(bytes.AsBuffer());
IRandomAccessStream stream2 = randomAccessStream; //await file.OpenAsync(Windows.Storage.FileAccessMode.Read);
editor.Document.LoadFromStream(Windows.UI.Text.TextSetOptions.ApplyRtfDocumentDefaults, stream2);

Open .prn file that includes image with right Encoding using c#

I need to open a .prn file and replace some strings.
In the .prn file I included an image, that has a string like this:
When I open the .prn file, C# is not able to read the string as it is.
Probably, it misses some encoding, but not sure which one.
I tried different encodings, but without success.
Here is the code that opens the file in read mode:
string text = File.ReadAllText(root + #"testImage.prn");
c# reads that string in this way
and i'm not able to print the file with the image included.
Thanks in advance for your help.
Most PRN files contain ISO encoding. So, try using ISO encoding and read the file using System.IO.StreamReader with explicitly specifying the desired encoding.
The following example worked perfectly in my case:
System.Text.Encoding encoding = System.Text.Encoding.GetEncoding("ISO-8859-1");
string text;
using (System.IO.StreamReader sr = new System.IO.StreamReader(path, encoding))
{
text = sr.ReadToEnd();
}
In Java, it worked this way for me: Using Stream and charset ISO-8859-1.
Stream<String> stream = Files.lines(Paths.get(filePath), Charset.forName("ISO-8859-1"));

Exception reading *.htm file from local app data (Metro App)

I'm using FileIO.ReadTextAsync() to read an *.htm webpage which I have saved into "ms-appdata:///local", using Utf8 encoding.
But I get a System.ArgumentOutOfRangeException when doing it. Additional information is No mapping for the Unicode character exists in the target multi-byte code page.
Reading an ordinary *.txt file using the same function works fine. What am I doing wrong ?
Edit : Code
async private void Button_Click(object sender, RoutedEventArgs e)
{
StorageFile SF = await StorageFile.GetFileFromApplicationUriAsync(new Uri("ms-appdata:///local/test3.html"));
string html = await FileIO.ReadTextAsync(SF, Windows.Storage.Streams.UnicodeEncoding.Utf8);
}
Change the file encoding using Visual Studio. When I opened the file it had the encoding: "Western European (Windows) - Codepage 1252"
Open the file in Visual Studio
File > Advanced Save Options... >
Change the encoding to "Unicode (UTF-8 with signature) - Codepage 65001"
Save the file
Credits: Advanced save options in visual studio

How to convert the encoded file to another with c#

I need to convert file encoding from the default windows encoding to another specific encoding like "IBM864", and then save the file in the new encoding.
please any one can help me.
Read the input file:
string content = File.ReadAllText(inputFilePath);
Write the content with the specified encoding:
Encoding enc = Encoding.GetEncoding(864); //864 is the codepage for IBM864-Arabic (864)
File.WriteAllText(outputFilePath,content,enc);

How to change text file encoding pragmatically using C# in WinRT/Windows store app

I need to change the encoding of some text file from UTF-8 to ASCII pragmatically in my Windows store app project(c#). On WinRT/Win8.1, we can do this simply by manually open it with notepad and then choose "Save as" menu, but my question is how to do it in code(c#)?
[EDIT]
In WinRT, we can use FileIO.WriteLinesAsync() or FileIO.WriteTextAsync() to save a string to text file, but we can only specify UnicodeEncoding as the encoding. So, the SDK is quite different compare to full fledged .NET SDK.
[EDIT]
I know ASCII is a subset of UTF-8, but I really need to make sure the file encoding is ASCII, because I want to upload the file to a web site and it only accept ASCII encoding txt files(UTF-8/Unicode encoding would cause it complain file format error!);
[EDIT]
Problem solved:
public async void SaveStringToAnsiFile()
{
StorageFile file = await ApplicationData.Current.LocalFolder.CreateFileAsync("test.txt", CreationCollisionOption.ReplaceExisting);
await Windows.Storage.FileIO.WriteBytesAsync(file, Encoding.GetEncoding("gb2312").GetBytes("abcd→1234"));
}
Since ASCII isn't directly supported, you'll need to convert the text to a byte array and use something like WriteBytesAsync (reference). Here's a simple technique. Of course, non-ascii characters won't work (but that's not what you need anyway).
string str = "these are characters";
byte[] bytes = new byte[str.Length];
for (var i = 0; i < str.Length; i++)
{
bytes[i] = Convert.ToByte(str[i]);
}
// create the file here ... then ...
await Windows.Storage.FileIO.WriteBytesAsync(file, bytes);

Categories

Resources