amazonS3client.SelectObjectContentAsync - downloading the large jsonline formate file - unwanted line break - c#

I am trying to download a file content from the S3 bucket using the SelectObjectContentAsync method from AWSSDK for C#.
But there are some unwanted line break(\n) in mid of the raw data.
Data Example :
{"Id":1,"Name":"aaa"}, {"Id":2,"N
\name":"bbb"}
My Code :
var amazonS3Client = new AmazonS3Client(awsAccessKeyId, awsSecretAccessKey, region);
SelectObjectContentRequest selectObjectContentRequest = new SelectObjectContentRequest()
{
Bucket = bucketName,
Key = key,
ExpressionType = ExpressionType.SQL,
Expression = query,
InputSerialization = new InputSerialization()
{
JSON = new JSONInput()
{
JsonType = JsonType.Lines
},
CompressionType = CompressionType.Gzip
},
OutputSerialization = new OutputSerialization()
{
JSON = new JSONOutput()
{
RecordDelimiter = ","
}
}
};
using (var content = amazonS3Client.SelectObjectContentAsync(selectObjectContentRequest).Result.Payload)
{
foreach (var item in content)
{
if (item is RecordsEvent recordsEvent)
{
using (var reader = new StreamReader(recordsEvent.Payload, Encoding.UTF8))
{
using (var file = new StreamWriter(path, true))
{
file.WriteLine(reader.ReadToEnd());
}
}
}
}
}

Related

Double values get converted to string values when the double values has decimal points

I am trying to use CSVHelper library to write data into a MemoryStream and then generate a CSV file using that MemoryStream.
The problem is Double values get converted to weird string values when the double values have decimal points. The expected output and weird output are there at the bottom.
Is anyone know how to overcome this issue? or Is there any mistake in the below code?
public class Foo
{
public Foo()
{
}
public double valOne { get; set; }
public double valTwo { get; set; }
}
public class FooMap : ClassMap<Foo>
{
public FooMap()
{
Map(m => m.valOne).Index(0).Name("Val One");
Map(m => m.valTwo).Index(1).Name("Val Two");
}
}
var records = new List<Foo> {
new Foo{valOne = 3224.12, valTwo = 4122},
new Foo{valOne = 2030.20, valTwo = 5555},
};
var config = new CsvConfiguration(CultureInfo.CurrentCulture) { Delimiter = ",", HasHeaderRecord = true };
using (var memoryStream = new MemoryStream())
using (var writer = new StreamWriter(memoryStream))
using (var csv = new CsvWriter(writer, config))
{
csv.Context.RegisterClassMap<FooMap>();
csv.WriteHeader<Foo>();
csv.NextRecord();
foreach (var record in records)
{
csv.WriteRecord(record);
csv.NextRecord();
}
writer.Flush();
var result = Encoding.UTF8.GetString(memoryStream.ToArray());
byte[] bytes = Encoding.ASCII.GetBytes(result);
return new FileContentResult(bytes, "text/csv")
{
FileDownloadName = "Sample_Report_Name"
};
}
Expected Output:
Val One, Val Two
3224.12,4122
2030.20,5555
Weird Output:
Val One, Val Two
"3224,12",4122
"2030,20",5555
The issue is the CurrentCulture of the computer running the code uses commas instead of periods to indicate the decimal point. Using CultureInfo.InvariantCulture instead of CultureInfo.CurrentCulture should fix the formatting issue.
Also, you can simplify your code by using csv.WriteRecords(records).
var records = new List<Foo> {
new Foo{valOne = 3224.12, valTwo = 4122},
new Foo{valOne = 2030.20, valTwo = 5555},
};
var config = new CsvConfiguration(CultureInfo.CurrentCulture) { Delimiter = ",", HasHeaderRecord = true };
using (var memoryStream = new MemoryStream())
using (var writer = new StreamWriter(memoryStream))
using (var csv = new CsvWriter(writer, config))
{
csv.Context.RegisterClassMap<FooMap>();
csv.WriteRecords(records);
writer.Flush();
var result = Encoding.UTF8.GetString(memoryStream.ToArray());
byte[] bytes = Encoding.ASCII.GetBytes(result);
return new FileContentResult(bytes, "text/csv")
{
FileDownloadName = "Sample_Report_Name"
};
}

csvhelper: how to write a specific "cell" on an existing csv on C#?

I have a customer list in csv format which I'm using to send out emails. I would like to write to the CSV after each row has been executed in order to place a conditional rule. I'm using csvhelper to manipulate the file. Here's the code:
var scan = new StreamReader(myBlob);
var csvv = new CsvReader(scan, CultureInfo.InvariantCulture);
var records = csvv.GetRecords<Records>().ToList();
var scanwriter = new StreamWriter(myBlob4);
var csvwriter = new CsvWriter(scanwriter, CultureInfo.InvariantCulture);
foreach (Records record in records)
{
var from = new EmailAddress("example.com", "John");
var to = new EmailAddress(record.Email, record.Name);
var subject = "exapmple";
var msg = MailHelper.CreateSingleEmail(from, to, subject, txtf, htmlf);
StringBuilder text = new StringBuilder();
text.AppendFormat("sent", record.EmailSent);
csvwriter.WriteField(record.EmailSent);
csvwriter.NextRecord();
var response = await client.SendEmailAsync(msg);
}
However my csv is not appending the "sent" value to the file under the emailsent column. I'm using StringBuilder which might not be helpful in this scenario.
It seems like you are trying to do something more like this.
void Main()
{
var records = new List<SendEmail>
{
new SendEmail{ Email = "example.com", Name = "John" },
new SendEmail{ Email = "example2.com", Name = "Jenny" }
};
var csvwriter = new CsvWriter(Console.Out, CultureInfo.InvariantCulture);
foreach (var record in records)
{
// var from = new EmailAddress("example.com", "John");
// var to = new EmailAddress(record.Email, record.Name);
//
// var subject = "exapmple";
//
// var msg = MailHelper.CreateSingleEmail(from, to, subject, txtf, htmlf);
record.EmailSent = "sent";
csvwriter.WriteRecord(record);
csvwriter.NextRecord();
//var response = await client.SendEmailAsync(msg);
}
}
public class SendEmail
{
public string Email { get; set; }
public string Name { get; set; }
public string EmailSent { get; set; }
}
//using blocks will make sure the streams and disposed and file handles are closed properly,
// **even if an exception is thrown **
using(var scan = new StreamReader(myBlob))
using (var csvv = new CsvReader(scan, CultureInfo.InvariantCulture))
using (var scanwriter = new StreamWriter(myBlob4))
using (var csvwriter = new CsvWriter(scanwriter, CultureInfo.InvariantCulture))
{
var records = csvv.GetRecords<Records>(); //ToList() was not needed or helpful here
foreach (var record in records)
{
var from = new EmailAddress("example.com", "John");
var to = new EmailAddress(record.Email, record.Name);
var subject = "example";
var msg = MailHelper.CreateSingleEmail(from, to, subject, txtf, htmlf);
csvwriter.WriteField($"sent {record.EmailSent}");
csvwriter.NextRecord();
var response = await client.SendEmailAsync(msg);
}
}

How do I serialize multiple items with foreach using JSON in c#?

I am trying to serialize objects from 2 ListViews, ListView#2 is used to display objects that are grouped while ListView#2 displays the said groups. (Selecting a group in #1 displays different set of objects in #2)
JsonSerializer serializer = new JsonSerializer();
using (StreamWriter sw = new StreamWriter(path + "\\data.txt"))
{
foreach (ListViewItem group in lV_groups.Items)
{
foreach (ListViewItem item in lV_items.Items)
{
List<ItemSerObj> itemsObj = new List<ItemSerObj>()
{
new ItemSerObj
{
ItemName = item.SubItems[0].Text,
Value = item.SubItems[1].Text,
Quality = item.SubItems[2].Text,
TimeStamp = item.SubItems[3].Text
}
};
GroupSerObj serializeGroup = new GroupSerObj
{
GroupName = group.SubItems[0].Text,
UpdateRate = group.SubItems[1].Text,
Active = group.SubItems[2].Text,
Items = itemsObj
};
using (JsonWriter writer = new JsonTextWriter(sw))
{
serializer.Serialize(writer, serializeGroup); //Where exception occurs.
}
}
}
}
I am getting "System.ObjectDisposedException: 'Cannot write to a closed TextWriter'" exception.
Simply change it to something like this:
JsonSerializer serializer = new JsonSerializer();
using (StreamWriter sw = new StreamWriter(path + "\\data.txt"))
{
using (JsonWriter writer = new JsonTextWriter(sw))
{
foreach (ListViewItem group in lV_groups.Items)
{
List<ItemSerObj> itemsObj = new List<ItemSerObj>();
foreach (ListViewItem item in lV_items.Items)
{
itemsObj.Add(
new ItemSerObj
{
ItemName = item.SubItems[0].Text,
Value = item.SubItems[1].Text,
Quality = item.SubItems[2].Text,
TimeStamp = item.SubItems[3].Text
});
}
GroupSerObj serializeGroup = new GroupSerObj
{
GroupName = group.SubItems[0].Text,
UpdateRate = group.SubItems[1].Text,
Active = group.SubItems[2].Text,
Items = itemsObj
};
serializer.Serialize(writer, serializeGroup);
}
}
}
Each group iteration creates new List and this list is filled inside inner foreach loop. Later, it is added to GroupSerObj and serialized

Lucene.Net (4.8) AutoComplete / AutoSuggestion

I'd like to implement a searchable index using Lucene.Net 4.8 that supplies a user with suggestions / autocomplete for single words & phrases.
The index has been created successfully; the suggestions are where I've stalled.
Version 4.8 seems to have introduced a substantial number of breaking changes, and none of the available samples I've found work.
Where I stand
For reference, LuceneVersion is this:
private readonly LuceneVersion LuceneVersion = LuceneVersion.LUCENE_48;
Solution 1
I've tried this, but can't get past reader.Terms:
public void TryAutoComplete()
{
var analyzer = new EnglishAnalyzer(LuceneVersion);
var config = new IndexWriterConfig(LuceneVersion, analyzer);
RAMDirectory dir = new RAMDirectory();
using (IndexWriter iw = new IndexWriter(dir, config))
{
Document d = new Document();
TextField f = new TextField("text","",Field.Store.YES);
d.Add(f);
f.SetStringValue("abc");
iw.AddDocument(d);
f.SetStringValue("colorado");
iw.AddDocument(d);
f.SetStringValue("coloring book");
iw.AddDocument(d);
iw.Commit();
using (IndexReader reader = iw.GetReader(false))
{
TermEnum terms = reader.Terms(new Term("text", "co"));
int maxSuggestsCpt = 0;
// will print:
// colorado
// coloring book
do
{
Console.WriteLine(terms.Term.Text);
maxSuggestsCpt++;
if (maxSuggestsCpt >= 5)
break;
}
while (terms.Next() && terms.Term.Text.StartsWith("co"));
}
}
}
reader.Terms no longer exists. Being new to Lucene, it's unclear how to refactor this.
Solution 2
Trying this, I'm thrown an error:
public void TryAutoComplete2()
{
using(var analyzer = new EnglishAnalyzer(LuceneVersion))
{
IndexWriterConfig config = new IndexWriterConfig(LuceneVersion, analyzer);
RAMDirectory dir = new RAMDirectory();
using(var iw = new IndexWriter(dir,config))
{
Document d = new Document()
{
new TextField("text", "this is a document with a some words",Field.Store.YES),
new Int32Field("id", 42, Field.Store.YES)
};
iw.AddDocument(d);
iw.Commit();
using (IndexReader reader = iw.GetReader(false))
using (SpellChecker speller = new SpellChecker(new RAMDirectory()))
{
//ERROR HERE!!!
speller.IndexDictionary(new LuceneDictionary(reader, "text"), config, false);
string[] suggestions = speller.SuggestSimilar("dcument", 5);
IndexSearcher searcher = new IndexSearcher(reader);
foreach (string suggestion in suggestions)
{
TopDocs docs = searcher.Search(new TermQuery(new Term("text", suggestion)), null, Int32.MaxValue);
foreach (var doc in docs.ScoreDocs)
{
System.Diagnostics.Debug.WriteLine(searcher.Doc(doc.Doc).Get("id"));
}
}
}
}
}
}
When debugging, speller.IndexDictionary(new LuceneDictionary(reader, "text"), config, false); throws a The object cannot be set twice! error, which I can't explain.
Any thoughts are welcome.
Clarification
I'd like to return a list of suggested terms for a given input, not the documents or their full content.
For example, if a document contains "Hello, my name is Clark. I'm from Atlanta," and I submit "Atl," then "Atlanta" should come back as a suggestion.
If I am understanding you correctly you may be over-complicating your index design a bit. If your goal is to use Lucene for auto-complete, you want to create an index of the terms you consider complete. Then simply query the index using a PrefixQuery using a partial word or phrase.
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.En;
using Lucene.Net.Documents;
using Lucene.Net.Index;
using Lucene.Net.Search;
using Lucene.Net.Store;
using Lucene.Net.Util;
using System;
using System.Linq;
namespace LuceneDemoApp
{
class LuceneAutoCompleteIndex : IDisposable
{
const LuceneVersion Version = LuceneVersion.LUCENE_48;
RAMDirectory Directory;
Analyzer Analyzer;
IndexWriterConfig WriterConfig;
private void IndexDoc(IndexWriter writer, string term)
{
Document doc = new Document();
doc.Add(new StringField(FieldName, term, Field.Store.YES));
writer.AddDocument(doc);
}
public LuceneAutoCompleteIndex(string fieldName, int maxResults)
{
FieldName = fieldName;
MaxResults = maxResults;
Directory = new RAMDirectory();
Analyzer = new EnglishAnalyzer(Version);
WriterConfig = new IndexWriterConfig(Version, Analyzer);
WriterConfig.OpenMode = OpenMode.CREATE_OR_APPEND;
}
public string FieldName { get; }
public int MaxResults { get; set; }
public void Add(string term)
{
using (var writer = new IndexWriter(Directory, WriterConfig))
{
IndexDoc(writer, term);
}
}
public void AddRange(string[] terms)
{
using (var writer = new IndexWriter(Directory, WriterConfig))
{
foreach (string term in terms)
{
IndexDoc(writer, term);
}
}
}
public string[] WhereStartsWith(string term)
{
using (var reader = DirectoryReader.Open(Directory))
{
IndexSearcher searcher = new IndexSearcher(reader);
var query = new PrefixQuery(new Term(FieldName, term));
TopDocs foundDocs = searcher.Search(query, MaxResults);
var matches = foundDocs.ScoreDocs
.Select(scoreDoc => searcher.Doc(scoreDoc.Doc).Get(FieldName))
.ToArray();
return matches;
}
}
public void Dispose()
{
Directory.Dispose();
Analyzer.Dispose();
}
}
}
Running this:
var indexValues = new string[] { "apple fruit", "appricot", "ape", "avacado", "banana", "pear" };
var index = new LuceneAutoCompleteIndex("fn", 10);
index.AddRange(indexValues);
var matches = index.WhereStartsWith("app");
foreach (var match in matches)
{
Console.WriteLine(match);
}
You get this:
apple fruit
appricot

How to exclude header when writing data to CSV

I am writing my data from a public class to a CSV file. As I want to append my data, I want to exclude the importing of header and only import the data from the class. My code below imports both headers and data. Hope to get help. Thanks.
Record.cs - my class
public class Record
{
public string Name
{
get; set;
}
public DateTime DateOfBirth
{
get; set;
}
}
Form1.cs - my form
public partial class Form1 : Form
{
private List<Record> records;
public Form1()
{
InitializeComponent();
records = new List<Record>();
}
private void Savetocsv_Click(object sender, EventArgs e)
{
var myDocument = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
using (var writer = new StreamWriter(myDocument + "/my-data.csv", append: true))
{
using (var csv = new CsvWriter(writer))
{
csv.WriteRecords(records);
}
}
}
Using the Configuration , you can use the property HasHeaderRecord:
HasHeaderRecord :
Gets or sets a value indicating if the CSV file has a header record.
Default is true.
var records = new List<Foo>
{
new Foo { Id = 1, Name = "one" },
new Foo { Id = 1, Name = "one" },
};
using (var writer = new StreamWriter($"file.csv"))
using (var csv = new CsvWriter(writer, new Configuration { HasHeaderRecord = false }))
{
csv.WriteRecords(records);
}
Result file : "file.csv"
1;one
1;one
Or simply loop on records an write them:
var records = new List<Foo>
{
new Foo { Id = 1, Name = "one" },
new Foo { Id = 1, Name = "one" }
};
using (var writer = new StreamWriter($"file.csv"))
using (var csv = new CsvWriter(writer))
{
foreach (var record in records)
{
csv.WriteRecord(record);
csv.NextRecord();
}
}
The name of the configuration class has changed.
using (var csv = new CsvWriter(outputStream, new CsvConfiguration(CultureInfo.InvariantCulture)
{
HasHeaderRecord = false
}))
Change your writing method as following and then CsvHelper.WriterConfiguration do the trick (note HasHeaderRecord):
private void Savetocsv_Click(object sender, EventArgs e)
{
var myDocument = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
using (var writer = new StreamWriter(myDocument + "/my-data.csv", append: true))
{
using (var csv = new CsvWriter(writer, new Configuration { HasHeaderRecord = false }))
{
csv.WriteRecords(records);
}
}
}
I don't know which CsvWriter you are using, but the one here has a HasHeaderRecord property that you can use to ignore or include headers.
private void Savetocsv_Click(object sender, EventArgs e)
{
var myDocument = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
using (var writer = new StreamWriter(myDocument + "/my-data.csv", append: true))
{
using (var csv = new CsvWriter(writer))
{
csv.Configuration.HasHeaderRecord = true;
csv.WriteRecords(records);
}
}
}
Remove the first row from records before calling:
csv.WriteRecords(records);
(If you need to leave records unchanged, add the headers back again after calling WriteRecords(...).)
private void Savetocsv_Click(object sender, EventArgs e)
{
var myDocument = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
using (var writer = new StreamWriter(myDocument + "/my-data.csv", append: true))
{
using (var csv = new CsvWriter(writer))
{
records.RemoveAt(0); // Removes the header row.
csv.WriteRecords(records);
}
}
}

Categories

Resources