Serialize dynamic list to CSV without header in Servicestack.Text - c#

I'm trying to generate a csv file using CsvSerializer.SerializeToCsv(data), but I want to omit the headers.
I read this question, but this is not working as I'm using a list of dynamic objects.
I've tried:
IEnumerable<dynamic> data = ...;
CsvConfig<object>.OmitHeaders = true;
string csvFile = CsvSerializer.SerializeToCsv(data);
And
IEnumerable<dynamic> data = ...;
CsvConfig<dynamic>.OmitHeaders = true;
string csvFile = CsvSerializer.SerializeToCsv(data);
Both options are serializing the csvFile with headers, which I don't need.

Since I didn't find a way with a library, I opt to do this manually. Something like this worked for me:
var parsedData = new List<string>();
// parse data into comma separated objects
parsedData.AddRange(data.Select(d =>
{
var dProperties = (IDictionary<string, object>)d;
var valuesFixed = dProperties.Values.Select(v => v.ToString().ToRFC4180String());
return string.Join(",", valuesFixed);
}));
var file = string.Join("\r\n", parsedData);
Where FillInnerQuotes is just an extensor method to manage special characters based on rfc4180 standard.
public static string ToRFC4180String(this string value)
{
if(value.Contains("\""))
value = value.Replace("\"", "\"\"");
if(value.Contains("\"")
|| value.Contains("\n")
|| value.Contains("\r")
|| value.Contains("\r\n")
|| value.Contains(","))
return $"\"{value}\"";
return value;
}

Related

C# how to write JSON with enumerated identifiers

I am writing a code to convert Excel to JSON (so far it works).
But I got a problem, I need to number each line that I am writing after the word Match_ (Aka Match_1, Match_2, Match_3).
If you look towards the end of the code, I tried to maybe put For? but than it gives me all Match_i..
How can I use Replace command so I can actually put corresponding numbers after the word Match_?
IP = another string I am adding to the sentence. Ignore it
row[0] = the text its taking as is from the row from the excel
Match_ is not a var, its literally a text taken, I can also write there Oded_ and then it will write Oded_ = (IP string) + (excel text on row[0])
Match_ is a text I am actually trying to replace from within the text, as I cannot do FOR inside the Link Query.
using (var conn = new OleDbConnection(connectionString))
{
conn.Open();
var cmd = conn.CreateCommand();
cmd.CommandText = $"SELECT * FROM [{sheetName}$]";
using (var rdr = cmd.ExecuteReader())
{
if (rdr != null)
{
//LINQ query - when executed will create anonymous objects for each row
var query = rdr.Cast<DbDataRecord>().Select(row => new
{
Match_ = IP + row[0]
});
//Generates JSON from the LINQ query
var json = JsonConvert.SerializeObject(query);
//Write the file to the destination path
for (int i = 1; i<200; i++)
{
json = json.Replace("match_", "match_" + i );
}
File.WriteAllText(destinationPath, json);
}
}
So, after it is assigned query is an IEnumerable<> of your anonymous type that will have 0 to many rows. Those rows are not actually evaluated yet. The important think to remember is that you are making an anonymous type, not an anonymous object, so all enumerations of your result must be of that type, you can't switch one by one.
There are many way to achieve what you want but possibly the most expedient is to include the iterator in your select enumerator, then return a JObject something like this,
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
...
var query = rdr.Cast<DbDataRecord>().Select((row, i) => {
var result = new JObject();
result.Add( $"match_{i}", IP + row[0]);
return result;
});
Then you won't have to do any error prone and costly string manipulation on your JSON, it will already be formatted correctly.
Here is a full working example of this in action,
using System;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using System.Linq;
public class Program
{
public static void Main()
{
var query = Enumerable
.Range(1,5)
.Select( (n, i) =>
{
var result = new JObject();
result.Add($"match_{i}", n);
return result;
});
Console.WriteLine(
JsonConvert.SerializeObject(
query,
Formatting.Indented));
}
}
It is possible to do this with the more modern System.Text.Json but you'll have to embed the work in a writer.
Try regex.
class Program
{
int i = 0;
static void Main(string[] args)
{
string json = "match_ abc match_ def match_ hijmatch_";
string pattern = "match_";
Program p = new Program();
MatchEvaluator myEvaluator = new MatchEvaluator(p.ReplaceCC);
Regex r = new Regex(pattern);
string output = r.Replace(json, myEvaluator);
}
public string ReplaceCC(Match m)
// Replace each Regex cc match with the number of the occurrence.
{
i++;
return m.Value + i.ToString();
}
}

How to change the value of a StringBuilder Parsing through JArray

I have struggled to finish this task, please if anyone can give me a hint I would be so thankful.
My main task is to get data from database using (FOR JSON AUTO) which is working :)
select filed1, field2, field3 from table FOR JSON AUTO;
And then after connecting to Data base I use the StringBuilder() to build a Json Array of objects which is working :)
var jsonResult = new StringBuilder();
if(!r.HasRows)
{
jsonResult.Append("[]");
}
else
{
while(r.Read())
{
jsonResult.Append(r.GetValue(0).ToString());
}
// JArray array = JArray...
}
After that I am trying to change the value of filed1 for each object inside the Json Array
JArray array = JArray.Parse(jsonResult.ToString());
foreach (JObject obj in array.Children<JObject>())
{
foreach (JProperty singleProp in obj.Properties())
{
string name = singleProp.Name;
string value = singleProp.Value.ToString();
if(name.ToString() == "field1")
{
Int64 newID = 1234;
value = newID.ToString();
}
}
}
This is working but My BIG QUESTION is how can I get it changed inside the jsonResult?
You simply have to replace the value that you want to update. Since StringBuilder has a .Replace inbuilt method, you can implement that method.
`JArray arr = JArray.Parse(jsonResult.ToString());
foreach (JObject obj in arr.Children<JObject>())
{
foreach(JProperty singleProp in obj.Properties())
{
string name = singleProp.Name;
string value = singleProp.Value.ToString();
if (name.ToString().Equals("field1")) //good practice
{
Int64 newID = 1234;
jsonResult.Replace(value, newID.ToString());//replacing old value with new value and directly updates jsonResult
}
//not necesssary, explanation is given below
var jsonElement = JsonSerializer.Deserialize<JsonElement>(jsonResult.ToString());
result = JsonSerializer.Serialize(jsonElement, options);
}
}`
And for better formatting, I used JsonSerializer so that your output will look like json object rather than whole string without any lines.
` var options = new JsonSerializerOptions()
{
WriteIndented = true
};
var result = ""
while loop{
jsonResult.Append(r.GetValue(0).ToString());
(Above code)
}
`

Exporting MongoDB Documents to CSV in C#

I want to export a CSV table from the items of an IMongoCollection from MongoDB.Driver using C#.
How would I be able to do this efficiently? I was thinking of doing this by retrieving the documents from the collection and either convert them to a JSON-like format or use a StringBuilder to create the CSV file using and array of PropertyInfo to access the fields of the retrieved object.
Can someone come with an example of how I would be able to do this?
Seems like the obvious way is to get all header data somehow (see further below), and then iterate through the collection and if you were to write by hand (which people don't encourage), string build, writing to file in batches (if your collection were quite large).
HashSet<string> fields = new HashSet<string>();
BsonDocument query = BsonDocument.Parse(filter);
var result = database.GetCollection<BsonDocument>(collection).Find(new BsonDocument());
// Populate fields with all unique fields, see below for examples how.
var csv = new StringBuilder();
string headerLine = string.Join(",", fields);
csv.AppendLine(headerLine);
foreach (var element in result.ToListAsync().Result)
{
string line = null;
foreach (var field in fields)
{
BsonValue value;
if (field.Contains("."))
{
value = GetNestedField(element, field);
}
else
{
value = element.GetElement(field).Value;
}
// Example deserialize to string
switch (value.BsonType)
{
case BsonType.ObjectId:
line = line + value.ToString();
break;
case BsonType.String:
line = line + value.ToString();
break;
case BsonType.Int32:
line = line + value.AsInt32.ToString();
break;
}
line = line + ",";
}
csv.AppendLine(line);
}
File.WriteAllText("D:\\temp.csv", csv.ToString());
In the case of your own objects you'd have to use your own deserializer.
HOWEVER I'd recommend using the mongoexport tool if you can.
You could simply run the exe from your application, feeding in arguments as required. Keep in mind though, that it requires explicit fields.
ProcessStartInfo startInfo = new ProcessStartInfo();
startInfo.FileName = "C:\mongodb\bin\mongoexport.exe";
startInfo.Arguments = "-d testDB -c testCollection --type csv --fields name,address.street,address.zipCode --out .\output.csv";
startInfo.UseShellExecute = false;
Process exportProcess= new Process();
exportProcess.StartInfo = startInfo;
exportProcess.Start();
exportProcess.WaitForExit();
More on mongoexport such as paging, additional queries and field file:
https://docs.mongodb.com/manual/reference/program/mongoexport/
Getting Unique Field Names
In order to find ALL field names you could do this a number of ways. Using BsonDocument as a generic data example.
Recursively traverse through your IMongoCollection results. This is going to have to be through the entire collection, so performance may not be great.
Example:
HashSet<string> fields = new HashSet<string>();
var result = database.GetCollection<BsonDocument>(collection).Find(new BsonDocument());
var result = database.GetCollection<BsonDocument>(collection).Find(new BsonDocument());
foreach (var element in result.ToListAsync().Result)
{
ProcessTree(fields, element, "");
}
private void ProcessTree(HashSet<string> fields, BsonDocument tree, string parentField)
{
foreach (var field in tree)
{
string fieldName = field.Name;
if (parentField != "")
{
fieldName = parentField + "." + fieldName;
}
if (field.Value.IsBsonDocument)
{
ProcessTree(fields, field.Value.ToBsonDocument(), fieldName);
}
else
{
fields.Add(fieldName);
}
}
}
Perform a MapReduce operation to return all fields. Scanning nested fields becomes more complex with this method however. See this.
Example:
string map = #"function() {
for (var key in this) { emit(key, null); }
}";
string reduce = #"function(key, stuff) { return null; }";
string finalize = #"function(key, value){
return key;
}";
MapReduceOptions<BsonDocument, BsonValue> options = new MapReduceOptions<BsonDocument, BsonValue>();
options.Finalize = new BsonJavaScript(finalize);
var results = database.GetCollection<BsonDocument>(collection).MapReduceAsync(
new BsonJavaScript(map),
new BsonJavaScript(reduce),
options).Result.ToListAsync().Result;
foreach (BsonValue result in results.Select(item => item["_id"]))
{
Debug.WriteLine(result.AsString);
}
Perform an Aggregation operation. You'd need to unwind as many times as required to get all nested fields.
Example:
string[] pipeline = new string[3];
pipeline[0] = "{ '$project':{ 'arrayofkeyvalue':{ '$objectToArray':'$$ROOT'}}}";
pipeline[1] = "{ '$unwind':'$arrayofkeyvalue'}";
pipeline[2] = "{ '$group':{'_id':null,'fieldKeys':{'$addToSet':'$arrayofkeyvalue.k'}}}";
var stages = pipeline.Select(s => BsonDocument.Parse(s)).ToList();
var result = await database.GetCollection<BsonDocument>(collection).AggregateAsync<BsonDocument>(stages);
foreach (BsonValue fieldName in result.Single().GetElement("fieldKeys").Value.AsBsonArray)
{
Debug.WriteLine(fieldName.AsString);
}
Nothing perfect here and I couldn't tell you which is the most efficient but hopefully something to help.

Json.Net convert complex querystring to JsonString

I am implementing a utility method to convert queryString to JsonString.
My code is as follows:
public static string GetJsonStringFromQueryString(string queryString)
{
var nvs = HttpUtility.ParseQueryString(queryString);
var dict = nvs.AllKeys.ToDictionary(k => k, k => nvs[k]);
return JsonConvert.SerializeObject(dict, new KeyValuePairConverter());
}
when I test with the following code:
var postString = "product[description]=GreatStuff" +
"&product[extra_info]=Extra";
string json = JsonHelper<Product>.GetJsonStringFromQueryString(postString);
I got
{
"product[description]":"GreatStuff",
"product[extra_info]":"Extra",
...
}
what I would like to get is
{
"product":{
"description": "GreatStuff",
"extra_info" : "Extra",
...
}
}
How can I achieve this without using System.Web.Script Assembly? (I am on Xamarin and have no access to that library)
You need to remove the product[key] (excepting the product property name or key...) part to get what you want...
That is, you should pre-process your query string before parsing it this way:
string queryString = "product[description]=GreatStuff" +
"&product[extra_info]=Extra";
var queryStringCollection = HttpUtility.ParseQueryString(queryString);
var cleanQueryStringDictionary = queryStringCollection.AllKeys
.ToDictionary
(
key => key.Replace("product[", string.Empty).Replace("]", string.Empty),
key => queryStringCollection[key]
);
var holder = new { product = cleanQueryStringDictionary };
string jsonText = JsonConvert.SerializeObject(holder);

Regular Expression with Lambda Expression

I've got several text files which should be tab delimited, but actually are delimited by an arbitrary number of spaces. I want to parse the rows from the text file into a DataTable (the first row of the text file has headers for property names). This got me thinking about building an extensible, easy way to parse text files. Here's my current working solution:
string filePath = #"C:\path\lowbirthweight.txt";
//regex to remove multiple spaces
Regex regex = new Regex(#"[ ]{2,}", RegexOptions.Compiled);
DataTable table = new DataTable();
var reader = ReadTextFile(filePath);
//headers in first row
var headers = reader.First();
//skip headers for data
var data = reader.Skip(1).ToArray();
//remove arbitrary spacing between column headers and table data
headers = regex.Replace(headers, #" ");
for (int i = 0; i < data.Length; i++)
{
data[i] = regex.Replace(data[i], #" ");
}
//make ready the DataTable, split resultant space-delimited string into array for column names
foreach (string columnName in headers.Split(' '))
{
table.Columns.Add(new DataColumn() { ColumnName = columnName });
}
foreach (var record in data)
{
//split into array for row values
table.Rows.Add(record.Split(' '));
}
//test prints correctly to the console
Console.WriteLine(table.Rows[0][2]);
}
static IEnumerable<string> ReadTextFile(string fileName)
{
using (var reader = new StreamReader(fileName))
{
while (!reader.EndOfStream)
{
yield return reader.ReadLine();
}
}
}
In my project I've already received several large (gig +) text files that are not in the format in which they are purported to be. So can I see having to write methods such as these with some regularity, albeit with a different regular expression. Is there a way to do something like
data =data.SmartRegex(x => x.AllowOneSpace) where I can use a regular expression to iterate over the collection of strings?
Is something like the following on the right track?
public static class SmartRegex
{
public static Expression AllowOneSpace(this List<string> data)
{
//no idea how to return an expression from a method
}
}
I'm not too overly concerned with performance, just would like to see how something like this works
You should consult with your data source and find out why your data is bad.
As for the API design that you are trying to implement:
public class RegexCollection
{
private readonly Regex _allowOneSpace = new Regex(" ");
public Regex AllowOneSpace { get { return _allowOneSpace; } }
}
public static class RegexExtensions
{
public static IEnumerable<string[]> SmartRegex(
this IEnumerable<string> collection,
Func<RegexCollection, Regex> selector
)
{
var regexCollection = new RegexCollection();
var regex = selector(regexCollection);
return collection.Select(l => regex.Split(l));
}
}
Usage:
var items = new List<string> { "Hello world", "Goodbye world" };
var results = items.SmartRegex(x => x.AllowOneSpace);

Categories

Resources