Replace Value Text Over Multiple Lines - c#

I have multiple text files and I need to replace a string inside with something else. The text files can have up to 500 of this reoccurrences .
The example below I need to replace everything after Synopsis with whatever string I choose such as not available.
I can read the file and I even tried to make 2 markers start and end but not working:
List of all books
Romance
Book number: 1
Title : Something Title
Author : Some Author
ISBN: 45425425423
Written Date : Some date
Release Date: Some date
Characters: Blah Blah
Genre: Romance
Synopsis : Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
Book number: 2
Title : Something Title
Author : Some Author
ISBN: 45425425423
Written Date : Some date
Release Date: Some date
Characters: Blah Blah
Genre: Romance
Synopsis : Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
Book number: 3
Title : Something Title
Author : Some Author
ISBN: 45425425423
Written Date : Some date
Release Date: Some date
Characters: Blah Blah
Genre: Romance
Synopsis : Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
Comedy
Book number: 1
Title : Something Title
Author : Some Author
ISBN: 45425425423
Written Date : Some date
Release Date: Some date
Characters: Blah Blah
Genre: Romance
Synopsis : Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
Blah Blah Blah Blah Blah Blah Blah Blah
This is the code i tried so far.
string[] filelines = File.ReadAllLines(#"file.txt", Enconding.UTF8)
string markS = "Synopsis :"
string MarkE = "Book number:"
for (int i = 0; i < fileLines.Length; i++)
{
string line = fileLines[i];
int start = line.IndexOf(markS);
int end = line.LastIndexOf(markE);
Console.WriteLine(start + " " + end);
// if (start >= 0 && end >= 0)
if (start >= 0 && end >= 0)
{
Console.WriteLine(line);
int length = line.IndexOf(markE) + markE.Length - start;
//include the markers in the substring
//in case the substring occurs elsewhere without the markers
var textToReplace = line.Substring(start, length);
//add the markers to the relacement
string replacement = markS + replacedText + markE;
var result = line.Replace(textToReplace, replacement);
fileLines[i] = result;
}
File.WriteAllLines(#"test2.txt", fileLines);`your text`
}

Using regular expressions and Regex.Replace we can replace the text as so:
var pattern = #"(^Synopsis\s:)(.+?)(^Book\snumber|\Z)";
var replacements =
Regex.Replace(text,
pattern,
$"$1 Jabberwocky {Environment.NewLine}$3",
RegexOptions.Multiline | RegexOptions.Singleline);
The above code will replace the multline "Synopsis" value with "Jabberwocky".
You may need to tweak the (^Book\snumber|\Z) capture with the section headers like (^Book\snumber|Romance|Comedy|\Z). But I leave that up to you to work out since it's your data.

Your post states that you need to replace a string inside with something else and you could consider the entire file as "one long string" and do some processing on that. The Regex solution is probably a great way to go in that case.
But I read your code carefully from the perspective of what you're actually trying to do. Ok, so my crystal ball isn't 100% but I believe that if we meet back up a few weeks from now, it will have occurred to you want certain elements:
A class that represents a Book.
A serialization method (like Json) that can take a file and turn it into a Book and vice versa.
A way to search the books (like SQLite) based on the properties in the Book class.
Having a Book class would simplify the substitution that you want to do because the Synopsis property would already be separated out. Then you could perform a standard string.Replace in a targeted way.
Book class
class Book
{
public string? BookNumber { get; set; }
public string? Title { get; set; }
public string? Author { get; set; }
public string? ISBN { get; set; }
public string? Written { get; set; }
public string? Release { get; set; }
public string? Characters { get; set; }
public string? Genre { get; set; }
public string? Synopsis { get; set; }
// Display
public override string ToString()
{
return
$"Book number : {BookNumber}{Environment.NewLine}" +
$"Title : {Title}{Environment.NewLine}" +
$"Author : {Author}{Environment.NewLine}" +
$"ISBN : {ISBN}{Environment.NewLine}" +
$"Written Date : {Written}{Environment.NewLine}" +
$"Release Date : {Release}{Environment.NewLine}" +
$"Characters : {Characters}{Environment.NewLine}" +
$"Genre : {Genre}{Environment.NewLine}" +
$"Synopsis : {Synopsis}{Environment.NewLine}";
}
}
Replace:
book.Synopsis = book.Synopsis.Replace("Blah", "Marklar");
Serialization - The hard way
You stated that the .txt files are on disk. Here's a method that uses the string representations in your post to turn a "file" into a Book.
public Book(string file)
{
var synopsis = new StringBuilder();
foreach (var line in File.ReadAllLines(file))
{
var parse = line.Split(':').Select(_ => _.Trim()).ToArray();
if (parse.Length == 1)
{
synopsis.Append("\t" + parse[0] + Environment.NewLine);
}
else
{
var property = parse[0];
switch (property)
{
case "Book number": BookNumber = parse[1]; break;
case "Title": Title = parse[1]; break;
case "Author": Author = parse[1]; break;
case "ISBN": ISBN = parse[1]; break;
case "Written Date": Written = parse[1]; break;
case "Release Date": Release = parse[1]; break;
case "Characters": Characters = parse[1]; break;
case "Genre": Genre = parse[1]; break;
case "Synopsis": synopsis.Append(parse[1] + Environment.NewLine); break;
default: Debug.Assert(false, $"Error reading '{property}'"); break;
}
}
}
Synopsis = synopsis.ToString();
}
That's a lot of work, but now you can replace value over multiple lines starting with a raw file:
book = new Book("492C9F2A7E73.txt");
book.Synopsis = book.Synopsis.Replace("Blah", "Marklar");
Serialization - An easier way
But also please consider using something like the Newtonsoft.Json NuGet to simplify your serialization. It still writes the file in plain text and you'll even see the ':' character used in a similar way to your file listings. But the format lets Json reconstruct a Book object directly.
SAVE
var path = Path.Combine(dir, $"{book.ISBN}.json");
File.WriteAllText(path, JsonConvert.SerializeObject(book));
Result in file (this is after doing the replacement):
{
"BookNumber": "2",
"Title": "Something Title",
"Author": "Some Author",
"ISBN": "7E092CB94CCD",
"Written": "Some date",
"Release": "Some date",
"Characters": "Blah Blah",
"Genre": "Romance",
"Synopsis": "Marklar Marklar Marklar Marklar Marklar Marklar Marklar Marklar\r\n\tMarklar Marklar Marklar Marklar Marklar Marklar Marklar Marklar\r\n\tMarklar Marklar Marklar Marklar Marklar Marklar Marklar Marklar\r\n\tMarklar Marklar Marklar Marklar Marklar Marklar Marklar Marklar\r\n"
}
LOAD
var book = JsonConvert.DeserializeObject<Book>(File.ReadAllText("492C9F2A7E73.json"));
QUERY
var romanceBooks = database.Query<Book>("SELECT * FROM books WHERE Genre='Romance'");
There's more than one way to do what you asked. The benefit of doing something like this is to set you up going forward for a search engine using the Book class.

Related

Empty String Input Validation

I've been having trouble on understanding as of why my custom empty string validation method does not work compared when I check for an empty string directly
Validation.EmptyValidation(title,
"Please, do not leave the course title field empty!" +
"\r\nEnter the course title: ");
It does not output the course title in the end, but when I do it this way it does:
while (string.IsNullOrEmpty(title))
{
Console.WriteLine("No empty string: ");
title = Console.ReadLine();
}
Class:
Console.WriteLine("* Create Course *\r\n");
Console.WriteLine("Enter the course title: ");
string title = Console.ReadLine();
while (string.IsNullOrEmpty(title))
{
Console.WriteLine("No empty string: ");
title = Console.ReadLine();
}
Validation.EmptyValidation(title,
"Please, do not leave the course title field empty!" +
"\r\nEnter the course title: ");
Console.WriteLine("\r\nEnter the course description: ");
string description = Console.ReadLine();
Validation.EmptyValidation(description,
"Please, do not leave the course description field empty!" +
"\r\nEnter the course description: ");
Console.WriteLine("\r\nEnter the number of students in the course: ");
=string studentsInput = Console.ReadLine();
int.TryParse(studentsInput, out int students);
CreateCourse(currentCourse, title, description, students);
public static Course CreateCourse (Course _currentCourse, string title string description, int students)
{
Course course = new Course(title, description, students);
_currentCourse = course;
_currentCourse.Title = course.Title;
Console.WriteLine($"\r\nThank you for registering the {_currentCourse.Title} course.\r\n" +
$"\r\nCourse Information" +
$"\r\nTitle: {_currentCourse.Title}" +
$"\r\nDescription: {_currentCourse.Description}" +
$"\r\nStudents: {_currentCourse.Capacity}");
return _currentCourse;
}
Empty Validation Method:
public static string EmptyValidation(string input, string prompt)
{
while (string.IsNullOrEmpty(input))
{
Console.WriteLine(prompt);
input = Console.ReadLine();
}
return input;
}
There is a couple of things going wrong here
// you weren't returning the results
title = Validation.EmptyValidation(title,
"Please, do not leave the course title field empty!" +
"\r\nEnter the course title: ");
Also if you don't need the other validation anymore you are best to remove it
//while (string.IsNullOrEmpty(title))
//{
// Console.WriteLine("No empty string: ");
// title = Console.ReadLine();
// }

Can Json Schema Validation via Newtonsoft.Json.Schema validate VALUES?

I have a small sample. If my Json is good, it works correctly. If I change the "tag" (aka, the property name), it works correctly by having invalid messages. If I change the value of a Guid to be a non-guid-value, the Json Schema Validation does not fail.
Is there a way to fail validation for a Guid value?
public class MyCoolObject
{
public Guid TheUuid { get; set; }
public Int32 TheInteger { get; set; }
public DateTime TheDateTime { get; set; }
}
and my test method. When i = 2 (and I'm setting the string to contain "NOTAGUID-3333-3333-3333-333333333333"), that is when I don't get error messages like I would like to.
private static void RunJsonSchemaValidate()
{
/* Note, the TheUuid is of type "string" and format "guid" */
string jsonSchemaText = #"
{
""typeName"": ""MyCoolObject"",
""additionalProperties"": false,
""type"": ""object"",
""required"": [
""TheUuid"",
""TheInteger"",
""TheDateTime""
],
""properties"": {
""TheUuid"": {
""type"": ""string"",
""format"": ""guid""
},
""TheInteger"": {
""type"": ""integer""
},
""TheDateTime"": {
""type"": ""string"",
""format"": ""date-time""
}
},
""$schema"": ""http://json-schema.org/draft-04/schema#""
}
";
Newtonsoft.Json.Schema.JSchema jschem = Newtonsoft.Json.Schema.JSchema.Parse(jsonSchemaText);
for (int i = 0; i < 3; i++)
{
string jsonContent = string.Empty;
switch (i)
{
case 1:
/* bad json, change the property NAME */
jsonContent = #"{
""TheUuidXXX"": ""33333333-3333-3333-3333-333333333333"",
""TheInteger"": 2147483647,
""TheDateTime"": ""2017-08-22T15:32:10.7023008-04:00""
}";
break;
case 2:
/* bad json, change the property VALUE */
jsonContent = #"{
""TheUuid"": ""NOTAGUID-3333-3333-3333-333333333333"",
""TheInteger"": 2147483647,
""TheDateTime"": ""2017-08-22T15:32:10.7023008-04:00""
}";
break;
case 3:
/* bad json, bad integer */
jsonContent = #"{
""TheUuid"": ""33333333-3333-3333-3333-333333333333"",
""TheInteger"": notAnumber,
""TheDateTime"": ""2017-08-22T15:32:10.7023008-04:00""
}";
break;
case 4:
/* bad json, bad date */
jsonContent = #"{
""TheUuid"": ""33333333-3333-3333-3333-333333333333"",
""TheInteger"": 2147483647,
""TheDateTime"": ""NOTADATE""
}";
break;
default:
/* good json */
jsonContent = #"{
""TheUuid"": ""33333333-3333-3333-3333-333333333333"",
""TheInteger"": 2147483647,
""TheDateTime"": ""2017-08-22T15:32:10.7023008-04:00""
}";
break;
}
/* START THE MEAT OF THIS PROCEDURE */
Newtonsoft.Json.Linq.JObject jobj = Newtonsoft.Json.Linq.JObject.Parse(jsonContent);
IList<string> messages;
bool valid = jobj.IsValid(jschem, out messages);
/* ENDTHE MEAT OF THIS PROCEDURE */
if (!valid)
{
string errorMsg = "i=" + i.ToString() + ":" + string.Join(",", messages);
Console.WriteLine(string.Empty);
Console.WriteLine(string.Empty);
Console.WriteLine(errorMsg);
}
else
{
Console.WriteLine(string.Empty);
Console.WriteLine(string.Empty);
Console.WriteLine("i=" + i.ToString() + ":" + "Good json Yes");
MyCoolObject thisShouldWorkWhenValidationPasses = Newtonsoft.Json.JsonConvert.DeserializeObject<MyCoolObject>(jsonContent);
}
Console.WriteLine(string.Empty);
Console.WriteLine("--------------------------------------------------");
Console.WriteLine(string.Empty);
}
and the packages
<?xml version="1.0" encoding="utf-8"?>
<packages>
<package id="Newtonsoft.Json" version="10.0.2" targetFramework="net45" />
<package id="Newtonsoft.Json.Schema" version="3.0.3" targetFramework="net45" />
</packages>
So what is happening is that when i=2, the json-schema passes, but then MyCoolObject thisShouldWorkWhenValidationPasses throws an exception....
i=2:Good json Yes
Unhandled Exception: Newtonsoft.Json.JsonSerializationException: Error
converting value "NOTAGUID-3333-3333-3333-333333333333" to type
'System.Guid'. Path 'TheUuid', line 2, position 77. --->
System.ArgumentException: Could not cast or convert from System.String
to System.Guid.
:(
I'm trying to have the json-schema fail earlier.
The end-game is to perform a json-schema-validation without exceptions getting thrown. Then after "everything is clear" try to load the objects. My real stuff is more complex, but this small demo shows the issue(s).
I also replaced the "meat of this procedure" with the below code
/* START THE MEAT OF THIS PROCEDURE */
Newtonsoft.Json.JsonTextReader reader = new Newtonsoft.Json.JsonTextReader(new System.IO.StringReader(jsonContent));
Newtonsoft.Json.Schema.JSchemaValidatingReader validatingReader = new Newtonsoft.Json.Schema.JSchemaValidatingReader(reader);
validatingReader.Schema = JSchema.Parse(schemaJson);
IList<string> messages = new List<string>();
validatingReader.ValidationEventHandler += (o, a) => messages.Add(a.Message);
Newtonsoft.Json.JsonSerializer serializer = new Newtonsoft.Json.JsonSerializer();
/* below is the issue with this code..you still try to serialize the object...and that can throw an exception */
MyCoolObject p = serializer.Deserialize<MyCoolObject>(validatingReader);
bool valid = !messages.Any();
/* END THE MEAT OF THIS PROCEDURE */
But again, this is subject to exceptions being thrown .. trying to validate.
Thanks to Jeroen Mostert for the hint that led me to this solution:
/* START THE MEAT OF THIS PROCEDURE */
IList<string> deserializeMessages = new List<string>();
/* first get any serialization issues */
MyCoolObject p = JsonConvert.DeserializeObject<MyCoolObject>(jsonContent,
new JsonSerializerSettings
{
Error = delegate (object sender, Newtonsoft.Json.Serialization.ErrorEventArgs args)
{
deserializeMessages.Add(args.ErrorContext.Error.Message);
args.ErrorContext.Handled = true;
}
});
IList<string> jsonSchemaMessages = new List<string>();
bool jsonSchemaIsValid = true;
/* now, only if there were no serialization issues, look at the schema */
if (!deserializeMessages.Any())
{
Newtonsoft.Json.Linq.JObject jobj = Newtonsoft.Json.Linq.JObject.Parse(jsonContent);
jsonSchemaIsValid = jobj.IsValid(jschem, out jsonSchemaMessages);
}
IEnumerable<string> allMessages = deserializeMessages.Union(jsonSchemaMessages);
bool overallValid = !allMessages.Any();
/* END THE MEAT OF THIS PROCEDURE */
This gives me the desired output for this situation:
i=0:Good json Yes
i=1:Property 'TheUuidXXX' has not been defined and the schema does not
allow additional properties. Path 'TheUuidXXX', line 2, position
41.,Required properties are missing from object: TheUuid. Path '', line 1, position 1.
i=2:Error converting value "NOTAGUID-3333-3333-3333-333333333333" to
type 'System.Guid'. Path 'TheUuid', line 2, position 77.
i=3:Unexpected character encountered while parsing value: o. Path
'TheInteger', line 3, position 41.,Error parsing boolean value. Path
'TheInteger', line 3, position 42.
i=4:Could not convert string to DateTime: NOTADATE. Path
'TheDateTime', line 4, position 50.
PRESS ENTER TO EXIT
I'm still wrapping my head around it a little. But in my specific situation (where I want to Response back to the http request immediately there was a json issue), it works.
I won't mark this as "the answer" in case anyone comes up with something better.
Note, I changed my i for loop to be < 5
for (int i = 0; i < 5; i++)

Generate unique email in c# for Microsoft Identity Manager

I have a DB ad Microsoft Identity Manager to generate user accounts from HR to MS Active Directory and so on.
I have a such code for generate unique email:
case "mailgenerate":
if (mventry["email"].IsPresent)
{
// Do nothing, the mail was already generated.
}
{
if (csentry["FIRST"].IsPresent && csentry["LAST"].IsPresent);
{
string FirstName = replaceRUEN(csentry["FIRST"].Value);
string LastName = replaceRUEN(csentry["LAST"].Value);
string email = FirstName + "." + LastName + "#test.domain.com";
string newmail = GetCheckedMail(email, mventry);
if (newmail.Equals(""))
{
throw new TerminateRunException("A unique mail could not be found");
}
mventry["email"].Value = newmail;
}
}
break;
//Generate mail Name method
string GetCheckedMail(string email, MVEntry mventry)
{
MVEntry[] findResultList = null;
string checkedmailName = email;
for (int nameSuffix = 1; nameSuffix < 100; nameSuffix++)
{
//added ; and if corrected
findResultList = Utils.FindMVEntries("email", checkedmailName,1);
if (findResultList.Length == 0)
{
// The current mailName is not in use.
return (checkedmailName);
}
MVEntry mvEntryFound = findResultList[0];
if (mvEntryFound.Equals(mventry))
{
return (checkedmailName);
}
// If the passed email is already in use, then add an integer value
// then verify if the new value exists. Repeat until a unique email is checked
checkedmailName = checkedmailName + nameSuffix.ToString();
}
// Return an empty string if no unique mailnickName could be created.
return "";
}
Problem:
When I run sync cycle for first time I get normal email like
duplicateuser1#test.domain.com
For next sync cycle this emails are updated to
duplicateuser#test.domain.com1
This code I'm also using to generate mailnickname and accountname without any problems.
Can anybody say why it is happens?
Thanks!
The problem is the line:
checkedmailName = checkedmailName + nameSuffix.ToString();
checkedmailName has a value like this: firstName.lastName#test.domain.com
So, you're doing this:
checkedmailName = firstName.lastName#test.domain.com + 1;
You need to do something like this:
checkedmailName = checkedmailName.Split('#')[0] + nameSuffix.ToString()+ "#" + checkedmailName.Split('#')[1];
Whith this, you're getting the part before #, adding a int value and then, appending the #+ domain.
Updated by author of thread I changed split -> Split and it works. Thanks!

c# take certain words from String

When I get http response it looks like this:
{
"course_editions": {
"2014/SL": [
{
"course_id": "06-DEGZLI0",
"term_id": "2014/SL",
"course_name": {
"en": "Preparation for bachelor exam",
}
},
{
"course_id": "06-DPRALW0",
"term_id": "2014/SL",
"course_name": {
"en": "Work experience",
}
},
{
I would like to be able to extract course title only, f.e.:
Work experience
Preparation for bachelor exam
I've tried this:
string probably_json = GetResponse(url_courses);
object obj = JsonConvert.DeserializeObject(probably_json);
using (StringReader reader = new StringReader(obj.ToString().Replace("\\t", " ").Replace("\\n", "\n")))
{
string line;
int lineNo = 0;
while ((line = reader.ReadLine()) != null)
{
if (line.Contains("en"))
{
string output = line.Substring(0, line.Length-1);
Console.WriteLine(output);
}
++lineNo;
}
} // End Using StreamReader
But that's all I've got:
"en": "Preparation for bachelor exam" "en": "Work experience"
what am I supposed to do, to get course title only ?
If you are using json.net anyways, make it do some work, don't parse yourself:
var result = JObject
.Parse(probably_json)
.SelectTokens("['course_editions'].['2014/SL'].[*].['course_name'].['en']");

Transform Search String into FullText Compatible Search String?

I'm working with the fulltext search engine of MSSQL 2008 which expects a search string like this:
("keyword1" AND "keyword2*" OR "keyword3")
My users are entering things like this:
engine 2009
"san francisco" hotel december xyz
stuff* "in miami" 1234
something or "something else"
I'm trying to transform these into fulltext engine compatible strings like these:
("engine" AND "2009")
("san francisco" AND "hotel" AND "december" AND "xyz")
("stuff*" "in miami" "1234")
("something" OR "something else")
I have a really difficult time with this, tried doing it using counting quotation marks, spaces and inserting etc. but my code looks like horrible for-and-if vomit.
Can someone help?
Here you go:
class Program {
static void Main(string[] args) {
// setup some test expressions
List<string> searchExpressions = new List<string>(new string[] {
"engine 2009",
"\"san francisco\" hotel december xyz",
"stuff* \"in miami\" 1234 ",
"something or \"something else\""
});
// display and parse each expression
foreach (string searchExpression in searchExpressions) {
Console.WriteLine(string.Concat(
"User Input: ", searchExpression,
"\r\n\tSql Expression: ", ParseSearchExpression(searchExpression),
"\r\n"));
}
Console.ReadLine();
}
private static string ParseSearchExpression(string searchExpression) {
// replace all 'spacecharacters' that exists within quotes with character 0
string temp = Regex.Replace(searchExpression, #"""[^""]+""", (MatchEvaluator)delegate(Match m) {
return Regex.Replace(m.Value, #"[\s]", "\x00");
});
// split string on any spacecharacter (thus: quoted items will not be splitted)
string[] tokens = Regex.Split(temp, #"[""\s]+", RegexOptions.IgnoreCase | RegexOptions.ExplicitCapture);
// generate result
StringBuilder result = new StringBuilder();
string tokenLast = string.Empty;
foreach (string token in tokens) {
if (token.Length > 0) {
if ((token.Length > 0) && (!token.Equals("AND", StringComparison.OrdinalIgnoreCase))) {
if (result.Length > 0) {
result.Append(tokenLast.Equals("OR", StringComparison.OrdinalIgnoreCase) ? " OR " : " AND ");
}
result.Append("\"").Append(token.Replace("\"", "\"\"").Replace("\x00", " ")).Append("\"");
}
tokenLast = token;
}
}
if (result.Length > 0) {
result.Insert(0, "(").Append(")");
}
return result.ToString();
}
}

Categories

Resources