C# programming for trimming first three lines and last four lines - c#

Below is my XML file. I want to get the node "name" from the XML using C#
'EventObjectsRead' ('73')
message attributes:
SATRCFG_OBJECT [xml] =
<ConfData>
<CfgAgentGroup>
<CfgGroup>
<DBID value="225"/>
<tenantDBID value="101"/>
<name value="CBD"/>
<routeDNDBIDs>
<DBID value="825"/>
</routeDNDBIDs>
<capacityTableDBID value="0"/>
<quotaTableDBID value="0"/>
<state value="1"/>
<capacityRuleDBID value="0"/>
<siteDBID value="0"/>
<contractDBID value="0"/>
</CfgGroup>
<agentDBIDs>
<DBID value="128"/>
<DBID value="133"/>
<DBID value="135"/>
<DBID value="385"/>
<DBID value="433"/>
</agentDBIDs>
</CfgAgentGroup>
</ConfData>
IATRCFG_TOTALCOUNT [int] = 1
IATRCFG_OBJECTCOUNT [int] = 1
IATRCFG_OBJECTTYPE [int] = 5
IATRCFG_REQUESTID [int] = 3
Is there a way to get node "name" directly from above XML or if i need to trim first three lines and last four lines. how can i do it.

You could extract the node you are looking for using Regex on the original string (where str is your string data):
// Use Regex to match the exact string and parse that to XElement.
string nameXML = Regex.Match(str, #"<name +value="".*"" */>").Groups[0].Value;
XElement name = XElement.Parse(nameXML);
Or here is an example where you can strip the invalid lines, parse the XML and then access the data from an XML object:
// Split the string into groups using newline as a delimiter.
string[] groups = str.Split(new[] { Environment.NewLine }, StringSplitOptions.None);
// Use skip and take to trim the first 3 and last 4 elements.
// Rejoin the remainder back together with empty strings and parse the XElement.
string xmlString = string.Join(string.Empty, groups.Take(groups.Length - 4).Skip(3));
XElement xml = XElement.Parse(xmlString);
// Use Descendants and First to get the first node called 'name' in the XML.
XElement name = xml.Descendants("name").First();

Here are two ways to achieve this. Either with string operation or with RegEx:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Name: {0}", GetNameFromFileString(File.ReadAllText("file.txt")));
Console.WriteLine("Name: {0}", GetNameFromFile("file.txt"));
}
private static string GetNameFromFileString(string filecontent)
{
Regex r = new Regex("(?<Xml><ConfData>.*</ConfData>)", RegexOptions.Singleline);
var match = r.Match(filecontent);
var xmlString = match.Groups["Xml"].ToString();
return GetNameFromXmlString(xmlString);
}
private static string GetNameFromFile(string filename)
{
var lines = File.ReadAllLines(filename);
var xml = new StringBuilder();
var isXml = false;
foreach (var line in lines)
{
if (line.Contains("<ConfData>"))
isXml = true;
if (isXml)
xml.Append(line.Trim());
if (line.Contains("</ConfData>"))
isXml = false;
}
var text = xml.ToString();
return GetNameFromXmlString(text);
}
private static string GetNameFromXmlString(string text)
{
var xDocument = XDocument.Parse(text);
var cfgAgentGroupt = xDocument.Root.Element("CfgAgentGroup");
var cfgGroup = cfgAgentGroupt.Element("CfgGroup");
var name = cfgGroup.Element("name");
var nameValue = name.Attribute("value");
var value = nameValue.Value;
return value;
}
}

From the string what you provided us and the describtion of what you want to do i assume taht you want to extract the XMl from the file. I would do this in the following way:
string text = System.IO.File.ReadAllText(#"C:\docs\myfile.txt");
Regex r = new Regex("<ConfData>(.|\r\n)*?</ConfData>");
var v = r.Match(text);
string myResult = "<ConfData>" + v.Groups[0].ToString() + "</ConfData>";

Related

C# XDocument right way to escape symbols

Hello I'm struggling with escaping in xml, problem is my output is escaped 2 times and I dont understand why its happening.
Code below:
private static string FixSingleEncoding(string data)
{
//data?.Replace("&", "&").Replace("<", "<").Replace(">", ">").Replace(""", """).Replace("'", "&apos;");
return System.Net.WebUtility.HtmlEncode(data); //SecurityElement.Escape(data);//
}
private static XDocument FixEncoding(XDocument instance)
{
XNamespace naming = instance.Root.Name.Namespace;
var result = instance.Descendants(naming + "dataset").ToList();
var count = result.Count;
for (int i = 0; i < count; i++)
{
result[i].Value = FixSingleEncoding(result[i].Value);
}
return instance;
}
public static bool CreateNewDataset(string path, string data)
{
Debug.WriteLine("CALL");
XDocument xdoc = XDocument.Load(Path.Combine(MasterLocation, path));
xdoc = FixEncoding(xdoc);
XNamespace df = xdoc.Root.Name.Namespace;
XElement root = new XElement(df+"changeSet");
root.Add(new XAttribute("id", "My Name"));
root.Add(new XAttribute("author", "Test"));
string final = data;
XElement innerelement = new XElement(df + "data", final);
innerelement.Add(new XAttribute("endDelimiter", "GO"));
root.Add(innerelement);
xdoc.Root.Add(root);
xdoc.Save(Path.Combine(MasterLocation, path));
return true;
}
Problem is when I first time load xml file and use method CreateNewDataset it retrieves all data from xml file and unescape old data, so I put FixEncoding method, but then another problem showed up, now it escapes two times, how do I know that exactly two times, well using VS Code and converting XML Entity to string, it needs to converted 2 times to readable string, CreateNewDataset method is called only once, but data escaped two times, what do I miss here?
entered data
IF EXISTS ( SELECT *
FROM sysobjects
WHERE id = object_id(N'[dbo].[table1]')
and OBJECTPROPERTY(id, N'IsProcedure') = 0)
orginal code before CreateNewDataset:
<changeSet id="Test" author="My Name">
<data endDelimiter="GO">
IF EXISTS ( SELECT *
FROM sysobjects
WHERE id = object_id(N&apos;[dbo].[table1]&apos;)
and OBJECTPROPERTY(id, N&apos;IsProcedure&apos;) = 0)
</data>
</changeSet>
AFTER createnewdataset(without FixEncoding)
<changeSet id="Test" author="My Name">
<data endDelimiter="GO">
IF EXISTS ( SELECT *
FROM sysobjects
WHERE id = object_id(N'[dbo].[table1]')
and OBJECTPROPERTY(id, N'IsProcedure') = 0)
</data>
</changeSet>

C# Replace regex matched pattern using dictionary

I am trying to replace a pattern in my string where only the words between the tags should be replaced. The word that needs to be replaced resides in a dictionary as key and value pair.
Currently this is what I am trying:
string input = "<a>hello</a> <b>hello world</b> <c>I like apple</c>";
string pattern = (#"(?<=>)(.)?[^<>]*(?=</)");
Regex match = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = match.Matches(input);
var dictionary1 = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
dictionary1.Add("hello", "Hi");
dictionary1.Add("world", "people");
dictionary1.Add("apple", "fruit");
string output = "";
output = match.Replace(input, replace => { return dictionary1.ContainsKey(replace.Value) ? dictionary1[replace.Value] : replace.Value; });
Console.WriteLine(output);
Console.ReadLine();
Using this, it does replace but only the first 'hello' and not the second one. I want to replace every occurrence of 'hello' between the tags.
Any help will be much appreciated.
The problem is that the matches are:
hello
hello world
I like apple
so e.g. hello world is not in your dictionary.
Based on your code, this could be a solution:
using System;
using System.Text.RegularExpressions;
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var dictionary1 = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
dictionary1.Add("hello", "Hi");
dictionary1.Add("world", "people");
dictionary1.Add("apple", "fruit");
string input = "<a>hello</a> <b>hello world</b> <c>I like apple</c>";
string pattern = ("(?<=>)(.)?[^<>]list|" + GetKeyList(dictionary1) + "(?=</)");
Regex match = new Regex(pattern, RegexOptions.IgnoreCase);
MatchCollection matches = match.Matches(input);
string output = "";
output = match.Replace(input, replace => {
Console.WriteLine(" - " + replace.Value);
return dictionary1.ContainsKey(replace.Value) ? dictionary1[replace.Value] : replace.Value;
});
Console.WriteLine(output);
}
private static string GetKeyList(Dictionary<string, string> list)
{
return string.Join("|", new List<string>(list.Keys).ToArray());
}
}
Fiddle: https://dotnetfiddle.net/zNkEDv
If someone wants to dig into this an tell me why do I need a "list|" in the list (because the first item is being ignored), I'll appreciate it.
This is another way of doing it - I parse the string into XML and then select elements containing the keys in your dictionary and then replace each element's value.
However, you have to have a valid XML document - your example lacks a root node.
var xDocument = XDocument.Parse("<root><a>hello</a> <b>hello world</b> <c>I like apple</c></root>");
var dictionary1 = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase) { { "hello", "Hi" }, { "world", "people" }, { "apple", "fruit" } };
string pattern = #"\w+";
Regex match = new Regex(pattern, RegexOptions.IgnoreCase);
var xElements = xDocument.Root.Descendants()
.Where(x => dictionary1.Keys.Any(s => x.Value.Contains(s)));
foreach (var xElement in xElements)
{
var updated = match.Replace(xElement.Value,
replace => {
return dictionary1.ContainsKey(replace.Value)
? dictionary1[replace.Value] : replace.Value; });
xElement.Value = updated;
}
string output = xDocument.ToString(SaveOptions.DisableFormatting);
This pattern of "\w+" matches words, not spaces.
This LINQ selects descendants of the root node where the element value contains any of the keys of your dictionary:
var xElements = xDocument.Root.Descendants().Where(x => dictionary1.Keys.Any(s => x.Value.Contains(s)));
I then iterate through the XElement enumerable collection returned and apply your replacement MatchEvaluator to just the string value, which is a lot easier!
The final output is <root><a>Hi</a><b>Hi people</b><c>I like fruit</c></root>. You could then remove the opening and closing <root> and </root> tags, but I don't know what your complete XML looks like.
This will do what you want (from what you have provided so far):
private static Dictionary<string, string> dict;
static void Main(string[] args)
{
dict =
new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
{
{ "hello", "Hi" },
{ "world", "people" },
{ "apple", "fruit" }
};
var input = "<a>hello</a> <b>hello world</b> apple <c>I like apple</c> hello";
var pattern = #"<.>([^<>]+)<\/.>";
var output = Regex.Replace(input, pattern, Replacer);
Console.WriteLine(output);
Console.ReadLine();
}
static string Replacer(Match match)
{
var value = match.Value;
foreach (var kvp in dict)
{
if (value.Contains(kvp.Key)) value = value.Replace(kvp.Key, kvp.Value);
}
return value;
}

Not able to read XML string in C#

I have created a XML string and Looping that to get value. But its not entering in foreach loop. But in my other code same loop code is working.
my code is :
XML string:
<SuggestedReadings>
<Suggestion Text="Customer Centricity" Link="http://wdp.wharton.upenn.edu/book/customer-centricity/?utm_source=Coursera&utm_medium=Web&utm_campaign=custcent" SuggBy="Pete Fader�s" />
<Suggestion Text="Global Brand Power" Link="http://wdp.wharton.upenn.edu/books/global-brand-power/?utm_source=Coursera&utm_medium=Web&utm_campaign=glbrpower" SuggBy="Barbara Kahn�s" />
</SuggestedReadings>
Code Is:
string str = CD.SRList.Replace("&", "&");
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(str);
XmlNode SuggestionListNode = xmlDoc.SelectSingleNode("/SuggestedReadings/Suggestion");
foreach (XmlNode node in SuggestionListNode)
{
COURSESUGGESTEDREADING CSR = new COURSESUGGESTEDREADING();
var s = db.COURSESUGGESTEDREADINGS.OrderByDescending(o => o.SRID);
CSR.SRID = (s == null ? 0 : s.FirstOrDefault().SRID) + 1;
CSR.COURSEID = LibId;
CSR.TEXT = node.Attributes.GetNamedItem("Text").Value;
CSR.LINK = node.Attributes.GetNamedItem("Link").Value; ;
CSR.SUGBY = node.Attributes.GetNamedItem("SuggBy").Value; ;
CSR.ACTIVEFLAG = "Y";
CSR.CREATEDBY = CD.CreatedBy;
CSR.CREATEDDATE = DateTime.Now;
db.COURSESUGGESTEDREADINGS.Add(CSR);
}
You should use SelectNodes, not SelectSingleNode, since you are trying to get multiple rows out of the XML document.
Use this:
XmlNodeList SuggestionListNode = xmlDoc.SelectNodes("//Suggestion");
foreach (XmlNode node in SuggestionListNode)
{
}
You can try this.
XDocument xdoc = XDocument.Load("data.xml");
var xmlData = from lv1 in xdoc.Descendants("Suggestion")
select new {
Text = lv1.Attribute("Text").Value,
Link = lv1.Attribute("Link").Value,
SuggBy = lv1.Attribute("SuggBy").Value
};
foreach (var item in xmlData){
// your logic here
}

To return the multiple values from for loop

I have parsed the xml document and used a for loop to loop for getting different values in string, but when I try to return the value I get only the last value obtained, I want to return all the individual values so that I can store that values in any file format,
Below is my code,
XmlDocument xmlDOC = new XmlDocument();
xmlDOC.LoadXml(periodID_Value_Before_OffSet); // string storing my XML
var value = xmlDOC.GetElementsByTagName("value");
var xmlActions = new string[value.Count];
string values = "";
string Period1 = "";
string periodlevel_period1 = "";
var listOfStrings = new List<string>();
string modified_listofstrings = listOfStrings.ToString();
string arrayOfStrings = "";
for (int i = 0; i < value.Count; i++)
{
var xmlAttributeCollection = value[i].Attributes;
if (xmlAttributeCollection != null)
{
var action = xmlAttributeCollection["periodid"];
xmlActions[i] = action.Value;
values += action.Value + ",";
string vals = values.Split(',')[1];
string counts = values;
string[] periods = counts.Split(',');
Period1 = periods[i];
// periodlevel_period1 = Client.GetAttributeAsString(sessionId, Period1, "name", "");
modified_listofstrings = Client.GetAttributeAsString(sessionId, Period1, "name", "");
modified_listofstrings.ToArray().ToString();
//listOfStrings = periodlevel_period1;
}
}
return modified_listofstrings;
This modified_listofstrings string only return last on value, I want to return the array of the values all obtained while looping.
----------Updated question----------
below is my Sample XMl
<string xmlns="http://tempuri.org/">
<ResultSetHierarchy totalResultsReturned="1" totalResults="1" firstIndex="0" maxCount="-1">
<object id="SC.1938773693.238">
<measure.values>
<series id="SC.1938773693.108280985">
<value periodid="SC.1938773693.394400760" value="17" />
<value periodid="SC.1938773693.1282504058" value="15" />
<value periodid="SC.1938773693.1631528570" value="13" />
</series>
</object>
</ResultSetHierarchy>
</string>
I want output as "SC.1938773693.394400760":"17" and so on for all periodid
Based on the provided information I have updated the answer.
List<string> items = new List<string>();
XmlDocument xmlDOC = new XmlDocument();
xmlDOC.Load(#"E:\Delete Me\ConsoleApplication1\ConsoleApplication1\bin\Debug\List.xml");
var elements = xmlDOC.GetElementsByTagName("value");
foreach (var item in elements)
{
XmlElement value = (XmlElement)item;
items.Add(string.Format("{0}:{1}", value.GetAttribute("periodid"), value.GetAttribute("value")));
}
It looks like you're trying to:
Load an XmlDocument
Get a list of all the attributes of name 'periodid'
Look each periodid up using a webservice call
Return a list of the lookup results
If that is correct, the following method should do what you need:
public List<string> GetListOfData()
{
XmlDocument xmlDOC = new XmlDocument();
xmlDOC.LoadXml("<Html><value periodid='Yabba'>YabbaValue</value><value periodid='Dabba'>DabbaValue</value><value periodid='Doo'>DooValue</value></Html>"); // string storing my XML
var value = xmlDOC.GetElementsByTagName("value");
var listOfStrings = new List<string>();
for (int i = 0; i < value.Count; i++)
{
var xmlAttributeCollection = value[i].Attributes;
if (xmlAttributeCollection != null)
{
var action = xmlAttributeCollection["periodid"];
string Period1 = action.Value;
listOfStrings.Add(QPR_webService_Client.GetAttributeAsString(sessionId, Period1, "name", "") + ":" + value[i].InnerText);
}
}
return listOfStrings;
}

How to get the unescaped length of XElement inner text?

I try to parse the following Java resources file - which is an XML.
I am parsing using C# and XDocument tools, so not a Java question here.
<?xml version="1.0" encoding="utf-8"?>
<resources>
<string name="problem"> test </string>
<string name="no_problem"> test </string>
</resources>
The problem is that XDocument.Load(string path) method load this as an XDocument with 2 identical XElements.
I load the file.
string filePath = #"c:\res.xml"; // whatever
var xDocument = XDocument.Load(filePath);
When I parse the XDocument object, here is the problem.
foreach (var node in xDocument.Root.Nodes())
{
if (node.NodeType == XmlNodeType.Element)
{
var xElement = node as XElement;
if (xElement != null) // just to be sure
{
var elementText = xElement.Value;
Console.WriteLine("Text = '{0}', Length = {1}",
elementText, elementText.Length);
}
}
}
This produces the following 2 lines :
"Text = ' test ', Length = 6"
"Text = ' test ', Length = 6"
I want to get the following 2 lines :
"Text = ' test ', Length = 6"
"Text = ' test ', Length = 16"
Document encoding is UTF8, if this is relevant somehow.
string filePath = #"c:\res.xml"; // whatever
var xDocument = XDocument.Load(filePath);
String one = (xDocument.Root.Nodes().ElementAt(0) as XElement).Value;//< test >
String two = (xDocument.Root.Nodes().ElementAt(1) as XElement).Value;//< test >
Console.WriteLine(one == two); //false
Console.WriteLine(String.Format("{0} {1}", (int)one[0], (int)two[0]));//160 32
You have two different strings, and   is there, but in unicode format.
One possible way to get things back is manually replace non-breaking space to " "
String result = one.Replace(((char) 160).ToString(), " ");
Thanks to Dmitry, following his suggestion, I have made a function to make stuff work for a list of unicode codes.
private static readonly List<int> UnicodeCharCodesReplace =
new List<int>() { 160 }; // put integers here
public static string UnicodeUnescape(this string input)
{
var chars = input.ToCharArray();
var sb = new StringBuilder();
foreach (var c in chars)
{
if (UnicodeCharCodesReplace.Contains(c))
{
// Append &#code; instead of character
sb.Append("&#");
sb.Append(((int) c).ToString());
sb.Append(";");
}
else
{
// Append character itself
sb.Append(c);
}
}
return sb.ToString();
}

Categories

Resources