How to check your XML documents does not contains any external resource

How to check your XML documents does not contains any external resource - c#

I would like to check the better approach to check for XML documents that contain external resource.
I have received this error during veracode analysis.
Configure the XML parser to disable external entity resolution.
I can set the XMLResolve to null, but we depend on third party dlls too. So, I would like to validate the xml if it contains any external resource and reject the file immediately.
We do not use DTDs for our XML documents.
So here are the two options that I could think of. I guess both are almost the same. Just wanted to make sure if I'm missing anything.
//Check for DTD element in XML, if it contains, ignore this document.
public bool IsValid(string xml)
{
if (xml.Contains("<!DOCTYPE"))
{
return false;
}
return true;
}
or
public bool IsValid(string xml)
{
XmlReaderSettings xs = new XmlReaderSettings() {DtdProcessing = DtdProcessing.Prohibit};
try
{
XmlReader.Create(xml, xs);
return true;
}
catch (Exception ex)
{
return false;
}
}
Also, this will only resolve DTDs, how can we check for other external resources like entities and schemas? What is the process to check for all the external entities? Thanks for your help.

Related

How to cache reading .csv files in C#

This may be a noob question, but I need some help. I have written two simple methods in C#: ReadCsv_IT and GetTranslation. The ReadCsv_IT method reads from a csv file. The GetTransaltion method calls the ReadCsv_IT method and returns the translated input (string key).
My problem is that in the future I will have to request a lot of times GetTranslation, but I obviously don't want to read the .csv files every time. So I was thinking about ways to use cache Memory to optimize my program, so that I don't have to read the .csv file on every request. But I am not sure how to do it and what I could do to optimize my program. Can anyone please help ?
public string ReadCsv_IT(string key)
{
string newKey = "";
using (var streamReader = new StreamReader(#"MyResource.csv"))
{
CsvReader csv = new CsvReader(streamReader);
csv.Configuration.Delimiter = ";";
List<DataRecord> rec = csv.GetRecords<DataRecord>().ToList();
DataRecord record = rec.FirstOrDefault(a => a.ORIGTITLE == key);
if (record != null)
{
//DOES THE LOCALIZATION with the help of the .csv file.
}
}
return newKey;
}
Here is the GetTranslation Method:
public string GetTranslation(string key, string culture = null)
{
string result = "";
if (culture == null)
{
culture = Thread.CurrentThread.CurrentCulture.Name;
}
if (culture == "it-IT")
{
result = ReadCsv_IT(key);
}
return result;
}
Here is also the class DataRecord.
class DataRecord
{
public string ORIGTITLE { get; set; }
public string REPLACETITLE { get; set; }
public string ORIGTOOLTIP { get; set; }
}
}

Two options IMO:
Turn your stream into an object?
In other words:
Make a class stream so you can refer to that object of the class stream.
Second:
Initialize your stream in the scope that calls for GetTranslation, and pass it on as an attribute to GetTranslation and ReadCSV_IT.

Brecht C and Thom Hubers have already given you good advice. I would like to add one more point, though: using csv files for localization in .NET is not really a good idea. Microsoft recommends using a resource-based approach (this article is a good starting point). It seems to me that you are trying to write code for something that is already built into .NET.
From a translation point of view csv files are not the best possible format either. First of all, they are not really standardized: many tools have slightly different ways to handle commas, quotes, and line breaks that are part of the translated text. Besides, translators will be tempted to open them in Excel, and -unless handled with caution- Excel will write out translations in whatever encoding it deems best.
If the project you are working on is for learning please feel free to go ahead with it, but if you are developing software that will be used by customers, updated, translated into several target languages, and redeployed, I would recommend to reconsider your internationalization approach.

#Brecht C is right, use that answer to start. When a variable has to be cached to be used by multiple threads or instances: take a look at InMemoryCache or Redis when perfomance and distribution over several clients gets an issue.

CodeFluent Aspect for Full-Text Index

I'm trying to develop a CodeFluent aspect to set a property of a entity to be a full-text index.
I've found this link, which does something similar to what I'm aiming for.
http://blog.codefluententities.com/2012/11/27/using-the-sql-server-template-producer-to-generate-clustered-indexes/
However this uses a SQL template producer. Are there anyway to set a property to be a full-text index entirely in the aspect itself, so I don't have to install/maintain both template producer and aspect for all projects?
Here's the C# aspect code I have so far:
public class FullTextIndexing : IProjectTemplate
{
public static readonly XmlDocument Descriptor;
public const string Namespace = "http://www.softfluent.com/aspects/samples/FullTextIndexing";
static FullTextIndexing()
{
Descriptor = new XmlDocument();
Descriptor.LoadXml(
#"<cf:project xmlns:cf='http://www.softfluent.com/codefluent/2005/1' defaultNamespace='FullTextIndexing'>
<cf:pattern name='Full Text Indexing' namespaceUri='" + Namespace + #"' preferredPrefix='fti' step='Tables'>
<cf:message class='_doc'>CodeFluent Full Text Indexing Aspect</cf:message>
<cf:descriptor name='fullTextIndexing'
typeName='boolean'
category='Full Text Indexing'
targets='Property'
defaultValue='false'
displayName='Full-Text Index'
description='Determines if property should be full text indexed.' />
</cf:pattern>
</cf:project>");
}
public Project Project { get; set; }
public XmlDocument Run(IDictionary context)
{
if (context == null || !context.Contains("Project"))
{
// we are probably called for meta data inspection, so we send back the descriptor xml
return Descriptor;
}
// the dictionary contains at least these two entries
Project = (Project)context["Project"];
// the dictionary contains at least these two entries
XmlElement element = (XmlElement)context["Element"];
Project project = (Project)context["Project"];
foreach (Entity entity in project.Entities)
{
Console.WriteLine(">>PROPERTY LOGGING FOR ENTITY "+entity.Name.ToUpper()+":<<");
foreach (Property property in entity.Properties)
{
Log(property);
if(MustFullTextIndex(property))
{
Console.WriteLine("CHANGING PROPERTY");
property.TypeName = "bool";
Log(property);
}
}
}
// we have no specific Xml to send back, but aspect description
return Descriptor;
}
private static bool MustFullTextIndex(Property property)
{
return property != null && property.IsPersistent && property.GetAttributeValue("fullTextIndexing", Namespace, false);
}
private static void Log(Property property)
{
Console.WriteLine(property.Trace());
}
}
EDIT ONE:
Following Meziantou's answer, I'm trying to create a template producer, but it's giving me compilation errors when I try to add the new template producer to the project producers list, so I'm probably doing it wrong.
The error says:
Cannot convert type 'CodeFluent.Model.Producer' to 'CodeFluent.Producers.SqlServer.TemplateProducer'
Here's the code I have thus far:
public XmlDocument Run(IDictionary context)
{
if (context == null || !context.Contains("Project"))
{
// we are probably called for meta data inspection, so we send back the descriptor xml
return Descriptor;
}
// the dictionary contains at least these two entries
XmlElement element = (XmlElement)context["Element"];
Project project = (Project)context["Project"];
CodeFluent.Producers.SqlServer.TemplateProducer producer = new CodeFluent.Producers.SqlServer.TemplateProducer();
producer.AddNamespace("CodeFluent.Model");
producer.AddNamespace("CodeFluent.Model.Persistence");
producer.AddNamespace("CodeFluent.Producers.SqlServer");
Console.WriteLine(producer.Element);
//TODO: Need to figure out how to modify the actual template's contents
project.Producers.Add(producer); //Error happens here
// we have no specific Xml to send back, but aspect description
return Descriptor;
}

In the sample code, the aspect is used only because it has a descriptor. Descriptors are used by CodeFluent Entities to populate the property grid:
<cf:descriptor name="IsClusteredIndex" typeName="boolean" targets="Property" defaultValue="false" displayName="IsClusteredIndex" />
So when you set the value of this property to true or false, the xml attribute ns:IsClusteredIndex is added or removed from the xml file.
Then the SQL Template reads the value of the attribute to generate the expected SQL file:
property.GetAttributeValue("sa:IsClusteredIndex", false)
So the aspect is not mandatory, but provides a graphical interface friendly way to add/remove the attribute. If you don't need to integrate into the graphical interface, you can safely remove the aspect.
If your goal is to integrate into the graphical interface, you need an aspect (XML or DLL) or a producer. If you don't want to create a producer, you can embed the template into your aspect. During the build, you can extract the SQL template and add the SQL Template producer to the project, this way everything is located in the aspect.

XML Validation against XSD always returns true

I have a c# script that validates an XML document against an XSD document, as follows:
static bool IsValidXml(string xmlFilePath, string xsdFilePath)
{
XmlReaderSettings settings = new XmlReaderSettings();
settings.Schemas.Add(null, xsdFilePath);
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Compile();
try
{
XmlReader xmlRead = XmlReader.Create(xmlFilePath, settings);
while (xmlRead.Read())
{ };
xmlRead.Close();
}
catch (Exception e)
{
return false;
}
return true;
}
I've compiled this after looking at a number of MSDN articles and questions here where this is the solution. It does correctly validate that the XSD is formed well (returns false if I mess with the file) and checks that the XML is formed well (also returns false when messed with).
I've also tried the following, but it does the exact same thing:
static bool IsValidXml(string xmlFilePath, string xsdFilePath)
{
XDocument xdoc = XDocument.Load(xmlFilePath);
XmlSchemaSet schemas = new XmlSchemaSet();
schemas.Add(null, xsdFilePath);
try
{
xdoc.Validate(schemas, null);
}
catch (XmlSchemaValidationException e)
{
return false;
}
return true;
}
I've even pulled a completely random XSD off the internet and thrown it into both scripts, and it still validates on both. What am I missing here?
Using .NET 3.5 within an SSIS job.

In .NET you have to check yourself if the validator actually matches a schema component; if it doesn't, there is no exception thrown, and so your code will not work as you expect.
A match means one or both of the following:
there is one global element in your schema set with a qualified name that is the same as your XML document element's qualified name.
the document element has an xsi:type attribute, that is a qualified name pointing to a global type in your schema set.
In streaming mode, you can do this check easily. This pseudo-kind-of-code should give you an idea (error handling not shown, etc.):
using (XmlReader reader = XmlReader.Create(xmlfile, settings))
{
reader.MoveToContent();
var qn = new XmlQualifiedName(reader.LocalName, reader.NamespaceURI);
// element test: schemas.GlobalElements.ContainsKey(qn);
// check if there's an xsi:type attribute: reader["type", XmlSchema.InstanceNamespace] != null;
// if exists, resolve the value of the xsi:type attribute to an XmlQualifiedName
// type test: schemas.GlobalTypes.ContainsKey(qn);
// if all good, keep reading; otherwise, break here after setting your error flag, etc.
}
You might also consider the XmlNode.SchemaInfo which represents the post schema validation infoset that has been assigned to a node as a result of schema validation. I would test different conditions and see how it works for your scenario. The first method is recommended to reduce the attack surface in DoS attacks, as it is the fastest way to detect completely bogus payloads.

Keeping track of user customization's c#

Good evening; I have an application that has a drop down list; This drop down list is meant to be a list of commonly visited websites which can be altered by the user.
My question is how can I store these values in such a manor that would allow the users to change it.
Example; I as the user, decide i want google to be my first website, and youtube to be my second.
I have considered making a "settings" file however is it practical to put 20+ websites into a settings file and then load them at startup? Or a local database, but this may be overkill for the simple need.
Please point me in the right direction.

Given you have already excluded database (probably for right reasons.. as it may be over kill for a small app), I'd recommend writing the data to a local file.. but not plain text..
But preferably serialized either as XML or JSON.
This approach has at least two benefits -
More complex data can be stored in future.. example - while order can be implicit, it can be made explicit.. or additional data like last time the url was used etc..
Structured data is easier to validate against random corruption.. If it was a plain text file.. It will be much harder to ensure its integrity.

The best would be to use the power of Serializer and Deserializer in c#, which will let you work with the file in an Object Oriented. At the same time you don't need to worry about storing into files etc... etc...
Here is the sample code I quickly wrote for you.
using System;
using System.IO;
using System.Collections;
using System.Xml.Serialization;
namespace ConsoleApplication3
{
public class UrlSerializer
{
private static void Write(string filename)
{
URLCollection urls = new URLCollection();
urls.Add(new Url { Address = "http://www.google.com", Order = 1 });
urls.Add(new Url { Address = "http://www.yahoo.com", Order = 2 });
XmlSerializer x = new XmlSerializer(typeof(URLCollection));
TextWriter writer = new StreamWriter(filename);
x.Serialize(writer, urls);
}
private static URLCollection Read(string filename)
{
var x = new XmlSerializer(typeof(URLCollection));
TextReader reader = new StreamReader(filename);
var urls = (URLCollection)x.Deserialize(reader);
return urls;
}
}
public class URLCollection : ICollection
{
public string CollectionName;
private ArrayList _urls = new ArrayList();
public Url this[int index]
{
get { return (Url)_urls[index]; }
}
public void CopyTo(Array a, int index)
{
_urls.CopyTo(a, index);
}
public int Count
{
get { return _urls.Count; }
}
public object SyncRoot
{
get { return this; }
}
public bool IsSynchronized
{
get { return false; }
}
public IEnumerator GetEnumerator()
{
return _urls.GetEnumerator();
}
public void Add(Url url)
{
if (url == null) throw new ArgumentNullException("url");
_urls.Add(url);
}
}
}

You clearly need some sort of persistence, for which there are a few options:
Local database
- As you have noted, total overkill. You are just storing a list, not relational data
Simple text file
- Pretty easy, but maybe not the most "professional" way. Using XML serialization to this file would allow for complex data types.
Settings file
- Are these preferences really settings? If they are, then this makes sense.
The Registry - This is great for settings you don't want your users to ever manually mess with. Probably not the best option for a significant amount of data though
I would go with number 2. It doesn't sound like you need any fancy encoding or security, so just store everything in a text file. *.ini files tend to meet this description, but you can use any extension you want. A settings file doesn't seem like the right place for this scenario.

XmlDocument.Validate does not fire for multiple errors

I am trying to validate an incoming input xmlDocument against a an existing XmlSchemaSet. Following is the code:
public class ValidateSchemas
{
private bool _isValid = true;
public List<string> errorList = new List<string>();
public bool ValidateDocument(XmlDocument businessDocument)
{
XmlSchemaSet schemaSet = SchemaLoader.Loader();
bool isValid = Validate(businessDocument, SchemaLoader._schemaSet);
return isValid;
}
public bool Validate(XmlDocument document, XmlSchemaSet schema)
{
ValidationEventHandler eventHandler = new ValidationEventHandler(HandleValidationError);
document.Schemas = schema;
document.Validate(eventHandler);
return _isValid;
}
private void HandleValidationError(object sender, ValidationEventArgs ve)
{
_isValid = false; errorList.Add(ve.Message);
}
}
The code works fine from a validation perspective. However the errorList captures only the first node error. It does not capture the other node errors. Looks like the event is getting fired only once. How to accomplish this, please help. Please note I am getting xmldocument as input , hence not using a reader.

That's exactly the expected behavior of XmlDocument.Validate method. Once it finds a validation error it stops validate process and returns the error. So, the user has to fix that error and validate again.
This behavior is different from the Visual studio error list. For example, if you have a single syntax error in the code sometimes it returns 100s of errors. But actually you have to fix only one at one place. So, there can be both pros and cons depends on the circumstance. However, I don't think you could easily get all the validation errors for a XMLDocument, it works in a different way inherently.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to check your XML documents does not contains any external resource - c#

Related

How to cache reading .csv files in C#

CodeFluent Aspect for Full-Text Index

XML Validation against XSD always returns true

Keeping track of user customization's c#

XmlDocument.Validate does not fire for multiple errors

Categories

Resources