Is it possible to restrict/require certain capabilities in a Stream parameter? - c#

I'm writing an application that creates catalogs of files. Currently the catalog information is stored in an XML file, but I'm trying to abstract the interface to a catalog to allow for other future storage mechanisms such as a single ZIP file, SQL server, or HTTP server. So rather than returning a file path the abstract Catalog class returns files as byte Streams. Thus allowing the source of a file to be a disk, but also for files coming from a database or a web server. See my previous related question.
However, the root Stream class includes Streams with different capabilities. Some streams can only be read, others can only be written to. Still some streams support seeking, while other streams do not.
Is there anyway to restrict the capabilities of the stream returns by a property or method? For example my Catalog class looks something like this.
public abstract class Catalog
{
...
public abstract Stream File
{
get;
}
...
}
Is there someway to ensure that File will always return a readable stream that supports seeking?

Well, you can check the CanRead, CanWrite and CanSeek properties of the stream.
I'm not sure I understand your question correctly, though... What are you trying to do exactly ?
Some streams will never be seekable (for instance NetworkStream, GZipStream...), so if you're working on those types of stream, there is no way to force them to seek.
If you just want to restrict the functionality of a stream (for instance, prevent writing to a stream that is normally writable), you can create a wrapper that delegates its implementation to the underlying stream, but throws an exception for "disabled" methods.

howabout abstracting the underlying persistance mechanism. What do your callers need? If they all need the same behaviour from your 'File', can you not create an interface which all your potential stores to implement, rather than have them all return 'Stream' classes?

Related

Intercepting a filestream...Impossible?

No doubt this isn't possible but i would like to see if anyone has an ingenious suggestion. We have a third party assembly which can output an image stored internally within a bespoke database to file using an internal method 'SaveToFile', an example:
3rdParty.Scripting.ImageManager man = new 3rdParty.Scripting.ImageManager("ref");
3rdParty.Scripting.Image itemImg = man.GetImage(orderNumber);
itemImg.SaveToFile("c:\file.jpg")
ItemImg.SaveToFile has no return type and just creates a bitmap internally and writes that to a filestream. We have absolutely no access to the compiled method.
What i need to do is somehow intercept the filestream and read the bitmap, i know this probably isn't possible but i'm no absolute expert so wanted to see if there is a magical way to do this.
If all else fails i'll save the file then read it back, i just want to avoid saving to disk where i might be able to obtain the data directly and eventually convert that to a base64 string value.
Unfortunately unless the 3rd party library provides a SaveToStream method where you could provide the stream from the outside there's no way to achieve what you are after. You will have to save the contents to a temporary file and then read the contents back.
That's why it's usually best practice when designing a library to provide methods taking Streams as I/O parameters as this would give the consumer the control of whether he wants to save it to a file, memory or network stream.

Deserializing .NET stream with multiple objects

I have a MemoryStream which I write into as I receive data off the network. Since the data can be broken up, there is the potential for the stream to have a partial message or multiple messages stored in the stream. When deserializing, I place the pointer back at the beginning of the stream and try to deserialize a class of mine. I have the deserialize wrapped in a try catch block, but I get to the deserialize line, the application just quits out (no exception, not more lines run in the function, etc).
I have multiple questions:
What is the best way to receive a stream of XML data from the network that may or may not be complete, and if so may or may not have more than one message?
Does the deserializer need to know about the encoding to decode the XML within the MemoryStream?
Does deserialization place the stream pointer after the deserialized object?
Can you deserialize multiple objects within a single stream?
1) You can leverage the XmlReader class which "provides forward-only, read-only access to a stream of XML data". That may help you translate xml data that may not be complete. http://msdn.microsoft.com/en-us/library/vstudio/system.xml.xmlreader
2) If you are referring to the mixing ASCII, UTF-8, etc. then yes, otherwise I am not sure what the question is.
3) That depends on the deserializer you are using.
4) Yes, with the XMlReader class you can cleverly extract attributes and xml fragments for later consumption (although the solution is not elegant and rather ugly)

Why use serialization

I've seen couple of examples with serializable attribute like this:
[Serializable()]
public class sampleClass
{
public int Property1{ get; set; }
public string Proerty2{ get; set; }
public sampleClass() { }
public sampleClass(int pr1, int pr2)
{
pr1 = pr1;
Name = pr2.ToString();
}
}
I never had a good grasp on understanding of how this works, but from msdn:
Serialization allows the developer to save the state of an object
and recreate it as needed, providing storage of objects as well as
data exchange. Through serialization, a developer can perform actions
like sending the object to a remote application by means of a Web
Service, passing an object from one domain to another, passing an
object through a firewall as an XML string, or maintaining security or
user-specific information across applications.
But the problem is that in my code example I see no use for it. Just an object that is used to retrieve data from the database, nothing special. What are some other uses on when to use and when not to use serialization.
For example, should I always use serializzation because it is more secure? is it goin to be slower this way?
Update: Thanks for all nice answers
Serialization is useful any time you want to move a representation of your data into or out of your process boundary.
Saving an object to disk is a trivial example you'll see in many tutorials.
More commonly, serialization is used to transfer data to and from a web service, or to persist data to or from a database.
Several answers have covered the reasons of why you might want to use serialization in general. You seem to also want to know why a specific class has attribute [Serializable] and you are wondering why that may have been done.
With ASP.NET the default Session state storage is InProc which allows you to store any object as a reference and leave it on the heap. This is the best performing way to store session state, however, it only works if you are using a single worker thread or if all your session state could be rebuilt automatically if the worker thread were to change (unlikely). For the other state storage modes (StateServer and SQL Server) all the session state objects must be serializable as the ASP.NET engine will first serialize these objects using binary serialization before sending them to the storage medium.
In your case, you may be using InProc. One reason though to still mark all classes that are used in session state as Serializable and test them that way is that you may have a need to change this in the future (for example, to use a Web Farm). If you do not design your session state classes with this in mind it will be quite difficult to do the migration in the future.
Also, just because you can remove the Serializable attribute and the program "works" in one environment does not mean that it will work in another environment. For example, it may work fine for you under Visual Studio test web server (which always uses InProc session state mode) instance and even in a development IIS instance but then, perhaps a production IIS instance is setup to use a different storage mode.
These environmental/configuration differences are not necessarily limited to ASP.NET applications. There are other application engines that may do this or even standalone applications that do (it is not difficult to build this kind of configurable environment).
Finally, you may be working with a library which may be consumed by different applications. Some may need to store state in a serializable manner and others may not.
Because of these factors it is often a very good idea, at least when building a library, to consider marking simple value classes or state management classes with [Serializable]. Keep in mind that this increases the work for testing these classes and there are limits to what can be serialized (i.e. a class that contains a socket reference or open file reference may not be a good candidate for serialization as open external resources cannot be serialized) so do not overuse this.
You asked if using [Serializable] will be slower. No, it will not be. This attribute has no direct affect on performance. However, if the application environment is changed to serialize the object, then yes, performance will be affected. It is the act of serializing and deserializing that is slower than just storing the object on the heap. [Note that some routines could be written to look for the Serializable attribute and then choose to serialize but this is rare; usually it is like ASP.NET and left up to an administrator or user to decide if they want to change the store medium.]
The MSDN quote you provide explains when serialization is useful: for transporting or storing data. Writing to a file is serialization, and serialization is required t send an object over a network.
If you are just populating the object in a single application, perhaps from a database, then indeed: serialization is not a concern at all. Imaging a class for serialization has no impact on security or performance: if you don't need it, don't worry about it.
Note also that [Serializable] mainly relates to BinaryFormatter, but there are actually many more serializers than that. For example: you might want to expose your object via JSON or XML - both of those require serialization, but neither requires [Serializable].
Simple example: Imagine you have a custom shape to store application settings.
namespace My.Namespace
{
[Serializable]
public class Settings
{
public string Setting1 { get; set; }
public string Setting2 { get; set; }
}
}
You could then have a file an xml file as such:
<?xml version="1.0" encoding="utf-8" ?>
<Settings>
<Setting1>Foo</Setting1>
<Setting2>Bar</Setting2>
</Settings>
Using XmlSerializer you could simply serialize and deserialize your settings.
It is also necessary for your shape to be Serializable if you wish to stuff it into ASP.NET ViewState
These are very basic examples but demonstrate it's usefulness
What are some other uses on when to use and when not to use serialization.
Let me give you one practical example. In one of my application, I was given XML schemas (XSD files) for request and response XML files. I need to parse the request XML file, process and save the information back into several tables. Later I need to prepare response XML file accordingly and send it back to our client.
I used Xsd2Code to generate C# classes based on the schema. So parsing the request XML file is simply deserializing it to the generated request class object. Then I can access properties from the object the way it appears in request XML file. While generating response XML file is simply serializing from the generated response class object which I populate in my code. This way I can work with C# objects rather than XML files. I hope it makes sense.
For example, should I always use serializzation because it is more secure
I don't think this is related to security in any way.

What is the difference between File and FileInfo in C#?

I've been reading that the static methods of the File Class are better used to perform small and few tasks on a file like checking to see if it exists and that we should use an instance of the FileInfo Class if we are going to perform many operations on a specific file.
I understand this and can simply use it that way blindly, but I would like to know why is there a difference?
What is it about the way they work that make them suitable for different situations? What is the point of having this two different classes that seem do the same in different ways?
It would be helpful if someone could answer at least one of this questions.
Generally if you are performing a single operation on a file, use the File class. If you are performing multiple operations on the same file, use FileInfo.
The reason to do it this way is because of the security checking done when accessing a file. When you create an instance of FileInfo, the check is only performed once. However, each time you use a static File method the check is performed.
The methods of the File and FileInfo classes are similar, but they differ in that the methods of the File class are static, so you need to pass more parameters than you would for the methods of the FileInfo instance.
You need to do this because it operates on a specific file; for example, the FileInfo.CopyTo() method takes one parameter for the destination path that's used to copy the file, whereas the File.Copy() method takes two parameters for the source path and the destination path."
References
http://aspfree.com/c/a/C-Sharp/A-Look-at-C-Sharp-File-and-FileInfo-Classes/1/
http://intelliott.com/blog/PermaLink,guid,ce9edbdb-6484-47cd-a5d6-63335adae02b.aspx
The File.Exists will perform much faster than a new FileInfo(filePath).Exists - especially over a network and provided the files actually exist. This is because File.Exists will only check for existence of the file, whereas a new FileInfo(filePath).Exists first constructs a FileInfo object, which contains all the properties (dates, size etc) of the file (if it exists).
In my experience with this, even checking for the existence of 10 files over the network is noticeably faster (ie 20ms vs 200ms) by using File.Exists.
File is optimized for one-off operations on a file, FileInfo is optimized around multiple operations on the same file, but in general there isn't that much difference between the different method implementations.
If you want to compare the exact implementations, Use Reflector to look at both classes.
A FileInfo may be needed to deal with Access Control properties. For the rest it is a Static versus Instance choice and you can pick what is convenient.
FileInfo is an instance of a file thus representing the file itself. File is a utility class so can work with any file
FileInfo:
Need to instantiate before using
Contains instance methods
Cache Info about the File and you need to call Refresh every time to get the latest info about the File
File:
No need to instantiate
Contains static methods
Do not cache, so you get latest info every time you use it.
src:
FileInfo
File
Yes, and one of the reason could be is, as Nag said Files is a utility class and hence no instance is required to be created. Same time, as File being utility class, each time require security check.
On other hand FileInfo requires instance to be created, and that point it uses security check. Thus, now performing multiple operation using FileInfo will not invoke security checks.
Recently I faced problem with File.Exist, I hate this function. After than I've used Fileinfo class Exist function then my program works correct.
Actually what happen in development enviornment File.Exist works well but when it goes to live environment this function is blocking the file object due to that reason I am getting the error access denied and not able to use the file.
This is my learning.
I will never used File.Exist method best is to create the object and then use it. Be aware to use static methods.
The major difference between File class and FileInfo class is that
Both members of the File and FileInfo class are decorated with the [System.Security.SecurityCritical] and [System.Security.SecuritySafeCritical] attribute but File class has 'multiple security checks' as compared to FileInfo class (Read Here) and the check is performed each time when you call a static member of the File class.
When you create an instance of FileInfo, the check is performed only once.
Apart from these, other minor differences are that File is a static type class whereas FileInfo is an instance type class.
Therefore to access the members of FileInfo class you need to create an instance whereas in File class you can directly access its members without the need to create an instance.
If you are performing multiple operations on the same file, it can be more efficient to use FileInfo instance methods instead of the corresponding static methods of the File class.
However, File class provides more methods as compared to FileInfo class.
Note: Either the SecurityCriticalAttribute attribute or the SecuritySafeCriticalAttribute attribute must be applied to code for the code to perform security-critical operations.

Save struct into a file In C#

I want to save a struct of data in a file in C#, but I don't want to use serialize and deserialize to implement it.
I want implement this action like I implement it in the C and C++ languages.
System.IO - File and Streams
To implement it in the "old fashioned way" in C#/.NET, based on the assumption C++ might use raw files and streams, you need to start in the .NET Framework's System.IO namespace.
Note: This allows you complete customization over the file reading/writing process so you don't have to rely on implicit mechanisms of serialization.
Files can be managed and opened using System.IO.File and System.IO.FileInfo to access Streams. (See the inheritance hierarchy at the bottom of that page to see the different kinds of streams.)
Helper Classes
So you don't have to manipulate bits and bytes directly (unless you want to).
For binary file access you can use System.IO.BinaryReader and BinaryWriter. For example, it easily converts between native data types and stream bytes.
For text-based access file access use System.IO.StreamReader and StreamWriter. Let's you use strings and characters instead of worrying about bytes.
Random Access
If random access is supported on the stream, use a method such as Stream.Seek(..) to jump around based on on whatever algorithm you decide on for determining record lengths and such.
You can use PtrToStructure and StructureToPtr to just dump the content to/from untyped data in a byte array which you can easily push to the file as one block. Just don't try this if your structure contains references to other objects (try keeping indexes instead, perhaps).
If you don't want to serialize it, you can always just use a BitConverter to convert the members to bytes via GetBytes, and write these directly to a Stream.

Categories

Resources