What can be read from memory in a C# application

What can be read from memory in a C# application - c#

I know that any application running (whether it is built with C#, C, C++, Java, etc) will have elements exposed in memory. I'm curious as to how to control what and how it is exposed in memory?
I'm curious because I know that many games get hacked or modified by a user viewing the contents in memory of the game and altering them. I just want to know more details around how this works. I know special programs must be used to even dive into the memory and there are conversions and stuff that must happen for it to even be some what readable.
Let's take a extremely simple example and I'll ask some questions about it.
using System.Security;
static class Program2
{
private static SecureString fSecureString;
public static string fPublicString = "Test123";
private static string fPrivateString = "321tesT";
static void Main2()
{
}
}
class TestClass
{
private string fInstancedPrivateString;
public TestClass()
{
fInstancedPrivateString = "InstancedSet";
}
private string DoSomething()
{
return fInstancedPrivateString.ToLower();
}
}
}
Given the code above, I imagine that fPublicString is pretty visible to see. What elements can someone reading memory see? Can they read the variable name or do they just see an memory address and an assigned value (Test123). What about Functions like DoSomething that are inside an instanced class? Can someone see that in memory and write malicious code to execute it at their will?
I'm just curious as to how much of this I need to keep in mind while writing applications (or games). I understand the general idea of the accessor properties (public/private/etc) and their relation to other code having visibility to it, but I'm curious if they have any bearing on how it is represented in memory.
My final question will be very specific: EverQuest (game) has a hack called MacroQuest which from my understanding reads memory by having the proper offsets and can then execute code from the EQ client side or simply change values stored in memory for the client. How did EQ get this so wrong? Was it poor programming on their end? A technology limitation that is sort of resolved now? Or can this technically be done with virtually every piece of software that is written with the right amount of knowledge?
Over all I guess I could probably use a good tutorial, article, or book that provides some details on how code looks in memory etc.

Knowing that your application's memory can be read should not be something a "normal" developer needs to worry about. The number of users that are able to exploit this in a useful way are very few (in the grand scheme) and it only really matters for sensitive parts of your application anyway (licensing, passwords, and other personally identifiable information). Otherwise, the risk is really negligible.
If the effort of protecting it can't be justified by the cost of doing so then why should the person/group/etc paying to have it built worry. It isn't worth investing the time to care when there's always a ton of other things that could otherwise use the time investment.
Should Notepad or MS Word care that you can write a sniffer to listen to what is being typed? Probably not, and why? Because it really doesn't effect the bottom line or pose any realistic risk.

Related

How to allow the Concept of Circularity in a program without infinite computational resources?

This might sound like a very strange question. But i work on a project which needs to have cirular references within it. Actually, they are even non-avoidable. Because Users could create their own circular references within the GUI. And this is absolutely intended.... Please don't ask why, this would take ages to explain.
All Question, Answers, Resources i found which discuss Circular References provide Solutions and Approaches on how to avoid one. But non i have read contained a solution on how to make one, without killing the underlying computational resources.
Issues i see
Such a cicular reference seems to me to always have the possibility to completely overhelm the underlying system, be it a simple home computer or research supercomputer where this program is meant to be run.
This is due to my understanding that the resources provided are always finite, but circular references are infinite by nature.
The resources i see which might be of issue here are:
computational power (CPU)
working memory (RAM)
Data storage
Network bandwidth
How could it be possible to mitigate those issues
Mitigation could take place by making sure that the program itself is only ever able to increase it's needs for computational resources in an very minor and incremental fashion. If there are then measures implemented which, based on gathered Data of the whole System as a Unit, allows us to decide if further evolutions are even necessary to improve the perceived Quality of the System. It would help us to cap the needs for Computational Resources.
One of the ways i could imagine that this capping could take place is by introducing time as a limiting factor. The program could be designed in such a way that it only considers re-evaluating "itself" after a given amount of time. If this time and the limit of Quality are carefully choosen to match the underlying computational resources, i feel like the resource issues with circular references could be mitigated.
Code Snippet
Find below a very simplified Code Snippet. Point 1 and Point 2 are completely independent in nature, they could even be on different Threads (actually that's an Idea how it could be done, but i dont understand multithreading well enough to decide if it would be a good approach or not). The action first begins when they are attached to another. I do not care if the behavior of "First this then that" happens in a specific way. The only thing for which i do care is that all interactions between those two Points have been taken place at some point in the future (after their attachement).
namespace Circularity
{
class Program
{
static void Main(string[] args)
{
Point Point1 = new Point();
Point Point2 = new Point();
Point1.attach(Point2);
}
}
class Point
{
private ulong Value;
public Point()
{
Value = ulong.MaxValue / 2;
}
public void attach(Point otherPoint)
{
if (Value < ulong.MaxValue) Value++;
otherPoint.attach(this);
}
}
}
This Code leads instantly to a Stack Overflow. But i do not understand the underlying concepts of the Stack well enough to implement a counter measure. I tried to apply the Time concept here already, but it just takes longer for the Stack Overflow.

The reason you're getting a stack overflow is because you're calling attach recursively, so you will just keep adding stack frames, the CLR can't handle that many and as you've witnessed, it quickly maxes out. One strategy here would be to use Continuation Passing Style so you avoid building a stack of method calls.
When and how to use continuation passing style

c# Should I store a copy of a field for convenience?

I have developed a habit of sometimes doing this particular thing and I'm wondering why am I doing it, is there any advantage?
Heres an example from a Unity3d game..
In my class I want to do various calculations and so forth with a float ThingYposition which is a field stored somewhere in Thing.transform.position.y. Rather than be writing Thing.transform.position.y so many times I just make a copy of the float I want at the beginning of the program.
public GameObject Thing;
private float ThingYposition;
public Start()
{
ThingYposition = Thing.transform.position.y
}
public Update()
{
//Do stuff every frame with ThingYposition
}
So this way means my lines of code will be a little less cluttered but the program will use a little bit more memory as I'm storing that float twice now. But will it be any faster? Does accessing a deeply embedded field like Thing.transform.position.y actually use any more processing power than accessing my float field?
Do you think this is harmless habit or should I stop?
Also please note in this example I dont care if the original changes at all and I dont want to change it.

You already stated you don't care if the original changes, so I'll skip that part. The only advantage I can see is in a multi-threaded environment. You don't have to worry about another thread mucking with Thing, since you have a private copy of ThingYposition.
In terms of efficiency, you're well into micro optimizing here. If you're having a problem, profile it and experiment with alternatives. But I can't imagine this is something you really need to worry about.

Since you don't care whether or not the original position changes and will not change it yourself, then this is probably the best approach for the use-case you described.
The answer to the other part of your question, is it faster to access a local vs a "deeply embedded field" depends on how Thing.transform.position.y is implemented. If it just a member field, then the access times would be essentially the same for a local copy or the "deeply embedded field". If Thing.transform.position.y is calculated on every access then the local copy would be faster.

Storing a reference to an outside variable within an object

I'm programming for a game in XNA and attempting to create a universal math object with a stored output location supplied during construction.
My plan was to use ref in the constructor, but I'm not sure how to hold/store that reference in the object beyond the initial call...
public MathObject(ref float OutParam)
{
Out = OutParam; // This obviously won't do what I want... But it's where I'd like to do it.
}
In the update I'd like to state the input and have the product modify the stored output location:
foreach (MathObject MatOb in MathList)
{
MatOb.Update(time);
}
The idea was to create a modular math tool to use throughout the code and direct it on creation to a pre-existing object parameter elsewhere ("output") that it will modify in the update (without re-referencing). The hope was that this would allow a single loop to direct every instance of the tool to modify it's given output.
As I understand it, in c++ this is possible through storing the address of the parameter to be modified within the math object, then using this in the update to point to and modify the memory at that location.
Is something similar possible in c# without the use of unsafe code?
Should unsafe code always be avoided?
Edit:
-- Intended Use --
I'd like to be able to create objects with an adjustable "set and forget" output location.
For instance, I've built a simple bezier curve editor that works within the game interface. I can set the output locations in the code so that a given curve always adjusts specific parameters(character position for example), but It would be nice to modify what the output is connected to within the interface also.
The specific applications would be mostly for in-game editing. I understand editors are most practical when self-contained but this would be for limited, game console friendly editing functionality (less robust, but similar in principle to the editing capablities of Little Big Planet).
My background is in 3D design and animation so I'm used to working with many node-based editing systems - Creating various utility nodes and adjusting inputs and outputs to drive parameters for shading, rigging models, etc. I'm certainly not attempting to re-create this in game, but I'm curious about carrying over and applying certain principles to limited in-game editing functionality. Just troubleshooting best to go about it.
Thanks for the replies!

The way to do this in C# is to use a pair of get/set delegates. You could use this handy-dandy helper struct to store them:
public struct Ref<T>
{
Func<T> get;
Action<T> set;
public Ref(Func<T> get, Action<T> set)
{
this.get = get;
this.set = set;
}
public T Value { get { return get(); } set { set(value); } }
}
Given some class like this:
class Foo { public float bar; }
Use it like this:
Foo myFoo = new Foo();
Ref<float> barRef = new Ref<float>(() => myFoo.bar, (v) => myFoo.bar = v);
// Ta-Da:
barRef.Value = 12.5f;
Console.WriteLine(barRef.Value);
(Please note that I haven't actually tested the code here. The concept works and I've successfully used it before, but I might have messed up my syntax by typing this up off the top of my head.)
Because this question is tagged with XNA, I should briefly talk about performance:
You'll probably find this performs about an order of magnitude or so slower than the equivalent memory access - so it's not suitable for tight loops.
Creating these things allocates memory. This is very bad for performance for a number of reasons. So avoid creating these inside your draw/update loop. During loading is fine, though.
And, finally, this is uglier than simply accessing the property directly - so be sure that you're doing what you can to avoid using this where possible.

I wouldn't know how to do this in C# without unsafe code, but.. if you absolutely must tackle your problem with this solution and without using unsafe code then maybe memory mapped files are your friend. even so, these haven't been around for .NET development until .NET 4.0 and I'm not sure how this option compares to unsafe code performance-wise.

I think what you need is the observer design pattern. Item interested in the update of math object will register the avent ( ie MathObjectChange ) and react properly.

How do these people avoid creating any garbage?

Here's an interesting article that I found on the web.
It talks about how this firm is able to parse a huge amount of financial data in a managed environment, essentially by object reuse and avoiding immutables such as string. They then go on and show that their program doesn't do any GC during the continuous operation phase.
This is pretty impressive, and I'd like to know if anyone else here has some more detailed guidelines as to how to do this. For one, I'm wondering how the heck you can avoid using string, when blatently some of the data inside the messages are strings, and whatever client application is looking at the messages will want to be passed those strings? Also, what do you allocate in the startup phase? How will you know it's enough? Is it simple a matter of claiming a big chunk of memory and keeping a reference to it so that GC doesn't kick in? What about whatever client application is using the messages? Does it also need to be written according to these stringent standards?
Also, would I need a special tool to look at the memory? I've been using SciTech memory profiler thus far.

I found the paper you linked to rather deficient:
It assumes, and wants you to assume, that garbage collection is the ultimate latency killer. They have not explained why they think so, nor have they explained in what way their system is not basically a custom-made garbage collector in disguise.
It talks about the amount of memory cleaned up in garbage collection, which is irrelevant: the time taken to garbage collect depends more on the number of objects, irrespective of their size.
The table of “results” at the bottom provides no comparison to a system that uses .NET’s garbage collector.
Of course, this doesn’t mean they’re lying and it’s nothing to do with garbage collection, but it basically means that the paper is just trying to sound impressive without actually divulging anything useful that you could use to build your own.

One thing to note from the beginning is where they say "Conventional wisdom has been developing low latency messaging technology required the use of unmanaged C++ or assembly language". In particular, they are talking about a sort of case where people would often dismiss a .NET (or Java) solution out of hand. For that matter, a relatively naïve C++ solution probably wouldn't make the grade either.
Another thing to consider here, is that they have essentially haven't so much gotten rid of the GC as replaced it - there's code there managing object lifetime, but it's their own code.
There are several different ways one could do this instead. Here's one. Say I need to create and destroy several Foo objects as my application runs. Foo creation is parameterised by an int, so the normal code would be:
public class Foo
{
private readonly int _bar;
Foo(int bar)
{
_bar = bar;
}
/* other code that makes this class actually interesting. */
}
public class UsesFoo
{
public void FooUsedHere(int param)
{
Foo baz = new Foo(param)
//Do something here
//baz falls out of scope and is liable to GC colleciton
}
}
A much different approach is:
public class Foo
{
private static readonly Foo[] FOO_STORE = new Foo[MOST_POSSIBLY_NEEDED];
private static Foo FREE;
static Foo()
{
Foo last = FOO_STORE[MOST_POSSIBLY_NEEDED -1] = new Foo();
int idx = MOST_POSSIBLY_NEEDED - 1;
while(idx != 0)
{
Foo newFoo = FOO_STORE[--idx] = new Foo();
newFoo._next = FOO_STORE[idx + 1];
}
FREE = last._next = FOO_STORE[0];
}
private Foo _next;
//Note _bar is no longer readonly. We lose the advantages
//as a cost of reusing objects. Even if Foo acts immutable
//it isn't really.
private int _bar;
public static Foo GetFoo(int bar)
{
Foo ret = FREE;
FREE = ret._next;
return ret;
}
public void Release()
{
_next = FREE;
FREE = this;
}
/* other code that makes this class actually interesting. */
}
public class UsesFoo
{
public void FooUsedHere(int param)
{
Foo baz = Foo.GetFoo(param)
//Do something here
baz.Release();
}
}
Further complication can be added if you are multithreaded (though for really high performance in a non-interactive environment, you may want to have either one thread, or separate stores of Foo classes per thread), and if you cannot predict MOST_POSSIBLY_NEEDED in advance (the simplest is to create new Foo() as needed, but not release them for GC which can be easily done in the above code by creating a new Foo if FREE._next is null).
If we allow for unsafe code we can have even greater advantages in having Foo a struct (and hence the array holding a contiguous area of stack memory), _next being a pointer to Foo, and GetFoo() returning a pointer.
Whether this is what these people are actually doing, I of course cannot say, but the above does prevent GC from activating. This will only be faster in very high throughput conditions, if not then letting GC do its stuff is probably better (GC really does help you, despite 90% of questions about it treating it as a Big Bad).
There are other approaches that similarly avoid GC. In C++ the new and delete operators can be overridden, which allows for the default creation and destruction behaviour to change, and discussions of how and why one might do so might interest you.
A practical take-away from this is when objects either hold resources other than memory that are expensive (e.g. connections to databases) or "learn" as they continue to be used (e.g. XmlNameTables). In this case pooling objects is useful (ADO.NET connections do so behind the scenes by default). In this case though a simple Queue is the way to go, as the extra overhead in terms of memory doesn't matter. You can also abandon objects on lock contention (you're looking to gain performance, and lock contention will hurt it more than abandoning the object), which I doubt would work in their case.

From what I understood, the article doesn't say they don't use strings. They don't use immutable strings. The problem with immutable strings is that when you're doing parsing, most of the strings generated are just throw-away strings.
I'm guessing they're using some sort of pre-allocation combined with free lists of mutable strings.

I worked for a while with a CEP product called StreamBase. One of their engineers told me that they were migrating their C++ code to Java because they were getting better performance, fewer bugs and better portability on the JVM by pretty much avoiding GC altogether. I imagine the arguments apply to the CLR as well.
It seemed counter-intuitive, but their product was blazingly fast.
Here's some information from their site:
StreamBase avoids garbage collection in two ways: Not using objects, and only using the minimum set of objects we need.
First, we avoid using objects by using Java primitive types (Boolean, byte, int, double, and long) to represent our data for processing. Each StreamBase data type is represented by one or more primitive type. By only manipulating the primitive types, we can store data efficiently in stack or array allocated regions of memory. We can then use techniques like parallel arrays or method calling to pass data around efficiently.
Second, when we do use objects, we are careful about their creation and destruction. We tend to pool objects rather than releasing them for garbage collection. We try to manage object lifecycle such that objects are either caught by the garbage collector in the young generation, or kept around forever.
Finally, we test this internally using a benchmarking harness that measures per-tuple garbage collection. In order to achieve our high speeds, we try to eliminate all per-tuple garbage collection, generally with good success.

In 99% of the time you will be wasting your bosses money when you try to achieve this. The article describes a absolute extreme scenario where they need the last drop of performance. As you can read in the article, there are great parts of the .NET framework that can't be used when trying to be GC-free. Some of the most basic parts of the BCL use memory allocations (or 'produce garbage', as the paper calls it). You will need to find a way around those methods. And even when you need absolute blazingly fast applications, you'd better first try to build an application/architecture that can scale out (use multiple machines), before trying to walk the no-GC route. The sole reason for them to use the no-GC route is they need an absolute low latency. IMO, when you need absolute speed, but don't care about the absolute minimum response time, it will be hard to justify a no-GC architecture. Besides this, if you try to build a GC-free client application (such as Windows Forms or WPF App); forget it, those presentation frameworks create new objects constantly.
But if you really want this, it is actually quite simple. Here is a simple how to:
Find out which parts of the .NET API can't be used (you can write a tool that analyzes the .NET assemblies using an introspection engine).
Write a program that verifies the code you or your developers write to ensure they don't allocate directly or use 'forbidden' .NET methods, using the safe list created in the previous point (FxCop is a great tool for this).
Create object pools that you initialize at startup time. The rest of the program can reuse existing object so that they won't have to do any new ops.
If you need to manipulate strings, use byte arrays for this and store byte arrays in a pool (WCF uses this technique also). You will have to create an API that allows manipulating those byte arrays.
And last but not least, profile, profile, profile.
Good luck

Getting my head around object oriented programming

I am entry level .Net developer and using it to develop web sites. I started with classic asp and last year jumped on the ship with a short C# book.
As I developed I learned more and started to see that coming from classic asp I always used C# like scripting language.
For example in my last project I needed to encode video on the webserver and wrote a code like
public class Encoder
{
Public static bool Encode(string videopath) {
...snip...
return true;
}
}
While searching samples related to my project I’ve seen people doing this
public class Encoder
{
Public static Encode(string videopath) {
EncodedVideo encoded = new EncodedVideo();
...snip...
encoded.EncodedVideoPath = outputFile;
encoded.Success = true;
...snip...
}
}
public class EncodedVideo
{
public string EncodedVideoPath { get; set; }
public bool Success { get; set; }
}
As I understand second example is more object oriented but I don’t see the point of using EncodedVideo object.
Am I doing something wrong? Does it really necessary to use this sort of code in a web app?

someone once explained OO to me as a a soda can.
A Soda can is an object, an object has many properties. And many methods. For example..
SodaCan.Drink();
SodaCan.Crush();
SocaCan.PourSomeForMyHomies();
etc...
The purpose of OO Design is theoretically to write a line of code once, and have abstraction between objects.
This means that Coder.Consume(SodaCan.contents); is relative to your question.
An encoded video is not the same thing as an encoder. An encoder returns an encoded video. and encoded video may use an encoder but they are two seperate objects. because they are two different entities serving different functions, they simply work together.
Much like me consuming a soda can does not mean that I am a soda can.

Neither example is really complete enough to evaluate. The second example seems to be more complex than the first, but without knowing how it will be used it's difficult to tell.
Object Oriented design is at it's best when it allows you to either:
1) Keep related information and/or functions together (instead of using parallel arrays or the like).
Or
2) Take advantage of inheritance and interface implementation.
Your second example MIGHT be keeping the data together better, if it returns the EncodedVideo object AND the success or failure of the method needs to be kept track of after the fact. In this case you would be replacing a combination of a boolean "success" variable and a path with a single object, clearly documenting the relation of the two pieces of data.
Another possibility not touched on by either example is using inheritance to better organize the encoding process. You could have a single base class that handles the "grunt work" of opening the file, copying the data, etc. and then inherit from that class for each different type of encoding you need to perform. In this case much of your code can be written directly against the base class, without needing to worry about what kind of encoding is actually being performed.

Actually the first looks better to me, but shouldn't return anything (or return an encoded video object).
Usually we assume methods complete successfully without exceptional errors - if exceptional errors are encountered, we throw an exception.

Object oriented programming is fundamentally about organization. You can program in an OO way even without an OO language like C#. By grouping related functions and data together, it is easier to deal with increasingly complex projects.

You aren't necessarily doing something wrong. The question of what paradigm works best is highly debatable and isn't likely to have a clear winner as there are so many different ways to measure "good" code,e.g. maintainable, scalable, performance, re-usable, modular, etc.
It isn't necessary, but it can be useful in some cases. Take a look at various MVC examples to see OO code. Generally, OO code has the advantage of being re-usable so that what was written for one application can be used for others over and over again. For example, look at log4net for example of a logging framework that many people use.

The way your structure an OO program--which objects you use and how you arrange them--really depends on many factors: the age of the project, the overall size of the project, complexity of the problem, and a bit for just personal taste.
The best advice I can think of that will wrap all the reasons for OO into one quick lesson is something I picked up learning design patterns: "Encapsulate the parts that change." The value of OO is to reuse elements that will be repeated without writing additional code. But obviously you only care to "wrap up" code into objects if it will actually be reused or modified in the future, thus you should figure out what is likely to change and make objects out of it.
In your example, the reason to use the second set up may be that you can reuse the EncodedVideo object else where in the program. Anytime you need to deal with EncodedVideo, you don't concern yourself with the "how do I encode and use video", you just use the object you have and trust it to handle the logic. It may also be valuable to encapsulate the encoding logic if it's complex, and likely to change. Then you isolate changes to just one place in the code, rather than many potential places where you might have used the object.
(Brief aside: The particular example you posted isn't valid C# code. In the second example, the static method has no return type, though I assume you meant to have it return the EncodedVideo object.)

This is a design question, so answer depends on what you need, meaning there's no right or wrong answer. First method is more simple, but in second case you incapsulate encoding logic in EncodedVideo class and you can easily change the logic (based on incoming video type, for instance) in your Encoder class.

I think the first example seems more simple, except I would avoid using statics whenever possible to increase testability.
public class Encoder
{
private string videoPath;
public Encoder(string videoPath) {
this.videoPath = videoPath;
}
public bool Encode() {
...snip...
return true;
}
}

Is OOP necessary? No.
Is OOP a good idea? Yes.
You're not necessarily doing something wrong. Maybe there's a better way, maybe not.
OOP, in general, promotes modularity, extensibility, and ease of maintenance. This goes for web applications, too.
In your specific Encoder/EncodedVideo example, I don't know if it makes sense to use two discrete objects to accomplish this task, because it depends on a lot of things.
For example, is the data stored in EncodedVideo only ever used within the Encode() method? Then it might not make sense to use a separate object.
However, if other parts of the application need to know some of the information that's in EncodedVideo, such as the path or whether the status is successful, then it's good to have an EncodedVideo object that can be passed around in the rest of the application. In this case, Encode() could return an object of type EncodedVideo rather than a bool, making that data available to the rest of your app.

Unless you want to reuse the EncodedVideo class for something else, then (from what code you've given) I think your method is perfectly acceptable for this task. Unless there's unrelated functionality in EncodedVideo and the Encoder classes or it forms a massive lump of code that should be split down, then you're not really lowering the cohesion of your classes, which is fine. Assuming you don't need to reuse EncodedVideo and the classes are cohesive, by splitting them you're probably creating unnecessary classes and increasing coupling.
Remember: 1. the OO philosophy can be quite subjective and there's no single right answer, 2. you can always refactor later :p

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.