Need help with XNA's content pipeline (2Qs)

Need help with XNA's content pipeline (2Qs) - c#

I ask here about XNA and not on it's official forums because people from my country are not allowed to sign in to the new XNA website.
Well, these are my questions:
I want to use some 2D images I create in Paint Shop Pro/Photo Shop/Paint, but for some reason I need to use web safe pallet and such settings for it to be displayed currently (I use transparency).
could any1 please explain to me how can I use transparency & other settings (while creating & saving the image) so that the XNA (4.0) could display it correctly?
By the way, it might be that I just need some 1 to explain to me how to set the "GraphicsDevice"-s settings to work with transparency layer/channel.
I really do try to do things as I am supposed to (by Microsoft's view) & thus I use the content pipeline for ALL of my content loading (including classes initiation data files).
I use .txt files for storing my class initiation data & I edit them with simple good old notepad (++ :P).
Now, the problem is that all I managed to do is loading the .txt file as a really long string instead of creating a new instance of my GameDataFile class.
because of that I was forced to do it in 2 steps:
Step 1:
string tempStrData = content.load<string>("data/filename").Replace("\r", "");
/* Loads a string from a file (the string is the whole file!) */
Step 2:
GameDataFile gameDataFile = new GameDataFile(tempStrData.Split('\n'));
/* Sends the string to my GameDataFile class constructor which knows how to handle that string and break it to it's data elements (ints, strings vectors, etc...) */
I want to upgrade it to be of the following form:
GameDataFile gameDataFile = content.load<GameDataFile>("data/fileName");
I think I should do this using a custom Content pipeline Processor, any opinions if I'm right & how should I achieve that?
P.S please don't make me to use public members as I always set that to private and I hate and strictly forbid myself from using the C#-ONLY-get-&-set methods.
Thanks In Advance, Tal A.

For your first question, set the blendstate to AlphaBlend when you begin your SpriteBatch:
spriteBatch.Begin(SpriteSortMode.Immediate, BlendState.AlphaBlend, null, null, null);
I save my images as PNGs in PhotoShop which allows transparency.
Edit: Unless you're referring to 3D textures. If so I'll have to revise my answer
Edit: As for question 2, this example on App Hub shows how to do it.

Related

NetTopology 'found non-noded intersection' Exception when determining the difference between two specific geometries

Using the NetTopology in C# I'm getting a 'found non-noded intersection' Exception when determining the difference between two specific geometries.
These geometries are the result of using several routines like CascadedPolygonUnion.Union, Intersection, and Difference.
At some point, we have a MultiPolygon from which we want to cut out another geometry (Polygon):
We use this code to try and cut off the 'red' polygon:
Geometry difference = multiPolygon.Difference(geometryToRemove);
But then we get a NetTopologySuite.Geometries.TopologyException with the message:
found non-noded intersection between LINESTRING (240173.28029999882 493556.2806000002, 240173.28177031482 493556.28131837514) and LINESTRING (240173.28176154062 493556.2813140882, 240173.28176153247 493556.2813140842) [ (240173.28176153894, 493556.2813140874) ]
I asked this question also in the NetTopologySuite Discussuion forum because we are close to a release date and I was hoping someone could give some extra insight (of ideas for a workaround) as this looks like a bug in de library because the polygons themselves seem valid.
The data regarding the polygons can be found here - we use the 'RDNew' data to perform the Difference action, but I also added the WGS84 versions of these polygons to be able to view them in tools like geojson.io.

Thanks to one of the maintainers of the library I got the answer.
Basically, I needed to upgrade to version 2.2 (which I already did at first to see if this would resolve the problem).
But second, I needed to configure the application to use the - in version 2.2 introduced - 'NextGen' overlay generator, which is not turned on by default.
To use the 'Next Gen' overlay generator, add the following code at some starup point in your application:
var curInstance = NetTopologySuite.NtsGeometryServices.Instance;
NetTopologySuite.NtsGeometryServices.Instance = new NetTopologySuite.NtsGeometryServices(
curInstance.DefaultCoordinateSequenceFactory,
curInstance.DefaultPrecisionModel,
curInstance.DefaultSRID,
GeometryOverlay.NG, // RH: use 'Next Gen' overlay generator
curInstance.CoordinateEqualityComparer);
I use the current instance of NtsGeometryServices to get and reuse the current default instances of the other configurable parts.
But your free to create new instances of the required parts (like mentioned in the original post at https://github.com/NetTopologySuite/NetTopologySuite/discussions/530#discussioncomment-888410 )
There are also possibilities to use both overlay generators next to each other (also mentioned in the original post), but I never tried this as we will be using the 'NextGen' version for the entire application.

How to know if image is upside down before OCR ? if Transform or correcting orientation needed, how to know 90/180 transformation is right?

I am working with multi-page scanned text documents and I use FineReader 12 SDK as a underlying OCR engine. Sometimes, the document is scanned upside down or different orientation and causes all the resulted recognition characters are to be unrecognized symbols.
Is there way to know if the document is not in the right orientation before recognition\analysis\process ?
How to do the correct transform or orientation on the document instead of trial and error ?
The documents are always of english language and can we enforce if the detected language is english and do the transform based on that ?
Any help appreciated.

This problem can be solved using custom .ini processing profiles. You can automatically detect orientation and skew using the right properties, then apply or prohibit orientation correction and/or deskewing.
In your code between Engine initialization and recognition, call this method as described in the FRE documentation, section Working with Profiles
IEngine::LoadProfile
Create a new file document.ini somewhere in your project and pass it to this method call in order to tell the SDK to check for properties in this file before processing your files.
Add these lines in your freshly created file:
[PageProcessingParams]
PerformPreprocessing = TRUE <- allows engine to preprocess image
PerformAnalysis = TRUE
PerformRecognition = TRUE
[PagePreprocessingParams]
CorrectGeometry = TSPV_Auto
CorrectInvertedImage = TRUE
CorrectOrientation = TRUE <- correct orientation automatically
CorrectSkew = TSPV_Yes <- correct skew automatically
[OrientationDetectionParams]
OrientationDetectionMode = ODM_Normal <- detect orientation automatically
ProhibitClockwiseRotation = FALSE |
ProhibitCounterclockwiseRotation = FALSE <-| allow all orientations
ProhibitUpsideDownRotation = FALSE |
If you do not want to use a file for setting these properties for any reason, you can call them in your code. Have a glance at the documentation describing all the props objects tree for that. Using a file is way easier to understand what you are doing without browsing hundred lines of code.
For your language issue, I suggest you to use RecognizerParams and enforce specific properties. Again, have a look at the documentation for custom profiles as it is pretty powerful.
[RecognizerParams]
TextLanguage = English <- force english
LanguageDetectionMode = TSPV_No <- TSPV_Yes or TSPV_No are acceptable values
After doing this, you should be good to go, and all your image files should be close do 0° orientation for processing.
Choosing a language based on document orientation is a very specific workflow, the only option will be to code it.
Good luck on your project !

Template Engine Capable of Altering after Render has been done

Hello
I Need to create a template, which is "dynamic", and i'll explain the meanning of "Dynamic":
I need to have a template, that is rendered into text files (c++ code, to be exact).
the user will be able to change some things in the generated files.
After a while, a process is run to update the generated files, i'll be able to "spot" where were the templates regions and update them accordingly.
My Effort
currently, i use a "T4 Template" to create the initial render,
and in the Template, I implant C++ style comments over the regions i need to recognize later.
and another code that find that regions, and regenerates what should go between those "Comment blocks".
the problem is that it is not the same code that generate the boiler-plate, and updates the regions which costs me a lot of headache and buggy features.
it is not very intuitive to write, and the users (the ones who use the generated code) need to know Not to touch the "Comment Blocks".
Questions
How can i recognize location/blocks in a generated file without "littering" the file with "comment"/"unimportant" text?
How can i unify the code that generate the "templated blocks" both for "Generation" and "Update"
Later on, How can i make it work on Non-Code files too,
Edit
I guess I wasn't clear at what i am doing,
I am writing a tool in C#, that generates C++ code.
Also, T4 is just what i used, but any tool/library can be used while C# libraries are prefered
Any idea will be highly appreciated,
Thanks.

Now, I believe your question is totally ["open" and "opinion based"] on one side, and ["why is this code not working" without showing the code] on the other side.. but I want to try pointing some problem with the idea of "improvement" you have now.
Q2: How can I unify the code that generate the "templated blocks" both for "Generation" and "Update"
I'm strongly convinced that you should not, at least not now. Here's why:
'generate' and 'update' are happening in different directions; first is t4template->content, second is content->t4template
those two directions form different functionality
at least one of these directions requires complex logic not present in the other one
'generate' is based on T4 Engine, while 'update' will probably not be able to use it at all
..and probably many other reasons, but that's enough
Q3: Later on, How can i make it work on Non-Code files too
T4 Engine has no idea that what you generate now is a C++ file. T4 works only on a layer of "text files". If the process you have works now, you should be able to "generate" any text file already right now. The "update" part is a bit more tricky, because it depends on how you implemented it. If you assumed/used any correlaction to C++ syntax, you've got a problem. (guess why T4 Templates are called 'text templating engine', agnostic to the actual generated code language) If you kept it clean and worked as if on a free-form text file, then you're already safe to work on, well, free-form text files.
Q1: How can I recognize location/blocks in a generated file without "littering" the file with "comment"/"unimportant" text?
Well, basically, you can't and/or shouldn't. Consider a smart idea of keeping a hidden database that remembers text locations for every file. For every comment that you would put in the file, you put a row in the database, saying file: BAR\FOO.CPP | FROM: line 120 char 1 | TO: line 131 char 15 | XXX: yyy | ZZZ: aaa. That's almost no difference to having comments in the file, all information is preserved, and the file is clean now, right?
Nope. And that's because you want to detect what has changed. Let's take a highly contrived example, here's the generated file with such invisible markers that are managed by database. Each # character denotes a marker, be it start/stop/metainfo nevermind:
class FooBar : public #BaseClass#
{
public:
#void Blargh(Whizz& output);#
#int GetAge() const;#
private:
int #shoeSize#;
#
};
Those # are of course invisible, it's just an information held elsewhere, the user sees a clean file. Now, he edits it to this:
class FooBar : public BaseClass
{
public:
template<T>
void Yeeek(T& output);
int GetAge() const;
private:
int shoeSize;
};
Please note how "template" was added and method renamed to "Yeeek". There were some markers out there, I didn't show them intentionally, just look on the "template<>" line. What if it was accidentally placed a line or a byte too far or too early, so one marker too many was skipped or included? Now the detector and updater may accidentally skip "template<>", and it will be totally happy to just rename the method. This is not a problem with the detector or updater. This is a problem of markers not being visible, so the user was not able to see where should he place his edit.
That's probably the most important point. But, let's see something more algorithmic/technical. Let's try an even simpler edit. User edits the file to:
class FooBarize : publ#ic BaseCl#ass
{
int goat;
# string cheese; #
p#ublic: #
void Blargh(Whizz& output);
i#nt GetA#ge() const;
p#rivate:
int shoeSize;
};
I overlaid those invisible markes from 'the external database of markers' back onto this edited file. What has happened? Simple. User has added two lines more in an odd place (he doesnt see the markes, right?), and the database remembers old places (i.e. 'line:char', but could be 'byte', or really whatever). Now of course, database may (and should!) also remember old shape of the file, so it can see that i.e. the first # was after ":public" and the process can try to map it onto the new file.. but then, you already have a highly complex problem, and this edit was trivial. Of course, you can require the user to enter some information on how to update the markers.. but hey, he don't see them, how can he do it? And since we wanted to hide the markers from him, we probably don't want to ask him about updating them as well..
How about editing the file to:
struct FooBar : One,Two,Three,Four
{
void OhNoes();
};
I didn't care to overlay the markers, because it's utter nonsense. Now, how to map it back to the template? Is OhNoes mappable to GetAge (const removed) or to Blargh (parameters removed)? How the template base class should be updated? Which one of the new bases is the true base? Or maybe all of them? Neither you nor I can decide it, even with our combined human intelligence, not mentioning an automated process.
Of course, you can leave it as a corner case, you can emit an error to the user and inform them that their edit went to far and is unanalyzable and so on. But The complexity of reverse-mapping a change back to the model text is still there.
What I want to show you by these contrived examples is, that if you want to detect and map changes back to the original template, you should keep these markers in the generated content. Having these markers in the code allows you quickly and reliably detect:
which sections changed? (-> content between markers has changed)
which sections were offsetted by edits? (->markers are now at different position than before)
were any sections deleted? (-> both markers and content between removed)
(..)
It also allows the user to see which parts are special so he can place his edits in a reasonable way, which allows you to ignore and not support more corner cases than in the "invisible markers" case.
Finally, let's take a read-world example which you already know. T4 Template. All those ugly <%!#!#^$^!%# littering your precious template text. Couldn't they be removed? Couldn't these be kept in a separate file that describes the transformation? Or at least at the beginning or end of the file? Yes, it could. But it would make the editing a real pain - we're back to 'invisible markers' problem: each your edit to the content may require you to manually update locations of some invisible markers.
Keep the markers in the generated content.
Keep your users aware of the generation and detection and special regions.
If it's too complex for them, change the users to a more technical group, or train your userbase to be more technical. Or prevent them from editing the file. Given them some partial access so they can edit a part of the file, as an excerpt, not as a whole file. Limit their editing power to absolute minimum. Maybe it will allow you to limit the number of visible markers, maybe even down to zero, maybe at the cost of splitting and downsizing editable fragments.

I think you are going about it the wrong way. You have a XY problem here. Allowing your users to modify only part of the generated file and then trying to detect that part it's a lot of headache as you have seen.
Instead, the better solution is to leave the generated file completely unmodifiable and have some configuration available. For instance you can have a config file where users can add their own data members, initializers for them, etc.
This way you have a clear separation of the parts of your system.
The modifications done by the users are now trivially carried to the next iteration and you can easily always re-generate the output.
+------------------+
| Input: Template | ------
+------------------+ \
|
+------------------+ | Generator code +-------------------------+
| Input: Config | -------+----------------------> | Output: Generated code |
+------------------+ | |-------------------------+
|
+------------------+ |
| Input: Config | --------/
+------------------+
This system can be used to generate non-code also.

Parsing Complex PDF document with C#

See attached K-1 Document. I have attempted to use numerous tweaks with iTextSharp library but haven't had success in loading data correctly.
Ideally I would like to parse out the document similar to how humans would read them, one textbox at a time, reading its contents.
var reader = new PdfReader(FILE, Encoding.ASCII.GetBytes(password));
string[] lines;
var strategy = new LocationTextExtractionStrategy();
string currentPageText = PdfTextExtractor.GetTextFromPage(reader, 1, strategy);
lines = currentPageText.Split(new string[] {"\r\n", "\n"}, StringSplitOptions.None);
I also tried playing with Annotation parsing but didn't have luck.
I'm a newbie and probably looking at wrong place. Can you help guide me in the right direction?
Thanks a lot.

You would like to parse out the document similar to how humans would read them, one textbox at a time, reading its contents. That means you first will have to try and automatically recognize those text boxes. Then you can extract text by these areas.
To recognize those text boxes automatically in your document, you have to extract the border lines enclosing the boxes. For this you will first have to find out how those border lines are created. They might be drawn using vector graphics as lines or rectangles, but they could also be part of a background bitmap image.
Unfortunately I don't have your IRS form at hand and so cannot analyze its internals. Let's assume the borders are created using vector graphics for now. Thus, you have to extract vector graphics.
To extract vector graphics with iText(Sharp), you make use of classes from the iText(Sharp) parser namespace by making them parse the document and feed the parsing events into a listener you create which collects the vector graphic operations:
You implement IExtRenderListener, in particular its ModifyPath and RenderPath methods which respectively are called when additional path elements (e.g. lines or rectangles) are added to the current path or when the current path is rendered (stroked? filled?). Your implementation collects these information.
You parse your document into an instance of your listener, e.g. using PdfReaderContentParser.
You analyse the lines and rectangles found and derive the coordinates of the boxes they build.
You parse the same page in a LocationTextExtractionStrategy instance.
You retrieve the texts of the recognized text boxes by calling LocationTextExtractionStrategy.GetResultantText with a matching ITextChunkFilter argument for each box.
(Actually you can do the parsing into the instance of your listener and the LocationTextExtractionStrategy instance in one pass for a bit of optimization.)
All iText(Sharp) specific tasks are trivial, and the only other task, the analysis of the lines and rectangles found to derive the coordinates of the boxes, should be no big problem for a software developer proficient in C#.

The first question if this form is electronic or a scanned one? the latter would make the data extraction much harder as it should involve OCR too.
in case you have electronic PDF and if you have all the similar forms then why don't you just use the following strategy:
store coordinates of each "box" in the config file
process documents and exract text from every "box" (i.e. region)
additional process extracted text with regular expressions to separate name from address (or maybe you may just set the region to read text from line by line)
In case you have few variations of the form then you may check the very first box to extract the name of the form and load the appropraite settings file (that contains a set of regions for that variation)
This approach should work with any PDF library.

Take a look at IvyPdf library and template editor. It's using c# and provides high-level functions to parse and extract data so you don't have to deal with internals of PDF documents. You can build fairly complex scenarios using it.
I don't think it can read annotations though.

How to translate website in another language?(ASP .NET , c#)

I have developed a large business portal. I just realized I need my website in another language. I have researched the solutions available like
Used third party control on my website. (Does fit in my design. Not useful regarding SEO point of view. Dont want to show third party brand names.)
Create Resource files for each language.( A lot of work required to restructure pages to use text from resource files. What about the data entered by the user like Business Description. )
Are there any Other options available.
I was thinking of a solution like a when a page is created on server side then I could translate it before sending back to client. Is there any way I can do that?(to translate everything including data added from databases or through a code. And without effecting design. )

If you really need to translate your application, it's going to take a lot of hard, tedious work. There is no magic bullet.
The first thing you need to do is convert your plain text in your markup to asp:Localize controls. By using the Localize control, you can leave your existing <span> tags in place and just replace the text inside of them. There's really no way around this. Visual Studio's search and replace supports regular expression matching that may help you with this, or you can use Resharper (see below).
The first approach would be to download the open source shopping application nopCommerce and see how they handle their localization. They store their strings in a database and have a UI for editing languages. A similar approach may work well for you.
Alternatively, if you want to use Resource Files, there are two tools that I would recommend using in addition to Visual Studio: Resharper 5 (Localization Features screencast) and Zeta Resource Editor. These are the steps I would take to accomplish it using this method:
Use the "Generate Local Resource" tool in visual studio for each page
Use Resharper's "Move HTML to resource" on the text in your markup to make them into Localize controls.
Use Resharper to search out any localizable strings in your code behind and move them to the resource file as well.
Use the Globalization Rules of Code Analysis / FXCop to help find any additional problems you might face formatting numbers, dates, etc.
Once all text is in the resx files, use Zeta Resource Editor to load up all of your resx files, add new languages, and export for translation (or auto translate if you're brave enough).
I've used this approach on a site translated into 8 languages (and growing) with dozens of pages (and growing). However, this is not a user-editable site; the pages are solely controlled by the programmers.

a large switch case? use a dictionary/hashtable (seperate instance for each a language), it is much, much more effective and fast.

To Convert The Page To Arabic Language Or Other Language .
Go to :
1-page design
2-Tools
3-Generate Local Resource
4-obtain "App_LocalResources" include "filename.aspx.resx"
5-copy the file and change the name to "filename.aspx.ar.resx" to convert the page to arabic language or other .
hope to helpful :)

I found a good solution, see in http://www.nopcommerce.com/p/1784/nopcommerce-translator.aspx
this project is open source and source repository is here: https://github.com/Marjani/NopCommerce-Translator
good luck

Without installing any 3rd party tool, APIs, or dll objects, I am able to utilize the App_LocalResources. Although I still use Google Translate for the words and sentences to be translated and copy and paste it to the file as you can see in one of the screenshots below (or you can have a person translator and type manually to add). In your Project folder (using MS Visual Studio as editor), add an App_LocalResources folder and create the English and other language (resx file). In my case, it's Spanish (es-ES) translation. See screenshot below.
Next, on your aspx, add the meta tags (meta:resourcekey) that will match in the App_LocalResources. One for English and another to the Spanish file. See screenshots below:
Spanish: (filename.aspx.es-ES.resx)
English: (filename.aspx.resx)
.
Then create a link on your masterpage file with a querystring that will switch the page translation and will be available on all pages:
<%--ENGLISH/SPANISH VERSION BUTTON--%>
<asp:HyperLink ID="eng_ver" runat="server" Text="English" Font-Underline="false"></asp:HyperLink> |
<asp:HyperLink ID="spa_ver" runat="server" Text="Español" Font-Underline="false"></asp:HyperLink>
<%--ENGLISH/SPANISH VERSION BUTTON--%>
.
On your masterpage code behind, create a dynamic link to the Hyperlink tags:
////LOCALIZATION
string thispage = Request.Url.AbsolutePath;
eng_ver.NavigateUrl = thispage;
spa_ver.NavigateUrl = thispage + "?ver=es-ES";
////LOCALIZATION
.
Now, on your page files' code behind, you can set a session variable to make all links or redirections to stick to the desired translation by always adding a querystring to urls.
On PageLoad:
///'LOCALIZATION
//dynamic querystring; add this to urls ---> ?" + Session["add2url"]
{
if (Session["version"] != null)
{
Session["add2url"] = "?ver=" + Session["version"]; //SPANISH version
}
else
{
Session["add2url"] = ""; // ENGLISH as default
}
}
///'LOCALIZATION
.
On Click Events sample:
protected void btnBack_Click(object sender, EventArgs e)
{
Session["FileName.aspx"] = null;
Response.Redirect("FileName.aspx" + Session["add2url"]);
}
I hope my descriptions were easy enough.

If you don't want to code more and if its feasible with google translator then You can try with Google Translator API. you can check below code.
<script src="http://translate.google.com/translate_a/element.js?cb=googleTranslateElementInit"></script>
<script>
function googleTranslateElementInit() {
$.when(
new google.translate.TranslateElement({pageLanguage: 'en', includedLanguages: 'en',
layout: google.translate.TranslateElement.FloatPosition.TOP_LEFT}, 'google_translate_element')
).done(function(){
var select = document.getElementsByClassName('goog-te-combo')[0];
select.selectedIndex = 1;
select.addEventListener('click', function () {
select.dispatchEvent(new Event('change'));
});
select.click();
});
}
$(window).on('load', function() {
var select = document.getElementsByClassName('goog-te-combo')[0];
select.click();
var selected = document.getElementsByClassName('goog-te-gadget')[0];
selected.hidden = true;
});
</script>
Also, Find below code for <body> tag
<div id="google_translate_element"></div>

It will certainly be more work to create resource files for each language - but this is the option I would opt for, as it gives you the opportunity to be more accurate. If you do it this way you can have the text translated, manually, by someone that speaks the language (there are many companies out there that offer this kind of service).
Automatic translation systems are often good for giving a general impression of what something in another language means, but I would never use them when trying to portray a professional image, as often what they output just doesn't make sense. Nothing screams 'unprofessional!' like text that just doesn't make sense because it's been automatically translated.

I would take the resource file route over the translation option because the meaning of words in a language can be very contextual and even one mistake could undermine your site's credibility.
As you suggest Visual Studio can generate the meta resource file keys for most controls containing text but may leave you having to do the rest manually but I don't see an easier, more reliable solution.
I don't think localisation is an easy-to-automate thing anyway as text held in the database often results in schema changes to allow for multiple languages, and web HTML often need restructuring to deal with truncated or wrapped label and button text because, for example, you've translated into German or something.
Other considerations:
Culture settings - financial delimitors, date formats.
Right-to-left - some languages like arabic are written right to left meaning that the pages require rethinking as to control positioning like images etc.
Good luck whatever you go with.

I ended up doing it the hard way:
I wrote an extension method on the string class called TranslateInto
On the Page's PreRender method I grab all controls recursively based on their type (the types that would have text)
Foreach through them and text.TranslateInto(SupportedLanguages.CurrentLanguage)
In my TranslateInto method I have a ridiculously large switch statement with every string displayed to the user and its associated translation.
Its not very pretty, but it worked.

We work with a Translation CAT tool (Computer Assisted Translation) called MemoQ that allows us to translate the text while leaving all the tags and coding in place. This is very helpful when the order of words change when you translate from one language to another.
It is also very useful because it allows us to work with translators from around the world, without the need for them to have any technical expertise. It also allows us to have the translation proof read by a second translator.
We use this translation environment to translate html, xml, InDesign, Word, etc.

I think you should try Google Translate.
http://translate.google.com/translate_tools
Very easy and very very effective.
HTH

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.