What goes into rolling your own wiki using c# and sql? - c#

I'd like to understand how a wiki works, at least from a high level. When a user saves changes, does it always insert a new row in the database for that wiki article (10 revisions, 10 rows in the database).

I agree with all the answers. Wikis normally handle every edit as a new record inside the database.
You may be interested in checking out the full Layout of the MediaWiki database diagram, the wiki engine behind Wikipedia.
Note that the full text of each revision is stored in a MEDIUMBLOB field in the text table.

I just wrote a wiki in C# actually. One thing I would add to everyone else's comments is that you'll want to make sure you can compare two versions. For doing this in C# I strongly suggest the C# implementation of the diff_match_patch library from Google. It's quite fast and it's quite easy to extend if you need more in the way of pretty printing or handling of structured text like HTML.

Every entry inside of the wiki is a new entry inside of the database.
That way revisions can be tracked. It is all about the community and tracking.
Behind the scenes the database is storing datetime, changes made, etc.

Yes, it does. Otherwise it will be impossible to see full page history, which is what's expected from a Wiki implementation.

Yes.
...
Seems a bit short. Let's just say that you have to store the original article and then details about each change afterwards. So you might have an Article table and a Revision table. That way you can roll back to any prior state.
Of course the design of the tables, the logic behind stripping revised text from the original and storing this separately is pretty complex.

Here is the dev blog for TWiki that might give you some useful information. http://twiki.org/cgi-bin/view/Blog/WebHome?category=Development.
Is Sql a requirement of the project? There is a lot of movement around NoSql at the moment and a wiki seems to fit nicely into the document store database. Some information on this can be found here http://nosql-database.org/.
There is a implementation on Codeplex at http://wikiplex.codeplex.com/. This is from another post on stackoverflow asp.net mvc wiki.

You might want to check if maybe a version control engine can be used for the text parts (Users etc might still need a database) as most version control systems have all the necessary functions implemented (history, diffing, log entries for changes, ...) which would save you a lot of work.

Related

Is it possible for Lucene to monitor a Sql Table and keep itself updated?

I am trying to understand some basics of Lucene, the full text search engine. More specifically I am looking at Lucene.Net.
Today I have an old legacy .NET 4.8 web app. Some is MVC, but the newer parts follow a pretty nice API first pattern. The app holds a lot of records (app half a million) with tons of different fields. The search functionality there is outdated to say the least. It is a ton of old Linq2SQL queries that fan out in like queries.
I would like to introduce a new and better way to search records, so I started looking at Lucene.Net. But I am trying to understand one key concept, and I can't seem to find the answer anywhere, and I think it might be because it cannot be done, but I would like to make sure.
Is it possible to set up Lucene to monitor a SQL table or view so I don't have to maintain the Lucene index from within my code. The code of this app does not lend itself to easily keeping a Lucene index updated when things are added, changed or deleted. But the database is good source of truth. I can live with a small delay on having the index up to date. But basically I would like define for each business model what fields are part of the index and what the id is, and then be able to query with that index from the C# server side code of my Web App.
Is such a scenario even possible or am I asking too much?
It's totally possible, but not out of the box. You have to implement it if you want it. Fundamentally you need to implement three things.
A way to know every time a piece of relevant data in the sql database changes
A place to capture information about that change, call it a change log.
A routine that reads the change log, applies those changes to the
LuceneNet index and than marks the record in the change log has processed.
There are of course lots of different ways to handle each of these.
This SO answer Lucene.Net index updates, when a manual change is done in SQL Database provides more details on one way this can be accomplished.

ASP.NET Working with normalized database

First of all sorry for what may seem as dumb question, but I have zero experience in that area.
So at work I was given a database (which is way more normalized than needed) and for each Insert/update/delete/select I have a separate stored procedure.
As someone with zero experience I started creating my own stored procedures and displaying text instead of ID and it was all going well until I realized I have to update this records at some point :).
So my question is can you give me directions on how to display "eye-friendly" information in the GridView and at the same time be able to edit/update this information?
Currently what I am doing is just calling a stored procedure and databind the grid view to it.
Thanks in advance!
Study time! I would like to explain this to You directly, but it is much more useful to already made tutorials, which have needed information. And this topic would be so rich, it would be hard to explain in just few sentences/examples.
The best option is to use tools/classes that .NET provides to You. Among them there is DataSet, which will incredibly help You out with all set of modifications deletes/selects.
You can bind DataSource to GridView, which will autofill data and You can just allow some kind of modifications to it.
Other approach is to use EntityFramework, where You can modify data in ways You want - it will do a lot of work for You.
Latest topic You should be interested in is LINQ - simple get data & do some queries/modifications in Your app (not at SQL server).
Check links below:
What is ADO.NET?
http://www.entityframeworktutorial.net/what-is-entityframework.aspx
Also recommending this YT video, be careful about video/sounds quality. Try some other videos with similar name.
http://tutorialspoint.com/linq/

How to think when saving application information (settings, data etc)?

Been thinking about this some time: Let's say I have a application where you can add and use reminders.
What is the best way to store this? In the past I've always used a textfile but it can get problematic if I later want to add another "field" to each reminder in the textfile. Let's say I add an feature for recurring reminders.
What is the most volatile way? Text? Xml? Json? SQLite?
Use a database. Adding another field is as simple as adding another column to a table.
MySQL is a solid query language and easy to pick up for beginners. When I started out, I watched (and really enjoyed) this tutorial series:
https://www.youtube.com/watch?v=6pbxQQG25Jw
If you ever make something that needs lots of scalability, you might want to look into PostgreSQL.
SQLite becomes a better option as your data model becomes more complex. The upgrade process (changing, adding, and removing tables) is a bit of work, and is required for your code to even refer to a new field in a query.
XML And JSON have the advantage of having parsers built into the standard libraries for most platforms these days, so you don't have to fix your parser every time you change your data model (as you would with plain text). XML can validate your model and let you know if the file does not comply with your model. JSON is really just a serialization protocol and doesn't provide anything in terms of model validation, which makes it the most flexible of the plain text options (IMO).
In terms of updating your model, your code should read in the file and allow for the new field to be missing or empty. If the field is mandatory, you should provide a default value and then write your model back out to the file so it's good to go the next time. This process is roughly the same for SQLite, but is just a bit more involved in terms of what you have to do to upgrade your model.

RavenDB: Convert a document property to another type

I'm currently developing an application where I change the document a lot as I go forward (a small project to learn stuff like RavenDB). Some changes are not backwards compatible, which leads to JSON deserialization failures when I try to fetch documents.
Are there some way that I convert a property from the old type to a new one during deserialization? I'm using Raven.Client.Lightweight as client library.
Example:
I had a property named AllProperties in a class which was a Dictionary<string,string>. I changed the type from dictionary to a class called MetadataItemCollection.
As in any other database-solution I suggest you roll your favourite migrations-framework for such kind of things. You will probably want to do set-based operations on documents.
Interesting is, Ayende is going to publish two articles about ravendb migrations in the next few days, however, google has already indexed them and you can access these articles here:
RavenDB Migrations: When to execute?
RavenDB Migrations: Rolling Updates
Ayende, please forgive me... ;)
If you are doing this during development, you are probably better off just deleting old docs and recreating them
If you are doing this in production, take a look at the posts that dlang has posted, they discuss those specific issues.

MongoDB, C# and NoRM + Denormalization

I am trying to use MongoDB, C# and NoRM to work on some sample projects, but at this point I'm having a much harder time wrapping my head around the data model. With RDBMS's related data is no problem. In MongoDB, however, I'm having a difficult time deciding what to do with them.
Let's use StackOverflow as an example... I have no problem understanding that the majority of data on a question page should be included in one document. Title, question text, revisions, comments... all good in one document object.
Where I start to get hazy is on the question of user data like username, avatar, reputation (which changes especially often)... Do you denormalize and update thousands of document records every time there is a user change or do you somehow link the data together?
What is the most efficient way to accomplish a user relationship without causing tons of queries to happen on each page load? I noticed the DbReference<T> type in NoRM, but haven't found a great way to use it yet. What if I have nullable optional relationships?
Thanks for your insight!
The balance that I have found is using SQL as the normalized database and Mongo as the denormalized copy. I use a ESB to keep them in sync with each other. I use a concept that I call "prepared documents" and "stored documents". Stored documents are data that is only kept in mongo. Useful for data that isn't relational. The prepared documents contain data that can be rebuilt using the data within the normalized database. They act as living caches in a way - they can be rebuilt from scratch if the data ever falls out of sync (in complicated documents this is an expensive process because these documents require many queries to be rebuilt). They can also be updated one field at a time. This is where the service bus comes in. It responds to events sent after the normalized database has been updated and then updates the relevant mongo prepared documents.
Use each database to their strengths. Allow SQL to be the write database that ensures data integrity. Let Mongo be the read-only database that is blazing fast and can contain sub-documents so that you need less queries.
** EDIT **
I just re-read your question and realized what you were actually asking for. I'm leaving my original answer in case its helpful at all.
The way I would handle the Stackoverflow example you gave is to store the user id in each comment. You would load up the post which would have all of the comments in it. Thats one query.
You would then traverse the comment data and pull out an array of user ids that you need to load. Then load those as a batch query (using the Q.In() query operator). Thats two queries total. You would then need to merge the data together into a final form. There is a balance that you need to strike between when to do it like this and when to use something like an ESB to manually update each document. Use what works best for each individual scenario of your data structure.
I think you need to strike a balance.
If I were you, I'd just reference the userid instead of their name/reputation in each post.
Unlike a RDBMS though, you would opt to have comments embedded in the document.
Why you want to avoid denormalization and updating 'thousands of document records'? Mongodb db designed for denormalization. Stackoverlow handle millions of different data in background. And some data can be stale for some short period and it's okay.
So main idea of above said is that you should have denormalized documents in order to fast display them at ui.
You can't query by referenced document, in any way you need denormalization.
Also i suggest have a look into cqrs architecture.
Try to investigate cqrs and event sourcing architecture. This will allow you to update all this data by queue.

Categories

Resources