How to replace text with index value in C# - c#

I have a text file which contains a repeated string called "map" for more than 800 now I would like to replace them with map to map0, map1, map2, .....map800.
I tried this way but it didn't work for me:
void Main() {
string text = File.ReadAllText(#"T:\File1.txt");
for (int i = 0; i < 2000; i++)
{
text = text.Replace("map", "map"+i);
}
File.WriteAllText(#"T:\File1.txt", text);
}
How can I achieve this?

This should work fine:
void Main() {
string text = File.ReadAllText(#"T:\File1.txt");
int num = 0;
text = (Regex.Replace(text, "map", delegate(Match m) {
return "map" + num++;
}));
File.WriteAllText(#"T:\File1.txt", text);
}

/// <summary>
/// Replaces each existing key within the original string by adding a number to it.
/// </summary>
/// <param name="original">The original string.</param>
/// <param name="key">The key we are searching for.</param>
/// <param name="offset">The offset of the number we want to start with. The default value is 0.</param>
/// <param name="increment">The increment of the number.</param>
/// <returns>A new string where each key has been extended with a number string with "offset" and beeing incremented with "increment".The default value is 1.</returns>
/// <example>
/// Assuming that we have an original string of "mapmapmapmap" and the key "map" we
/// would get "map0map1map2map3" as result.
/// </example>
public static string AddNumberToKeyInString(string original, string key, int offset = 0, int increment = 1)
{
if (original.Contains(key))
{
int counter = offset;
int position = 0;
int index;
// While we are withing the length of the string and
// the "key" we are searching for exists at least once
while (position < original.Length && (index = original.Substring(position).IndexOf(key)) != -1)
{
// Insert the counter after the "key"
original = original.Insert(position + key.Length, counter.ToString());
position += index + key.Length + counter.ToString().Length;
counter += increment;
}
}
return original;
}

It's because you are replacing the same occurrence of map each time. So the resulting string will have map9876543210 map9876543210 map9876543210 for 10 iterations, if the original string was "map map map". You need to find each individual occurrence of map, and replace it. Try using the indexof method.

Something along these lines should give you an idea of what you're trying to do:
static void Main(string[] args)
{
string text = File.ReadAllText(#"C:\temp\map.txt");
int mapIndex = text.IndexOf("map");
int hitCount = 0;
int hitTextLength = 1;
while (mapIndex >= 0 )
{
text = text.Substring(0, mapIndex) + "map" + hitCount++.ToString() + text.Substring(mapIndex + 2 + hitTextLength);
mapIndex = text.IndexOf("map", mapIndex + 3 + hitTextLength);
hitTextLength = hitCount.ToString().Length;
}
File.WriteAllText(#"C:\temp\map1.txt", text);
}
Due to the fact that strings are immutable this wouldn't be the ideal way to deal with large files (1MB+) as you would be creating and disposing the entire string for each instance of "map" in the file.
For an example file:
map hat dog
dog map cat
lost cat map
mapmapmaphat
map
You get the results:
map0 hat dog
dog map1 cat
lost cat map2
map3map4map5hat
map6

Related

convert integer to hex as unit with prefix [duplicate]

How can I convert the following?
2934 (integer) to B76 (hex)
Let me explain what I am trying to do. I have User IDs in my database that are stored as integers. Rather than having users reference their IDs I want to let them use the hex value. The main reason is because it's shorter.
So not only do I need to go from integer to hex but I also need to go from hex to integer.
Is there an easy way to do this in C#?
// Store integer 182
int intValue = 182;
// Convert integer 182 as a hex in a string variable
string hexValue = intValue.ToString("X");
// Convert the hex string back to the number
int intAgain = int.Parse(hexValue, System.Globalization.NumberStyles.HexNumber);
from http://www.geekpedia.com/KB8_How-do-I-convert-from-decimal-to-hex-and-hex-to-decimal.html
HINT (from the comments):
Use .ToString("X4") to get exactly 4 digits with leading 0, or .ToString("x4") for lowercase hex numbers (likewise for more digits).
Use:
int myInt = 2934;
string myHex = myInt.ToString("X"); // Gives you hexadecimal
int myNewInt = Convert.ToInt32(myHex, 16); // Back to int again.
See How to: Convert Between Hexadecimal Strings and Numeric Types (C# Programming Guide) for more information and examples.
Try the following to convert it to hex
public static string ToHex(this int value) {
return String.Format("0x{0:X}", value);
}
And back again
public static int FromHex(string value) {
// strip the leading 0x
if ( value.StartsWith("0x", StringComparison.OrdinalIgnoreCase)) {
value = value.Substring(2);
}
return Int32.Parse(value, NumberStyles.HexNumber);
}
int valInt = 12;
Console.WriteLine(valInt.ToString("X")); // C ~ possibly single-digit output
Console.WriteLine(valInt.ToString("X2")); // 0C ~ always double-digit output
string HexFromID(int ID)
{
return ID.ToString("X");
}
int IDFromHex(string HexID)
{
return int.Parse(HexID, System.Globalization.NumberStyles.HexNumber);
}
I really question the value of this, though. You're stated goal is to make the value shorter, which it will, but that isn't a goal in itself. You really mean either make it easier to remember or easier to type.
If you mean easier to remember, then you're taking a step backwards. We know it's still the same size, just encoded differently. But your users won't know that the letters are restricted to 'A-F', and so the ID will occupy the same conceptual space for them as if the letter 'A-Z' were allowed. So instead of being like memorizing a telephone number, it's more like memorizing a GUID (of equivalent length).
If you mean typing, instead of being able to use the keypad the user now must use the main part of the keyboard. It's likely to be more difficult to type, because it won't be a word their fingers recognize.
A much better option is to actually let them pick a real username.
To Hex:
string hex = intValue.ToString("X");
To int:
int intValue = int.Parse(hex, System.Globalization.NumberStyles.HexNumber)
I created my own solution for converting int to Hex string and back before I found this answer. Not surprisingly, it's considerably faster than the .net solution since there's less code overhead.
/// <summary>
/// Convert an integer to a string of hexidecimal numbers.
/// </summary>
/// <param name="n">The int to convert to Hex representation</param>
/// <param name="len">number of digits in the hex string. Pads with leading zeros.</param>
/// <returns></returns>
private static String IntToHexString(int n, int len)
{
char[] ch = new char[len--];
for (int i = len; i >= 0; i--)
{
ch[len - i] = ByteToHexChar((byte)((uint)(n >> 4 * i) & 15));
}
return new String(ch);
}
/// <summary>
/// Convert a byte to a hexidecimal char
/// </summary>
/// <param name="b"></param>
/// <returns></returns>
private static char ByteToHexChar(byte b)
{
if (b < 0 || b > 15)
throw new Exception("IntToHexChar: input out of range for Hex value");
return b < 10 ? (char)(b + 48) : (char)(b + 55);
}
/// <summary>
/// Convert a hexidecimal string to an base 10 integer
/// </summary>
/// <param name="str"></param>
/// <returns></returns>
private static int HexStringToInt(String str)
{
int value = 0;
for (int i = 0; i < str.Length; i++)
{
value += HexCharToInt(str[i]) << ((str.Length - 1 - i) * 4);
}
return value;
}
/// <summary>
/// Convert a hex char to it an integer.
/// </summary>
/// <param name="ch"></param>
/// <returns></returns>
private static int HexCharToInt(char ch)
{
if (ch < 48 || (ch > 57 && ch < 65) || ch > 70)
throw new Exception("HexCharToInt: input out of range for Hex value");
return (ch < 58) ? ch - 48 : ch - 55;
}
Timing code:
static void Main(string[] args)
{
int num = 3500;
long start = System.Diagnostics.Stopwatch.GetTimestamp();
for (int i = 0; i < 2000000; i++)
if (num != HexStringToInt(IntToHexString(num, 3)))
Console.WriteLine(num + " = " + HexStringToInt(IntToHexString(num, 3)));
long end = System.Diagnostics.Stopwatch.GetTimestamp();
Console.WriteLine(((double)end - (double)start)/(double)System.Diagnostics.Stopwatch.Frequency);
for (int i = 0; i < 2000000; i++)
if (num != Convert.ToInt32(num.ToString("X3"), 16))
Console.WriteLine(i);
end = System.Diagnostics.Stopwatch.GetTimestamp();
Console.WriteLine(((double)end - (double)start)/(double)System.Diagnostics.Stopwatch.Frequency);
Console.ReadLine();
}
Results:
Digits : MyCode : .Net
1 : 0.21 : 0.45
2 : 0.31 : 0.56
4 : 0.51 : 0.78
6 : 0.70 : 1.02
8 : 0.90 : 1.25
NET FRAMEWORK
Very well explained and few programming lines
GOOD JOB
// Store integer 182
int intValue = 182;
// Convert integer 182 as a hex in a string variable
string hexValue = intValue.ToString("X");
// Convert the hex string back to the number
int intAgain = int.Parse(hexValue, System.Globalization.NumberStyles.HexNumber);
PASCAL >> C#
http://files.hddguru.com/download/Software/Seagate/St_mem.pas
Something from the old school very old procedure of pascal converted to C #
/// <summary>
/// Conver number from Decadic to Hexadecimal
/// </summary>
/// <param name="w"></param>
/// <returns></returns>
public string MakeHex(int w)
{
try
{
char[] b = {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};
char[] S = new char[7];
S[0] = b[(w >> 24) & 15];
S[1] = b[(w >> 20) & 15];
S[2] = b[(w >> 16) & 15];
S[3] = b[(w >> 12) & 15];
S[4] = b[(w >> 8) & 15];
S[5] = b[(w >> 4) & 15];
S[6] = b[w & 15];
string _MakeHex = new string(S, 0, S.Count());
return _MakeHex;
}
catch (Exception ex)
{
throw;
}
}
Print integer in hex-value with zero-padding (if needed) :
int intValue = 1234;
Console.WriteLine("{0,0:D4} {0,0:X3}", intValue);
https://learn.microsoft.com/en-us/dotnet/standard/base-types/how-to-pad-a-number-with-leading-zeros
Like #Joel C, I think this is an AB problem.
There’s an existing algorithm that I think suits the need as described better which is uuencode, which I’m sure has many public domain implementations, perhaps tweaked to eliminate characters that looks very similar like 0/O. Likely to produce significantly shorter strings. I think this is what URL shorteners use.
int to hex:
int a = 72;
Console.WriteLine("{0:X}", a);
hex to int:
int b = 0xB76;
Console.WriteLine(b);

Removing HTML from messages safely

I need to output all of the plaintext within messages that may include valid and/or invalid HTML and possibly text that is superficially similar to HTML (i.e. non-HTML text within <...> such as: < why would someone do this?? >).
It is more important that I preserve all non-HTML content than it is to strip out all HTML, but ideally I would like to get rid of as much of the HTML as possible for readability.
I am currently using HTML Agility Pack, but I am having issues where non-HTML within < and > is also removed, for example:
my function:
text = HttpUtility.HtmlDecode(text);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(text);
text = doc.DocumentNode.InnerText;
simple example input*:
this text has <b>weird < things</b> going on >
actual output (unacceptable, lost the word "things"):
this text has weird going on >
desired output:
this text has weird < things going on >
Is there a way to remove only legitimate HTML tags within HTML Agility Pack without stripping out other content that may include < and/or >? Or do I need to manually create a white-list of tags to remove like in this question? That is my fallback solution but I'm hoping there is a more complete solution built in to HTML Agility Pack (or another tool) that I just haven't been able to find.
*(real input often has a ton of unneeded HTML in it, I can give a longer example if that would be useful)
You could use this pattern to replace the HTML tags:
</?[a-zA-Z][a-zA-Z0-9 \"=_-]*?>
Explanation:
<
maybe / (as it may be closing tag)
match a-z or A-Z as the first letter
MAYBE match any of a-z, or A-Z, 0-9, "=_- indefinitely
>
Final Code:
using System;
using System.Text.RegularExpressions;
namespace Regular
{
class Program
{
static void Main(string[] args)
{
string yourText = "this text has <b>weird < things</b> going on >";
string newText = Regex.Replace(yourText, "</?[a-zA-Z][a-zA-Z0-9 \"=_-]*>", "");
Console.WriteLine(newText);
}
}
}
Outputs:
this text has weird < things going on >
#corey-ogburn's comment is not correct as <[space]abc> would be replaced.
As you only want to strip them off the string I don't see a reason where you'd want to check if you have a tag starting/ending, but you could easily make it with regex.
It's not always a good choice to use RegEx to parse HTML, but I think it'd be fine if you want to parse simple text.
I wrote this a really long time ago to do something similar. You might use it as a starting point:
You'll need:
using System;
using System.Collections.Generic;
And the code:
/// <summary>
/// Instances of this class strip HTML/XML tags from a string
/// </summary>
public class HTMLStripper
{
public HTMLStripper() { }
public HTMLStripper(string source)
{
m_source = source;
stripTags();
}
private const char m_beginToken = '<';
private const char m_endToken = '>';
private const char m_whiteSpace = ' ';
private enum tokenType
{
nonToken = 0,
beginToken = 1,
endToken = 2,
escapeToken = 3,
whiteSpace = 4
}
private string m_source = string.Empty;
private string m_stripped = string.Empty;
private string m_tagName = string.Empty;
private string m_tag = string.Empty;
private Int32 m_startpos = -1;
private Int32 m_endpos = -1;
private Int32 m_currentpos = -1;
private IList<string> m_skipTags = new List<string>();
private bool m_tagFound = false;
private bool m_tagsStripped = false;
/// <summary>
/// Gets or sets the source string.
/// </summary>
/// <value>
/// The source string.
/// </value>
public string source { get { return m_source; } set { clear(); m_source = value; stripTags(); } }
/// <summary>
/// Gets the string stripped of HTML tags.
/// </summary>
/// <value>
/// The string.
/// </value>
public string stripped { get { return m_stripped; } set { } }
/// <summary>
/// Gets or sets a value indicating whether [HTML tags were stripped].
/// </summary>
/// <value>
/// <c>true</c> if [HTML tags were stripped]; otherwise, <c>false</c>.
/// </value>
public bool tagsStripped { get { return m_tagsStripped; } set { } }
/// <summary>
/// Adds the name of an HTML tag to skip stripping (leave in the text).
/// </summary>
/// <param name="value">The value.</param>
public void addSkipTag(string value)
{
if (value.Length > 0)
{
// Trim start and end tokens from skipTags if present and add to list
CharEnumerator tmpScanner = value.GetEnumerator();
string tmpString = string.Empty;
while (tmpScanner.MoveNext())
{
if (tmpScanner.Current != m_beginToken && tmpScanner.Current != m_endToken) { tmpString += tmpScanner.Current; }
}
if (tmpString.Length > 0) { m_skipTags.Add(tmpString); }
}
}
/// <summary>
/// Clears this instance.
/// </summary>
public void clear()
{
m_source = string.Empty;
m_tag = string.Empty;
m_startpos = -1;
m_endpos = -1;
m_currentpos = -1;
m_tagsStripped = false;
}
/// <summary>
/// Clears all.
/// </summary>
public void clearAll()
{
this.clear();
m_skipTags.Clear();
}
/// <summary>
/// Strips the HTML tags.
/// </summary>
private void stripTags()
{
// Preserve source and make a copy for stripping
m_stripped = m_source;
// Find first tag
getNext();
// If there are any tags (if next tag is string.Empty we are at EOS)...
if (m_tagName != string.Empty)
{
do
{
// If the tag we found is not to be skipped...
if (!m_skipTags.Contains(m_tagName))
{
// Remove tag from string
m_stripped = m_stripped.Remove(m_startpos, m_endpos - m_startpos + 1);
m_tagsStripped = true;
}
// Get next tag, rinse and repeat (if next tag is string.Empty we are at EOS)
getNext();
} while (m_tagName != string.Empty);
}
}
/// <summary>
/// Steps the pointer to the next HTML tag.
/// </summary>
private void getNext()
{
m_tagFound = false;
m_tag = string.Empty;
m_tagName = string.Empty;
bool beginTokenFound = false;
CharEnumerator scanner = m_stripped.GetEnumerator();
// If we're not at the beginning of the string, move the enumerator to the appropriate location in the string
if (m_currentpos != -1)
{
Int32 index = 0;
do
{
scanner.MoveNext();
index += 1;
} while (index < m_currentpos + 1);
}
while (!m_tagFound && m_currentpos + 1 < m_stripped.Length)
{
// Find next begin token
while (scanner.MoveNext())
{
m_currentpos += 1;
if (evaluateChar(scanner.Current) == tokenType.beginToken)
{
m_startpos = m_currentpos;
beginTokenFound = true;
break;
}
}
// If a begin token is found, find next end token
if (beginTokenFound)
{
while (scanner.MoveNext())
{
m_currentpos += 1;
// If we find another begin token before finding an end token we are not in a tag
if (evaluateChar(scanner.Current) == tokenType.beginToken)
{
m_tagFound = false;
beginTokenFound = true;
break;
}
// If the char immediately following a begin token is a white space we are not in a tag
if (m_currentpos - m_startpos == 1 && evaluateChar(scanner.Current) == tokenType.whiteSpace)
{
m_tagFound = false;
beginTokenFound = true;
break;
}
// End token found
if (evaluateChar(scanner.Current) == tokenType.endToken)
{
m_endpos = m_currentpos;
m_tagFound = true;
break;
}
}
}
if (m_tagFound)
{
// Found a tag, get the info for this tag
m_tag = m_stripped.Substring(m_startpos, (m_endpos + 1) - m_startpos);
m_tagName = m_stripped.Substring(m_startpos + 1, m_endpos - m_startpos - 1);
// If this tag is to be skipped, we do not want to reset the position within the string
// Also, if we are at the end of the string (EOS) we do not want to reset the position
if (!m_skipTags.Contains(m_tagName) && m_currentpos != stripped.Length)
{
m_currentpos = -1;
}
}
}
}
/// <summary>
/// Evaluates the next character.
/// </summary>
/// <param name="value">The value.</param>
/// <returns>tokenType</returns>
private tokenType evaluateChar(char value)
{
tokenType returnValue = new tokenType();
switch (value)
{
case m_beginToken:
returnValue = tokenType.beginToken;
break;
case m_endToken:
returnValue = tokenType.endToken;
break;
case m_whiteSpace:
returnValue = tokenType.whiteSpace;
break;
default:
returnValue = tokenType.nonToken;
break;
}
return returnValue;
}
}

How to Trim exactly 1 whitespace after splitting the string

I have made a program that evaluates a string by splitting it at a pipeline, the string are randomly generated and sometimes whitespace is a part of what need to be evaluated.
HftiVfzRIDBeotsnU uabjvLPC | LstHCfuobtv eVzDUBPn jIRfai
This string is same length on either side(2 x whitespace on left side of pipeline), but my problem comes when i have to trim the space on both sides of the pipeline (i do this after splitting)
is there some way of making sure that i only trim 1 single space instead of them all.
my code so far:
foreach (string s in str)
{
int bugCount = 0;
string[] info = s.Split('|');
string testCase = info[0].TrimEnd();
char[] testArr = testCase.ToCharArray();
string debugInfo = info[1].TrimStart();
char[] debugArr = debugInfo.ToCharArray();
int arrBound = debugArr.Count();
for (int i = 0; i < arrBound; i++)
if (testArr[i] != debugArr[i])
bugCount++;
if (bugCount <= 2 && bugCount != 0)
Console.WriteLine("Low");
if (bugCount <= 4 && bugCount != 0)
Console.WriteLine("Medium");
if (bugCount <= 6 && bugCount != 0)
Console.WriteLine("High");
if (bugCount > 6)
Console.WriteLine("Critical");
else
Console.WriteLine("Done");
}
Console.ReadLine();
You have 2 options.
If there is always 1 space before and after the pipe, split on {space}|{space}.
myInput.Split(new[]{" | "},StringSplitOptions.None);
Otherwise, instead of using TrimStart() & TrimEnd() use SubString.
var split = myInput.Split('|');
var s1 = split[0].EndsWith(" ")
? split[0].SubString(0,split[0].Length-1)
: split[0];
var s2 = split[1].StartsWith(" ")
? split[1].SubString(1) // to end of line
: split[1];
Note, there is some complexity here - if the pipe has no space around it, but the last/first character is a legitimate (data) space character the above will cut it off. You need more logic, but hopefully this will get you started!
There is no way to tell the Trim.. methods family to stop after cutting out some number of characters.
In general case, you'd need to do it manually by inspecting the parts obtained after Split and checking their first/last characters and substring'ing to get the correct part.
However, in your case, there's a much simpler way - the Split can also take a string as an argument, and even more - a set of strings:
string[] info = s.Split(new []{ " | " });
// or even
string[] info = s.Split(new []{ " | ", " |", "| ", "|" });
That should take care of the single spaces around the pipe | character by simply treating them as a part of the separator.
This is a string extension to trim space for count times, just in case.
public static class StringExtension
{
/// <summary>
/// Trim space at the end of string for count times
/// </summary>
/// <param name="input"></param>
/// <param name="count">number of space at the end to trim</param>
/// <returns></returns>
public static string TrimEnd(this string input, int count = 1)
{
string result = input;
if (count <= 0)
{
return result;
}
if (result.EndsWith(new string(' ', count)))
{
result = result.Substring(0, result.Length - count);
}
return result;
}
/// <summary>
/// Trim space at the start of string for count times
/// </summary>
/// <param name="input"></param>
/// <param name="count">number of space at the start to trim</param>
/// <returns></returns>
public static string TrimStart(this string input, int count = 1)
{
string result = input;
if (count <= 0)
{
return result;
}
if (result.StartsWith(new string(' ', count)))
{
result = result.Substring(count);
}
return result;
}
}
In the main
static void Main(string[] args)
{
string a = "1234 ";
string a1 = a.TrimEnd(1); // returns "1234 "
string a2 = a.TrimEnd(2); // returns "1234"
string a3 = a.TrimEnd(3); // returns "1234 "
string b = " 5678";
string b1 = b.TrimStart(1); // returns " 5678"
string b2 = b.TrimStart(2); // returns "5678"
string b3 = b.TrimStart(3); // returns " 5678"
}

OpenXML replace text in all document

I have the piece of code below. I'd like replace the text "Text1" by "NewText", that's work. But when I place the text "Text1" in a table that's not work anymore for the "Text1" inside the table.
I'd like make this replacement in the all document.
using (WordprocessingDocument doc = WordprocessingDocument.Open(String.Format("c:\\temp\\filename.docx"), true))
{
var body = doc.MainDocumentPart.Document.Body;
foreach (var para in body.Elements<Paragraph>())
{
foreach (var run in para.Elements<Run>())
{
foreach (var text in run.Elements<Text>())
{
if (text.Text.Contains("##Text1##"))
text.Text = text.Text.Replace("##Text1##", "NewText");
}
}
}
}
Your code does not work because the table element (w:tbl) is not contained in
a paragraph element (w:p). See the following MSDN article for more information.
The Text class (serialized as w:t) usually represents literal text within a Run element in a
word document. So you could simply search for all w:t elements (Text class) and replace your
tag if the text element (w:t) contains your tag:
using (WordprocessingDocument doc = WordprocessingDocument.Open("yourdoc.docx", true))
{
var body = doc.MainDocumentPart.Document.Body;
foreach (var text in body.Descendants<Text>())
{
if (text.Text.Contains("##Text1##"))
{
text.Text = text.Text.Replace("##Text1##", "NewText");
}
}
}
Borrowing on some other answers in various places, and with the fact that four main obstacles must be overcome:
Delete any high level Unicode chars from your replace string that cannot be read from Word (from bad user input)
Ability to search for your find result across multiple runs or text elements within a paragraph (Word will often break up a single sentence into several text runs)
Ability to include a line break in your replace text so as to insert multi-line text into the document.
Ability to pass in any node as the starting point for your search so as to restrict the search to that part of the document (such as the body, the header, the footer, a specific table, table row, or tablecell).
I am sure advanced scenarios such as bookmarks, complex nesting will need more modification on this, but it is working for the types of basic word documents I have run into so far, and is much more helpful to me than disregarding runs altogether or using a RegEx on the entire file with no ability to target a specific TableCell or Document part (for advanced scenarios).
Example Usage:
var body = document.MainDocumentPart.Document.Body;
ReplaceText(body, replace, with);
The code:
using System;
using System.Collections.Generic;
using System.Linq;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
namespace My.Web.Api.OpenXml
{
public static class WordTools
{
/// <summary>
/// Find/replace within the specified paragraph.
/// </summary>
/// <param name="paragraph"></param>
/// <param name="find"></param>
/// <param name="replaceWith"></param>
public static void ReplaceText(Paragraph paragraph, string find, string replaceWith)
{
var texts = paragraph.Descendants<Text>();
for (int t = 0; t < texts.Count(); t++)
{ // figure out which Text element within the paragraph contains the starting point of the search string
Text txt = texts.ElementAt(t);
for (int c = 0; c < txt.Text.Length; c++)
{
var match = IsMatch(texts, t, c, find);
if (match != null)
{ // now replace the text
string[] lines = replaceWith.Replace(Environment.NewLine, "\r").Split('\n', '\r'); // handle any lone n/r returns, plus newline.
int skip = lines[lines.Length - 1].Length - 1; // will jump to end of the replacement text, it has been processed.
if (c > 0)
lines[0] = txt.Text.Substring(0, c) + lines[0]; // has a prefix
if (match.EndCharIndex + 1 < texts.ElementAt(match.EndElementIndex).Text.Length)
lines[lines.Length - 1] = lines[lines.Length - 1] + texts.ElementAt(match.EndElementIndex).Text.Substring(match.EndCharIndex + 1);
txt.Space = new EnumValue<SpaceProcessingModeValues>(SpaceProcessingModeValues.Preserve); // in case your value starts/ends with whitespace
txt.Text = lines[0];
// remove any extra texts.
for (int i = t + 1; i <= match.EndElementIndex; i++)
{
texts.ElementAt(i).Text = string.Empty; // clear the text
}
// if 'with' contained line breaks we need to add breaks back...
if (lines.Count() > 1)
{
OpenXmlElement currEl = txt;
Break br;
// append more lines
var run = txt.Parent as Run;
for (int i = 1; i < lines.Count(); i++)
{
br = new Break();
run.InsertAfter<Break>(br, currEl);
currEl = br;
txt = new Text(lines[i]);
run.InsertAfter<Text>(txt, currEl);
t++; // skip to this next text element
currEl = txt;
}
c = skip; // new line
}
else
{ // continue to process same line
c += skip;
}
}
}
}
}
/// <summary>
/// Determine if the texts (starting at element t, char c) exactly contain the find text
/// </summary>
/// <param name="texts"></param>
/// <param name="t"></param>
/// <param name="c"></param>
/// <param name="find"></param>
/// <returns>null or the result info</returns>
static Match IsMatch(IEnumerable<Text> texts, int t, int c, string find)
{
int ix = 0;
for (int i = t; i < texts.Count(); i++)
{
for (int j = c; j < texts.ElementAt(i).Text.Length; j++)
{
if (find[ix] != texts.ElementAt(i).Text[j])
{
return null; // element mismatch
}
ix++; // match; go to next character
if (ix == find.Length)
return new Match() { EndElementIndex = i, EndCharIndex = j }; // full match with no issues
}
c = 0; // reset char index for next text element
}
return null; // ran out of text, not a string match
}
/// <summary>
/// Defines a match result
/// </summary>
class Match
{
/// <summary>
/// Last matching element index containing part of the search text
/// </summary>
public int EndElementIndex { get; set; }
/// <summary>
/// Last matching char index of the search text in last matching element
/// </summary>
public int EndCharIndex { get; set; }
}
} // class
} // namespace
public static class OpenXmlTools
{
// filters control characters but allows only properly-formed surrogate sequences
private static Regex _invalidXMLChars = new Regex(
#"(?<![\uD800-\uDBFF])[\uDC00-\uDFFF]|[\uD800-\uDBFF](?![\uDC00-\uDFFF])|[\x00-\x08\x0B\x0C\x0E-\x1F\x7F-\x9F\uFEFF\uFFFE\uFFFF]",
RegexOptions.Compiled);
/// <summary>
/// removes any unusual unicode characters that can't be encoded into XML which give exception on save
/// </summary>
public static string RemoveInvalidXMLChars(string text)
{
if (string.IsNullOrEmpty(text)) return "";
return _invalidXMLChars.Replace(text, "");
}
}
Maybe this solution is easier
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
{
string docText = null;
//1. Copy all the file into a string
using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
docText = sr.ReadToEnd();
//2. Use regular expression to replace all text
Regex regexText = new Regex(find);
docText = regexText.Replace(docText, replace);
//3. Write the changed string into the file again
using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
sw.Write(docText);

Best way to reverse string in c#

Is this the right method to reverse a string? I'm planning to use it to reverse a string like: Products » X1 » X3 to X3 « X1 « Products
I want it to be a global function which can be used elsewhere.
public static string ReverseString(string input, string separator, string outSeparator)
{
string result = String.Empty;
string[] temp = Regex.Split(input, separator, RegexOptions.IgnoreCase);
Array.Reverse(temp);
for (int i = 0; i < temp.Length; i++)
{
result += temp[i] + " " + outSeparator + " ";
}
return result;
}
How about:
String.Join(" « ", "Products » X1 » X3".Split(new[]{" » "},
StringSplitOptions.None).Reverse().ToArray());
EDIT: The updated version version will work if the components contain spaces (e.g. "Foo Products » X1 » X3")
Yes that seems to be ok.
About StringBuilder:
No need to use StringBuilder unless there are usually more than 4-5 elements after the split. If there are usually less than that then aggregation is fine.
You should use a StringBuilder rather than just string aggregation, especially if this is going to be used a lot.
You can also use String.Join() to put a delimited string array back together.
I used the following:
/// <summary>
/// From BReusable
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="items"></param>
/// <param name="toStringFunc"></param>
/// <param name="seperator"></param>
/// <returns></returns>
public static string ToJoinedString<T>(this IList<T> items, Func<int, T, string> toStringFunc, string seperator)
{
var sb = new StringBuilder();
for (int i = 0; i < items.Count(); i++)
{
sb.Append((i != 0 ? seperator : String.Empty) + toStringFunc(i,items[i]));
}
return sb.ToString();
}
public static string ToStringFromCharArray(this IEnumerable<char> items)
{
return items.ToJoinedString(x => x.ToString(), string.Empty);
}
with stringValue.Reverse().ToStringFromCharArray();

Categories

Resources