I have a sitemap file for search engines:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url>
<loc>http://site.com/</loc>
</url>
<url>
<loc>http://site.com/about</loc>
</url>
<url>
<loc>http://site.com/contacts</loc>
</url>
<url>
<loc>http://site.com/articles/article1.html</loc>
</url>
<url>
<loc>http://site.com/users/123</loc>
</url>
</urlset>
How to insert a new node?
When I use xDoc.Element("url") or xDoc.Element("urlset") or xDoc.Element("xml") or Doc.Elements(...) I get null always. It's very strange.
The code below shows how to navigate within the xml and how to insert a new node
XDocument xDoc = XDocument.Load("sitemap.xml");
XNamespace ns = xDoc.Root.Name.Namespace;
// Navigation within the xml
XElement urlset = xDoc.Element(ns + "urlset");
Console.WriteLine(urlset.Name.LocalName); // -> "urlset"
IEnumerable<XElement> urls = urlset.Elements(ns + "url");
foreach (var url in urls)
{
XElement loc = url.Element(ns + "loc");
Console.WriteLine(loc.Value); // -> "http://site.com/", ...
}
// Inserting a new node under "urlset" node
urlset.Add(
new XElement(ns + "url",
new XElement(ns + "loc",
"http://site.com//questions/4183526")));
Related
I'm trying to create and XML Sitemap, and I'm tasked to create it using a List and in a way that if the ImageName property in the List in empty, the XML ignores it but if the property is not empty, it will use the property to build the XML block correctly.
This is what I'm currently using to build the XML:
string imageURL = "https://images.ontheedgebrands.com/images/";
XNamespace xsi = "http://www.w3.org/2001/XMLSchema-instance";
XNamespace gs = "http://www.sitemaps.org/schemas/sitemap/0.9";
XNamespace nsImage = "http://www.google.com/schemas/sitemap-image/1.1";
XDocument doc = new XDocument(
new XElement(gs + "urlset",
new XAttribute("xmlns", gs),
new XAttribute(XNamespace.Xmlns + "image", nsImage),
new XAttribute(XNamespace.Xmlns + "xsi", xsi),
new XAttribute(xsi + "schemaLocation", "http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"),
from rw in rwlist select
new XElement(gs + "url",
new XElement(gs + "loc", site + rw.SEOURL),
new XElement(nsImage + "image",
new XElement(nsImage + "loc", imageURL + rw.ImageName)),
new XElement(gs + "changefreq", "weekly"),
new XElement(gs + "priority", rw.Priority)
)));
doc.Save(file);
And depending on if the rw.ImageName property in in the list and not empty, I want it to build the XML dynamically and look something like this:
<url>
<loc>https://www.budk.com/$10-$20-3231</loc>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://www.budk.com/$20-$50-3232</loc>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://www.budk.com/-308-Black-Lower-Receiver-Kit--80-Percent-36485</loc>
<image:image>
<image:loc>https://images.ontheedgebrands.com/images/A52-PO2331.jpg</image:loc>
</image:image>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://www.budk.com/-40-Cal-Blowgun-Broadhead-Dart-25-Per-Pack-20739</loc>
<image:image>
<image:loc>https://images.ontheedgebrands.com/images/A08-SFBGBHD25.jpg</image:loc>
</image:image>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
The first two objects in the list had the ImageName property empty and the third and fourth list object had an ImageName so they where build differently.
With the code, right now the XML looks like this, and I don't want the first two XML blocks to add an image attribute because the ImageName property in the List is empty:
<url>
<loc>https://www.budk.com/$10-$20-3231</loc>
<image:image>
<image:loc>https://images.ontheedgebrands.com/images/</image:loc>
</image:image>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://www.budk.com/$20-$50-3232</loc>
<image:image>
<image:loc>https://images.ontheedgebrands.com/images/</image:loc>
</image:image>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://www.budk.com/-308-Black-Lower-Receiver-Kit--80-Percent-36485</loc>
<image:image>
<image:loc>https://images.ontheedgebrands.com/images/A52-PO2331.jpg</image:loc>
</image:image>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://www.budk.com/-40-Cal-Blowgun-Broadhead-Dart-25-Per-Pack-20739</loc>
<image:image>
<image:loc>https://images.ontheedgebrands.com/images/A08-SFBGBHD25.jpg</image:loc>
</image:image>
<changefreq>weekly</changefreq>
<priority>1.0</priority>
</url>
You can use the fact that LINQ to XML ignores null values when adding them. So all you need to do is change this code that unconditionally creates the image element:
new XElement(nsImage + "image",
new XElement(nsImage + "loc", imageURL + rw.ImageName)),
... to this:
string.IsNullOrEmpty(rw.ImageName)
? null
: new XElement(nsImage + "image", new XElement(nsImage + "loc", imageURL + rw.ImageName)),
You could also potentially simplify the code a bit by setting up those XName values beforehand:
XNamespace nsImage = "http://www.google.com/schemas/sitemap-image/1.1";
XName imageXName = nsImage + "image";
XName imageLocXName = nsImage + "loc";
...
// In the argument list
string.IsNullOrEmpty(rw.ImageName)
? null
: new XElement(imageXName, new XElement(imageLocXName, imageURL + rw.ImageName)),
I have sitemap format as below.
I want to delete a complete node that
I find loc.
For example:
Where a node has <loc>with a value of http://www.my.com/en/flight1.
I want to delete the <url> node and his child
I want to delete loc
than lastmod than priority and than changefreq
<url>
<loc>http://www.my.com/en/flight1
</loc>
<lastmod>2015-03-05</lastmod>
<priority>0.5</priority>
<changefreq>never</changefreq>
</url>
<url>
<loc>
http://www.my.com/en/flight2
</loc>
<lastmod>2015-03-05</lastmod>
<priority>0.5</priority>
<changefreq>never</changefreq>
</url>
<url>
<loc>
http://www.my.com/en/flight3
</loc>
<lastmod>2015-03-05</lastmod>
<priority>0.5</priority>
<changefreq>never</changefreq>
</url>
If you're using C# you should use System.xml.linq (XDocument)
You can remove a node like so:
XDocument.Load(/*URI*/);
var elements = document.Root.Elements().Where(e => e.Element("loc") != null && e.Element("loc").Value == "http://www.my.com/en/flight1");
foreach (var url in elements)
{
url.Remove();
}
I am trying to make a video sitemap for video website. But I am facing a XMLExecption "The ':' character, hexadecimal value 0x3A, cannot be included in a name." This is because of colons (video:video) in the name.
XNamespace gs = "http://www.sitemaps.org/schemas/sitemap/0.9";
XDocument doc = new XDocument(
new XElement(gs + "urlset",
(from p in db.Videos
orderby p.video_id descending
select new XElement(gs + "url",
new XElement(gs + "loc", "http://www.example.com/video/" + p.video_id + "-" + p.video_query),
new XElement(gs + "video:video",
new XElement(gs + "video:thumbnail_loc", "http://cdn.example.com/thumb/" + p.video_image)
))).Take(50)));
doc.Save(#"C:\video_sitemap.xml");
Please tell me how to add colons in the name to generate dynamic xml sitemap using LINQ to SQL.
Thanks and Regards.
UPDATE:
This Video XML sitemap should look like on the page:
Google Video Sitemap
video here is the alias for a namespace. As per the example:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url>
<loc>http://www.example.com/videos/some_video_landing_page.html</loc>
<video:video>
...
</video:video>
</url>
</urlset>
So you just need two XNamespace values - one for the sitemap namespace and one for the video namespace:
XNamespace siteMapNs = "http://www.sitemaps.org/schemas/sitemap/0.9";
XNamespace videoNs = "http://www.google.com/schemas/sitemap-video/1.1";
XDocument doc = new XDocument(
new XElement(siteMapNs + "urlset",
(from p in db.Videos
orderby p.video_id descending
select new XElement(siteMapNs + "url",
new XElement(siteMapNs + "loc",
"http://www.example.com/video/" + p.video_id + "-" + p.video_query),
new XElement(videoNs + "video",
new XElement(videoNs + "thumbnail_loc",
"http://cdn.example.com/thumb/" + p.video_image)
)
)
).Take(50)
)
);
EDIT: If you really want this to use an alias of video for the namespace, you can declare it in your root element:
XDocument doc = new XDocument(
new XElement(siteMapNs + "urlset",
new XAttribute(XNamespace.Xmlns + "video", videoNs),
(from p in db.Videos
...
#Jon Skeet:
Your code generated this sitemap
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/video/1-video_query_1</loc>
<video xmlns="http://www.google.com/schemas/sitemap-video/1.1">
<thumbnail_loc>http://cdn.example.com/thumb/7665518872558731.jpg</thumbnail_loc>
</video>
</url>
<url>
<loc>http://www.example.com/video/2-video_query_2</loc>
<video xmlns="http://www.google.com/schemas/sitemap-video/1.1">
<thumbnail_loc>http://cdn.jigers.com/thumb/6921835997871337.jpg</thumbnail_loc>
</video>
</url>
but It should be like this:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url>
<loc>http://www.example.com/video/1-video_query_1</loc>
<video:video>
<video:thumbnail_loc>http://cdn.example.com/thumb/7665518872558731.jpg</video:thumbnail_loc>
</video:video>
</url>
<url>
<loc>http://www.example.com/video/2-video_query_2</loc>
<video:video>
<video:thumbnail_loc>http://cdn.jigers.com/thumb/6921835997871337.jpg</video:thumbnail_loc>
</video:video>
</url>
</urlset>
video:video have colons and there should be a xmlns:video="http://www.google.com/schemas/sitemap-video/1.1
namespace in urlset..
Please take a look
You seem to be confusing aliases with namespaces.
Figure out what the namespace for 'videos'. Create an XNamespace for it (like you did in gs).
Then do videoNamespace + "thumbnail_loc".
I'm trying to create a structure like
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="namespace1"
xmlns:image="namespace2">
<url>
<loc>http://www.example.com/foo.html</loc>
<image:image>
<image:loc>http://example.com/image.jpg</image:loc>
</image:image>
</url>
</urlset>
Any ideas on how to create the image elements using XLinq?
Thanks
You're looking for the XNamespace class.
For example:
XNamespace image = "namespace2";
var element = new XElement(image + "image",
new XElement(image + "loc", someUrl)
);
I'm not sure if you can get exactly what your after, but this:
XNamespace ns1 = "namespace1";
XNamespace ns2 = "namespace2";
new XElement(ns1 + "urlset",
new XElement(ns1 + "loc", "http://www.example.com/foo.htm"),
new XElement(ns2 + "image",
new XElement(ns2 + "loc", "http://example.com/image.jpg"))).Dump();
Should get you the equivalent.
<urlset xmlns="namespace1">
<loc>http://www.example.com/foo.htm</loc>
<image xmlns="namespace2">
<loc>http://example.com/image.jpg</loc>
</image>
</urlset>
I have a Linq-2-XML query that will not work if a google sitemap that I have created has its urlset element populated with attributes but will work fine if there are no attributes present.
Can't query:
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.foo.com/index.htm</loc>
<lastmod>2010-05-11</lastmod>
<changefreq>monthly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>http://www.foo.com/about.htm</loc>
<lastmod>2010-05-11</lastmod>
<changefreq>monthly</changefreq>
<priority>1.0</priority>
</url>
</urlset>
Can query:
<?xml version="1.0" encoding="utf-8"?>
<urlset>
<url>
<loc>http://www.foo.com/index.htm</loc>
<lastmod>2010-05-11</lastmod>
<changefreq>monthly</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>http://www.foo.com/about.htm</loc>
<lastmod>2010-05-11</lastmod>
<changefreq>monthly</changefreq>
<priority>1.0</priority>
</url>
</urlset>
The query:
XDocument xDoc = XDocument.Load(#"C:\Test\sitemap.xml");
var sitemapUrls = (from l in xDoc.Descendants("url")
select l.Element("loc").Value);
foreach (var item in sitemapUrls)
{
Console.WriteLine(item.ToString());
}
What would be the reason for this?
See the "xmlns=" tag in the XML? You need to specify the namespace. Test the following modification of your code:
XDocument xDoc = XDocument.Load(#"C:\Test\sitemap.xml");
XNamespace ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
var sitemapUrls = (from l in xDoc.Descendants(ns + "url")
select l.Element(ns + "loc").Value);
foreach (var item in sitemapUrls)
{
Console.WriteLine(item.ToString());
}