Image indexation and other new features of Simple XML sitemap 2.10

26 Sep 2017
sitemap.xml

New features of Simple XML sitemap

Version 2.10 of Simple XML sitemap is mainly a feature release with only a few minor bugs fixed. The new features are

  • the implementation of the changefreq parameter
  • the ability to set an interval at which to regenerate the sitemap
  • the ability to customize XML output
  • the ability to add arbitrary links to the sitemap
  • image indexation

See the 8.x-2.10 release page for details.
A new version has been released, please make sure to visit the project page.

Image indexationInclusion settings

Simple XML sitemap is now able to create Google image sitemaps through indexing all images attached to entities. This includes images uploaded through the image field as well as inline images uploaded through the WYSIWYG. The inclusion of images can be set at the entity type and bundle level but can be overridden on a per-entity basis giving you all the flexibility.

Please bear in mind, that all images attached to entities get indexed regardless of their file system location (public/private). Another thing worth noting is that the original images get indexed, not the derived styles. This should be considered before indexing entities with many high resolution images which could increase traffic.

Indexation of custom link images has not made it into this release, but the feature is already available in the development version of the module.

Adding arbitrary links to the sitemap

Most use cases dictate the inclusion of internal links which can be achieved through adding entity links to the index. For non-entity pages like views, there has been the possibility to add custom links through the UI or the API. In both cases however the system only allows internal links which are accessible to anonymous users. The new version of the module provides a way to add any link to the index, even ones Drupal does not know about:

  1. /**
  2.  * Use this hook to add arbitrary links to the sitemap.
  3.  *
  4.  * @param array &$arbitrary_links
  5.  */
  6. function hook_simple_sitemap_arbitrary_links_alter(&$arbitrary_links) {
  7.  
  8. // Add an arbitrary link.
  9. $arbitrary_links[] = [
  10. 'url' => 'http://example.com',
  11. 'priority' => '0.5',
  12. 'lastmod' => '2012-10-12T17:40:30+02:00',
  13. 'changefreq' => 'weekly',
  14. 'images' => [
  15. ['path' =>'http://path-to-image.png']
  16. ]
  17. ];
  18. }

As the example shows, all properties of the link like priority/lastmod/changefreq can be defined as well.

To alter links shortly before they get transformed to XML output, there is still the possibility to use the following:

  1. /**
  2.  * Alter the generated link data before the sitemap is saved.
  3.  * This hook gets invoked for every sitemap chunk generated.
  4.  *
  5.  * @param array &$links
  6.  * Array containing multilingual links generated for each path to be indexed.
  7.  */
  8. function hook_simple_sitemap_links_alter(&$links) {
  9.  
  10. // Remove German URL for a certain path in the hreflang sitemap.
  11. foreach ($links as $key => $link) {
  12. if ($link['path'] === 'node/1') {
  13. // Remove 'loc' URL if it points to a german site.
  14. if ($link['langcode'] === 'de') {
  15. unset($links[$key]);
  16. }
  17. // If this 'loc' URL points to a non-german site, make sure to remove
  18. // its german alternate URL.
  19. else {
  20. if ($link['alternate_urls']['de']) {
  21. unset($links[$key]['alternate_urls']['de']);
  22. }
  23. }
  24. }
  25. }
  26. }

Basic alteration of the XML output

The following two new hooks can now be used to alter the XML output:

  1. /**
  2.  * Alters the sitemap attributes shortly before XML document generation.
  3.  * Attributes can be added, changed and removed.
  4.  *
  5.  * @param array &$attributes
  6.  */
  7. function hook_simple_sitemap_attributes_alter(&$attributes) {
  8.  
  9. // Remove the xhtml attribute e.g. if no xhtml sitemap elements are present.
  10. unset($attributes['xmlns:xhtml']);
  11. }
  12.  
  13. /**
  14.  * Alters attributes of the sitemap index. shortly before XML document generation.
  15.  * Attributes can be added, changed and removed.
  16.  *
  17.  * @param array &$index_attributes
  18.  */
  19. function hook_simple_sitemap_index_attributes_alter(&$index_attributes) {
  20.  
  21. // Add some attribute to the sitemap index.
  22. $index_attributes['name'] = 'value';
  23. }

Other API changes

The API is now more forgiving allowing missing link setting arguments when using some of its inclusion altering methods. Here is en example of the simple_sitemap.generator API in action:

  1. \Drupal::service('simple_sitemap.generator')
  2. ->saveSetting('remove_duplicates', TRUE)
  3. ->enableEntityType('node')
  4. ->setBundleSettings('node', 'page', ['index' => TRUE, 'priority' => 0.5])
  5. ->removeCustomLinks()
  6. ->addCustomLink('/some/view/page', ['priority' => 0.5])
  7. ->generateSitemap();

More documentation can be found here. I hope the new version of this module will be of great use to you!

All info about the project

Comments

One question, if you don't mind:

You say "add any link to the index, even ones Drupal does not know about", but then suggest using a hook. As a result, I'm a bit confused.

I use a directory on my website as a "personal Imgur" - i.e. upload images there through SFTP to post them somewhere, e.g. on forums.

That's the images "Drupal does not know about", but I would like to have them indexed, as the contents is usually rather interesting, unique and on topic (not like memes on Imgur).

What do I need to do to make Drupal include those images into sitemap? In most cases those images aren't displayed on my site (i.e. aren't used in the contents of any node).

Thanks,
Dmitri

Usually you would not index images by themselves, instead you would index images as part of a webpage. For this page, it looks like this:
 

  1. <url>
  2. <loc>http://gbyte.co/blog/image-indexation-new-features-simple-xml-sitemap-2.10</loc>
  3. <xhtml:link href="http://gbyte.co/blog/image-indexation-new-features-simple-xml-sitemap-2.10" hreflang="en" rel="alternate">
  4. <lastmod>2017-12-12T22:36:26+01:00</lastmod>
  5. <changefreq>weekly</changefreq>
  6. <priority>0.7</priority>
  7. <image:image>
  8. <image:loc>http://gbyte.co/sites/default/files/public/images/blog/sitemap_8_0_0.png</image:loc>
  9. </image:image>
  10. <image:image>
  11. <image:loc>http://gbyte.co/sites/default/files/public/inline-images/bundle_settings_2_0.png</image:loc>
  12. </image:image>
  13. </xhtml:link>
  14. </url>

So if you have an accessible index page of these images, you can use hook_simple_sitemap_arbitrary_links_alter to add that page and its images like shown above, or, if it is a routed page, just add it to the index and use hook_simple_sitemap_links_alter to add images to it.

But if you are serious about the image drop functionality, the best thing would be to build this functionality in a way which makes Drupal know the images. I implemented this functionality on gbyte.co to share documents and images with my clients. There is some custom code involved, but most of the work is done by these modules:

  • ACL
  • Content Access
  • Download
  • User Protect

ACL and Content Access handle the permissions, Download makes it possible to download all the files attached to an entity. I also implemented group accounts where many people can use the same credentials to log in to their files. To prevent them from editing the account, the User Protect module can be utilized.

I would like to know exactly xml sitemaps works. I am also the developer of .net but I am confused with xml things how it works exactly

Add new comment

The content of this field is kept private and will not be shown publicly.

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.

Get a quote in 24 hours

Wether a huge commerce system, or a small business website, we will quote the project within 24h of you pressing the following button: Get quote