500 Response on Robots.txt Fetch Can Impact Rich Results

Google’s John Mueller has received feedback about an error in how Search Console validates rich results. Google will drop images from rich results due to an error in how the CDN hosting images handles a non-existent robots.txt request. The bug discovered was how the search console and Google’s rich results test failed to alert the publisher of the error and subsequently grant the structured data a successful validation.

An error in a programming context is when a program behaves in an unexpected way. A bug is not always a coding problem, but as in this case, there could be a failure to anticipate a problem which in turn leads to unintended results, like this one.

The publisher who asked the question tried using Google tools to diagnose why their rich results were missing and was surprised to find that they weren’t helpful for this particular error.

While this issue was affecting the recipe rich result image preview in Google Rich Recipe Results, this issue could also be an issue for other situations as well.

So it is good to be aware of this problem as it may appear in other ways.

Image previews for recipe rich results are gone

The person asking the question provided a background to what happened.

He recounted what happened:

“We fell into a tiger trap, I would say, in terms of rich recipe results.

We have hundreds of thousands of recipes cataloged and there is a lot of traffic coming from the recipe gallery.

And then… it stopped over a period of time.

And all the metadata checked and Google Search Console was saying… This is all rich recipe content, it’s all good, it can be shown.

We finally noticed that in the preview, when previewing the result, the image was missing.

And it appears that there was a change at Google and that if a robots.txt file was required to retrieve images, nothing we could see in the tools was actually saying anything was invalid.

So, it seems a little awkward, when you check something to say “Is this the result of a rich and valid recipe?” And she says yeah, it’s great, it’s absolutely amazing, we have all the meta.

And check all URLs and all images are correct, but it turns out behind the scenes, there’s a new requirement that you have a robots.txt file.

John Mueller asked:

“How do you mean you should have a robots.txt file?”

The person who asked the question answered:

What we found is, if you request a robots.txt file from our CDN, it gives you like 500.

When we put the robots.txt file in there, the previews started showing up right away.

This includes crawling and putting it in a static location, I think.

So operationally we found adding a robots.txt file that did the job.”

John Mueller nodded.

“Yes, okay.

So from our point of view, it’s not that robots.txt is required. But it must have an appropriate result code.

So if you don’t have on , it should return 404.

If you have one, we can read that clearly.

But if you return a server error for a robots.txt file, our systems will assume that there may be a server problem and we won’t crawl.

And that’s kind of been the way things have been since the beginning.

But these kinds of issues especially when you’re on a CDN and it’s a separate hostname, sometimes it’s really hard to figure it out.

And I visualize the rich results test, at least as far as I know, it focuses on the content on the HTML page.

So the JSON-LD markup you have in there, is probably not checking to see if the images are actually extractable.

And then, if it can’t be fetched, then of course we can’t use it in the carousel either.

So that might be something we need to figure out how to highlight it better.”

A 500 CDN Robots.txt error response can cause problems

This is one of those offering off SEO issues that are hard to diagnose but can cause a lot of negative issues as the person asking the question noticed.

Crawling a non-existent robots.txt file usually results in a 404 server response code, which means that the robots.txt file does not exist.

So if a robots.txt request generates a 500 response code, this is an indication that something on the server or CMS is configured incorrectly.

The short term solution is to upload a robots.txt file.

But it might be a good idea to dive into your CMS or server to check for the underlying problem.

500 response code to fetch Robots.txt

The negative results of previewing rich results for recipes may be due to a CDN that returns a 500 error response, a rare issue.

Sometimes a 500 server error response code occurs when there is something unexpected or missing in the code and the server responds by ending processing the code and throwing a 500 response code.

For example, if you edit a PHP file and forget to indicate the end of a section of code, this may cause the server to give up processing the code and issue a 500 response.

Whatever the reason for the error response when Google tried to fetch a robots.txt file, this is a good issue to keep in mind for that rare situation when it happens to you.

the quote

CDN for images, glitch results and rich recipes

Watch at 51:45 min Mark

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button