Wednesday, April 11, 2012

Facebook not able to scrape my url

I have the HTML structure for my page as given below. I have added all the meta og tags, but still facebook is not able to scrape any info from my site.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "">
<html xmlns="" xmlns:fb="">
<meta http-equiv="Content-Type" content="text/html;" charset=utf-8"></meta>
<title>My Site</title>
<meta content="This is my title" property="og:title">
<meta content="This is my description" property="og:description">
<meta content="" property="og:image">
<meta content="<MYPAGEID>" property="fb:page_id">

When I input the URL in facebook debugger(, I get the following messages:

Scrape Information
Response Code 404

Critical Errors That Must Be Fixed
Bad Response Code URL returned a bad HTTP response code.

Errors that must be fixed

Missing Required Property The 'og:url' property is required, but not present.
Missing Required Property The 'og:type' property is required, but not present.
Missing Required Property The 'og:title' property is required, but not present.

Open Graph Warnings That Should Be Fixed
Inferred Property The 'og:url' property should be explicitly provided, even if a value can be inferred from other tags.
Inferred Property The 'og:title' property should be explicitly provided, even if a value can be inferred from other tags.

Why is facebook not reading the meta tags info? The page is accessible and not hidden behind login etc.


Ok I did bit of debugging and this is what I found. I have htaccess rule set in my directory- I am using PHP Codeigniter framework and have htaccess rule to remove index.php from the url.

So, when I feed the url to facebook debugger( without index.php, facebook shows a 404, but when I feed url with index.php, it is able to parse my page.

Now how do I make facebook scrape content when the url doesn't have index.php?

This is my htaccess rule:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /

#Removes access to the system folder by users.
#Additionally this will allow you to create a System.php controller,
#previously this would not have been possible.
#'system' can be replaced if you have renamed your system folder.
RewriteCond %{REQUEST_URI} ^system.*
RewriteRule ^(.*)$ /index.php?/$1 [L]

#When your application folder isn't in the system folder
#This snippet prevents user access to the application folder
#Submitted by: Fabdrol
#Rename 'application' to your applications folder name.
RewriteCond %{REQUEST_URI} ^application.*
RewriteRule ^(.*)$ /index.php?/$1 [L]

#Checks to see if the user is attempting to access a valid file,
#such as an image or css document, if this isn't true it sends the
#request to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?/$1 [L]

<IfModule !mod_rewrite.c>
# If we don't have mod_rewrite installed, all 404's
# can be sent to index.php, and everything works as normal.
# Submitted by: ElliotHaughin

ErrorDocument 404 /index.php

No comments:

Post a Comment