Wednesday, April 11, 2012

Facebook not able to scrape my url

I have the HTML structure for my page as given below. I have added all the meta og tags, but still facebook is not able to scrape any info from my site.



<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:fb="http://www.facebook.com/2008/fbml">
<head>
<meta http-equiv="Content-Type" content="text/html;" charset=utf-8"></meta>
<title>My Site</title>
<meta content="This is my title" property="og:title">
<meta content="This is my description" property="og:description">
<meta content="http://ia.media-imdb.com/images/rock.jpg" property="og:image">
<meta content="<MYPAGEID>" property="fb:page_id">
.......
</head>
<body>
.....


When I input the URL in facebook debugger(https://developers.facebook.com/tools/debug), I get the following messages:



Scrape Information
Response Code 404

Critical Errors That Must Be Fixed
Bad Response Code URL returned a bad HTTP response code.


Errors that must be fixed

Missing Required Property The 'og:url' property is required, but not present.
Missing Required Property The 'og:type' property is required, but not present.
Missing Required Property The 'og:title' property is required, but not present.


Open Graph Warnings That Should Be Fixed
Inferred Property The 'og:url' property should be explicitly provided, even if a value can be inferred from other tags.
Inferred Property The 'og:title' property should be explicitly provided, even if a value can be inferred from other tags.


Why is facebook not reading the meta tags info? The page is accessible and not hidden behind login etc.



UPDATE



Ok I did bit of debugging and this is what I found. I have htaccess rule set in my directory- I am using PHP Codeigniter framework and have htaccess rule to remove index.php from the url.



So, when I feed the url to facebook debugger(https://developers.facebook.com/tools/debug) without index.php, facebook shows a 404, but when I feed url with index.php, it is able to parse my page.



Now how do I make facebook scrape content when the url doesn't have index.php?



This is my htaccess rule:



<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /

#Removes access to the system folder by users.
#Additionally this will allow you to create a System.php controller,
#previously this would not have been possible.
#'system' can be replaced if you have renamed your system folder.
RewriteCond %{REQUEST_URI} ^system.*
RewriteRule ^(.*)$ /index.php?/$1 [L]

#When your application folder isn't in the system folder
#This snippet prevents user access to the application folder
#Submitted by: Fabdrol
#Rename 'application' to your applications folder name.
RewriteCond %{REQUEST_URI} ^application.*
RewriteRule ^(.*)$ /index.php?/$1 [L]

#Checks to see if the user is attempting to access a valid file,
#such as an image or css document, if this isn't true it sends the
#request to index.php
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?/$1 [L]
</IfModule>

<IfModule !mod_rewrite.c>
# If we don't have mod_rewrite installed, all 404's
# can be sent to index.php, and everything works as normal.
# Submitted by: ElliotHaughin

ErrorDocument 404 /index.php
</IfModule>




No comments:

Post a Comment