Author |
Message |
PHrEEkie
Subject Matter Expert
![](modules/Forums/images/avatars/gallery/blank.gif)
Joined: Feb 23, 2004
Posts: 358
|
Posted:
Mon Dec 06, 2010 10:09 am |
|
Guardian2003 wrote: | Code:
function is_spider(){
$spiders = array(
'Googlebot', 'Openbot', 'Yahoo', 'Slurp', 'msnbot',
'ia_archiver', 'Lycos', 'Scooter', 'AltaVista', 'Teoma', 'Gigabot',
'Googlebot-Mobile'
);
// Loop through each spider and check if it appears in
// the User Agent
foreach ($spiders as $spider)
{
if (eregi($spider, $_SERVER['HTTP_USER_AGENT']))
{ return TRUE; }
}
return FALSE;
}
| |
Nice and simple, easily updated with more, and again, there's a lot of websites that specialize in User Agent strings, so that list can be kept up to date. 'Googlebot' will already catch 'Googlebot-Mobile', so that can be removed unless you were going to process them somewhere else separately. Since the function just returns True or False, I assumed that isn't the case.
The only thing I would change is to not be using the deprecated eregi function, and instead replace it with preg_match:
Code:
function is_spider() {
$spiders = array(
'Googlebot',
'Openbot',
'Yahoo', 'Slurp',
'msnbot',
'ia_archiver',
'Lycos',
'Scooter',
'AltaVista',
'Teoma',
'Gigabot',
);
// Loop through each spider and check if it appears in
// the User Agent
foreach ($spiders as $spider) {
if (preg_match('/' . preg_quote($spider, '/') . '/i', $_SERVER['HTTP_USER_AGENT'])) {
return TRUE;
}
}
return FALSE;
}
|
The $spider variable is preg_quoted, just in case a bot name had special characters associated with it (as Googlebot-mobile would have, ie., '-').
- Keith |
_________________ PHP - Breaking your legacy scripts one build at a time. |
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
fkelly
Former Moderator in Good Standing
![](modules/Forums/images/avatars/gallery/blank.gif)
Joined: Aug 30, 2005
Posts: 3312
Location: near Albany NY
|
Posted:
Mon Dec 06, 2010 10:30 am |
|
In my opinion, the best place for this would be in mainfile. We could have a function is_bot, similar to is_user or is_admin. However, in terms of ease of updating ... unless we are going to have people hacking code ... we should probably think about putting a $bot_array in rnconfig.php. Actually, I suppose that involves hacking code too .. you have to change rnconfig .. but at least that's where we centrally make changes.
I am going to put this thread into our internal tracking system and see what our leadership wants to do about getting this into a release, preferably 2.41. I can probably get the coding done within a few weeks elapsed time.
One question ... so we want to exclude bots from updating impressions. Do we want to have two impressions fields ... or without bots and one with? That would require a database change. Or do we just want to change the way we count the existing impressions field? I would vote for the latter in terms of simplicity of implementation. When we make database changes we always wind up with people getting out of synch for years to come (they don't run the rndb_upgrade script). |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
spasticdonkey
RavenNuke(tm) Development Team
![](modules/Forums/images/avatars/48fb116845dfecf66294c.gif)
Joined: Dec 02, 2006
Posts: 1693
Location: Texas, USA
|
Posted:
Mon Dec 06, 2010 10:43 am |
|
sounds cool to me. I don't think we need to track bot impressions for the ad module... although the statistics module should be upgraded to use the new $bot_array ![Wink](modules/Forums/images/smiles/icon_wink.gif) |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
Guardian2003
Site Admin
![](modules/Forums/images/avatars/125904890252d880f79f312.png)
Joined: Aug 28, 2003
Posts: 6799
Location: Ha Noi, Viet Nam
|
Posted:
Mon Dec 06, 2010 3:19 pm |
|
I placed it in mainfile as that is usually included in every script and therefore you should be able to call the function from anywhere. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
fkelly
![](modules/Forums/images/avatars/gallery/blank.gif)
|
Posted:
Tue Dec 07, 2010 9:37 am |
|
Notes to self and anyone else who wants to help.
Theme.php calls ads(0) which is the zero position of ads. It actually echoes out the results of that call.
The ads() function is in mainfile. Aside from detecting when an ad has used up its impressions and emailing the client, this function divides ads into three classes: code, image, and flash. The original design did not anticipate any clickurls for code and/or flash ads or at least it did not provide any way to put a click url into the banners table or to collect statistics on clicks. For code ads (html or javascript) it simply returns the ad code itself to theme.php where it is echoed in the header area of the page. For Flash it returns the flash image plus some parameters.
For image ads however, mainfile returns:
Code:$ads = '<center><a href="index.php?op=ad_click&bid='.$bid.'" target="_blank"><img src="'.$imageurl.'" border="0" alt="'.$alttext.'" title="'.$alttext.'" /></a></center>';
|
(I am going to replace the centers with divs with style = text-align:center btw)
Here is where it gets tricky. This image/ad code is echoed by theme.php and the rest of the page gets displayed (whatever the user is doing at that time, could be looking at news, displaying a calendar, whatever). So, for an image ad, sitting at the top of your page will be a link that says something like:
Quote: | http://localhost/rn240_extract/index.php?op=ad_click&bid=1 |
When you click on it, index.php will process it, right at the top with the following code:
[code]
if (isset($op) && ($op == 'ad_click') && isset($bid)) {
$bid = intval($bid);
$sql = 'SELECT `clickurl` FROM `' . $prefix . '_banner` WHERE `bid`=\'' . $bid . '\'';
$result = $db->sql_query($sql);
list($clickurl) = $db->sql_fetchrow($result, MYSQL_NUM);
if ($result) {
$result = $db->sql_query('UPDATE `' . $prefix . '_banner` SET `clicks`=clicks+1 WHERE `bid`=\'' . $bid . '\'');
update_points(21);
}
Header('Location: ' . $clickurl);
die();
}[code]
I modified it already a bit to remove the freeresults.
So this is where clicks for image ads gets updated. Now, if I modify mainfile to do this for code ads:
[code]if ($ad_class == 'code') {
$ads = '<div style="text-align:center"><a href="index.php?op=ad_click&bid='.$bid.'" target="_blank">'.$ad_code.'</div>';[/code]
And if I phpmyadmin a clickurl (remember the admin screens don't yet let me put a clickurl in for code ads) into the clickurl field for my code ad and if I remove the a href from my pre-existing code ad so that it is just contains the text of the link while the actual url is in the clickurl field ... well then I can update clicks for my code ad.
--- and that's just stage 1, sort of understanding how this beast work -- but it is not really going to be a satisfactory solution. Because:
(a) it won't be backward compatible with existing code ads which might have the actual link in them;
(b) at best it would just collect data for one clickurl for a code ad
(c) the whole "going through index.php" thing is a mess
So here's what I'm thinking, at a very conceptual level to start with. To maintain backward compatibility we probably should just come up with a new ad class. Kind of a code ad on steroids. Let's call it form_ad. And we probably need a different table structure than the current banners table. This is because:
(a) we want to allow multiple clickurls in an ad. On my sites I already allow a mailto url plus a web site url on my code ads. I just have no way to collect clicks on them. The links are built into the ad code itself with a hrefs.
(b) we want to be able to collect clicks for each clickurl separately.
(c) to get around the limits of the current index.php and header redirection processing, we want to embed the ad in a form.
The way I'm picturing this is that theme.php would still call ads(). For a form_ad we'd return the form code. On the advertising admin screen we'd build up the form by allowing a title, some text, and then submit buttons for any links (clickurls). There could be any number of them. When the form was "submitted" it would go to a PHP program that was explicitly designed to collect clicks and impressions. That would update the banners and let's say forms_banners for impressions and clicks and then go to whatever clickurl was selected And what if there were no link for the ad? What if it were just a display ad? We'd just display a form -- we can get by with no submit ? but they'd read it and click somewhere else on the page and move on.
By the way, I intend to fix the bot situation too, but that seems trivial as compared to collecting clicks.
Enough for now. Comments welcome. Obviously this won't get done immediately and I'll post more as I make progress. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
Palbin
Site Admin
![](modules/Forums/images/avatars/Dilbert/Dilbert_-_Dogbert_King.gif)
Joined: Mar 30, 2006
Posts: 2583
Location: Pittsburgh, Pennsylvania
|
Posted:
Tue Dec 07, 2010 11:15 am |
|
fkelly wrote: | (I am going to replace the centers with divs with style = text-align:center btw) | Please use a class and not the style attribute. The classes are located in ravennuke.css.
Please move the below piece of code from index.php and the function in mainfile.php to a file within the ads module. Then just include that file in mainfile.php Also wrap it in an if file_exists() until we figure out we will include files from module that need to be in mainfile. This should not affect any functionality and will make it more modular.
Code:
if (isset($op) && ($op == 'ad_click') && isset($bid)) {
$bid = intval($bid);
$sql = 'SELECT `clickurl` FROM `' . $prefix . '_banner` WHERE `bid`=\'' . $bid . '\'';
$result = $db->sql_query($sql);
list($clickurl) = $db->sql_fetchrow($result, MYSQL_NUM);
if ($result) {
$result = $db->sql_query('UPDATE `' . $prefix . '_banner` SET `clicks`=clicks+1 WHERE `bid`=\'' . $bid . '\'');
update_points(21);
}
Header('Location: ' . $clickurl);
die();
}
|
fkelly wrote: |
Code:if ($ad_class == 'code') {
$ads = '<div style="text-align:center"><a href="index.php?op=ad_click&bid='.$bid.'" target="_blank">'.$ad_code.'</div>';
|
And if I phpmyadmin a clickurl (remember the admin screens don't yet let me put a clickurl in for code ads) into the clickurl field for my code ad and if I remove the a href from my pre-existing code ad so that it is just contains the text of the link while the actual url is in the clickurl field ... well then I can update clicks for my code ad.
|
1. Why not just leave the code ads as is and only let them be time based? Just a thought, but feel free to disagree as I have not thought about it much and have never used this module.
2. I do not think this will validate if they put anything more than an image in the ad code.
fkelly wrote: |
So here's what I'm thinking, at a very conceptual level to start with. To maintain backward compatibility we probably should just come up with a new ad class. Kind of a code ad on steroids. Let's call it form_ad. And we probably need a different table structure than the current banners table. This is because:
(a) we want to allow multiple clickurls in an ad. On my sites I already allow a mailto url plus a web site url on my code ads. I just have no way to collect clicks on them. The links are built into the ad code itself with a hrefs.
(b) we want to be able to collect clicks for each clickurl separately.
(c) to get around the limits of the current index.php and header redirection processing, we want to embed the ad in a form.
The way I'm picturing this is that theme.php would still call ads(). For a form_ad we'd return the form code. On the advertising admin screen we'd build up the form by allowing a title, some text, and then submit buttons for any links (clickurls). There could be any number of them. When the form was "submitted" it would go to a PHP program that was explicitly designed to collect clicks and impressions. That would update the banners and let's say forms_banners for impressions and clicks and then go to whatever clickurl was selected And what if there were no link for the ad? What if it were just a display ad? We'd just display a form -- we can get by with no submit ? but they'd read it and click somewhere else on the page and move on.
By the way, I intend to fix the bot situation too, but that seems trivial as compared to collecting clicks.
|
Make sure said file is within the module. Obviously if you do something standalone that does bot recognition etc that can be in mainfile. |
_________________ "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan. |
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
Guardian2003
![](modules/Forums/images/avatars/gallery/blank.gif)
|
Posted:
Tue Dec 07, 2010 11:19 am |
|
@ PHrEEkie - Yes eregi is deprecated, I just have not got around to updating the code that I posted.
I forgot to remove Googlebot-Mobile in that example because as you rightly point out, a match on 'Googlebot' would catch it. The full function I'm using contains additional code that amongst other things, counts 'mobile' spiders seperately to give me an idea of how often those sorts of spiders crawl the site and which content they crawl - purely for data analysis. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
fkelly
![](modules/Forums/images/avatars/gallery/blank.gif)
|
Posted:
Tue Dec 07, 2010 11:29 am |
|
Quote: | 1. Why not just leave the code ads as is and only let them be time based |
I could live with just time based as the means for expiring ads. However, to help sell ads it helps to be able to give potential advertisers a sense of what number of impressions and how many clicks they could expect. I think we need to accumulate those statistics ... removing bots from the count and keeping count of clicks for non-image based ads where there is a link.
Thanks for the other suggestions Palbin, I will see what I can do with them. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
fkelly
![](modules/Forums/images/avatars/gallery/blank.gif)
|
Posted:
Fri Dec 10, 2010 3:09 pm |
|
I am going to break development of this up into stages. This reduces risk and the likelihood of errors.
Stage one is going to be to detect bots and exclude them from the count of impressions in ads. This will require changes in rnconfig.php, mainfile and also counter.php. We probably could use the "old" method in counter.php but it makes no sense to do so.
In rnconfig.php right after the allowablehtml array I am adding:
Code:$rn_bots = array('bot','yahoo', 'slurp', 'ia_archiver', 'lycos', 'scooter', 'altavista', 'teoma', 'spider', 'infoseek', 'crawl');
|
I am deliberately lowercasing them as I am going to lowercase the extracts from USER_AGENT ... all tests will be on lowercase values.
I had a previous version using John (Guardian's) code:
Code:$rn_bots = array('bot','yahoo', 'slurp', 'ia_archiver', 'lycos', 'scooter', 'altavista', 'teoma', 'spider', 'infoseek', 'crawl');
|
but looking at AWstats on Cpanel I saw that a lot of bots were identified only with the "bot" string ... plus if we use just "bot" it will pick up any "bot" with a "bot" in it. LOL. But seriously, is there a problem with that.
Next in mainfile, I am using a variant of Keith's code in a new function named is_bot. I am putting this right before the ads() function. The code for is_bot is:
Code:function is_bot($agent) {
global $rn_bots;
foreach ($rn_bots as $bot) {
if (preg_match('/' . preg_quote($bot, '/') . '/i', strtolower($_SERVER['HTTP_USER_AGENT']))) {
return TRUE;
}
}
return FALSE;
}
|
Later in the ads() function of mainfile I am surrounding the incrementing of impressions with a test for is_bot. That section of code goes like this:
Code: list($bid, $impmade, $imageurl, $clickurl, $alttext) = $db->sql_fetchrow($result);
$bid = intval($bid);
$agent = strtolower($_SERVER['HTTP_USER_AGENT']);
if(!is_bot($agent)) {
$db->sql_query('UPDATE '.$prefix.'_banner SET impmade=impmade+1 WHERE bid=\''.$bid.'\'');
}
|
Finally, for this stage, counter.php (in the includes directory) is being modified to do this:
Code:if ((in_array('admin', $statisticsExcludeList) && is_admin($admin)) || (is_user($user) && in_array($userinfo['username'], $statisticsExcludeList))) {
// Do not gather statistics
} else {
// Yes, gather statistics
/* Get the Browser data */
$agent = strtolower($_SERVER['HTPP_USER_AGENT']);
if(is_bot($agent)) {
$browser = 'Bot';
}
else {
if ($agent == 'navigator' || $agent == netscape) {
$browser = 'Netscape';
}
elseif ($agent == 'firefox') {
$browser = 'FireFox';
}
elseif ($agent == 'msie') {
$browser = 'MSIE';
}
elseif ($agent == 'lynx') {
$browser = 'Lynx';
}
elseif ($agent == 'opera') {
$browser = 'Opera';
}
elseif ($agent == 'webtv') {
$browser = 'WebTV';
}
elseif ($agent == 'konqueror') {
$browser = 'Konqueror';
}
elseif ($agent == 'chrome') {
$browser = 'Chrome';
}
elseif ($agent == 'safari') {
$browser = 'Safari';
}
else $browser = 'Other';
}
/* Get the Operating System data */
|
The old code retrieved HTTP_USER_AGENT for every test, I think it's probably more efficient to just stuff it into a variable at the outset.
I have only tested this at the localhost level. I'll need to move it to a real server to try it out where I get a variety of user_agents. There are quite a number of articles about detecting bots with PHP that can be found by googling. I've looked at some and I've downloaded my log files into an editor ... but looking at my cpanel stats indicates that what we have here should vastly improve our statistics collection. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
rebelt
Worker
![Worker Worker](modules/Forums/images/ranks/3stars.gif)
![](modules/Forums/images/avatars/gallery/blank.gif)
Joined: May 07, 2006
Posts: 172
|
Posted:
Sat Jan 15, 2011 5:36 am |
|
I can't help with regard to coding, but would like to put an idea forward as an interim measure.
With regard to expiring ads, could you not use GCalendar with an alteration.
It has the facility to email notification of new events, could that be modified to email notification of due events, so you receive an email a couple of days before an event, then enter the expiry date as an event.
If GCalendar is already being used for actual events, then perhaps a duplicate?
Just a thought ![Very Happy](modules/Forums/images/smiles/icon_biggrin.gif) |
_________________ I wish I knew what I was doing LOL |
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
fkelly
![](modules/Forums/images/avatars/gallery/blank.gif)
|
Posted:
Sat Jan 15, 2011 8:06 am |
|
Glad you asked Rebelt. Coincidentally, yesterday I was testing something else on my home (localhost) system and found a bug in the counter.php code I posted on December 10. Where it says:
"$agent = strtolower($_SERVER['HTPP_USER_AGENT']); "
It should say: "$agent = strtolower($_SERVER['HTTP_USER_AGENT']);"
my bad (but I guess no one is testing or that would have hit them in the eye).
We really don't need to plug Gcalendar into this. Just store the expiration date in the ad and test whether the current date is past it. That part is easy to do. I just haven't found the time to start working on the administrative part of the Advertising Module to allow the "extra" fields to be stored for code type of ads. That will take a bit of do-jiggering. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
rebelt
![](modules/Forums/images/avatars/gallery/blank.gif)
|
Posted:
Sat Jan 15, 2011 8:37 am |
|
I understand it takes time, especially trying to fit it in with other "stuff".
I was just thinking a mod to GCalandar as a short term solution.
Sends email when event (expiry date) is due, instead of an email for new events.
i.e. change when GCalendar sends an email. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
fkelly
![](modules/Forums/images/avatars/gallery/blank.gif)
|
Posted:
Sat Jan 15, 2011 9:55 am |
|
Thinking about your proposal after my previous post Rebelt .. first the mainfile function already sends an email when an ad expires. But second, we are trying to reduce interdependence between modules so that RN can be truly "modular". Someone should be able to install it without installing the calendar module (or the advertising, or any other module for that matter). So we don't want advertising to depend on Gcalendar or vice versa.
Take a look at the mainfile function ads() when you get a chance and you'll see what I mean. Right now the problem is that the admin input screens in advertising don't allow for expiration dates to be added (no pun intended) for all types of ads. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
bobbyg
Worker
![Worker Worker](modules/Forums/images/ranks/3stars.gif)
![](modules/Forums/images/avatars/47640777475ce61275311.jpg)
Joined: Dec 05, 2007
Posts: 212
Location: Tampa, Florida
|
Posted:
Sat Jan 15, 2011 3:26 pm |
|
Maybe I am missing it, but what about adding 'start-date' and 'end-date' as was in the old banner.php module? |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
fkelly
![](modules/Forums/images/avatars/gallery/blank.gif)
|
Posted:
Sat Jan 15, 2011 4:28 pm |
|
The banner table has a dateend field. It is just inconsistently applied to different types of ads (code, images, flash). There is a date field but not a start-date field. Administratively when you are setting up an ad, you can activate it. It might be a convenience to have a start-date but there is a trade-off. Generally we try to minimize database changes because they tend to "break" the programs unless the site admin has made the database changes also before upgrading the code. We put the database changes into the upgrade script but not everyone runs the scripts. They should but the fact is that they don't. So, then we have to deal with questions in the Forums about "what happened to my system and how come advertising is broke?".
If there was a groundswell of opinion that we should have a start-date field we could add it. But heck, it's obvious that not many people are even paying attention to this thread since I had a typo type bug in the code and no one noticed. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
rebelt
![](modules/Forums/images/avatars/gallery/blank.gif)
|
Posted:
Mon Jan 24, 2011 2:14 pm |
|
fkelly wrote: | If there was a groundswell of opinion that we should have a start-date field we could add it. But heck, it's obvious that not many people are even paying attention to this thread since I had a typo type bug in the code and no one noticed. |
I wouldn't have noticed the typo bug
Many of us end users are amazed at the way you guys are able to sort out these things. Although we don't say it enough.
Heck, my first site was made with publisher. No comparison to that and what I have now, thanks to the guys on here. |
|
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
montego
Site Admin
![](modules/Forums/images/avatars/0c0adf824792d6d341ef4.gif)
Joined: Aug 29, 2004
Posts: 9457
Location: Arizona
|
Posted:
Sat Jan 29, 2011 8:56 pm |
|
No, start and end dates to me are a must. Its just that my advertising has fallen off to none now, due to what I perceive as a drop off in *nuke interest and also my not having time to devote to promoting my site (and making fresh content - ) Just a reality. BUT, when I was actively working with advertisers on my site, impressions was never something they cared about. It is always either a time-based ad OR click-through. Unfortunately, I think both of these are sorely lacking in the current module. I just don't have the time to fix it.
And, just a comment about not wanting to update the DB. Don't you think we put in a pretty handy db upgrade script which takes care of all of this seamlessly fkelly? I mean, it was you and I maintaining that thing... I think it works pretty darn good if you ask me. |
_________________ Only registered users can see links on this board! Get registered or login!
Only registered users can see links on this board! Get registered or login! |
|
|
![](themes/RavenIce/forums/images/spacer.gif) |
|