Sean said:I think we may be ok now guys. It turns out we were not attacked by a worm, but by a mass amount of search engine spiders (they hit us and pull information). For now I am blocking all of them. When things settle down I may re-activate a few of them to maintain our indexes out on the net. I had no idea there were 100's of these bots hitting us like that, it was like a flood. I may keep the gallery offline a little longer. Thanks
You CAN controll this via a few mechanisms, among them the robots.xt controll files and via HTTP headers and embeded metadata. Spiders that don't follow robots.txt or hit too fast you just block and be done.Sean said:I think we may be ok now guys. It turns out we were not attacked by a worm, but by a mass amount of search engine spiders (they hit us and pull information).
I don't quite understand what chromatics have to do with anything?Aggie said:Sean, Tim is telling me this, so bear with me if it soumds a bit blonde.
Posts are not emails. Posts are data base entries.
No. Indexing and search is of information and have nothing to do with it. Relational database systems, in fact, are poorly suited to the task. Again you are confusing things, this time applications with organization and storage models.This allows them to searched and indexed.
You don't understand what "expiration" in HTTP means. If its not set then the data can expire immediately since its not been defined. If I don't know when data expires then I also can't assume that it will never expire or that the data expires once I get a copy (as the case might be in a ticket reservation system). If data is volatile then this is want you may want to get someone to keep asking for data. This is where the modification date/time enter the picture. One then asks if the data has been changed since the last time one asked and got it.. But if it too is undefined then one will probably need to assume that the data might have been changed (most browsers allow one to set this on a per-session etc. basis but its in the hands of the client and not server). There are also some features for a hash of context to try to distinguish between changes but I think I'm getting too deep into the fine details of designing spiders and search engines (which I do) and less sites.Expiration is generally against what your data base is supposed to do.
That's why one needs to set an expiration date at a distant point in the future and set the modification date. Its up to the site administration to try to controll how clients (and web spiders are in this capacity nothing other than clients with a code of behaviour) behave.You don't want data to just expire and disappear into thin space. Least of all in a community like this.
Bob F. said:MSNBOT seems a very hungry creature - you might want to use robots.txt to stop the sod crawling too deeply... This thread makes interesting reading (Dead Link Removed). <EDIT: Hmmm - that link does not work from here - it does from Google... - do a google search on: MSNBOT voracious appetite... >
Skipping much of the dribble, I see a lack of understanding.Valthonis said:Originally Posted by Aggie
I have to break this one down into its component flawed arguements to best show why this is a wrong interpretation.
POP3 is just a little protocol designed to pass around some e-mail messages that conform to a certain standard. One should never confuse a transfer protocol with the content of the messages being transfered. One should never confuse the syntactics of the message, its form, with its context. One should never confuse the grammar with the model. One should not confuse metaphor with concrete instance. A story is a story and pixies don't in real life fly.Quote: You are confusing context with storage models.
--------- You are confusing context with method. Just because you view the forums LIKE an email system does not an email system make. You arent logging on to a pop3 server to transmit a properly formated message to
So? And they are ill-suited for many uses. RDBMSs are typically not terribly good at searching for something in anything.Quote: The typical RDBMS system is designed to handle volatile and dynamic information.
--------- Firstly a definition of RDBMS for those too lazy to use google.
*Short for relational database management system and pronounced as separate letters, a type of database management system (DBMS) that stores data in the form of related tables. Relational databases are powerful because they require few assumptions about how data is related or how it will be extracted from the database. As a result, the same database can be viewed in many different ways.*
Databases in an environment such as apug Should handle volatile and dynamic information. The entries are specificly designed to be editable and thus dynamic.
I don't understand the relevance of popularity--- the crux of your "defaco standard" as in Microsoft Windows to totalitarian and corrupt governments to ... I don't want to touch upon why PHP is currently popular nor the current trends of "development".Quote: The model for these forums is little else other than a threaded mailing list or what we've come to call Usenet News lifted over to a web interface (at first mail->web but latter also web->web). As a little aside I created the Mail->Web genre a good dozen years ago: see the w3c.org web museum.....
-------- The model for these forums is little else other than phpBB with several phpHacks and a skin that matches the design. This is the defacto standard for web forums but the administrator could have used a diffrent
|Photrio.com contains affiliate links to products. We may receive a commission for purchases made through these links. |
To read our full affiliate disclosure statement please click Here.
PHOTRIO PARTNERS EQUALLY FUNDING OUR COMMUNITY: