Wikipedia:WikiProject Spam

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Shortcut:
WP:WPSPAM
WikiProject Spam engaging spammers off the coast of Wikilandia
WikiProject Spam engaging spammers off the coast of Wikilandia

As Wikipedia grows in popularity, the temptation to misuse its editability to bring attention to other websites becomes nearly unbearable. At one end of the spectrum are professional spammers seeking to drive traffic to commercial sites. At the other end, are webmasters of simple community sites who want to get more attention for their site. This potential for self-promotion on Wikipedia must be managed — Wikipedia is not a "repository for links" or a "vehicle for advertising". Wikipedia exists for the purpose of creating a collaboratively edited encyclopedia, not for any individual to promote a site in which they have an interest.

Those promoting sites by linking to them from Wikipedia formerly saw major search engine optimization (SEO) benefits, due to Wikipedia's popularity. The ability to promote a site's appearance in search engine results was considered too great an incentive for people to add extraneous links to articles. So in February 2007, the English Wikipedia instituted a policy that tags external links "NOFOLLOW."[1] This means that major search engines like Google no longer index these links. Many web site operators still seek to use Wikipedia to increase the number of inbound links to their sites, some either out of ignorance of SEO functionality or of this policy change, others because they simply hope to draw individual readers to their site.

Currently link spammers enjoy a lot of advantages from the lack of cohesion to the spam fighting process. It is possible to sneak links into relatively unwatched articles successfully. Such links may lay unexamined for months, gaining the appearance of legitimacy from having remained in an article so long. When spam links are reverted, there is not much communication. Spammers can return and add links when different editors are watching who do not know their history of editing-with-an-agenda. Frequently spam contributors take advantage of Wikipedia's Assume good faith policy. They may engage in straw-man or special pleading arguments for inclusion of their links under the guise that they have only the welfare of Wikipedia at heart, usually in the presence of evidence to the contrary.

WikiProject Spam is a voluntary Spam fighting brigade. Our purpose is to: develop standards and processes for recognizing, hunting down, and eliminating link spam; to streamline communication between those who want to watch over articles to prevent it; and, to send a message by our actions and effectiveness that link spammers are fighting a war they cannot win.

If you would like to participate, we encourage you to add your name to the sign-up list. We encourage you to join in editing this page so we can grow toward consensus about the best way to fight link spam. You are welcome to relate any of your own current ongoing efforts to fight link spam on the talk page so that in the immediate future we can be aware of users that are acting with an agenda to promote an external site.

Contents

[edit] Removal how-to

There are a variety of facets for dealing with inappropriate links. This guide breaks the process into a number of steps. Most editors will want to complete the first step. Editors interested in doing a more thorough job should follow through with additional measures.

  1. Revert and warn the user: when new spam appears on your watchlist, the easiest way to remove or revert it is by selecting the diff link, then select the edit above the left-hand column, include an appropriate edit summary, preview changes and finally save the page. If you come upon an article with spam, check the recent diffs (the last links) in the page history to see if it was added recently in a way that damaged the article; revert the changes to restore the articles to its previous state. It is important to warn the user, which will likely stop the spamming or establish a history of problematic edits. To warn the user, go back to your watchlist or page history and select the Talk link associated with the editor and add {{subst:uw-spam1}} ~~~~ to that page. If the user already has a spam warning, add {{subst:uw-spam2}} ~~~~; if two warnings {{subst:uw-spam3}} ~~~~; if three warnings {{subst:uw-spam4}} ~~~~. At this point the task is done, but to see if the user added the same link to other articles, go on to step2.
  2. Check the user's contributions: a user will often add the same link to multiple articles. This is often confirmation that the user is not editing in good faith. To check for this type of activity, select the contribs (or for anonymous users the IP address) link from your watch list or an article's history. This shows all the other edits the user recently made and selecting the diff link shows if the same link has been added to other articles. If inappropriate links are found, revert as in step one, but the user only needs to be warned once unless he has spammed since the last warning.
  3. Check for similar links: a crafty spammer hides spamming by using multiple accounts. This step involves finding all of the articles that contain a link to a particular site. If a link to www.example.com were discovered and removed in steps one and two, the next step is to use the linksearch command to find all articles that contain such links. One may enter www.example.com in the search box, but consider entering *.example.com because this will find not only www.example.com but also ads.example.com and any other domain that might have been used. The linksearch command is found in the Special pages list, which has a link from the toolbox of every page.
  4. Identifying the spammer: the process of finding links in step three reveals which articles they are in but does not indicate which editor added them. To find out, go to the article history and expand it to 500 entries. Check to see if the link is present in the last revision in the list. If so, select the previous 500 changes and check again, repeating the process to find the subset of changes where the link appeared. To find the exact edit where the link was added, check the version in the middle of the 500 entries. If it is not present there, then it is in the edits above otherwise it is in the edits below. Check the middle of the appropriate half in the same manner. By using this divide and conquer method the exact point of insertion can be quickly found. Often the edit summary includes the words External links which can help pinpoint the edit. Once the edit is found, go back to step one and start cleaning up after this editor.
  5. Persistent spammers: if an active spammer continues adding links after a {{subst:uw-spam4}} warning, report this user to the administrators at the intervention against vandalism page.
  6. Document the target url. Post an example of the link you have been removing on this project's talk page. If the site is spammed again, any future spam fighters who try a link search will realize that it is a repeat offense, and further action may be needed.

See also the to-do list.

[edit] Standards

The number one rule for Project members is this code of honor: "I will never insert links to my own sites into Wikipedia's article space." Not only is Conflict of interest a guideline that is generally accepted among editors, but many of us who run websites are too committed to their success (however we define it) to judge impartially whether or not they belong in an article. Moreover, we are actively reverting self-promotion linking by other editors, some of whom view the addition of their links as sincere attempts to service various communities. It is easier to gain the respect of these people if we hold ourselves to the highest possible standard and avoid any appearance of double-standards or hypocrisy.

[edit] Tag 'em to stop 'em

Suspicious edits automatically deserve a {{subst:uw-spam1}} tag on the user's talk page, with spam or {{uw-spam1}} in the edit summary. This is important! First, to drive the message that spam is not welcome here, and second, to warn us of repeat offenders. If they come back months later there will be a record of their behavior. Placing the warning tag does not take much more effort than removing the spam itself, and can really help the effort to prevent the spam from returning. Successive violations of the spam policy can be met with {{subst:uw-spam2}}, {{subst:uw-spam3}} and then {{subst:uw-spam4}} on the user's page. If a violation occurs after the fourth warning, you should report the offending user at the Administrator intervention against vandalism page.

[edit] How to identify spam and spammers

  1. User is anonymous (an IP address)
  2. User:page and/or User_talk:page are red links
  3. No edit summary (other than, perhaps /* External links */)
  4. User has made only one edit, which consisted of inserting a link
  5. User has made multiple edits to related articles
  6. The majority of user's edits are to external links sections
  7. The link is a site that has Google/Yahoo ads (AdSense/SM).
  8. Edits are marked "minor"
  9. Link is trying to sell a product or service. You can use Microsoft's Detecting Online Commercial Intention Tool to help you with the determination.
  10. User adds links to the top of a section, above far more relevant sites
  11. User replaces an existing link or part of an existing link.
  12. The syntax of the added link does not match the syntax used in the rest of the list
  13. User adds links to inappropriate sections of articles ("References", "See also", "For more information")
  14. User adds links that have been previously removed, without discussing on the talk page.
  15. Following a link takes you to a site that does not mention the specific topic of the page containing the link.
  16. Link is unrelated, or only marginally related to the article. For example, link on a biography to a specific page on a genealogy site describing the person's genealogy, but not the person.
  17. User adds links to other Wikipedia articles where he/she has already placed spam links.
  18. User includes within the link description, "hosted on example.com" with a separate link to example.com.
  19. Link is mangled, or it took many edits to get the syntax right. The spammer may be new to Wikipedia and not be familiar with Wikipedia syntax for external links.
  20. Text of the link goes beyond describing the contents to actively encouraging you to read it. For example, including text such as, "Read more about [subject] in [this fascinating article]"

[edit] Common spammer strawmen

Spammers will offer arguments like the following. These are strawman arguments, for the reasons listed.

  • "But you have links to commercial sites in the list."
    • Spamming is about promoting your own site or a site you love, not about commercial sites at all. Links to commercial sites are often appropriate. Links to sites for the purpose of using Wikipedia to promote your site are not.
  • "But you have links to other sites that people have added for self-promotion."
    • Those need to go, too. The fact that we haven't gotten around to it, yet, does not mean that we have some obligation to have your site.
  • "But you have a link to site Y, and my site is just like that."
    • We don't need to link to every site in existence that meets a certain criterion. Sometimes we just need one site representative of a category. (See also the comments about linking to web directories instead, so that Wikipedia does not become a web directory.)
  • "But these links have been here for a long time."
    • There are no binding decisions on Wikipedia, especially when the decision was never discussed on the talk page. Just because nobody noticed your spam a long time ago does not mean you now have a "right" to keep it in.
  • "My link is very unique."
    • It is more likely that the link they have added has no more information than the Wikipedia article itself.
  • "My site is non-commercial, so it's not spamming" (Similarly 'nonprofit', 'charitable', opposes cruelty to puppies, etc)".
    • It doesn't matter--being noncommercial (etc.) doesn't confer a license to spam even when it's true, and these sites are often trying to sell something even if the business is organized as a nonprofit.

[edit] Assuming good faith

Assuming good faith is an important policy of Wikipedia, but does not require that you assume good intentions when there is evidence to the contrary. Link spamming behavior fits a definite profile. When editors meet this profile, they are engaging in activity which is detrimental to Wikipedia, no matter how sincere they may have been in their edits. We should develop responses to those who engage in this behavior which encourage them to reform into productive Wikipedians, but we should waste no time in protecting Wikipedia from the damaging behavior through reverts and blocks where necessary.

[edit] Regular clean-out of undiscussed links

What several editors in some articles do is go in every few days and remove any undiscussed external links. Call it quick and easy "house cleaning." To encourage sincere links, they leave this edit summary:

Regular clean-out of undiscussed links. Please come to Talk page if you want a link not to be cleaned out regularly.

One could easily start this strategy in any article by adding {{subst:Discuss links here}}~~~~ to its talk page. The plan is to discourage people whose sole intention is self-promotion.

Also, add commented-out warnings to the External links section of the articles, themselves:

<!-- ATTENTION! Please do not add links without discussion and consensus on the talk page. Undiscussed links will be removed. -->

For this purpose the Template:NoMoreLinks has been created.

The strategy is used in the following articles:

This strategy is also helpful to deal with POV and conspiracy links:

[edit] What to do with linkfarms

  • Wikipedia policy states that Wikipedia is not a web directory of anything. Sometimes, the easiest and best way is replace the link farm with a reference to a web directory, such as the Open Directory Project ({{dmoz}}) and the Yahoo! Directory ({{yahoo directory}}. For example, see my edits to Model United Nations and Online shopping directory. It works!: check the date - until now, no one has added links to these two pages. --Perfecto 03:39, 16 January 2006 (UTC)
  • One good thing spammers do is find us overlapping product or company lists in several articles (which they create themselves, sadly). For example, one of them helped me find overlapping link farms in Friendster, Social network service and Social network. I found several sites linked in all three! The solution is to put these farms together in one article, and then say "For links to so-and-so sites, see:" on the rest. After this, I find, they start leaving the other articles alone. --Perfecto 22:34, 8 February 2006 (UTC)

[edit] The Campaign

We would want a concerted viral marketing strategy involving

and a dash of mentions in help pages, FAQs and fixup templates.

[edit] Guidelines, policies, and essays

[edit] Templates

Spam warnings
Advertising warnings
Article tags
Policy & Project

[edit] {{subst:spam}} and related

These templates should be substituted ({{subst:Uw-spam}}, etc) as per WP:SUBST.

[edit] {{Cleanup-spam}}

{{Cleanup-spam}}, which I began, might be useful. See Wikipedia:Spam for more details. -- Perfecto 04:03, 6 December 2005 (UTC)

[edit] {{subst:WPSPAM-invite}}

Saw someone revert or remove linkspam? Invite the comrade here with {{subst:WPSPAM-invite}} placed on their User talk page. -- Perfecto 04:27, 30 December 2005 (UTC)

A souped-up alternative: {{subst:WPSPAM-invite-n}}. Λυδαcιτγ 23:00, 18 February 2007 (UTC)

[edit] Standardised edit summary

HorsePunchKid suggests a standardised edit summary to raise awareness both of the problem and this particular effort:

Removed link spam. Wikipedia is [[WP:NOT|NOT]] a link directory. Join [[Wikipedia:WikiProject Spam]] to help!

Perfecto uses the following:

Removed [[WP:EL|external link]] [[WP:SPAM|spam]]. ([[WP:WPSPAM|You can help!]])

--Aude suggests:

Removed [[WP:EL|external link]] added by [[User talk:69.159.82.252|69.159.82.252]]. Wikipedia is [[WP:NOT|NOT]] a link directory. ([[WP:WPSPAM||WikiProject Spam]])
Substitute the ip address/user name as appropriate.

TheJabberwʘck suggests (for users of popups):

Reverted [[WP:EL|external link]] addition by [[Special:Contributions/<user>|<user>]] to version %s, using [[:en:Wikipedia:Tools/Navigation_popups|popups]]. Wikipedia is [[WP:NOT|NOT]] a link directory. ([[WP:WPSPAM|you can help!]])

These edit summaries help drive a concerted viral marketing strategy.

[edit] Recognition

The coveted Spamstar of Glory is awarded to those who show strong contributions to tracking down and stopping spammers as well as cleaning up their links. Introduced on November 8, 2006 by A. B., it originally consisted of a nicely Photoshopped can of Hormel spam superimposed on a barnstar. Later, due to concerns about infringing on Hormel's trademark, the award was changed to the current design, adapted from the The RickK Anti-Vandalism Barnstar

The Spamstar of Glory
Presented to {{{1}}} for diligence in fighting spam on Wikipedia

List of the proud few awarded this distinctive honor to date

[edit] Participants

See the list of participants. You can sign up and help us fight spam on Wikipedia!
As of September 2008 we have over 300 participants.

[edit] Userbox

Participants may add this to their userpage instead of signing up.

Code: Results in:
{{User WikiProject Spam}}
This user is a member of WikiProject Spam.

If you prefer not to use userboxes, you may add yourself directly to Category:WikiProject Spam members by placing the following code on your Userpage: [[Category:WikiProject Spam members|{{PAGENAME}}]]

[edit] Tools

  • Search tools
    • Special:Linksearch - find all external links to a particular site on en:Wikipedia, useful when a spam link is added by many different IP addresses or accounts.
    • Cross wiki linksearch Searches for a link across other language Wikipedias.
    • Archive link search searches several common archives for variations on a URL that are not necessarily active links (for instance, exampleDOTcom). This is useful for finding previous conversations about particular sites.
  • To combat repeat offenders, you may request to have links added to the local spam blacklist (for links that have only been spammed on the English Wikipedia) or the Wikimedia Global spam blacklist (for links that have been spammed on more than one Wikimedia Foundation project).
  • For links generally that get used in an inappropriate way by unestablished users, but does not qualify for the meta spam blacklist or the local spam-blacklist ask User:XLinkBot to monitor it, ask at User talk:XLinkBot/RevertList.
  • Watch the link addition feed in #wikipedia-en-spam. There is a bot on there that reports all newly added links and keeps track of serial spammers. For more info see /IRC Channels.
  • If no one is around to add something to the spam blacklist, contact users at #wikipedia-spam-t on freenode.
Personal tools