Spam by content type 100% 90% html 80% 70% 60% 50% 40% text_only 30% 20% 10% 0% Spam by content type (html and text only omitted) 16% other zip Other 14% pdf rfc822_bounce png 12% audio msword_doc 10% jpeg gif 8% 6% 4% 2% 0% Chart highlighting the changing techniques used in spam during 5 For more information on DarkMarket and the FBI operation, please visit: www.fbi.gov/pressrel/pressrel08/ darkmarket101608.htm 6 For the original Washington Post article, please visit: http://www.washingtonpost.com/wp-dyn/content/article/2008/11/12/ AR2008111202662_pf.html 1-Jul 8-Jul 1-Apr 8-Apr 7-Oct 1-Jan 8-Jan 3-Jun 5-Feb 4-Mar 5-Aug 2-Sep 9-Sep 15-Jul 22-Jul 29-Jul 6-May 15-Apr 22-Apr 29-Apr 14-Oct 21-Oct 28-Oct 15-Jan 22-Jan 29-Jan 10-Jun 17-Jun 24-Jun 12-Feb 19-Feb 26-Feb 11-Mar 18-Mar 25-Mar 12-Aug 19-Aug 26-Aug 16-Sep 23-Sep 30-Sep 13-May 20-May 27-May Starting with the U.S. Presidential Primaries (January), world news headlines in 2008 offered fodder for spammers.
The year was full of world events, including Tibetan Monk protests (March), the tragic earthquakes that struck China (May), the Russian-Georgian conflict and the 2008 Olympics in Beijing (August), the credit crisis (September and October), culminating in the U.S. Presidential Election (November). Each event provided an opportunity for spammers and criminals to disperse a message. In spam email subject lines alone, the spammers inadvertently foreshadowed the outcome of the U.S. Presidential election, with 85% of the election-related spam subjects mentioning Barack Obama compared with 15% referencing John McCain.
Holidays are also among spammers’ favorite events, including Saint Valentine’s Day in February, Halloween in October and Thanksgiving in November. Traditionally the end-of-harvest celebration, Halloween has become internationally recognized as a holiday involving creative costumes, tasty treats and haunted attractions. It came as no surprise that spammers sought to pump-out Halloween-themed spam with subject lines advertising “Halloween sales.” However, the message content itself pertained to the same lines of tired, old spam emails selling familiar herbal remedies and sexual enhancement drugs.
3.2 New Techniques, changes etc 2008 saw an aggressive shift in cyber-criminals abusing free, reputable, Web-based email and application service providers. In previous years, the majority of spam appearing to come from these domains was spoofed. However, that all changed in 2008, when it became clear to the spammers the sophistication of what they could achieve with large numbers of genuine online accounts hosted by these major providers.
Keys to the Kingdom - Web-based Email and Application Service Providers CAPTCHA-breaking reveals the keys to these online kingdoms and unlocks the door to creating a free email account that can be widely used for spamming and hosting spam content.
Notebooks / Blogs favorites Blog comments Discussion groups “Keys to the kingdom” Videos Images – accounts at major services Calendars Sites IM Docs Email Illustration of feature-rich functionality of major online accounts In January, the use of free, reputable, Web-based email and application service providers for sending spam, accounted for approximately 6.5% of all spam, peaking at 25% in September:
Spam originating from webmail accounts 25.0% 13.8% 13.3% 12.0% 11.2% 10.9% 6.5% 5.7% 5.2% 4.8% 3.3% Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Chart showing the proportion of spam sent from public Web mail accounts The value of CAPTCHA-breaking tools depends on which sites one wishes to use to generate a personal account. The most frequently occurring URLs or links included in spam messages relate to mainstream Web mail and application service account providers. These are accounts that have been created using CAPTCHA-breaking tools for the purpose of distributing spam content.
Search Engine Redirect Spam In early 2008, MessageLabs Intelligence identified a significant hike in the proportion of spam abusing search engine redirects. Search engine spamming is a technique that allows the spammer to include in an email message a link constructed from a search engine query. When the link is followed, the browser is led to the spammer’s Web site.
This means that the spammers can send messages without directly using the URL for the spam site in the body of the message, which makes it more difficult for traditional anti-spam products to identify the message as spam. While spam filters may recognize known spam sites, they cannot reasonably block links to legitimate search engine sites without imposing significant impairment to legitimate users.
Many major Internet search engines were all used to send these spam message, which accounted for as much as 17% of spam in January. However, search engine redirect spam quickly diminished as anti-spam technology caught up and the search engine providers made it much harder for spammers to take advantage of this feature.
Hosted Applications Spam In May 2008, MessageLabs Intelligence uncovered emails that contained links to hosted online documents created under accounts with a major hosted applications service provider. The spam content was contained in the hosted documents, rather than the email messages, making it harder for anti-spam systems to block these messages based on these domains without incurring considerable impairment to other legitimate users. Moreover, as the year progressed, spammers took the same advantage of other similarly hosted applications.
A Flash in the Spam Another new spam technique was uncovered by MessageLabs Intelligence in July 2008, where bona fide, free image hosting sites were being used to host very small, but malicious Shockwave Flash (.SWF) files that when viewed caused the Web browser to redirect to another site. Using this technique, many traditional anti-spam content filters were able to be bypassed since the link in the message related to a legitimate Web site hosting the.SWF file.
Example of spam containing a link to a.SWF file that redirects to a spam Web site When opened, the Flash program redirected the user’s Web browser to the real site using a command such as:
getURL(“http://[spam Web site removed]/”) Although most examples of spam using this technique were pharmaceutical spam, MessageLabs also intercepted a number of spam messages recruiting money laundering “mules” with work-from-home job advertisements. This technique was also being used to distribute malware in 2008, where the redirect action would result in the user being taken to a Website that downloaded malicious content.
Example of source code to Flash.SWF redirect attack One particularly attractive aspect of using hosted Flash files seemed to be that even after a few weeks the Flash files were still available online because the hosting providers were unaware of their purpose and did not remove them quickly.
Accordingly, free Web-based email and application service providers and other Web-based consumer services will most likely continue to be targeted in 2009 until such time that CAPTCHA mechanisms are unable to be broken automatically.
3.2.1 CAPTCHA Breaking The challenge in designing a good CAPTCHA is that it needs to be simple enough for most humans to pass the test, but difficult enough so that automated computer programs cannot.
Typical examples of CAPTCHAs used in For many years, CAPTCHAs have proven very useful for many reputable, Web-based email and application service providers, including social networking sites and online auction sites, for the purpose of deterring automated registration.
Nevertheless, cyber-criminals have not ceased trying to defeat CAPTCHA-based protection. It is not surprising that in 2008 CAPTCHA has been almost comprehensively defeated using “bots” to automate the process of setting-up email accounts and online profiles in a matter of minutes.
How does CAPTCHA-breaking work There are several approaches to defeating the CAPTCHAs that protect many online services, most notably their sign-up processes. First, the spammer may hire the services of Mechanical Turks7 or individuals who directly or indirectly create accounts that are subsequently traded online. Then there are those who wish to solve the CAPTCHAs using software.
CAPTCHA-breaking software may be specifically designed to target a particular Web site and the developers have created a tool capable of defeating the CAPTCHAs programmatically. An algorithm-based attack is very scalable once a reasonable level of accuracy is achieved and the tool itself can then be sold. In some cases, accounts may be created using the tool and also sold.
Major, free, reputable web-based email and application service providers Account creation validation CAPTCHA Successful or screen capture collected by bot Spam sending unsuccessful the bot account created learns more about the CAPTCHA system Bot Bot attempts to solve CAPTCHA Unsuccessful ~70% of time - bot starts over on new CAPTCHA Bot moves on to new CAPTCHA to attempt to create another Successful ~30% of time - bot creates account account Diagram showing automated techniques used to break CAPTCHAs Mechanical Turks often take the form of specialized Web sites, or Trojan programs (first seen in 2007), purporting to be a game where the user needs to enter the correct code, namely a CAPTCHA from another Web site, to be permitted to play. Some sites pay the users tiny amounts of money for each CAPTCHA solved, and one example of such a Trojan invites the recipient to disrobe an attractive woman step-by-step by solving the CAPTCHA.
7 “Mechanical Turk” is the term originally applied to an 18th Century chess-playing automaton, which turned out to be a hoax and was actually operated by a human concealed within the machine.
Example of CAPTCHA-breaking “Mechanical Turk” site Some anti-CAPTCHA tools target the audio alternative offered by sites for visually impaired visitors. Often the solution is comprised of a string of numbers between 0 and 9; waveform analysis of the audio sometimes proves an easier process than for the image-based CAPTCHA and the numbers are clearly distinguishable from the background random noise.
Example of an audio CAPTCHA for visually impaired users MessageLabs Intelligence analysis indicates that in early 2008, these algorithms deployed against CAPTCHA systems were about 20-30% successful, improving significantly as the year progressed. Eventually, CAPTCHAs could be broken in a matter of seconds. When combined with the incredible computational horsepower available in hackers’ botnets, and the ability to make unlimited attempts, this success rate means that attackers could create as many email accounts as desired.
CAPTCHA-breakers may also combine these two approaches using the “Mechanical Turks” to solve the CAPTCHAs initially and at the same time build a database of successful and failed attempts that can be used to train, test and tune an algorithm under development. As the success rate increases, the attackers can reduce or eliminate their use of expensive Mechanical Turks and turn to a botnet-powered operation.
Automation and Analysis Tools Often the first stage in automatically breaking a CAPTCHA involves removing background noise from an image before isolating or segmenting the individual characters in the image. These segments may then be analyzed with OCR (Optical Character Recognition) techniques to identify them, before assembling the response.
However, some CAPTCHA systems are broken without using OCR. For example, poor implementations on some sites allow the reuse of the session IDs for known CAPTCHAs. Some implementations use a computational one-way hash (such as MD5) of the correct answer, which is also passed to the browser to validate the answer from the user. However, in other cases, this hash is easily broken and can also be used to improve the OCR techniques.
CAPTCHA-breaking using “hash” keys Web-based email and application service providers continuously update and modify CAPTCHA-creation techniques to foil known CAPTCHA-solving algorithms, creating an ongoing arms race between CAPTCHA-developers and CAPTCHA-attackers. Ultimately the CAPTCHA-developer is limited by what a human user can solve. Reports of increasing difficulty with solving CAPTCHAs indicate doubt surrounding the long-term utility of the CAPTCHAs as a security mechanism for protecting online email services from abuse.
Why is CAPTCHA breaking so valuable Once the criminals have the ability to break CAPTCHAs, they can use them in almost any online process that uses CAPTCHA techniques to deter automated sign up. Some typical targeted applications are described below (but not limited to) the following:
• Free Email Account Registration: Spammers can use CAPTCHAs to easily create bulk email accounts. The most popular targets include free, reputable, Web-based email and application service providers. With the created accounts, spammers can send out unlimited junk mails. Spam sent through the service providers’ email servers is also digitally signed correctly using whichever sender authentication validation scheme the provider has deployed.
For example, DomainKeys Identified Mail (DKIM) uses a digital signature included with the headers to indicate that the message is genuine and not spoofed, thus making the mail generated in this way harder to block using anti-spam methods based on the source IP address.
• Social Web sites Account Registration: Spammers are also very keen to create bulk accounts on social networking, blogging and video and picture sharing sites, all of which use CAPTCHAs for user account registration.
Such social networking Web sites have rich user interaction features such as ‘Invite Friends,’ ‘Share This Video,’ ‘Send Message to Friends,’ ‘Comment on this Video,’ etc. Once signed up as a member, spammers can post spam comments to other members on videos, or send out junk messages to ‘Invited Friends’ and genuine email accounts, examples of which will be explored later in this report.
• Search Engine Ranking: Spammers can automatically post spam comments on blogs, chat forums and discussion boards that require CAPTCHA bypass when posting comments. There are some good reasons why spammers are keen to do so. One reason behind the motivation is to boost Web site rankings on the major Internet search engines.
Материалы этого сайта размещены для ознакомления, все права принадлежат их авторам.
Если Вы не согласны с тем, что Ваш материал размещён на этом сайте, пожалуйста, напишите нам, мы в течении 1-2 рабочих дней удалим его.