Google’s John Mueller posted a comment on Reddit describing what is often referred to as the “Japanese keyword hack,” he said. In short, it is an automated process that looks for vulnerabilities in your CMS to inject content on your website.
John said, “It looks like you (or your hoster) already cleaned some of this up – it’s often called the “Japanese keyword hack”.”
What happened was this site noticed over “20,000 pages in Japanese or Chinese have been indexed” as reported on this person’s Google Search Console property. He didn’t build that content, someone else injected it on his site and he didn’t know about it.
Here is a partial screenshot of the site command for that infected site:
The site was hacked and now this site owner needs to get the hack under control and remove the hacked content from the site.
Google even has a document on this hack saying, “The Japanese keywords hack typically creates new pages with autogenerated Japanese text on your site in randomly generated directory names (for instance, http://example.com/ltjmnjp/341.html). These pages are monetized using affiliate links to stores selling fake brand merchandise and then shown in Google Search.”
Here’s an example of what one of these pages looks like according to Google:
John Mueller from Google explained that “since someone hacked your site, even if you’ve cleaned up the hacked traces, it’s important to understand how they did it, so that you can make sure that the old vulnerabilities are locked down.” How does it work, he said “they scan the web automatically, the hack is probably also mostly automatic – so someone will be back if you didn’t lock it down.” “If it was through WordPress, make sure you automate updates & co — or consider moving to a platform where you don’t have to manage hosting yourself (wordpress .com?, or other CMSs), if you find it a hassle,” he added.
In terms of SEO and Google Search, John said he would “recommend making sure the important pages on the site are clean.”
John said those pages will “get recrawled / reindexed fairly quickly in Google (sometimes with some nudging through Search Console).”
For the other pages that do not get recrawled and reindexed quickly then “if there are hacked traces indexed for your site, but nobody sees them, you don’t need to do anything,” John said. “Old pages will remain indexed for months, they don’t cause any problems if they tend not to be seen; it’s easy to spend a ton of time on them for no visible effects,” he added.
Also, he said don’t worry about the links, “I wouldn’t worry about the links to your site, no need to disavow them. Focus on your site’s content,” he wrote. Google cares more about the content than the links, since Google can ignore those links anyway but it really can’t ignore the content.
If you are worried about the links, John added, if “it’s just spammers linking to your search results. I’d block the search results from indexing (robots.txt or noindex). It’ll take a while for those to drop out of search either way. No need to disavow. Having them reported as “indexed but blocked” is fine, they’ll drop out over time. For new / other sites, I’d generally block search results pages from indexing, no need to wait until someone takes advantage of your site like this. And if it happens, try to fix it as soon as you can.”
Being hacked is not fun – hope none of you go through it but you will as you gain more experience.
Forum discussion at Reddit.
The content at the Search Engine Roundtable are the sole opinion of the authors and in no way reflect views of RustyBrick ®, Inc
Copyright © 1994-2024 RustyBrick ®, Inc. Web Development All Rights Reserved.
This work by Search Engine Roundtable is licensed under a Creative Commons Attribution 3.0 United States License. Creative Commons License and YouTube videos under YouTube’s ToS.