Does google care where your content is accessible on your server? Everyone seems to have an opinion and I'm just not sure that the location of the content makes a bit of difference. In order to satisfy my own curiosity and make an actual testable test I will put up a piece of content, with an absolutely ridiculous name so that it can hopefully get indexed and show up as the first result in a Google search. Once the content is indexed - I'll test the results by doing the search test and recording the results. My hope is that I will have a good result set that sets the record straight, and I'd be even more happy if the test could be replicated again.
For this test the only difference between item A and item B is the link - we are testing to see if the placement of our content behind a subdomain has any effect on its indexability or index placement. The content will be exactly the same - only the link and how the content is accessed will be different.
I created a page with gibberish text with our keywords strewn about. Our keywords are "snorkel, alien, chameleon" - our body copy tally for each keyword is "chameleon X 14" "alien X 16" and "Snorkel X 14". The page features one image with the alt text "chameleon snorkel alien". The page has been style minimally, but should be semantic and easily indexed. Meta descriptions and keywords have been set. Title tag is set as well. The title for the content will be exactly the same: "Chameleon Snorkel Alien". Check out the page - it is glorious.
The page will be both setup on the domain to be both accessible at the subdomain as well as at the root of the domain.
I will post a link on the index of the top level domain. I will also make a social media post on Google + that are identical execept for the links included. The content should get indexed soon thereafter.
I will link to the page from a root link "http://file-drive.com/alien-snorkel-chameleon.html" and I will also link to it from http://alien.file-drive.com/alien-snorkel-chameleon.html".
<a href="http://alien.file-drive.com/alien-snorkel-chameleon.html">Alien Snorkel Chameleon</a><br>
<a href="http://file-drive.com/alien-snorkel-chameleon.html">Alien Snorkel Chameleon</a>
Checkout This Sweet Alien Snorkel Chameleon http://alien.file-drive.com/alien-snorkel-chameleon.html
Checkout This Sweet Alien Snorkel Chameleon http://file-drive.com/alien-snorkel-chameleon.html
Links on the index and the social post were made within 15 mins of each other. The page was available on the server about 30 mins before.
Once the content is indexed I will go to google and search "Chameleon Snorkel Alien" and record the results.
Omitted Results If google sees the content as redundant it could remove one of the results. If this happens we should take note of which version of the content gets omitted. If the alien subdomain is retained, but the root is omitted that further proves that the subdomain vs root domain structure is unimportant - and content is key to unique indexing.
Double Index If google truly sees the subdomain as a separate site from the root domain then there is a chance that both links to the same content will occur. If this happens then both links get indexed and it is imported to take note of the link presidence. If the root link is displayed first - then the root level domain is more important than subdomains - therefore all important content should be placed in the root and use of subdomains should be avoided. If the subdomain result is higher - then perhaps a subdomain is just as important. Additionally a double indexing could also place value on the order that content is linked. A double indexing could create a situation that would warrant further testing.
Single Index If google only indexes one result then we should take note of which one is indexed. Since we posted this content the same we should expect the same type of indexing. It could be a case that the other result is omitted and google isn't displaying that the other result was omitted because it is duplicated content. Another possibility is that the root takes presidence and the subdomain is disregarded since it isn't root. If the subdomain version is indexed but root isn't then the value placed on the root is unfounded and further testing might be needed.
Ultimately google is somewhat of a black box and it can be extremely difficult to predict an outcome.
I don't believe that content on the root will be favored substantially above content on the subdomain. I believe that google should treat it as it does all content - it will see the link, spider the content and link accordingly. I've heard that Google will treat subdomains as completely separate sites - if that is the case I would guess that Google will index the content twice. My bet is that google will see the two links. See it on the same top level domain - see that the content is the same and then omit one of the references. My guess is that which ever link is hit first, will remain and the second will be omitted.
Content Created: 2017-07-31 9:31 am Content uploaded: 2017-07-31 9:51 am Index Link Uploaded: 2017-07-31 10:02 am Social Posts: 2017-07-31 10:08 am Google Search: 2017-07-31 10:20 am "chameleon snorkel alien" - not in results
Google Search: 2017-08-01 3:57 pm "chameleon snorkel alien" - social post show up
Google Search: 2017-08-02 8:42 am "chameleon snorkel alien" - nothing shows up in results - considering another link or post to support the pages - I'd like to avoid submitting them directly to google - although that is on the table too. Will wait it out for the moment.
Google Search: 2017-08-07 3:11 pm "chameleon snorkel alien" - The google + social posts are back, but nothing else is posted. Getting these pages indexed is not easy without some major site linking to it. Adding links via https://www.google.com/webmasters/tools/submit-url?continue=/addurl - Will see if that makes any difference.
Google Add Url: 2017-08-07 4:00 pm - Added both http://www.file-drive.com/alien-snorkel-chameleon.html and http://alien.file-drive.com/alien-snorkel-chameleon.html through the individual submission tool.
Google Search: 2017-08-08 8:24 am searched "chameleon snorkel alien". Our result is number 1 in search - Which link is first? The subdomain link. and the naked domain link isn't listed. Will check back over the next few days to ensure that only one link is returning on the query. Below is a picture of the search result
Google Search Site Index: 2017-08-08 8:31 am "site:file-drive.com" - I did a site index search to see what pages were indexed. Both pages show up as indexed. So why aren't they both showing up in search results?
Content Created: 2017-08-08 5pm - Created a new piece of content called Skunk Guitar Butler. I linked to the content two ways on the index.html of file-drive.com. The order of the links this time are http://www.file-drive.com/a123123.html and then http://guitar.file-drive.com/a123123.html. I want to see if Google sets precedence on which link comes first.
Content Uploaded: 2017-08-08 5:15 pm
Google Add Url request: 2017-0808 5:25 pm- http://file-drive.com - I want to see if I can get indexing without submitting the actual content to Google.com/addurl this time.
Google Search: 2017-08-09 11:13 am - "skunk guitar butler" - www.file-drive.com/a123123.html shows up first in results. That was fast! This is the first result listed on the content - so this further solidifies the idea that whatever link is listed first influences which link will be displayed in query results. Also the other link is not present in the query result.
Google Search: 2017-08-09 11:24 am searched "chameleon snorkel alien". The index has changed again. Our result isn't 1st any longer. We are second. The subdomain link that was displaying yesterday is now gone, but our content from the root is displaying now. The subdomain link is no longer the top link on the page, which could explain why there is a change in the results.
Google Site Index: 2017-08-09 11:24 am "site:file-drive.com" - All unique content is showing up, however our subdomain links have been omitted. This supports the idea that subdomains do not hold the same precedence as the root - however I imagine that unique content at the subdomain would be treated evenly if there wasn't a root link option - but if the option exists, between root or subdomain then the subdomain will lose.
Google Search: 2017-09-05 8:24 am - "chameleon snorkel alien" and "skunk guitar butler" - Both google search returned number 1 results. The winning url is the root domain with the resource listed after it.
I will update with results as they come in.
I think it is safe to reach a conclusion at this point. 1st off, if there is a link - linked from existing, indexed content Google will index it. If that content hasn't been crawled in a millennia then it might take a while. Posting the link on social media might give it a boost - however the social boost might get temporarily indexed then disappear once the social post gets old. That seemed to happen for these posts: the social post was indexed really fast, but then that post dropped out once it became old. The link from the root of domain remained over time. In order for our pages to remain indexed we had to link to the content from our root. Once the pages were indexed it didn't seem to matter what the link looked like - it could be subdomain or a root domain page - however when Google was given the choice it seemed to favor the root domain link. You could say that Google has a preference for the root domain link, but I can't agree completely - I imagine that Google is more interested in weeding away duplicate links and content - Though the links were different it didn't allow the duplicate content - so it removed any links it found with duplicate content - when given a choice between a root level link and a subdomain link it removed the subdomain. That doesn't mean it didn't index the subdomain - in fact it did, but once it saw that the content was the same it removed the subdomain.
The reason for this might be because we had two pieces of content for the root, and two singular pieces of content for the subdomain. Therefore if we had more indexed content for a subdomain over indexed content for the root our subdomain might win - basically site A and B have the same content - A has 10 pages, B has 7, therefor A wins. I don't know, but it might be worth investigating.
We also created images for our pages and the alt text was the title of the articles. Both Google searches resulted with image results on the first page of results, thus our content showed up first both places, images and content. While the exposure is nice, a user clicking the image would have to click the image, then click the image again or the title in the Google image viewer to reach our page. It seems like a no-brainer to use the alt text for the double exposure potential, even if a user has to click twice.