Gary Illyes, Google’s Webmaster Trends Analyst since 2011, is kind of a big deal. Many in the world of search are dubbing him the new Matt Cutts, but whether or not this is the case, there are few people on the planet who know as much about what’s going on at Google as Mr. Illyes.
When he’s not (quoteunquote) “creating a better search experience for users by helping webmasters create amazing websites”, crunching data or reinventing search, you might find Gary helping (or trolling) users on Twitter, jumping out of moving planes, or talking about chowing down on an Australian marsupial or two. Culinary tastes aside, Gary’s the man with all the answers on Google’s algorithms.
We caught up with Gary at Big Digital Adelaide to try to trip him up and spill the beans.
This post has been lightly edited for clarity. Gary’s opinions are all his own and don’t necessarily represent those of Kwasi Studios or Google.
Woj: Welcome to Australia, Gary.
Gary: Thank you for the very warm introduction.
Woj: No problem. So let’s go back to the beginning. Well, not too far back.
What’s your first memory of the internet?
Gary: 28K modem sound.
Woj: *Beep boop boop*
Gary: Yeah, exactly. With my brother, we learned to sing that, to whistle it because we realised that it always followed the same pattern and except when there were troubles with the line and then the beeps were different. But yeah, that’s my first memory. And of course we were teenagers and we had to download pictures and how the pictures were rendering on the computer. I mean, pictures of cats.
Woj: Yeah, of course.
Gary: They took a very long time to load and I remember how frustrating that was. But we were mostly living in Romania and they have amazing internet.
Gary: Yeah, for a very, very long time, probably for 20 years by now. The first time I was in the U.S. they were still struggling with half a megabit DSL and by that time in Romania, 100 megabit synchronous connection was pretty common.
Woj: Oh, wow. And your background is being an online journalism teacher.
What are your thoughts about Google Podium?
Gary: Personally I think it’s an interesting way to present content. It’s a very cool idea, I think, and it can easily satisfy the information needs of the users. I don’t know where it will it go or what will happen to it. We like to experiment with these kinds of things and just see how it spins out.
Woj: What will happen to old content (e.g. news) with Google’s Podium instant articles feature? Do you see the content published direct to the search results as having potential value beyond its initial seven day lifespan, and will it be searchable at all?
Gary: I would hope that, in some sense, the posts would be archived. I think that it’s very good for humanity to preserve knowledge. We wouldn’t know much about history if that wasn’t humanity’s priority to archive things. So I hope that we are continuing with that.
Woj: We’ve got to make sure that future generations have access to all these wonderful historical snapshots of our lives.
Gary: Yeah, it’s not like I’m publishing stuff about what I was doing in the early morning after coffee.
Woj: Yeah – and also, as per the earlier disclaimer: these thoughts are Gary’s and not representative of Google.
Gary: I mean, some of them are but in many cases, probably I will not have a good answer from Google’s side because I just don’t work on the product, for example, like Google Post. In those cases, instead of saying that I don’t have an answer and then stop, I would prefer to just say something that is my personal opinion.
Woj: Sure. So Google’s mission is to organise the world’s information and make it universally accessible and useful. Aside from minor punctuation, this mission statement has not changed since the company’s inauguration in 1998.
What has changed is the search engine has grown into a colossal machine learning monolith, almost always delivering the right answer and in one eighth of a second. This implies that Google doesn’t want low quality in their results, and the technology they use to separate the wheat from the chaff has improved.
What are some ways that webmasters can change their behaviour to focus on providing a better user experience?
Gary: Our goal was always to provide the most relevant results to our users and that hasn’t changed over the years. The correct answer…well, not correct, but the relevant answer could be from a local site as well. It doesn’t have to be from a high quality site, but in most cases it will be.
In general, I hope that publishers will try hard to create high quality sites instead of trying to sell Viagra in a Canadian casino without a prescription, and we will try to reward those sites by presenting them in our search results.
Basically you want to create high quality sites following our webmaster guidelines, and focus on the user, try to answer the user, try to satisfy the user, and all eyes will follow.
Woj: Which doesn’t include manipulation and all of these sort of tactics that SEOs often want to take as shortcuts.
Gary: Going out and buying 5,000 links from a Russian spammer, it’s not a good idea in general; but sure, it could work for a few hours.
Woj: Google’s very good at detecting patterns. The way I see it, Google is trying to replicate the real world. The real world is full of entities and relationships between them.
How important is it to give Google clues via semantic markup or is Google pretty good at working out the clues themselves?
Gary: I think the answer is both. We are pretty good at triangulating answers or facts in general. We have multiple sources of information to validate its correctness for the knowledge vault. Schema Markup, for example, is also a very important part of search.
Woj: Is that a signal that, if found, kind of precedes other signals? Is that advantageous?
Gary: It’s not very visible in general. For a rich submit, for example, it is. For recipes, it is. For movie reviews, it is. But other than that, typically webmasters or content producers will not see a clear benefit. It’s more about making sure that search engines understand the content well.
So for example if you are talking in your pages about Apple, then that’s quite ambiguous unless you specify somehow which entity are you talking about. Are you talking about the fruit or the company? And one way to do that is to have Schema Markup on your pages.
Woj: So it’s beneficial when disambiguating terms?
Gary: Yes, it can be. Yes. And generally, it’s good for search engines because they will better understand the content of the page.
Woj: RankBrain has been mentioned as the third most important ranking factor. You’ve said that it lets Google understand queries better. No effect on crawling or indexing or ranking.
Can you explain how RankBrain lets Google understand queries better? And how it fits within the core algo?
Gary: Is that all?
Gary: You want fries with that?
Woj: Yes, please. Do you have a diagram?
Gary: So yes, RankBrain. Its importance depends on which way you look at it. It’s an important ranking factor because it affects pretty much all queries. In many cases, it will not do anything for the query stack because the results are already ranked by the core ranking algorithm. But for queries that we haven’t seen before – really long complex queries – it can produce very good predictions about what will work best for the user.
What it does is look at the query based on previously fed training data and try to make a prediction from the results set in order to provide results that work best for a specific query. It’s also really good at getting negative queries right.
So for example, “Can I beat Mario without using a walk-through?” Traditionally, for our algorithm, it was quite hard to understand “without” in the query and typically it was dropping it. With RankBrain, we do a better job at these kinds of queries.
Lemme try one last time: Rankbrain lets us understand queries better. No affect on crawling nor indexing or replace anything in ranking
— Gary Illyes (@methode) March 18, 2016
Woj: So it’s learning from users?
Gary: It’s an offline learning algorithm. It’s refreshed every now and then with new training data.
Woj: Sure. And potentially your favourite question:
Why do you think Larry Kim‘s thoughts on how Machine Learning-enabled algorithms leverage user engagement signals are far-fetched?
Gary: I don’t actually have anything against Larry personally. I typically think that people should try to stick to what they’re experts on and shouldn’t try to make predictions about something they don’t understand. If I’m good at AdWords, for example, or PPC in general, then I will try to stick to that. But I’m not. I’m good at search so I’m only talking about search. And I think that people should try to do that. Or if they want to talk about something that they don’t understand, then dig into the topic, learn how it works, and go from there.
Editor’s Note: Larry Kim has written a brief response to Gary’s comments, which you can read at the end of this post.
Eric Enge and his team wrote really good articles on machine learning and even RankBrain, and the way they did that was by doing months worth of research on machine learning and they took the time to actually write in to us and fact check what they were saying. Same happened with Jennifer Slegg and her Panda article. She wrote to us. We would much rather have accurate information out there, rather than really weird predictions that are shoved down people’s throats. I haven’t been following what Rand’s doing on machine learning, but I think in many cases his predictions and observations are very clear, very sane.
Is it true that Google considers the pogo stick effect, as well as click through rates when ranking results?
Gary: This is a very popular question. I get this pretty much at every single conference.
Clicks in general are a very noisy signal. I worked on trying to make observations from click data. It’s like a Gordian knot. Because there are tons of people who are scraping the results and trying to fetch ranking data, and for whatever reason, they also decide to click on things automatically. Links. It’s just a huge mess.
When we have controlled experiments, then obviously we have to look at click data. Before we launch a ranking change, typically what we do is to isolate 1% of the users and give them modified search results, modified by the new ranking algorithm or a piece of the algorithm and see how they like the new results. And in these instances, we do look for long clicks, short clicks, and so on. But in general, as I said, it’s a huge mess. When it comes to personalisation, we like to use click data because it’s…
Woj: It’s clearer.
Gary: It’s clearer. If you have a user who wants to mess up their own search results, it’s like, “Sure, go ahead.”
Woj: So I guess Google’s got a pretty accurate snapshot of the internet.
What percentage of the internet is human, as opposed to bots?
Gary: Okay, that’s an interesting question.
Typically the percentage for big data is 30 to 70.
Woj: Humans to robots?
Image credit: Google
Woj: Wow, and is it the same on mobile and desktop?
Gary: Yeah, it is actually.
Woj: Interesting. So there’s a lot of bots out there.
Gary: Oh yeah.
So we continuously try to throttle the bots that are scraping our results, and we sometimes even mess with the search results just to screw them. It’s not a huge problem for us, but it’s something we keep an eye on. If they’re not too aggressive we typically don’t take action against them. But if they get aggressive and damage the search experience for our users, then we block them.
What sort of things or signals might Google look at to work out a user’s intent?
Gary: Typically history. I think my best example is “apple” or “Amazon” or “orange” or “python.” I was just playing with a python a few days ago on Kangaroo Island and…
Woj: You were programming?
Gary: And I wasn’t programming.
Gary: Exactly. If we see that a user previously was more interested in the programming language than the animal, then we would favour programming related results when the user searches for “python.”
Woj: Is that data measured on cookie calls, Google+ accounts, Chrome usage, or a combination of all these different things? Gary: To the best of my knowledge, it’s coming from search results.
In February last year, you mentioned that Google are working on being more transparent. Is this a priority for Google? And where do you think Google still needs to make progress on being transparent?
Gary: That’s a very broad question. I think transparency is very important, but we also have to make sure that we are not compromising our own operations by being transparent.
We were very transparent about what we’re doing with mobile. We continue to talk about mobile and what we are going to do with it, something that we typically never did. We never pre-announced anything until it worked because launching stuff at Google is not easy and there are many things that can go wrong. With mobile, we want things to be constantly moving forward.
And we did pre-announce things, be it the mobile friendly ranking changes, AMP, app indexing. At Google I/O, we had quite a few announcements that were pre-announcements actually. Whether we can improve? Of course. There’s always room for improvement. We’re working on it. We’re trying to involve more publications in our press reach for example, not just from the US, but from other countries. We’re expanding our press reach or blogger reach.
Are there any areas that Google’s not ever going to be transparent in?
Gary: Yes. If you think about it:
If we disclose stuff about the core ranking algorithm, then we would be in deep crap. A very good example of this is the PageRank algorithm which is part of the core ranking algorithm. We were 100% transparent about it from the start, and that’s how the spamming started.
Was it a good idea? We did start a new industry, I guess, but…
Woj: The link economy.
Woj: Is Google ever going to do something like a Google University or maybe work with universities to create some kind of accreditation? Not necessarily to understand how to reverse engineer the algorithm but at least make the web a safer and more user-friendly place?
Gary: We do sometimes we go to universities and hold talks about how search works. But accreditation, I very strongly doubt that we would ever have that. I mean we have Google Partners accreditation, which is for AdWords and Analytics, but even that is abused by certain people who make claims that they can get businesses first position ranking because they’re Google partners.
Woj: Which is against the guidelines.
Gary: Yeah. You don’t want to work with those companies.
Woj: Yeah, it just differs across different device types, personalisation, proximity…
Are some types of businesses more likely to benefit from investing and developing a mobile app rather than a mobile website?
Gary: I guess the answer is yes. There are always exceptions.
In my view, I think mobile websites are more important in general because in apps there’s always that “plus”, that additional step that you have to download the app and install it.
With instant apps, probably this will change a little but still you will have to download a piece of that app and have the phone install it before you will be able to access information from the app. With websites, you don’t have this. With websites, you have content as soon as possible. Hopefully as soon as you touch a result or a link in Facebook or whatever.
Woj: Thank you for bringing Accelerated Mobile Pages (AMP) to Australia on your travels. I saw Jennifer’s tweet while you were on Kangaroo Island. Good timing…
Under what circumstances should a business consider AMP?
Gary: I think AMP becomes more and more important for us. We live on a web that is utterly slow, especially if you’re trying to access content from other countries that don’t have local edge servers, then you will wait long seconds before anything will load. My favourite news site, for example, loads here in Australia in about 20 seconds. The fact that my data has to traverse two oceans and at least three continents, it slows down the page or the perceived load time. With AMP, content owners can avoid that because content will be cached locally or close to the user at least on edge servers and the amplified pages are also much, much slimmer. They are like a few hundred kilobytes instead of like eight megabytes with images.
Woj: It makes a big difference.
Gary: It makes a huge difference.
Woj: You’ve mentioned that voice searches are definitely growing.
How do phrases and terminology used in voice differ to written texts?
Gary: Well, if you think about it, when you search for something using text you tend to be very brief with your query because people don’t like to type, and also typing on a mobile phone is not a fun thing to do. But with voice queries, people actually say out full questions or full sentences. So voice queries tend to be much longer, and they tend to be delivered using natural language, instead of just a few keywords. And that changes things. We’d like to think that we were able to retrieve the same results for short, keyword queries as we are for the same question delivered in natural language. We like to think that this wouldn’t affect queries that much. But this would be an interesting SEO experiment for someone to run.
What are your favourite ways to troll people on Twitter?
Woj: Have you ever said something and thought, “That’s a bit harsh.”
Gary: Yes, I tend to be pretty blunt. In general I don’t like hiding behind PR content I guess. I can say things nicely, but I think in general, it’s better if people can clearly see that I’m not happy, instead of hoping that they can read something into the tweet.
Woj: Yeah, often people would say they read the reverse implication of the answer. So it’s like the answer may not be direct, so they automatically correlate that with the opposite. I guess you must experience that a lot and how do you silence them or…
Gary: I don’t.
Woj: You don’t? You just let them play?
Gary: Yeah. Obviously if I’ve said something nasty, which doesn’t happen often, then I would apologise. But typically I don’t get angry or nasty for no reason.
Woj: Yeah. There was a bit of a gap between Matt Cutts when he ceased to be the public facing person on tweets.
Why did it take Google so long to appoint someone, such as yourself, as the new spokesperson?
Gary: So one thing is that Matt wasn’t replaced by a single person. Matt was replaced by a team. The other thing is that we were pretty much always there, but we didn’t get much exposure. We saw a need for having people out there who can answer webmasters and content owner’s questions, and we started being more active on Twitter and on social media. We ramped up on our official webmaster console account (WMC) and we have lots of people working on that, it’s not a single person doing it. In the same way I’m not the only person who’s going to conferences. For whatever reason, people decided that they would retweet my tweets more often, but for example, John Mueller has a much larger follower base than I have. Woj: Yeah, that’s interesting. Sorry, I’ve throw in a few extra questions on the fly. I hope they’re all okay?
What do you think of other search engines like Wolfram Alpha and DuckDuckGo?
Gary: It’s always good to have competition because it helps us stay relevant and innovate more.
Woj: Good answer.
What’s something you want to learn personally?
Gary: Mandarin. Woj: Mandarin?! Ah-ha! You’re going to work for Baidu!
Gary: That was a shrug. That’s why I didn’t say anything.
Finally, where do you see the internet in 2020? How will virtual reality and the Internet of Things impact our lives?
Gary: So those are two questions actually.
Gary: Internet of Things, I like the idea of having the internet on me and being able to get answers very fast.
What I’m afraid of is that people will become dumber because they don’t have to think as much because there’s the internet that will answer all their question. I’m also afraid that people will become siloed, I guess, because of how information is presented.
I know that at least Google is paying attention to these things and we’re trying to make sure that the Internet of Things, from our site at least, will be epic and awesome. But there are many players, and I just hope that everyone is considering the negative effects of the Internet of Things. As for Virtual Reality, I don’t know where VR is going. It’s definitely a very interesting topic and I do see lots of potential in it. It does remind me a little bit of The Matrix, and I want to make sure that we’re not in fact creating another Matrix.
Do you think there will ever be a virtual search experience?
Gary: It’s an interesting idea. I think it’s an idea that we definitely want to experiment with and see what can we do, if anything. I don’t know if anything will happen, maybe because I can’t see myself sitting with a VR headset on all day long to do searches. But I guess it would work in augmented.
Woj: Yeah, maybe in context with other apps?
I saw that you got to cuddle a wombat. Did you get to eat one?
Gary: No, I’m only here to analyze. I’m a trends analyst, so I’m analyzing why sometimes wombats are crossing the road. But other than that, no. But I learned that they are edible, in fact. And aboriginals are trading wombat meat and they are eating wombats.
Gary: I don’t know if I would like to try that but…
Woj: But you did try crocodile and kangaroo last night.
Gary: Yeah, so crocodile wasn’t a first for me. Well, actually I had alligator, not crocodile. But it’s pretty much the same. And yeah, Kangaroo, it’s a lean, good meat. But it’s not something that I would go out of my way to find.
Woj: Especially outside of Australia.
Gary: Well, actually you’d be surprised how many places there are that sell kangaroo meat. I know that in Zurich we have it in our office sometimes.
Woj: That’s probably something that people didn’t know.
Woj: Well thanks Gary. Thank you for your time. I really appreciate it.
Gary: Thank you for your questions.
Editor’s Note: We reached out to Larry Kim for a response to Illyes’s comments, which he has allowed us to publish here:
I’ve written a lot about machine learning based search algorithms recently, which can be summarized as follows:
- The new ML-algos by definition need training data in order to learn and produce better search results.
- Our research indicates that user engagement signals like CTR and dwell time are being used to audition keywords/content to determine if the intent of the user search was met or not. The research indicates that this impacts training data, which impacts future search results.
- Search marketers should therefore pay increased attention to user engagement metrics when optimizing content.
It’s unfortunate that Gary is resorting to personal attacks here rather than confirming or denying specific points. I think this is (a) unprofessional and (b) makes me more confident about my research as I kind of view these official Google PR endorsements as contrary indicators…
Please show some love and share this interview if you enjoyed it as much as we did, or check out some of our other interviews: