Introduction to
how Google
works.
Introduction
• In its infancy, the Internet wasn’t what you think of when you use it now. In fact,
it was nothing like the web of interconnected sites that has become one of the
greatest business facilitators of our time.
• To find a specific file in that collection, users had to navigate through each file.
Sure, there were shortcuts. If you knew the right people — that would be the
people who knew the exact address of the file you were looking for — you could
go straight to the file. That’s assuming you knew exactly what you were looking
for.
History of Search Engines
•The whole process made finding files on the Internet a difficult,
time-consuming exercise in patience; but that was before a student
at McGill University in Montreal decided there had to be an easier
way.
•Archie helped solve this data scatter problem by combining a script-based data
gatherer with a regular expression matcher for retrieving file names matching a user
query.
•However, Archie’s search capabilities weren’t as fancy as the natural language
capabilities you find in most common search engines today, but at the time it got the
job done
First Search Engines
• The first real search engine, in the form that we know search engines today,
didn’t come into being until 1993. Developed by Matthew Gray, it was
called Wandex. • Excite—1993
• Wandex was the first program to both index and search the index of pages • Yahoo!—1994
on the Web. This technology was the first program to crawl the Web, and • Web Crawler —1994
later became the basis for all search crawlers. After that, search engines took • Lycos —1994
on a life of their own. From 1993 to 1998, the major search engines that • Infoseek— 1995
you’re probably familiar with today were created: • AltaVista — 1995
• Inktomi—1996
• Ask Jeeves — 1997
• Google —1997
• MSN Search—1998
• Bing — 2009
Be Aware!!!!
•There are thousands of bloggers and journalists
spreading volumes of information that simply isn't true.
If you followed all the advice about SEO written on
blogs, it's unlikely you would receive top listings in
Google, and there’s a risk you could damage your site
performance and make it difficult to rank at all.
Exercise
•Go to google.com
•Type : how to pass in exam
•After you did this focus on the results
•How many websites did google find
•How long does it take to find the results .
•Why some websites appeared on Google’s top page while others
appeared on page number 10.
Anatomy of a Search Engine
•Search engine is a piece of software that uses algorithms to find and collect information
about web pages. The information collected is usually keywords or phrases that are possible
indicators of what is contained on the web page as a whole, the URL of the page, the code
that makes up the page, and links into and out of the page. That information is then indexed
and stored in a database.
•On the front end, the software has a user interface where users enter a search term — a word
or phrase — in an attempt to find specific information. When the user clicks a search button,
an algorithm then examines the information stored in the back-end database and retrieves
links to web pages that appear to match the search term the user entered.
Anatomy of a Search Engine
1. Query interface
•The query interface is what most people are
familiar with, and it’s probably what comes to
mind when you hear the term ‘‘search engine.’’
The query interface is the page, or user
interface, that users see when they navigate to a
search engine to enter a search term.
How about Yahoo (portal)
Anatomy of a Search Engine
2.Search engine results pages
•The other sides of the query interface, and the only other parts of a search engine
that’s visible to users, are the search engine results pages (SERPs). This is the
collection of pages that are returned with search results after a user enters a search
term or phrase and clicks the Search button. This is also where you ultimately want
to end up; and the higher you are in the search results, the more traffic you can
expect to generate from search. Specifically, your goal is to end up on the first
page of results — in the top 10 or 20 results that are returned for a given search
term or phrase.
Anatomy of a Search Engine
2.Search engine results pages
•Based on previous exercise, What’s the first thing you do when the page appears?
•There is no magic bullet or formula that will garner you those rankings every time.
Instead, it takes hard work and consistent effort to push your site as high as possible
in SERPs. At the risk of sounding repetitive, that’s the information you’ll find
moving forward. There’s a lot of it, though, and to truly understand how to land
good placement in SERPs, you really need to understand how search engines work.
There is much more to them than what users see.
Anatomy of a Search Engine
3.Crawlers, spiders, and robots
• The query interface and search results pages truly are the only parts of a search
engine that the user ever sees.(Front end)
• In fact, what’s in the back end is the most important part of the search engine,
and it’s what determines how you show up in the front end.
• Spiders, crawlers, and robots are programs that literally crawl around the Web,
cataloguing data so that it can be searched. In the most basic sense, all three
programs — crawlers, spiders, and robots — are essentially the same. They all
collect information about each and every web URL.
Anatomy of a Search Engine
3.Crawlers, spiders, and robots
• As discussed in the previous slide, that back end of search engine consist of 3
main parts. Search engine spiders follow links on the web to request pages that
are either not yet indexed or have been updated since they were last indexed.
These pages are crawled and are added to the search engine index (also known as
the catalogue). When you search using a major search engine you are not actually
searching the web, but are searching a slightly outdated index of content which
roughly represents the content of the web. The third part of a search engine is
Robots which Perform spider and crawlers Actions.
Anatomy of a Search Engine
3.Crawlers, spiders, and robots
• •In other words , the robot is able to find the site when the end user type a word
or phrase. This step is called query processor.
• •For more details please watch this video
• •https://www.youtube.com/watch?v=LVV_93mBfSU
Anatomy of a Search Engine
4. Databases
• Every search engine contains or is connected to a system of databases where data
about each URL on the Web (collected by crawlers, spiders, or robots) is stored.
These databases are massive storage areas that contain multiple data points about
each URL.
• The data might be arranged in any number of different ways and is ranked
according to a method of ranking and retrieval that is usually proprietary to the
company that owns the search engine.
• You’ve probably heard of the method of ranking called PageRank (for Google) or
even the more generic term quality scoring.
• This ranking or scoring determination is one of the most complex and secretive
parts of SEO. Why?
Anatomy of a Search Engine
4. Databases
• Answer: because search engine companies change the
weight of the elements used to arrive at the score
according to usage patterns on the Web.
• The idea is to score pages based on the quality that site
visitors derive from the page, not only on how well
website designers can manipulate the elements that make
up the quality score
Anatomy of a Search Engine
4. Databases
Cached web pages on Google
• •Cached pages on Google and what they mean to you. Search results on Google
often come with a “Cached” page version that can be accessed by clicking the
green arrow next to the URL. Clicking “Cached,” will take you to the version
of the page that Google saw when it last visited the site and indexed its content
Recent Google updates and how to
survive them.
• Keywords are still vitally important in web page ranking. However, they’re just one of
the dozens of considered elements, BUT Keywords are still the most important
elements of SEO.
• Simply put, everybody wants to be in Google. Google is fighting to keep its search
engine relevant and must constantly evolve to continue delivering relevant results to
users.
• This hasn't been without its challenges. Just like keyword stuffing, webmasters
eventually clued onto another way of gaming the system by having the most 'anchor
text' pointing to the page. If you are unfamiliar with this term, anchor text is the text in
external links pointing to a page.
Anchor Text Example
Recent Google updates and how to
survive them.
• Anchor text created another loophole exploited by spammers. In many cases, well-
meaning marketers and business owners used this tactic to achieve high rankings in
the search results.
• Along came a new Google update in 2012, this time called 'Penguin'. Google’s
Penguin update punished sites with suspicious amounts of links with exact matched
anchor text pointing to a page, by completely delisting sites from the search results.
• Many businesses that relied on search engine traffic lost all of their sales literally
overnight, just because Google believed sites with hundreds of links containing just
one phrase didn't acquire those links naturally
How to recover from Google changes, or
to prevent being penalized by new
updates
• There are many techniques you should consider it to increase your website rank
on SE. Google says :
• If you want to stay at the top of Google, never rely on one tactic.
• Always ensure your search engine strategies rely on SEO best practices.
Authority, trust & relevance. Three
powerful SEO strategies explained.
• Today, Google has well over 200 factors such as
• Google assesses how many links are pointing to your • How relevant your page is
site • How old your site is
• How trustworthy these linking sites are • How fast your site loads
• How many social mentions your brand has
… and the list goes on
Does this mean it's impossible or
difficult to get top rankings in Google?
• The answer is Nooooooooooooo.
• Google’s algorithm is complex, but you don’t have to be a rocket scientist to
understand how it works.
• In fact, it can be ridiculously simple if you remember just three principles. With
these three principles you can determine why one site ranks higher than another, or
discover what you have to do to push your site higher than a competitor.
• These three principles summarize what Google are focusing on in their algorithm
now, and are the most powerful strategies SEO professionals are using to their
advantage to gain rankings.
• The three key principles are:
• 1.Trust,
• 2.Authority
• 3.Relevance.
1.Trust
• Domain names and URLs
• Trust is at the very core of Google’s major changes and updates over the past
• Page content
several years. Google wants to keep poor-quality, shoddy sites out of the
• Link structure
search results and keep high-quality, legit sites at the top.
• Usability and accessibility
• If your site has high-quality content and backlinks from reputable sources,
• Meta tags
your site is more likely to be considered a trustworthy source, and more likely
• Page structure
to rank higher in the search results.
2.Authority
• Previously the most popular SEO strategy, authority is still powerful, but now
best used in tandem with the other two principles. Authority is your site’s
overall strength in your market.
• Authority is almost a pure numbers game, for example: if your site has one
thousand social media followers and backlinks, and your competitors only
have fifty social media followers and backlinks, you’re probably going to
rank higher.
3.Relevance
• Google looks at the contextual relevance of a site and rewards relevant sites
with higher rankings. This levels the playing field a bit, and might explain
why a niche site or local business can often rank higher than a Wikipedia
article.
• You can use this to your advantage by bulking out the content of your site
with relevant content, and use the SEO algorithms to give Google a ‘nudge’ to
see that your site is relevant to your market.
• You can rank higher with less links by focusing on building links from
relevant sites. Increasing relevance like this is a powerful strategy and can
lead to a high rankings in competitive areas
How Google ranks sites now—Google’s
top-10 ranking factors revealed.
• Fortunately, there are a handful of industry leaders who have figured it out, and regularly
publish their findings on the Internet. With these publications you can get a working
knowledge of what factors Google use to rank sites. These surveys are typically updated
every second year, but these factors don’t change often, so you can use them to your
advantage by knowing which areas to focus on.
Google’s top-10 ranking factors
• If your competitors’ pages have more of the above than yours, then it's likely they are going
to rank higher. If your pages have more of the above than competitors, then it is likely you
will beat them.
• The mentioned factors are from the Search Metrics Google Ranking Factors study released in
2015
• •1.Word count.
• 2.Relevant keywords on page.
• 3.Responsive design.
• 4.User signals (click-through-rate, time-on-site, bounce-rate).
• 5.Domain SEO visibility (how strong the domain is in terms of links and authority).
• 6.Site speed.
• 7.Referring domains (number of sites linking to your site).
• 8.Keyword in internal links.
• 9.Content readability.
• 10.Number of images.
Google’s top-10 ranking factors (2019)
• 1.A Secure and Accessible Website
• 2.Page Speed (Including Mobile Page Speed)
• 3.Mobile Friendliness
• 4.Domain Age, URL, and Authority
• 5.Optimized Content
• 6.Technical SEO
• 7.User Experience (RankBrain)
• 8.Links
• 9.Social Signals
• 10.Real Business Information
1. A Secure and Accessible
Website
• Unsurprisingly, the first of SEO ranking factors has to do with having the
right kind of URL. Specifically, that’s a URL that Google’s bots can easily
reach and crawl.
• In other words, Google has to be able to visit the URL and look at the page
content to start to understand what that page is about.
• To help the bots out, you’ll need:
• A robots.txt file that tells Google where it can and can’t look for your site
information
• A sitemap, which lists all your pages. If you’re running a website, you can use
an online sitemap generator.
• HTTPS is main a factor in deciding whether or not to index a page,
1. A Secure and Accessible
Website
• How to create a /robots.txt file and Where to put it
• The short answer: in the top-level directory of your web server.
• For example, for "http://www.example.com/shop/index.html, it will remove
the "/shop/index.html", and replace it with "/robots.txt", and will end up with
"http://www.example.com/robots.txt".
What to put in robots.txt
• The "/robots.txt" file is a text file, with one or more records. Usually contains
a single record looking like this:
• User-agent: [Required, one or more per group] The name of a search engine
robot (web crawler software) that the rule applies to. This is the first line for
any rule. Most Google user-agent names are listed in the Web Robots
Database or in the Google list of user agents. Supports the asterisk (*)
wildcard for a path prefix, suffix, or entire string.
What to put in robots.txt
• # Example 1: Block only Googlebot
• User-agent: Googlebot
• Disallow: /
• # Example 2: Block Googlebot and Adsbot
• User-agent: Googlebot
• User-agent: AdsBot-Google
• Disallow: /
• # Example 3: Block all AdsBot crawlers
• User-agent: *
• Disallow: /
What to put in robots.txt
• Disallow: [At least one or more Disallow or Allow entries
per rule] A directory or page, relative to the root domain, that
should not be crawled by the user agent.
• Allow: [At least one or more Disallow or Allow entries per
rule] A directory or page, relative to the root domain, that
should be crawled by the user agent just mentioned.
Another example file?in
robots.txt
• A robots.txt file consists of one or more groups, each beginning with a
User-agent line that specifies the target of the groups.
• Here is a file with two group; inline comments explain each group:
• # Block googlebot from example.com/directory1/... and
example.com/directory2/...
• # but allow access to directory2/subdirectory1/...
• # All other directories on the site are allowed by default.
• User-agent: googlebot
• Disallow: /directory1/
• Disallow: /directory2/
• Allow: /directory2/subdirectory1/
• # Block the entire site from anothercrawler.
• User-agent: anothercrawler
• Disallow: /
Useful robots.txt rules
2. Page Speed (Including Mobile
Page Speed)
• Page speed has been cited as one of the main SEO ranking factors for years.
Google wants to improve users’ experience of the web, and fast-loading web
pages will definitely do that.
• Google announced a search engine algorithm update focused on mobile page
speed that will start to affect sites from July 2018. If your site doesn’t load
fast on mobile devices, then it could be penalized.
• Accelerated Mobile Pages (AMP). AMP gives web pages a lightning speed to
load on users mobile devices. Faster the speed, higher the rank and the more
chances that users will get to see content in less time. Apart from improving
SERP, fast loading speed will also reduce Bouncing Rate.
• Use Google’s mobile testing tool to see how your site stacks up.
3. Mobile Friendliness
• While we’re on the subject of mobile, mobile-friendliness is another major
SEO ranking factor. More people use mobile devices than desktops to access
the web, and that’s one reason there’ve been changes in how Google ranks
search results.
• Things to look at include:
• Whether you have a responsive site that automatically resizes to fit the device
Eg. www.m.example.com
• Whether you’re using large fonts for easy readability on a small screen
• Accessibility and navigability, including making it easy to tap menus
• Ensuring that essential content isn’t hidden by ads
4. Domain Age, URL, and Authority
• Did you know that nearly 60% of the sites that have a top ten Google search
ranking are three years old or more? Data from an Ahrefs study of two
million pages suggests that very few sites less than a year old achieve that
ranking. So if you’ve had your site for a while, and have optimized it using
the tips in this article, that’s already an advantage.
4. Domain Age, URL, and Authority
• When it comes to search engine ranking factors, authority matters. As you’ll
see, that’s usually a combination of great content (see the next tip) and off-
page SEO signals like inbound links and social shares.
• You can check domain authority or page authority with
https://www.seoreviewtools.com
5. Optimized Content
• Google’s search algorithm relies on keywords. These are the words and
phrases searchers use when they’re looking for information. They’re also the
words and phrases that describe the topics your site is about. Ideally, those
will match up.
• It’s not just about the main keywords either; it’s also important to include
terms related to the main terms people are searching for. These are called LSI
(latent semantic indexing) keywords. They provide a kind of online word
association to help Google know which results to show.
• For example, using the right LSI keywords will tell Google that when
searchers type in “mini”, your page is relevant to the car, rather than the skirt,
and vice versa.
6. Technical SEO
• We said earlier that getting the code right is one aspect of optimizing content
for better search engine rankings. Here are some of the aspects you need to
look at:
• Use keyword phrases in page titles, which is where Google first looks to
determine which content is relevant to which search. You’ll see the page title
as the first line of a search result entry.
• Use header tags to show content hierarchy. If your title is formatted as h1,
then use h2 or h3 for subheads.
• Create a meta description that both entices readers and includes your
keyword phrase. Keep meta descriptions short and grabby – you have right
around 160 characters to convince searchers that this is the post they want.
• Use keyword phrases in image alt tags to show how the images are relevant
to the main content. Google also has an image search, which is another way
for people to find your content.
7. User Experience (RankBrain)
• For a while now, Google’s been using artificial intelligence to better rank
web pages. It calls that signal RankBrain. This includes other signals that
affect your search engine ranking. These are:
• Clickthrough rate – the percentage of people who click to visit your site after
an entry comes up in search results
• Bounce rate, especially pogosticking – the number of people who bounce
away again, which basically means your site didn’t give them what they
wanted
• Dwell time – how long they stay on your site after they’ve arrived.
8. Links
• •As we said at the start, the web is built on links, so naturally, links are a
crucial SEO ranking signal. There are three kinds of links to think about:
• •Inbound links
• •Outbound links
• •Internal links
9. Social Signals
• When people share your content on social networks, that’s another sign that
it’s valuable. Cognitive SEO‘s study of 23 million shares found a definitive
link between social shares and search engine ranking.
10. Real Business Information
• This tip is important for businesses targeting particular local areas. The
presence or absence of business information is one of the most crucial local
SEO ranking factors.