Web Browser

web Browser is an application software that allows us to view and explore information on the web. User can request for any web page by just entering a URL into address bar.

Web browser can show text, audio, video, animation and more. It is the responsibility of a web browser to interpret text and commands contained in the web page.

Earlier the web browsers were text-based while now a days graphical-based or voice-based web browsers are also available. Following are the most common web browser available today:

Browser	Vendor
Internet Explorer	Microsoft
Google Chrome	Google
Mozilla Firefox	Mozilla
Netscape Navigator	Netscape Communications Corp.
Opera	Opera Software
Safari	Apple
Sea Monkey	Mozilla Foundation
K-meleon	K-meleon

Architecture

There are a lot of web browser available in the market. All of them interpret and display information on the screen however their capabilities and structure varies depending upon implementation. But the most basic component that all web browser must exhibit are listed below:

Controller/Dispatcher
Interpreter
Client Programs

Controller works as a control unit in CPU. It takes input from the keyboard or mouse, interpret it and make other services to work on the basis of input it receives.

Interpreter receives the information from the controller and execute the instruction line by line. Some interpreter are mandatory while some are optional For example, HTML interpreter program is mandatory and java interpreter is optional.

Client Program describes the specific protocol that will be used to access a particular service. Following are the client programs tat are commonly used:

HTTP
SMTP
FTP
NNTP
POP

Starting Internet Explorer

Internet explorer is a web browser developed by Microsoft. It is installed by default with the windows operating system howerver, it can be downloaded and be upgraded.

To start internet explorer, follow the following steps:

Go to Start button and click Internet Explorer.

The Internet Explorer window will appear as shown in the following diagram:

Accessing Web Page

Accessing web page is very simple. Just enter the URL in the address bar as shown the following diagram:

Navigation

A web page may contain hyperlinks. When we click on these links other web page is opened. These hyperlinks can be in form of text or image. When we take the mouse over an hyperlink, pointer change its shape to hand.

Key Points

In case, you have accessed many web pages and willing to see the previous webpage then just click back button.
You can open a new web page in the same tab, or different tab or in a new window.

Saving Webpage

You can save web page to use in future. In order to save a webpage, follow the steps given below:

Click File > Save As. Save Webpage dialog box appears.
Choose the location where you want to save your webpage from save in: list box. Then choose the folder where you want to save the webpage.
Specify the file name in the File name box.
Select the type from Save as type list box.

Webpage, complete
Web Archive
Webpage HTML only
Text File

From the encoding list box, choose the character set which will be used with your webpage. By default, Western European is selected.
Click save button and the webpage is saved.

Saving Web Elements

Web elements are the pictures, links etc. In order to save these elements follow the steps given below:

Right click on the webpage element you want to save. Menu options will appear. These options may vary depending on the element you want to save.

Save Picture As: This option let you save the picture at specific location with its name. When you click this option, a dialog box is opened where you can sepcify its name and location.

Favourites

The Favourites option helps to save addresses of the webpages you visited oftenly. Hence you need not to remember long and complex address of websites you visit often.

In order to open any webpage, you just need to double click on the webpage that you have marked from bookmarks list.

ADDING A WEB PAGE TO YOUR FAVOURITES

In ordered to add website to your favourite list, follow the steps given below:

Open webpage that you want to add to your favourite.
Click on favourite menu and then click on Add to Favourites opton. Addfavourites dialog box appears.

You can also click Favourites button available in the toolbar. Favourites panel will open in the left corner of the internet explorer window. Click add button, AddFavourites dialog box will apppear.

In AddFavourites dialog box, the Name: text box will contains the name of the web page that you want to add to favourites.
Click the Create in button, Favoutites folder will appear. Move to the folder where you want to store the favourites by clicking on the folder name.
Now click OK button to save the favourites.

OPENING FAVOURITES

In order to open favourites, follow the steps given below:

In the Favourite Panel, take the mouse over the site that you want to open. Now click on the address to open that site.

Favourite can also be opened from the Favourites menu by selecting the appropriate one.

ORGANIZING FAVOURITES

Favourites can be organized by categorizing web pages, creating folder for each category and then storing web pages into them. In order to organize favourites, follow the steps given below:

Click Favourites menu > Organize Favourites.Organize favourites dialog box will appears.
In order to organize the webpages, drag the individual webpage to the respective folder. Similarly to delete a favourite, Click on delete button.

Web Server

Overview

Web server is a computer where the web content is stored. Basically web server is used to host the web sites but there exists other web servers also such as gaming, storage, FTP, email etc.

Web site is collection of web pages whileweb server is a software that respond to the request for web resources.

Web Server Working

Web server respond to the client request in either of the following two ways:

Sending the file to the client associated with the requested URL.
Generating response by invoking a script and communicating with database

Key Points

When client sends request for a web page, the web server search for the requested page if requested page is found then it will send it to client with an HTTP response.
If the requested web page is not found, web server will the send an HTTP response:Error 404 Not found.
If client has requested for some other resources then the web server will contact to the application server and data store to construct the HTTP response.

Architecture

Web Server Architecture follows the following two approaches:

Concurrent Approach
Single-Process-Event-Driven Approach.

Concurrent Approach

Concurrent approach allows the web server to handle multiple client requests at the same time. It can be achieved by following methods:

Multi-process
Multi-threaded
Hybrid method.

Multi-processing

In this a single process (parent process) initiates several single-threaded child processes and distribute incoming requests to these child processes. Each of the child processes are responsible for handling single request.

It is the responsibility of parent process to monitor the load and decide if processes should be killed or forked.

Multi-threaded

Unlike Multi-process, it creates multiple single-threaded process.

Hybrid

It is combination of above two approaches. In this approach multiple process are created and each process initiates multiple threads. Each of the threads handles one connection. Using multiple threads in single process results in less load on system resources.

Examples

Following table describes the most leading web servers available today:

S.N.	Web Server Descriptino
1	Apache HTTP Server This is the most popular web server in the world developed by the Apache Software Foundation. Apache web server is an open source software and can be installed on almost all operating systems including Linux, UNIX, Windows, FreeBSD, Mac OS X and more. About 60% of the web server machines run the Apache Web Server.
2.	Internet Information Services (IIS) The Internet Information Server (IIS) is a high performance Web Server from Microsoft. This web server runs on Windows NT/2000 and 2003 platforms (and may be on upcoming new Windows version also). IIS comes bundled with Windows NT/2000 and 2003; Because IIS is tightly integrated with the operating system so it is relatively easy to administer it.
3.	Lighttpd The lighttpd, pronounced lighty is also a free web server that is distributed with the FreeBSD operating system. This open source web server is fast, secure and consumes much less CPU power. Lighttpd can also run on Windows, Mac OS X, Linux and Solaris operating systems.
4.	Sun Java System Web Server This web server from Sun Microsystems is suited for medium and large web sites. Though the server is free it is not open source. It however, runs on Windows, Linux and UNIX platforms. The Sun Java System web server supports various languages, scripts and technologies required for Web 2.0 such as JSP, Java Servlets, PHP, Perl, Python, and Ruby on Rails, ASP and Coldfusion etc.
5.	Jigsaw Server Jigsaw (W3C's Server) comes from the World Wide Web Consortium. It is open source and free and can run on various platforms like Linux, UNIX, Windows, and Mac OS X Free BSD etc. Jigsaw has been written in Java and can run CGI scripts and PHP programs.

Proxy Server

Overview

Proxy server is an intermediary server between client and the interner. Proxy servers offers the following basic functionalities:

Firewall and network data filtering.
Network connection sharing
Data caching

Proxy servers allow to hide, conceal and make your network id anonymous by hiding your IP address.

Purpose of Proxy Servers

Following are the reasons to use proxy servers:

Monitoring and Filtering
Improving performance
Translation
Accessing services anonymously
Security

Monitoring and Filtering

Proxy servers allow us to do several kind of filtering such as:

Content Filtering
Filttering encrypted data
Bypass filters
Logging and eavasdropping

Improving performance

It fasten the service by process of retrieving content from the cache which was saved when previous request was made by the client.

Transalation

It helps to customize the source site for local users by excluding source content or substituting source content with original local content. In this the traffic from the global users is routed to the source website through Translation proxy.

Accessing services anonymously

In this the destination server receives the request from the anonymzing proxy server and thus does not receive information about the end user.

Security

Since the proxy server hides the identity of the user hence it protects from spam and the hacker attacks.

Type of Proxies

Following table briefly describes the type of proxies:

Forward Proxies

In this the client requests its internal network server to forward to the internet.

Open Proxies

Open Proxies helps the clients to conceal their IP address while browsing the web.

Reverse Proxies

In this the requests are forwarded to one or more proxy servers and the response from the proxy server is retrieved as if it came directly from the original Server.

Architecture

The proxy server architecture is divided into several modules as shown in the following diagram:

Proxy user interface

This module controls and manages the user interface and provides an easy to use graphical interface, window and a menu to the end user. This menu offers the following functionalities:

Start proxy
Stop proxy
Exit
Blocking URL
Blocking client
Manage log
Manage cache
Modify configuration

Proxy server listener

It is the port where new request from the client browser is listened. This module also performs blocking of clients from the list given by the user.

Connection Manager

It contains the main functionality of the proxy server. It performs the following functions:

It contains the main functionality of the proxy server. It performs the following functions:
Read request from header of the client.
Parse the URL and determine whether the URL is blocked or not.
Generate connection to the web server.
Read the reply from the web server.
If no copy of page is found in the cache then download the page from web server else will check its last modified date from the reply header and accordingly will read from the cache or server from the web.
Then it will also check whether caching is allowed or not and accordingly will cache the page.

Cache Manager

This module is responsible for storing, deleting, clearing and searching of web pages in the cache.

Log Manager

This module is responsible for viewing, clearing and updating the logs.

Configuration

This module helps to create configuration settings which in turn let other modules to perform desired configurations such as caching.

Search Engines

Introduction

Search Engine refers to a huge database of internet resources such as web pages, newsgroups, programs, images etc. It helps to locate information on World Wide Web.

User can search for any information by passing query in form of keywords or phrase. It then searches for relevant information in its database and return to the user.

Search Engine Components

Generally there are three basic components of a search engine as listed below:

Web Crawler
Database
Search Interfaces

Web crawler

It is also known as spider or bots. It is a software component that traverses the web to gather information.

Database

All the information on the web is stored in database. It consists of huge web resources.

Search Interfaces

This component is an interface between user and the database. It helps the user to search through the database.

Search Engine Working

Web crawler, database and the search interface are the major component of a search engine that actually makes search engine to work. Search engines make use of Boolean expression AND, OR, NOT to restrict and widen the results of a search. Following are the steps that are performed by the search engine:

The search engine looks for the keyword in the index for predefined database instead of going directly to the web to search for the keyword.
It then uses software to search for the information in the database. This software component is known as web crawler.
Once web crawler finds the pages, the search engine then shows the relevant web pages as a result. These retrieved web pages generally include title of page, size of text portion, first several sentences etc.

These search criteria may vary from one search engine to the other. The retrieved information is ranked according to various factors such as frequency of keywords, relevancy of information, links etc.

User can click on any of the search results to open it.

Architecture

The search engine architecture comprises of the three basic layers listed below:

Content collection and refinement.
Search core
User and application interfaces

Search Engine Processing

Indexing Process

Indexing process comprises of the following three tasks:

Text acquisition
Text transformation
Index creation

TEXT ACQUISITION

It identifies and stores documents for indexing.

TEXT TRANSFORMATION

It transforms document into index terms or features.

INDEX CREATION

It takes index terms created by text transformations and create data structures to suport fast searching.

Query Process

Query process comprises of the following three tasks:

User interaction
Ranking
Evaluation

USER INTERACTION

It supporst creation and refinement of user query and displays the results.

RANKING

It uses query and indexes to create ranked list of documents.

EVALUATION

It monitors and measures the effectiveness and efficiency. It is done offline.

Examples

Following are the several search engines available today:

Search Engine	Description
Google	It was originally called BackRub. It is the most popular search engine globally.
Bing	It was launched in 2009 by Microsoft. It is the latest web-based search engine that also delivers Yahoo’s results.
Ask	It was launched in 1996 and was originally known as Ask Jeeves. It includes support for match, dictionary, and conversation question.
AltaVista	It was launched by Digital Equipment Corporation in 1995. Since 2003, it is powered by Yahoo technology.
AOL.Search	It is powered by Google.
LYCOS	It is top 5 internet portal and 13th largest online property according to Media Matrix.
Alexa	It is subsidiary of Amazon and used for providing website traffic information.

Web Browser : Browsing

Web Browser

Web Browser

Architecture

Starting Internet Explorer

Accessing Web Page

Navigation

Saving Webpage

Saving Web Elements

Favourites

ADDING A WEB PAGE TO YOUR FAVOURITES

OPENING FAVOURITES

ORGANIZING FAVOURITES

Web Server

Overview

Web Server Working

Architecture

Concurrent Approach

Multi-processing

Multi-threaded

Hybrid

Examples

Proxy Server

Overview

Purpose of Proxy Servers

Monitoring and Filtering

Improving performance

Transalation

Accessing services anonymously

Security

Type of Proxies

Forward Proxies

Open Proxies

Reverse Proxies

Architecture

Proxy user interface

Proxy server listener

Connection Manager

Cache Manager

Log Manager

Configuration

Search Engines

Introduction

Search Engine Components

Web crawler

Database

Search Interfaces

Search Engine Working

Architecture

Search Engine Processing

Indexing Process

TEXT ACQUISITION

TEXT TRANSFORMATION

INDEX CREATION

Query Process

USER INTERACTION

RANKING

EVALUATION

Examples

Comments

Post a Comment