Intranet (closed)

 

Foreword

Before beginning implementation of the Intranet (closed) tag, please make sure you have selected the Client-side user ID plugin and set the mode to “always” in the Tag Composer interface:

2016-11-14_09h15_41
 

Principle

When tagging a closed intranet (no external logins), data cannot be directly transmitted to our collection servers. It will therefore be necessary to use a local collection server in order to retrieve the data, which will later be imported into our system.

Diagram of the architecture to be set up for closed intranet measurement

The orange zone represents the company 's internal network which the Intranet is located in. This network is generally cut off from the Internet so that no access is possible via the Internet.
The green frame represents an internal network which is partially accessible via the Internet.

1) The Internet user requests access to an Intranet page.
2) The Intranet server sends back the content of the page (text, images, etc,) and the Intranet tag.
3) The Intranet tag is executed on the Intranet user’s work station by retrieving certain information at the same time.
4) The Intranet user requests an image on the Intranet collection server by adding all of the collected information as parameters.
5) The Intranet collection server saves the request in a log file and sends back an image of 1 pixel by 1 pixel.
6) The collection server sends the logs generated on the ftp server, which can be accessed externally.
7) The AT Internet calculation server makes a request for logs to be retrieved as ftp files.
8) The ftp accepts and sends the log files.

Data collection occurs on the internal network where there is no possibility of external communication. Using this procedure means that you can disable access from the Internet, the entire network is therefore safe from external access.

When the data collection process is complete, the collection server sends this data (in the form of hourly log files) to the FTP server, and not vice-versa. Step n°6 shows that there is a flow leaving the internal network and not entering it.

The FTP server is located in a “web” zone with the same level of security as the web servers which host the sites. We can nevertheless limit ftp access to this server so that only our server may request information from it (filter on IP by the AT Internet server).

 

Technical setup

 

Creating an account/client configuration (AT Internet)

  • Creation of a specific log dedicated to receiving data: a specific site ID is attributed to the Intranet during the implementation of the process with the client.
  • Creation of account and site: an area which is dedicated to receiving data and a robot for recovering information is implemented.
  • Sending documents related to technical implementation
  • The following documents are sent to the client 's IT department to allow for technical implementation:
    • utils/code iso pays.xls
    • utils/aide_configuration_Apache.txt
    • utils/aide_configuration_IIS.txt
    • utils/spacer.gif
 

Implementing the tags

After setting up a buffer server, you will have a collection address that takes the following format:
http://myxiti.intranet.net/spacer.gif

It can be broken down into three parts:

  1. myxiti: collection subdomain
  2. intranet.net: collection domain
  3. /spacer.gif: collection pixel

These three elements are necessary in order to correctly initialise the tracker:

var config = {
  log: 'myxiti',
  domain: 'intranet.net',
  pixelPath: '/spacer.gif'
}
var tag = new ATInternet.Tracker.Tag(config);

For all that follows, the tagging remains similar to that of a regular website.

 

Implementing data collection and log management

 

Log files

Log files are hourly files (one per hour during the day) and must adhere to a format so that they can be processed. They are processed by the same servers which are used to process the standard files which are managed by our collectors and must therefore be formatted according to the same model. The format of the logs relates both to name and content.

Log files must only contain hits which relate to tag calls. In the example given, this means that only the lines relating to the file “/spacer.gif” are measured. In this case, the server is dedicated to the collection of Intranet logs, and it is necessary to extract all of the data (correctly formatted) on an hourly basis. If the collection server also processes other data, it will be necessary to filter on the appropriate file “/spacer.gif”. In this second case, it may be useful to provide a label which can be found easily in the logs, for example, “/intranet_hit.xiti”.

 

Configuring the logs

The files are stored on the collection server, and because of this we strongly recommend that you modify the configuration of the log data storage (if necessary). In doing so it will be possible to:

  • configure log files which are specific to the tag (hour by hour according to the type of server and the frequency of rotation of logs)
  • format the chain expected on each of the lines of hits
  • avoid post collection data processing, which is costly to develop and maintain

Examples of Apache and IIS server configurations are given in the documentation:

  • utils/aide_configuration_Apache.txt
  • utils/aide_configuration_IIS.txt

Rules for naming log files to be supplied:

exYYMMDDHH.log (zipped into exYYMMDDHH.zip)
  • YY: year
  • MM: month
  • DD: day
  • HH: hour

The registration pattern must be as follows (separator : space):

  • Hour
  • IP
  • Image
  • Querystring
  • User-agent
  • Url_of_the_page_loaded

Which gives the following if we take the example of the previous tag example:

00:53:23 123.123.234.215 /spacer.gif s=11025&s2=5&p=section_admin::gestion_PLV&di=1&idpays=fr&idprov=43hl=0x53x17&cn=modem&hm=0&lng=fr-ch&r=1680x1050xundefinedx32&idclient=6443799&idpays=fr&idprov=43&ref=http://www.google.fr mozilla/4.0+(compatible;+msie+6.0;+windows+nt+5.1;+funwebproducts;+.net+clr+1.1.4322) http://myxiti.intranet.net/admin/gestion_plv.asp
 

IIS specifics

Certain IIS servers may generate logs which are noticeably different with regard to their internal configuration (cannot be modified by a user).  The “IP user” column may appear in 4th position instead of the expected 2nd position. With IIS it is only possible to select the columns to be measured, without being able to specify the order in which they appear. It is not possible to deliver the logs according to the format expected AT Internet. In this case, you will obtain a line as shown in the example below, the 2nd type of format for logs is measured in this case:

Example: IIS format 2

00:53:23 /spacer.gif s=11025&s2=5&p=section_admin::gestion_PLV&di=1&hl=0x53x17&cn=modem&hm=0&lng=fr-ch&r=1680x1050xundefinedx32&idclient=6443799&idpays=fr&idprov=43&ref=http://www.google.fr 123.123.234.215 mozilla/4.0+(compatible;+msie+6.0;+windows+nt+5.1;+funwebproducts;+.net+clr+1.1.4322) http://myxiti.intranet.net/admin/gestion_plv.asp
 

File transfers between the client and AT Internet

The client places their files on an FTP server hosted by AT Internet.

The domain to whitelist is: corpoftp.xiti.com.

 

Transfer

The files sent to us will be zipped without encryption according to the pattern specified in the section “Rules for naming log files”.

The files from the previous day are recovered at 5am (GMT +1) in the following chronological order:

  • At 5am (GMT +1), the file 00 from the previous day is recovered
  • Slightly later, file 01 and finally,
  • File 23

After this time, all files from the previous day must be made available by the client. Those which are not available will not be processed.

It is strongly recommended that the hourly log files be made available to us every hour.

 

Example of an empty hourly log

The client must generate a file in the hourly log format with the word “empty” in it. It will therefore be taken into account, without generating a warning of the possible deposit of a defective file.

The transfers are subject to a definitive validation based on one test day by a designated member of staff at AT Internet. From this date, an automated process can be implemented.

Last update: 21/04/2023