The Agent
require 'rubygems' require 'mechanize' agent = WWW::[Link] # disable keep_alive, when running into # problems with session timeouts or getting an # EOFError agent.keep_alive = false # Setting user agent: agent.user_agent = 'Friendly Mechanize Script" # Using one of the predefined user agents: # 'Mechanize', 'Mac Mozilla', 'Linux Mozilla'. # 'Windows IE 6', 'iPhone', 'Linux Konqueror', # 'Windows IE 7', 'Mac FireFox', 'Mac Safari', # 'Windows Mozilla' agent.user_agent_alias = 'Mac Safari' # To verify server certificates: # (A collection of certificates is available # here: [Link] ) agent.ca_file = '[Link]' # Don't follow HTTP redirects agent.redirect_ok = false # Follow refresh in meta tags agent.follow_meta_refresh = true # Enable logging require 'logger' [Link] = [Link]('[Link]')
# The current page [Link]
Accessing page elements
Forms
# Submitting a form without a button: [Link] # Submitting a form with the default button form.click_button() # Submitting a form with a specific button form.click_button(form.button_with(:name => 'OK') # Form elements [Link], [Link], form.file_uploads, form.radio_buttons, [Link] # Form elements can be selected just like page elements # [Link](s)_with(:criteria => value) # e.g.: form.field_with(:name => 'password') form.field_with('password') [Link](:value => /view_.*/) # Field values can also be selected directly by their name [Link] = 'secret' # Setting field values # field : .value = 'something' # checkbox : .(un)check / .checked = true|false # radio_button: .(un)check / .checked = true|false # file_upload : .file_name = '/tmp/[Link]' # e.g.: form.field_with('foo').value = 'something' form.checkbox_with(:value => 'blue').uncheck form.radio_buttons[3].check # Select lists / drop down fields: form.field_with('color').option[2].select form.field_with('color').[Link]{|o| [Link] == 'red'}.select form.field_with('color').select_none form.field_with('color').select_all
# The HTML page content [Link] # forms, links, frames [Link], [Link], frames # Selecting by criteria follows the pattern: # [Link](s)_with(:criteria => value) # The plural form (.elements) returns an # array, the singular form (.element) the # first matching element or nil. Criteria # is an attribute symbol and value may be # a string or a regular expression. If no # criteria attributr is given, :name will # be used. e.g.: page.form_with(:name => 'formName') page.form_with('formName') page.links_with(:text => /[0-9]*/
Ruby / Mechanize
[Link]
Nokogiri
[Link]
Parsing the page content Hello Mechanize! Navigation/History
# load a page [Link]('[Link] # Go back to the last page: [Link] # Follow a link by its text agent.link_with(:text => 'click me').click # Backup history, execute block and # restore history [Link] do ... end require 'rubygems' require 'mechanize' agent = WWW::[Link] [Link]('[Link] [Link] = 'mechanize' [Link].click_button [Link].link_with(:text => /WWW::Mechanize/).click [Link].link_with(:text => 'Files').click links = [Link] / 'strong/a' version = [Link] do |link| link['href'] =~ /shownotes.*release_id/ [Link] puts "Hello Mechanize #{version}!"
# Selecting elements from the documents DOM nodes = [Link]('expression') nodes = [Link] / 'expression' # Selecting the first matching element or nil node = [Link]('expression') # 'expression' might be an XPath or CSS selector nodes = [Link]('//h2/a[@class="title"]') nodes = [Link]('.h2 [Link]') # navigating the document tree: [Link] [Link] # node content and attributes [Link] node.inner_html [Link]['width'] # found nodes, can be searched the same way rows = [Link] / 'table/tr' value = rows[0].at('td[@class="value"]').text
Version 2010-01-30 (c) 2010 Tobias Grimm
Creative Commons License [Link]