Showing posts with label APIs. Show all posts
Showing posts with label APIs. Show all posts

Tuesday, November 3, 2015

Using the wikipedia Python library (to search for oranges :)

By Vasudev Ram



Orange and orange juice image from Wikimedia Commons.

I had come across the wikipedia Python library some time ago. Note that I said "wikipedia Python library", not "Wikipedia Python API". That's because wikipedia is a Python library that wraps the Wikipedia API, providing a somewhat higher-level / easier-to-use interface to the programmer.

Here are a few basic ways of using this library:

First, install it with this command at your OS command line:
$ pip install wikipedia
(I am using $ as the command line prompt, so don't type it.)

Now the Python code snippets:
Import the library:
import wikipedia
Use the .page() method to saerch for a Wikipedia page:
print "1: Searching Wikipedia for 'Orange'"
try:
    print wikipedia.page('Orange')
except wikipedia.exceptions.DisambiguationError as e:
    print str(e)
    print 'DisambiguationError: The page name is ambiguous'
print
The output (partly truncated) is:
1: Searching Wikipedia for 'Orange'
"Orange" may refer to:
Orange (colour)
Orange (fruit)
Some other citrus or citrus-like fruit
Orange (manga)
Orange (2010 film)
Orange (2012 film)
Oranges (film)
The Oranges (film)
Orange Record Label
Orange (band)
Orange (Al Stewart album)
Orange (Jon Spencer Blues Explosion album)
"Orange" (song)
Between the Eyes
"L'Orange" (song)
DisambiguationError: The page name is ambiguous
Next, use the .page method with one of the results from above, which are actual page titles:
print "2: Searching Wikipedia for 'Orange_(fruit)'"
print wikipedia.page('Orange_(fruit)')
The output may not be what one expects:
2: Searching Wikipedia for 'Orange (fruit)'
<WikipediaPage 'Orange (fruit)'>
That'ss because the return value from the above call is a WikipediaPage object, not the page content itself. To get the content we want, we have to access the 'content' attrbute of the WikipediaPage object:
#print wikipedia.page('Orange_(fruit)').content
However, if we access it directly, we may get a Unicode error, so we encode it to UTF-8:
result = wikipedia.page('Orange_(fruit)').content.encode('UTF8')
print "3: Result of searching Wikipedia for 'Orange_(fruit)':"
print result
orange_count = result.count('orange')
print
print "The Wikipedia page for 'Orange_(fruit)' has " + \
    "{} occurrences of the word 'orange'".format(orange_count)
Here are the first few lines of the output, followed by the count at the end:
3: Result of searching Wikipedia for 'Orange_(fruit)':
The orange (specifically, the sweet orange) is the fruit of the citrus species Citrus × sinensis in the family Rutaceae.
The fruit of the Citrus × sinensis is considered a sweet orange, whereas the fruit of the Citrus × aurantium is considered a bitter orange. The sweet orange reproduces asexually (apomixis through nucellar embryony); varieties of sweet orange arise through mutations.
The orange is a hybrid, between pomelo (Citrus maxima) and mandarin (Citrus reticulata). It has genes that are ~25% pomelo and ~75% mandarin; however, it is not a simple backcrossed BC1 hybrid, but hybridized over multiple generations. The chloroplast genes, and therefore the maternal line, seem to be pomelo. The sweet orange has had its full genome sequenced. Earlier estimates of the percentage of pomelo genes varying from ~50% to 6% have been reported.
Sweet oranges were mentioned in Chinese literature in 314 BC. As of 1987, orange trees were found to be the most cultivated fruit tree in the world. Orange trees are widely grown in tropical and subtropical climates for their sweet fruit. The fruit of the orange tree can be eaten fresh, or processed for its juice or fragrant peel. As of 2012, sweet oranges accounted for approximately 70% of citrus production.
In 2013, 71.4 million metric tons of oranges were grown worldwide, production being highest in Brazil and the U.S. states of Florida and California.

The Wikipedia page for 'Orange_(fruit)' has 172 occurrences of the word 'orange'
- Enjoy.

- Vasudev Ram - Online Python training and programming

Signup to hear about new products and services I create.

Posts about Python  Posts about xtopdf

My ActiveState recipes

Saturday, March 7, 2015

PDFCrowd and its HTML to PDF API (for Python and other languages)

By Vasudev Ram


PDFcrowd is a web service that I came across recently. It allows users to convert HTML content to PDF. This can be done both via the PDFcrowd site - by entering either the content or the URL of an HTML page to be converted to PDF - or via the PDFcrowd API, which has support for multiple programming languages, including for Python. I tried multiple approaches, and all worked fairly well.

A slightly modified version of a simple PDFcrowd API example from their site, is shown below.

# Demo program to show how to use the PDFcrowd API
# to convert HTML content to PDF.
# Author: Vasudev Ram - www.dancingbison.com

import pdfcrowd

try:
    # create an API client instance
    # Dummy credentials used; to actually run the program, enter your own.
    client = pdfcrowd.Client("user_name", "api_key")
    client.setAuthor('author_name')
    # Dummy credentials used; to actually run the program, enter your own.
    client.setUserPassword('user_password')

    # Convert a web page and store the generated PDF in a file.
    pdf = client.convertURI('http://www.dancingbison.com')
    with open('dancingbison.pdf', 'wb') as output_file:
        output_file.write(pdf)
    
    # Convert a web page and store the generated PDF in a file.
    pdf = client.convertURI('http://jugad2.blogspot.in/p/about-vasudev-ram.html')
    with open('jugad2-about-vasudevram.pdf', 'wb') as output_file:
        output_file.write(pdf)

    # convert an HTML string and save the result to a file
    output_file = open('html.pdf', 'wb')
    html = "My Small HTML File"
    client.convertHtml(html, output_file)
    output_file.close()

except pdfcrowd.Error, why:
    print 'Failed:', why
I used three calls to the API. For the first two calls, the inputs were: 1) my web site, 2) the about page of my blog.

Screenshots of the results of those two calls are below. You can see that they correspond closely to the originals.

Screenshot of generated PDF of dancingbison.com site



Screenshot of generated PDF of About Vasudev Ram page on jugad2.blogspot.com blog



- Vasudev Ram - Online Python training and programming

Dancing Bison Enterprises

Signup to hear about new Python or PDF related products created by me.

Posts about Python  Posts about xtopdf

Contact Page

Friday, January 30, 2015

Twitter Adds Group Messaging and Native Video

By Vasudev Ram

Saw this via a tweet.

http://www.programmableweb.com/news/twitter-adds-group-messaging-and-native-video/elsewhere-web/2015/01/27


Should be interesting to check it out once ready. They say the group messaging feature will allow for messaging to small groups of up to 20 at a time. Could be useful for some applications.

- Vasudev Ram - Dancing Bison Enterprises

Signup to hear about my new software products.


Contact Page


Wednesday, December 17, 2014

Tortilla, a Python API wrapper

By Vasudev Ram



tortilla is a Python library for wrapping APIs. It's headline says "Wrapping web APIs made easy."

It can be installed with:
pip install tortilla
I tried it out, and slightly modified an example given in its documentation, to give this:
# test_tortilla.py
import tortilla
github = tortilla.wrap('https://api.github.com')
user = github.users.get('redodo')
for key in user:
    print key, ":", user[key]
That code uses the Github API (wrapped by tortilla) to get the information for user redodo, who is the creator of tortilla.
Here is the output of running:
python test_tortilla.py
bio : None
site_admin : False
updated_at : 2014-12-17T16:39:55Z
gravatar_id : 
hireable : True
id : 2227416
followers_url : https://api.github.com/users/redodo/followers
following_url : https://api.github.com/users/redodo/following{/other_user}
blog : 
followers : 6
location : Kingdom of the Netherlands
type : User
email : dodo@gododo.co
public_repos : 9
events_url : https://api.github.com/users/redodo/events{/privacy}
company : 
gists_url : https://api.github.com/users/redodo/gists{/gist_id}
html_url : https://github.com/redodo
subscriptions_url : https://api.github.com/users/redodo/subscriptions
received_events_url : https://api.github.com/users/redodo/received_events
starred_url : https://api.github.com/users/redodo/starred{/owner}{/repo}
public_gists : 0
name : Hidde Bultsma
organizations_url : https://api.github.com/users/redodo/orgs
url : https://api.github.com/users/redodo
created_at : 2012-08-27T13:03:15Z
avatar_url : https://avatars.githubusercontent.com/u/2227416?v=3
repos_url : https://api.github.com/users/redodo/repos
following : 2
login : redodo
Adding:
print type(user)
to the end of test_tortilla.py, shows that the user object is of type bunch.Bunch.

Bunch is a Python module providing "a dictionary that supports attribute-style access, a la JavaScript."

Did you know that tortillas are roughly similar to rotis?

- Vasudev Ram - Dancing Bison Enterprises

Signup for news about new products from me.

Contact Page

Thursday, September 18, 2014

API Developer Weekly

By Vasudev Ram

Saw this today, via this tweet by John Musser, founder of ProgrammableWeb:

API Developer Weekly, as the name suggests, is a weekly about APIs for developers, sponsored by LaunchAny and Casey Software.

From their site:

"API Developer Weekly is a weekly newsletter that is hyper-focused on the business, design, development, and deployment of APIs for web and mobile apps."

Just signed up for it and will see how interesting it is, over the next few weeks. I've been working on designing and implementing APIs for a while now, and also evangelizing them to some extent - see here, for example:

Winners of Bit.ly API Contest announced

The people behind API Developer Weekly are also coming out with an API Design Book.

If you want to try it, you can sign up for it here:

API Developer Weekly

- Vasudev Ram - Dancing Bison Enterprises - Online Python training

Contact Page

Wednesday, January 16, 2013

Some Google command line tools

googlecl is a command line tool (written in Python) that lets you use some Google services, such as Blogger, Calendar, Contacts, Docs, Finance,Picasa and YouTube.

googlecl - Command line tools for the Google Data APIs - Google Project Hosting

You can use it to add a post to your blog, add a calendar entry, edit a document (the editor to use is a configurable option), download your contacts, upload a video, etc. For those used to the command line, it can be faster than accessing the corresponding service via the web interface.

gsutil allows you to manipulate your data on cloud storage providers such as Google Storage and Amazon S3. It uses the boto Python library, written by Mitch Garnaat, which I had blogged about earlier.

code.google.com/p/gsutil/

https://github.com/boto/boto#readme

bq is a Python command line tool to access Google BigQuery, which I had also blogged about before.

https://developers.google.com/bigquery/docs/cli_tool

If you are interested in creating command line tools for Unix or Linux systems, you may like to read my tutorial article on the topic, published on IBM developerWorks:

Developing a Linux command-line utility:

http://www.ibm.com/developerworks/linux/library/l-clutil/

Though it uses the C language, many of the concepts discussed in the article are applicable to writing command line tools in Python as well, because, when writing such tools, you will basically be using the C library and Unix features such as standard input, standard output, pipes and I/O redirection, but via Python.

- Vasudev Ram
www.dancingbison.com

Thursday, August 23, 2012

GeoNames, big geographical DB with web services

By Vasudev Ram


Seen via Smashing Magazine on Twitter.

GeoNames is a "geographical database that covers all countries and contains over eight million placenames that are available for download free of charge."

I did a quick search for the city where I live, from the GeoNames home page, and it returned a lot of data (with links), not just for the city itself, but for various other related data, such as the division / sub-division in which the city is, some well-known landmarks like hotels in the city, etc.

The site / web services (see below) could be useful for apps that need to be geographically-aware in some way. To get a quick feel for the GeoNames web services, try this demo API call in your browser. It uses the GeoNames postalCodeSearch service with a parameter postalcode=9011, to return geographical information for St. Gallen in Switzerland, as well as other places in the world (such as in Hungary, Norway, Argentina, etc.), which have that postal code.

About GeoNames

GeoNames team

GeoNames has web services available for use. Most of the services (see overview) seem to use XML or JSON as formats, though other formats such as CSV, TXT and RSS also are supported for a few of them.

The data is provided free, commercial usage is allowed, and there are limits on daily and hourly usage. NOTE: The data is provided "as-is-where-is", i.e. with no guarantees.

They also have commercial web services which provide better speed, etc. and have Service Level Agreements.

If you wish, you can just download the daily database export, and build your app using that, instead of using the web services.

- Vasudev Ram - Dancing Bison Enterprises

Sunday, August 19, 2012

DocRaptor, HTML to PDF convertor

By Vasudev Ram


DocRaptor is a tool for HTML to PDF conversion.
It supports HTTP POST requests using C#, Curl, jQuery, Node.js, PHP, Prototype.js, Python, Ruby, and Rails.

DocRaptor examples

DocRaptor Python examples

- Vasudev Ram - Dancing Bison Enterprises



Thursday, October 13, 2011

PyAudio and PortAudio - like ODBC for sound

By Vasudev Ram - dancingbison.com | @vasudevram | jugad2.blogspot.com

PyAudio and PortAudio are I/O libraries for sound (i.e. audio).

I'm stretching the analogy a bit here, but they made me think:

"... like ODBC for sound". (*)

PyAudio is a Python interface to PortAudio.

PyAudio:

http://people.csail.mit.edu/hubert/pyaudio/

Excerpt:

[ PyAudio provides Python bindings for PortAudio, the cross-platform audio I/O library. With PyAudio, you can easily use Python to play and record audio on a variety of platforms. ]

PortAudio:

http://www.portaudio.com/

PortAudio apps:

http://www.portaudio.com/apps.html

I installed PyAudio for Windows. Installation was trivial. It also automatically installed the PortAudio DLL (actually, the .pyd file).

I then tried a PyAudio Python program from the docs to play a small .WAV file. It worked.

(*) That's because PyAudio and PortAudio support both:

a) different platforms (Windows, Linux, Mac, UNIX)

b) different "Host APIs" on the same platform, where the different Host API's have, obviously, different API's, but PortAudio (and hence PyAudio) hide those differences behind a uniform interface (to some extent).


UPDATE: If you interested in checking out other Python multimedia libraries, you may also like to read this earlier post of mine about pyglet:

Playing an MP3 with pyglet and Python is easy

pyglet has some of the same benefits as PyAudio - cross-platform, easy to use, etc.

Posted via email

- Vasudev Ram @ Dancing Bison

Tuesday, May 3, 2011

Try the Python WinSound API

By Vasudev Ram - www.dancingbison.com


UPDATE: Added more info and a link to the WinSound docs to this post.


The Python WinSound API is built-in to Python on the Windows platform. You can use it to play a sound of a specific frequency for a specific duration (in milliseconds). The API also lets you play a sound from a WAV file, or from the contents of a WAV file stored in a string, and also lets you play standard Windows registry sounds that are associated with different events/actions in Windows, such as displaying different kinds of dialogs (alert, question, exiting Windows sound, etc.). It also lets you play a sound repeatedly.


This is the Python docs page for the WinSound API


Here is a very simple program to try out the Python WinSound API.

Copy-paste the program below into a text editor file, save it with some filename like test_win_sound.py and then run it with Python like this:

C:\> python test_win_sound.py

Switch on your PC speakers first, if needed.

The program has a doubly nested loop that iterates over a few durations and within those, iterates over a few frequencies, and in each iteration, plays a sound at that frequency for that duration.

Here is the program:


#------------------------------------------------------------
# test_win_sound.py - testing the Python WinSound API.
# Author: Vasudev Ram - www.dancingbison.com
# Will work only on Windows.
#------------------------------------------------------------
# imports

import time import winsound

#------------------------------------------------------------

def play_note(): pass

#------------------------------------------------------------

def test_play_freq_list(freq_list, dur=500): for freq in freq_list: winsound.Beep(freq, dur)

#------------------------------------------------------------

def main(): for dur in (100, 200, 300, 400): test_play_freq_list(range(300, 1200, 100), dur) time.sleep(2)

#------------------------------------------------------------

if __name__ == "__main__": main()

#------------------------------------------------------------


Posted via email
- Vasudev Ram - Dancing Bison Enterprises