Showing posts with label anglais. Show all posts

Wednesday, August 26, 2020

Jupyter: JUlia PYThon and R

it's "ggplot2", not "ggplot", but it is ggplot()

Did you know that @projectJupyter's Jupyter Notebook (and JupyterLab) name came from combining 3 programming languages: JUlia, PYThon and R.

Readers of my blog do not need an introduction to Python. But what about the other 2?

Today we will talk about R. Actually, R and Python, on the Raspberry Pi.

R Origin

R traces its origins to the S statistical programming language, developed in the 1970s at Bell Labs by John M. Chambers. He is also the author of books such as Computational Methods for Data Analysis (1977) and Graphical Methods for Data Analysis (1983). R is an open source implementation of that statistical language. It is compatible with S but also has enhancements over the original.

A quick getting started guide is available here: https://support.rstudio.com/hc/en-us/sections/200271437-Getting-Started

Installing Python

As a recap, in case you don't have Python 3 and a few basic modules, the installation goes as follow (open a terminal window first):

pi@raspberrypi: $ sudo apt install python3 python3-dev build-essential

pi@raspberrypi: $ sudo pip3 install jedi pandas numpy

Installing R

Installing R is equally easy:

pi@raspberrypi: $ sudo apt install r-recommended

We also need to install a few development packages:

pi@raspberrypi: $ sudo apt install libffi-dev libcurl4-openssl-dev libxml2-dev

This will allow us to install many packages in R. Now that R is installed, we can start it:

pi@raspberrypi: $ R

Installing packages

Once inside R, we can install packages using install.packages('name') where name is the name of the package. For example, to install ggplot2 (to install tidyverse, simply replace ggplot2 with tidyverse):

> install.packages('ggplot2')

To load it:

> library(ggplot2)

And we can now use it. We will use the mpg dataset and plot displacement vs highway miles per gallon and set the color to:

>ggplot(mpg, aes(displ, hwy, colour=class))+

geom_point()

Combining R and Python

We can go at this 2 ways, from Python call R, or from R call Python. Here, from R we will call Python.

First, we need to install reticulate (the package that interfaces with Python):

> install.packages('reticulate')

And load it:

> library(reticulate)

We can verify which python binary that reticulate is using:

> py_config()

Then we can use it to execute some python code. For example, to import the os module and use os.listdir(), from R we do ($ works a bit in a similar fashion to Python's .):

> os <- import("os")

> os$listdir(".")

Or even enter a Python REPL:

> repl_python()
>>> import pandas as pd

>>>

Type exit to leave the Python REPL.

One more trick: Radian

we will now exit R (quit()) and install radian, a command line REPL for R that is fully aware of the reticulate and Python integration:

pi@raspberrypi: $ sudo pip3 install radian

pi@raspberrypi: $ radian

This is just like the R REPL, only better. And you can switch to python very quickly by typing ~:

r$> ~

As soon as the ~ is typed, radian enters the python mode by itself:

r$> reticulate::repl_python()

>>>

Hitting backspace at the beginning of the line switches back to the R REPL:

r$>

I'll cover more functionality in a future post.

Francois Dion

@f_dion

Tuesday, October 11, 2016

PyData Carolinas 2016 Tutorial: Datascience on the web

PyData Carolinas 2016

Don Jennings and I presented a tutorial at PyData Carolinas 2016: Datascience on the web.

The plan was as follow:

Description

Learn to deploy your research as a web application. You have been using Jupyter and Python to do some interesting research, build models, visualize results. In this tutorial, you’ll learn how to easily go from a notebook to a Flask web application which you can share.

Abstract

Jupyter is a great notebook environment for Python based data science and exploratory data analysis. You can share the notebooks via a github repository, as html or even on the web using something like JupyterHub. How can we turn the work we have done in the notebook into a real web application?

In this tutorial, you will learn to structure your notebook for web deployment, how to create a skeleton Flask application, add a model and add a visualization. While you have some experience with Jupyter and Python, you do not have any previous web application experience.

Bring your laptop and you will be able to do all of these hands-on things:

get to the virtual environment
review the Jupyter notebook
refactor for reuse
create a basic Flask application
bring in the model
add the visualization
profit!

Now that is has been presented, the artifacts are a github repo and a youtube video.

Github Repo

https://github.com/fdion/pydata/

After the fact

The unrefactored notebook is here while the refactored one is here.

Once you run through the whole refactored notebook, you will have train and test sets saved in data/ and a trained model in trained_models/. To make these available in the tutorial directory, you will have to run the publish.sh script. On a unix like environment (mac, linux etc):

chmod a+x publish.sh
./publish.sh

Video

The whole session is now on youtube: Francois Dion & Don Jennings Datascience on the web

Francois Dion
@f_dion

Friday, September 30, 2016

5 music things

5 in 5

I like to cover 5 things in 5 minutes for lightning talks. Or one thing. At the local
Python user group, sometimes questions or other circumstances turn these 5
in 5 more into a 5 in 10-15...

5 Music Things

Eventually, after a year or two, I'll revisit a subject. I recently noticed that I had
not talked about music related things in almost two and a half years, so I did
5 quick Jupyter notebooks and presented that. Interestingly enough, none of
these 5 things were covered back then. The github repo includes edited versions
of the notebooks, based on the interactions at the meeting during my presentation.

Requirements: All require the following

pip install jupyter

Alphabetically...

1 - Audio

Notebook

2 - libROSA

Here we will need to pip install matplotlib and numpy, and of course librosa.

Notebook

3 - music21

pip install music21

You'll need some external programs: Lilypond and Musescore

You also need launch scripts for each of them. On a mac, use the provided
launch scripts in the mac/ folder of this repo. Make sure you chmod a+x them.
Change the path in the notebook to reflect your own user path.

Notebook

4 - python-sonic

pip install python-sonic

You'll need one external program: Sonic Pi and to start it before running through
the notebook.

Notebook

5 - pyKnon

pip install pyknon

You'll need one external program: timidity

easily installed:

in Linux with apt-get install timidity
on a Mac with brew install timidity

This was mostly an excuse to demo that external command line tools like timidity
or sox can be used here.

Notebook

Have fun!

@f_dion - francois(dot)dion(at)gmail(dot)com

P.S.: Github repo at: https://github.com/fdion/5_music_things but for some strange reason, github will not render the first (0-StartHere) notebook. This blog post is basically that notebook, putting things in context.

Sunday, September 25, 2016

Something for your mind: Polymath Podcast Episode 001

Two topics will be covered:

Chipmusic, limitations and creativity

Numfocus (Open code = better science)

The numfocus interview was recorded at PyData Carolinas 2016. There will be a future episode covering the keynotes, tutorials, talks and lightning talks later this year. This interview was really more about open source and less about PyData.

The episode concludes with Learn more, on Claude Shannon and Harry Nyquist.

Something for your mind is available on

art·chiv.es

/'ärt,kīv/

at artchiv.es/s4ym/

Francois Dion
@f_dion

Sunday, September 18, 2016

Something for your mind: Polymath Podcast launched

Some episodes

will have more Art content, some will have more Business content, some will have more Science content, and some will be a nice blend of different things. But for sure, the show will live up to its name and provide you with “something for your mind”. It might raise more questions than it answers, and that is fine too.

Episode 000

Listen to Something for your mind on http://Artchiv.es

Francois Dion
@f_dion

Sunday, August 9, 2015

Computing at 80,000ft, future tech and the future of tech

Another exciting Winston-Salem Section meeting at CDI ! Wednesday, August 12, 2015 at 11:30am.

Presenter: Francois Dion

Originally from Montreal, Canada, Francois Dion is a Coder, Data Scientist, Entrepreneur, Hacker, Mentor, Musician, Polyglot, Photographer, Polymath and Sound Engineer. He is the founder of Dion Research LLC, an R&D firm specializing in Fully Integrated Software and Hardware (www.dionresearch.com) and works as a Data Scientist at Inmar, Inc. (www.inmar.com).

He is also the founder of the local Python user group (PYPTUG), a group he founded to “promote and advance computing, electronics and science in general in North Carolina using the Python programming language.”

Detail:

Behind the scene and various aspects of electronics and computing cluster and data science in near space. A glimpse at future technology. and at the future of technology.

When
Date: 12-August-2015
Time: 11:30AM to 01:30PM (2.00 hours)
All times are: America/New_York

Add Meeting to Calendar

Where

Building: CDI

Center For Design Innovation

450 Design Ave.
Winston Salem, North Carolina
United States 27102

Staticmap?size=800x400&sensor=false&zoom=16&markers=36.0906389%2c-80

Tuesday, December 4, 2012

On fractals

I did say I'd talk about fractals, didn't I? I've been fascinated by them for a good 25 years now... A few weeks ago I attended a presentation at the local IEEE chapter. It didn't feature a lot of graphics, but instead it focused on practical applications of fractals to analyze lots of data. So I figured I'd bring them up at some point on this blog.

Today I'll mention some stuff just to whet your appetite.

Definition...

Or perhaps not. According to wikipedia:

"There is some disagreement amongst authorities about how the concept of a fractal should be formally defined. The general consensus is that theoretical fractals are infinitely self-similar, iterated and detailed mathematical constructs having fractal dimensions, of which many examples have been formulated and studied in great depth..."

Clear as mud, I'm sure. But still, fractals are quite interesting. Not just from a mathematical standpoint, but from a wide variety of angles.

Back when I was in school, one of my math teachers talked a good bit about fractals. The term had been around less than 10 years, but already it was quite popular in academic circles. I don't really remember anything he told us, but I will never forget the picture of the Mandelbrot set he showed us.

To him, fractals were fascinating equations. To us, they were fascinating graphics:

Actually... no, that is the Python code!

Yep, that's it! The Mandelbrot set

Regarding the source code, that's not how Python code looks usually. This is what is called obfuscated code. It is an art form in itself. The source was taken from the Preshing blog.

For a non obfuscated version of the code, and lots of explanations, check out: Python Patterns.

Sierpinski

Before you can run, you have to learn how to walk. Mandelbrot demanded a lot of horsepower back in the mid 80s. My Apple ][ was only 1MHz and my Mac, 8MHz. It would have taken days. So I played around with assembler and Pascal, doing this:


                               @                               
                              @ @                              
                             @   @                             
                            @ @ @ @                            
                           @       @                           
                          @ @     @ @                          
                         @   @   @   @                         
                        @ @ @ @ @ @ @ @                        
                       @               @                       
                      @ @             @ @                      
                     @   @           @   @                     
                    @ @ @ @         @ @ @ @                    
                   @       @       @       @                   
                  @ @     @ @     @ @     @ @                  
                 @   @   @   @   @   @   @   @                 
                @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @                
               @                               @               
              @ @                             @ @              
             @   @                           @   @             
            @ @ @ @                         @ @ @ @            
           @       @                       @       @           
          @ @     @ @                     @ @     @ @          
         @   @   @   @                   @   @   @   @         
        @ @ @ @ @ @ @ @                 @ @ @ @ @ @ @ @        
       @               @               @               @       
      @ @             @ @             @ @             @ @      
     @   @           @   @           @   @           @   @     
    @ @ @ @         @ @ @ @         @ @ @ @         @ @ @ @    
   @       @       @       @       @       @       @       @   
  @ @     @ @     @ @     @ @     @ @     @ @     @ @     @ @  
 @   @   @   @   @   @   @   @   @   @   @   @   @   @   @   @ 
@ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ @

It is a text representation of a Sierpinski triangle or gasket. It is a Fractal that was discovered by the Polish mathematician Waclaw Sierpinski in 1915, before we even had a name for fractals (that was coined by Benoit Mandelbrot in 1975).

Perhaps not that impressive nowadays, but it was a lot of assembler code to do this. In Pascal too, I seem to remember it was at least 60 lines.

On Rosettacode, you can find the following Python program that does the equivalent:

def sierpinski(n):
    d = ["*"]
    for i in xrange(n):
        sp = " " * (2 ** i)
        d = [sp+x+sp for x in d] + [x+" "+x for x in d]
    return d
 
print "\n".join(sierpinski(4))

But dont be fooled by that short piece of code, it is a complex subject.

Other fractals that have been around for a long long time and dont require a lot of computing power, are the cantor set, the Heighway Dragon and the Koch snowflake. All of them are examples of iterated function systems (IFS). The Mandelbrot and Julia sets are of a different type of fractals: escape time fractals. Also, there have been some interesting links made between the Cantor set and Fibonacci's series, so it is a normal continuation from that theme, since I've had a few blog entries on Fibonacci: here, here and here.

Fractint

In 1988 appeared on BBSes and usenet the program Fract386, and renamed Fractint the following year. That was pretty exciting, we could render Mandelbrot without hardware floating point math. It was fast! Well... compared to what it was up to that point...

Mandelbrot set

Fractint is a DOS program. It is also available for Windows (old windows 3.1...) and Linux now, but let's stay with this concept for a minute. How can we run DOS programs on the Raspberry Pi?

Video mode selection

Dosbox

Dosbox is a dos on x86 emulator. It runs on non x86 platform, such as the Raspberry Pi (ARM processor) without problems. By comparison, WINE does not work on the Pi.

$ sudo apt-get install dosbox

Once installed, run it once so the configuration file gets written, then exit. If you get issues with the keyboard mapping:

$ cd .dosbox
$ vi dosbox.conf
usescancodes=false

That should take care of it. Now create a dos directory and download fractint there:

$ cd
$ cd dos
$ wget http://www.nahee.com/spanky/pub/fractals/programs/ibmpc/frain204.zip
$ unzip fain204.zip

Now, run dosbox and mount the dos folder:

mount c dos

You can now type c:, change (cd) to the fracti~1.04p folder and run Fractint!

We'll continue on fractals at a later date.

Hackerspace Charlotte, Fablocker, AMG

We are quite fortunate in this area, we have two hackerspaces really close by and a few more not too far. In Winston Salem, we have Fablocker. There is also the Alamance Makers Guild in Burlington, NC.

And one hour south of here is Hackerspace Charlotte. Just recently, in October, Eben Upton of the Raspberry Pi Foundation came to visit the area and stopped by HSC. A few from Fablocker made it to hear that talk (and get our RPi signed)

His talk was recorded on video and was just posted today at: raspberrypi.org

On the video, I see my mobile office just right out the door, behind Eben.

I noticed the question session had not been posted. I have the audio, perhaps I'll post that.

And speaking of interesting connections, in the next video, recorded in Orlando, FL, Eben is wearing a Hackerspace Charlotte T-shirt (continuing on the same thread...): Orlando Sentinel

Tuesday, November 27, 2012

Minecraft + Python + Raspberry Pi =

>>> Minecraft Raspberry Pi Edition (raspberrypi.org)

Python

#!/usr/bin/env python
import minecraft as mc

mc.connect()

for x in range(-10,10):
    for y in range(-10,1):
        for z in range(-10,10):
            mc.setblock(x, y, z, mc.GLASS)

Minecraft + Python + Raspberry Pi = ☺

Updating a local mercurial project

In the previous article, we cloned the pyptug repository from bitbucket. But what if the repository changes (and it did, this morning)?

Follow me

First, make sure you have signed up to bitbucket and follow the repository ( https://bitbucket.org/fdion/pyptug ). That way you will get an email notification when there is a change.

Solid Hg

Next, you need to go into the pyptug directory and issue:

 fdion@srv:~/bitbucket$ cd pyptug/  
 fdion@srv:~/bitbucket/pyptug$ hg incoming http://bitbucket.org/fdion/pyptug  
 warning: bitbucket.org certificate with fingerprint 24:9c:45:8b:9c:aa:ba:55:4e:01:6d:58:ff:e4:28:7d:2a:14:ae:3b not verified (check hostfingerprints or web.cacerts config setting)  
 real URL is https://bitbucket.org/fdion/pyptug  
 comparing with http://bitbucket.org/fdion/pyptug  
 searching for changes  
 changeset:  1:8e0c4e63bbb1  
 user:    [email protected]  
 date:    Mon Nov 26 09:52:46 2012 -0500  
 summary:   pep8 compliance  
 changeset:  2:638251f3f511  
 user:    [email protected]  
 date:    Mon Nov 26 10:39:57 2012 -0500  
 summary:   pep8 and fo3 fix  
 changeset:  3:b09297ef88b2  
 user:    [email protected]  
 date:    Mon Nov 26 10:43:09 2012 -0500  
 summary:   pep8 compliance  
 changeset:  4:9e0101cf03f8  
 user:    [email protected]  
 date:    Mon Nov 26 10:45:04 2012 -0500  
 summary:   pep8 compliance  
 changeset:  5:89aa2b01e1be  
 tag:     tip  
 user:    [email protected]  
 date:    Tue Nov 27 06:24:25 2012 -0500  
 summary:   Missing if __name__ ==  
 fdion@srv:~/bitbucket/pyptug$ hg pull http://bitbucket.org/fdion/pyptug  
 warning: bitbucket.org certificate with fingerprint 24:9c:45:8b:9c:aa:ba:55:4e:01:6d:58:ff:e4:28:7d:2a:14:ae:3b not verified (check hostfingerprints or web.cacerts config setting)  
 real URL is https://bitbucket.org/fdion/pyptug  
 pulling from http://bitbucket.org/fdion/pyptug  
 searching for changes  
 adding changesets  
 adding manifests  
 adding file changes  
 added 5 changesets with 9 changes to 9 files  
 (run 'hg update' to get a working copy)  
 fdion@srv:~/bitbucket/pyptug$ hg update  
 9 files updated, 0 files merged, 0 files removed, 0 files unresolved

The hg incoming will basically tell us what the changes are, but will not download them. This is purely informational, you can skip that step if you want.

The hg pull command is what gets the changes to the pyptug repository pulled to your local machine.

Finally, hg update will apply those changes to a working copy. In the above example, 9 files have been changed over 5 changesets since I first published the repository.

Chchch...changes

If you are curious as to what was the change:

When running a web.py application through wsgi, and that means not only mod_wsgi or an external wsgi server, but also when running as python script.py [port] (since it will launch CherryPy), it is required to use:

if __name__ == "__main__":

to invoke app.run(). Else, it will lock itself by trying to open a socket on the same port twice.

hw2.py and hw3.py didn't have this. Although it did work in my automatic deployment test setup (uses cgi instead of wsgi so I can test every applications all at once), it didn't work in a standalone install.

Sunday, November 25, 2012

PYPTUG mercurial repository

A little experiment

web.py, the antiframework framework

I'm putting the code up before the actual talk (Monday, the 26th of November), for those who would like to follow along with their laptop. The code again is made available under an MIT license.

Assuming you already have mercurial:

$ hg clone https://bitbucket.org/fdion/pyptug

If you need to install mercurial first:

debian based

$ sudo apt-get install mercurial

or fedora

$ sudo yum install mercurial

Under solaris, it is available under the packagemanager. On windows or mac you can get it here: http://mercurial.selenic.com/

All those .py scripts and directories will make more sense during the talk, as we dig through them.

I will also post screencam captures later, for those who are not local.

Saturday, November 24, 2012

It's more fun to compute

last track - "It's more fun to compute"

Ah, Computer World, by Kraftwerk. The last track is titled "It's more fun to compute". I'm thinking Hutter, Schneider and Bartos must have written this song in November, after raking and blowing leaves for a few days... All joking aside, I hope you have this album on your ipod.

So what is it exactly that makes computing more fun? More fun than what, I hear you ask... What do you think, is it more fun to compute?

Also, either tomorrow or Monday, I'll post the bitbucket Mercurial repository for the web.py presentation at PYPTUG that I'll be doing on Monday. In the interim, go ahead and install web.py. I have the instructions on the sidekick page. Basically:

pi@raspberrypi ~/Desktop $ sudo easy_install web.py
Searching for web.py
Best match: web.py 0.37
Processing web.py-0.37-py2.7.egg
web.py 0.37 is already the active version in easy-install.pth

Using /usr/local/lib/python2.7/dist-packages/web.py-0.37-py2.7.egg
Processing dependencies for web.py
Finished processing dependencies for web.py

If you've gone through the PyHack workshop #01, you already installed mercurial (hg). If not, you will need to install it on your Raspberry Pi before you can get the code:

pi@raspberrypi ~/Desktop $ sudo apt-get install mercurial

Monday, November 19, 2012

Fibospeak

espeak

Speaking (of) the Fibonacci numbers, did you figure it out?

It is actually quite simple to use espeak. On the Raspberry Pi, we cant use directly the espeak module in Python. We thus have to call the espeak application through the os module.

 import os  

 def say(something):  
   os.system('espeak -ven+f2 "{0}"'.format(something))

 a, b = 0, 1  
 say(a)  
 while b < 50:  
     say(b)  
     a, b = b, a+b

And that is basically it!

Variations

If we wanted the numbers read in spanish:

   os.system('espeak -ves+f2 "{0}"'.format(something))

Or in French, male, female and Portuguese, male female:

   os.system('espeak -vfr+m2 "{0}"'.format(something))

   os.system('espeak -vfr+f2 "{0}"'.format(something))

   os.system('espeak -vpt+m2 "{0}"'.format(something)) 
   os.system('espeak -vpt+f2 "{0}"'.format(something))

Under Linux, on a PC with a properly functioning espeak-python module, it would bea little different. After importing espeak, instead of using my function say(), we would use:

   espeak.synth(str(s))

Other options

On the raspberry pi forum, someone pointed to a different approach, which is to use the google translate web service.

Under Linux, but not on the Raspberry Pi (it just clicks instead of speaking), you could also use the Speech Dispatcher server (speechd, python-dispatcher).

Still unresolved

I'll follow up with the visual version tomorrow. There is still time to leave a solution in a comment.

See the next part: Fibovisual

Sunday, November 18, 2012

Python user group meeting

PYPTUG (PYthon Piedmont Triad User Group, covering the Greensboro, High Point and Winston Salem NC area is meeting on Monday November 26th in Kernersville:

PYPTUG meeting

All level of (in)experience welcome!

Saturday, November 17, 2012

Python Speak Hint

A hint for our current python challenge:

$ sudo apt-get install espeak

import pycairo

Depending which challenge you picked... The answer will probably vary if you try it under Raspbian versus Debian, it is all in the library.

Thursday, November 15, 2012

Fibonacci

When we say Fibonacci, some might think of the man, the mathematician, the Fibonacci numbers or even specifically of the following relation:

Or, if you watched (or read) David Mitchell's presentation on iPython at PyHack Workshop #01, you'll recall that it was part of it.

The code was:

 a, b = 0, 1  
 print a,  
 while b < 50:  
   print b,  
   a, b = b, a + b

Incredibly simple, isn't it? The result of which is:


0 1 1 2 3 5 8 13 21 34

I added the print a, since mathematically speaking, the Fibonacci sequence starts at 0, but most of the time it is displayed starting at 1 (1, 1, 2, 3 etc). I also increased the upper boundary to 50 (it was 10 in the demo).

If you can't read the math, you can at least clearly see that any given member of the Fibonacci series is equal to the sum of the two preceding elements.

So, why am I talking about this?

I have two challenges for my readers. The first, is a Python programming challenge, while the other just requires some power of observation:

modify the above code to have the numbers read (audio) instead of printed, or visualize them graphically, but not as numbers.
look around. Where can you find occurrences of Fibonacci numbers

Bucolic Mix

You'll see where we are going with this on our next mathematical intrigue, perhaps this weekend... In the meantime, do post your answers to the challenge in our comments section below.

See the next step: Fibospeak

Monday, November 12, 2012

PiQuizMachine

This article continues documenting one of the PYPTUG Workshops: PyHack Workshop #01, and goes into writing the PiQuizMachine code.

The machine

The PiQuizMachine

The Circuit

Each button controller is made from 1/2" PVC parts and a momentary mini push button, connected by a wire to a board.

On the board, one 10K Ohm resistor pulls up the GPIO to high, while the push button is connected to the GPIO on one end and to ground through a 1K Ohm resistor on the other end.

This last resistor is optional if you are certain you can avoid pressing the buttons whenever the GPIOs are configured as outputs instead of inputs...

This circuit is repeated for all 4 buttons

Source Code

Make sure you've installed Mercurial and pulled the code from bitbucket (as instructed in the previous article), and go into:


    $ cd fablocker/PyHack/workshop01
    $ cd trivia
    $ ls
    load.py  piquiz.py  piquizmachine.py  questions.txt  README.md short.txt

The load.py is the first piece of code we will review.

We have a text file (question.txt) with all the questions and answers for the quiz game. We generated this file using a python script to web-scrape the data from a few pages of triviachamp.com (you can also do this by hand, selecting all the questions on a screen, and copy-pasting into a text file).

There is also a shorter version, with 2 questions / answers. We will use that first to figure out how we will load them into memory in our program. Here is the content of short.txt:

 Louis Leterrier - This film was released in 2010.Who directed Clash of the Titans?  
   
     a. Rodrigo Garcia  
     b. Louis Leterrier  
     c. Joe Carnahan  
     d. Iain Softley  
   
   
 Chicago - This team is part of the NFL.If you wanted to see the Bears play football, which city would you need to visit?  
   
     a. Chicago  
     b. Houston  
     c. San Diego  
     d. Denver

Wow... So what do we have here? We do not have the data nicely separated by a special delimiter or by a new line. We will have to figure out a way to ignore the blank lines, handle the slight variations and extract the data into various fields:

answer - trivia blah blah.question blah blah?
multiple choices of answers

The thing is, the trivia and question can have all kinds of punctuation marks such as comma, period, dash and varying amount of white spaces. Doing this by hand coding a function to do it would be a lot of work and quite boring.

There is something that can deal with this fairly easily. I'm referring again to Python's "batteries included", a library called re, for Regular Expressions.

Regular Expressions

Regular Expressions ( abbreviated as re, regex or regexp), is a language designed to create simple to complex matching against data ( a string or a file). It is about as much the opposite to Python as can be, as it is dense, hard to read and hard to debug.

But sometimes, it is the perfect solution to a problem, and it is fairly easy to use in the Python implementation (a full implementation on Python, just like in Perl - in some other languages, it is a partial implementation, and can be hard to use or at least way more involved).

At any rate, if you ever have a career in IT, chances are you will have to become familiar with them, from system administrators wanting to parse log files, to programmers doing EDI, to web developers doing URL matching (Django, Web.py, even config files for certain web servers), to loading and extracting data from a file that is not 100% structured to be read by a computer (as we will do here).

The official documentation of the re module can be found at docs.python.org

A quick cheat sheet on regular expressions can be found here at tartley.com

And for those who like to follow video tutorials, check out this google video tutorial by Nick Parlante, on youtube.

Let's now look at the code in load.py:

1:  #!/usr/bin/env python  
2:  # -*- coding: utf-8 -*-  
3:  """  
4:  load.py - just loading the question file  
5:  Loads questions and answers from quiz data file. It follows the format  
6:  from triviachamp.com  
7:  """  
8:  # vim: tabstop=8 expandtab shiftwidth=4 softtabstop=4  
9:    
10:  import re  
11:    
12:  with open('short.txt', 'r') as f:  
13:    for line in f:  
14:      if len(line) > 1:  
15:        match = re.search(r'^([\w ]+)-([\w ,-]+)\.(.+)', line)  
16:        if match:  
17:          answer = match.group(1).strip()  
18:          trivia = match.group(2).strip()  
19:          question = match.group(3).strip()  
20:          choices = []  
21:        else:  
22:          match = re.search(r'\s+(\w)\.([\w ,-]+)', line)  
23:          if match:  
24:            choices.append(match.group().strip())  
25:            choice = match.group(1)  
26:            description = match.group(2).strip()  
27:            if description == answer:  
28:              correct_response = choice  
29:            if choice == 'd':  
30:              print(question, choices[0], choices[1],  
31:                   choices[2], choices[3],  
32:                   correct_response, trivia)  
33:

Line 1 through 7 are typical of what we've done for the past several scripts, with the exception of line 2. On that line we specify that this document or script is following not ASCII, not ISO8859-1 but UTF-8 for the encoding of the characters. In this particular case we do not need it, but if we had to use accented letters, or special glyphs (for a card game, the heart, spade, diamond and clover glyphs, for example) then we would need this. Python 3 defaults to UTF-8, so it is a good idea to start learning about unicode and UTF-8, even though we are writing Python 2.x code right now. Line 8 is simply some instructions for an editor named vim.

Line 10 is where we import the regular expression module we just discussed. This is part of the Python library, so no need to download anything. This follows the "batteries included" pattern of Python. For general programming, you typically dont need to download anything else. For domain specific applications however, you will need to download and install other modules (like web.py, matplotlib, pygame, scipy etc)

Line 12, we are opening a file named short.txt, in the 'r' (or read) mode.

12:  with open('short.txt', 'r') as f:

This is quite alien looking for some having a background in another language, such as C. In fact, it is also possible to open a file this way:

f = open('short.txt', 'r')

However, by using the statement with, we get exception handling and graceful housekeeping for free. We dont have to use try, except and finally, it is done implicitly. So just use this form.

Line 13, we have a for loop, and just like we did with the PINS in the previous article, we are getting items from an object. For PINS it was either a tuple or a list, both of which can be iterated by a for loop. In this case, we use a file object named f, which we obtained by calling open(). This is different than, say, a file descriptor in C/C++ (which is what fopen() in these languages would return) where it is only a reference to be used by other functions to do stuff. In Python, it is an object that, when an iteration is requested, will give us one item. For files, it will be one line. We could have also used a different variable name: for blah in file would get me a whole line in variable blah.

Line 14, we use a built-in, len() to tell us if we are dealing with an actual line with some data, or just a blank line, based on the length of the line.

Let me isolate the next 5 lines so we can focus on them:

15:        match = re.search(r'^([\w ]+)-([\w ,-]+)\.(.+)', line)  
16:        if match:  
17:          answer = match.group(1).strip()  
18:          trivia = match.group(2).strip()  
19:          question = match.group(3).strip()

Now, the scary part, line 15. The regular expression. You'll just have to trust me that it works as intended. In the workshop I couldn't go into all the details. Similarly, this article would be way too long if I did, but I'll try to do it anyway... The actual regular expression is this:

^([\w ]+)-([\w ,-]+)\.(.+)

The caret (^) is to start the match at the beginning of the line. The first group in parenthesis will match an alphanumeric character (\w) or a space (anything listed between the two square braces), while the + says to repeat it for as long as you can:

([\w ]+)

But we also follow this with a dash (-) and so it will stop the first group just before the dash. This separates the answer, from the trivia about the answer. The square brackets delimit a set of what characters should match (think of it as a Python list with no commas). This is convenient, because in the second group, we want to get the trivia, which includes not only alphanumeric or 'word characters', but also spaces commas and dashes:

([\w ,-]+)

The + repeats the match until the next rule, the period. It has to be escaped, because a period means match any character. We want it to match an actual period, so we escape it with a backslash (\). The last group uses the period to match any character and wont stop until the end of the line.

To actually use this whole regular expression in our Python code, we have to put it in a string. In python we can use the single or double quote mark for strings ( 'a_string' , "also_a_string" ).

In this case however, not just any string, a raw string. We do this by adding a r at the beginning: r'a_string'. That way we do not have to escape the backslash. And we pass that, along with our line to the search() function of the re module.

This returns a match object only if there is an actual match, so we have to test for existence of such on line 16, before we can use it on lines 17-19. There, we get the data from the groups we defined (defined in the regular expression by the parenthesis pairs) and assign them to variables: answer, trivia and question.

Wow, lots of explanation for only 5 lines of code! But that is the nature of the regular expression beast. As I said, often, it is not the answer, but when it is, we just have to live with its denseness...

Line 20 is just setting up (or clearing) a list to store the multiple choice answers that will follow. Yes, that is right, we haven't dealt with those yet...

Line 21 is the else tied to the existence of a match. Basically, if we couldn't match the first regular expression we wrote, that means we are probably dealing with one of the multiple choice answers. This next part could have been done without regular expressions, but since we spent all this time explaining them, let's use them again on line 22, this time with only 2 groups defined. The first will have the a-d letter and the second the description.

Line 23, we test again for a match. Line 24, we get the match.group() without specifying which of the group we want. By doing this, we get the whole match. We further use the strip() function to remove any leading or trailing white spaces, and we then append this to our choices list. Initially it is empty, and we add to it the multiple choices until the last choice (d).

Let me put the last piece of code repeated here:

25:            choice = match.group(1)  
26:            description = match.group(2).strip()  
27:            if description == answer:  
28:              correct_response = choice  
29:            if choice == 'd':  
30:              print(question, choices[0], choices[1],  
31:                   choices[2], choices[3],  
32:                   correct_response, trivia)

Here, on line 25 we get the letter (a-d) assigned to choice. We then get the description on line 26 (using strip() again to remove leading and trailing white spaces). We can then use this description to compare it (line 27) to the answer we picked up earlier. If the answer is on that line, we then assign to correct_response that letter.

Furthermore, if we are on the last line of the multiple choice answers, we are now ready to either store the whole group of question, choices, correct response and trivia, or in this case (lines 30-32) to print it.

We could also have passed the file to the regular expression and write a single regular expression getting all the data we wanted on multiple lines at once, but the sheer complexity of it would have rendered this tutorial useless.

Simple Quiz

In piquiz.py, we keep things simple (relatively... !) again. It is a basis that can be evolved into something a bit more interesting. Just a straight script, a trivia game in 50 lines of code (a little bare bone obviously with no scoring):

1:  #!/usr/bin/env python  
2:  # -*- coding: utf-8 -*-  
3:  """  
4:  PiQuizMachine - A quiz game for the Raspberry Pi.  
5:  Loads questions and answers from quiz data file. It follows the format  
6:  from triviachamp.com  
7:  """  
8:  # vim: tabstop=8 expandtab shiftwidth=4 softtabstop=4  
9:    
10:  import re  
11:  import random  
12:    
13:  data = []  
14:  with open('questions.txt', 'r') as f:  
15:    for line in f:  
16:      if len(line) > 1:  
17:        match = re.search(r'^([\w ]+)-([\w ,-]+)\.(.+)', line)  
18:        if match:  
19:          answer = match.group(1).strip()  
20:          trivia = match.group(2).strip()  
21:          question = match.group(3).strip()  
22:          choices = []  
23:        else:  
24:          match = re.search(r'\s+(\w)\.([\w ,-]+)', line)  
25:          if match:  
26:            choices.append(match.group().strip())  
27:            choice = match.group(1)  
28:            description = match.group(2).strip()  
29:            if description == answer:  
30:              correct_response = choice  
31:            if choice == 'd':  
32:              entry = (question, choices[0], choices[1],  
33:                   choices[2], choices[3],  
34:                   correct_response, trivia)  
35:              data.append(entry)  
36:    
37:  random.shuffle(data)  
38:  for question,choice_a,choice_b,choice_c,choice_d,\  
39:      correct_response,trivia in data:  
40:    print question  
41:    print choice_a  
42:    print choice_b  
43:    print choice_c  
44:    print choice_d  
45:    team_answer = raw_input("Your answer:")  
46:    if team_answer == correct_response:  
47:      print "That is correct, the answer is",correct_response  
48:      print trivia  
49:    else:  
50:      print "Not correct."

Up to line 31, it is almost the same as we already discussed (but using the full size questions.txt file). 32-34, instead of printing, we now store this entry into a list named data, which we initialized empty on line 13.

We are now going to randomize or shuffle the list of questions on line 37. On line 11 we imported the random module which includes a shuffle function. This is very convenient for games, not just for a quiz, but really interesting for a card game. We could define a card deck as a list, then shuffle it.

On line 38 and 39 (the backslash makes it as if it was a single line), we now loop through all the questions in the data list using a for loop. We then print the question and multiple choices on lines 40 through 44. I'm using a print syntax of a statement, but make note that starting with Python 3, print is a function. The next code example uses the print() function syntax (works in python 2.7 and 3). Lines 38-44: This can all be coded in a prettier way, but in order to keep this as simple as I can, I just did it by specifying all the fields so it is very obvious what we are doing.

Line 45 allows us to get an answer from a keyboard, with a prompt of "Your answer:". We check if the answer is correct on line 46 and if so print a message and the trivia tied to the answer on lines 47 and 48. If the answer was wrong (else) we print a different message on line 50.

So that is the basic core of a quiz program.

The Real Deal

Combining the code we've done in the button/quiz*.py scripts in the previous article, with the code above, we have all the ingredients to make an interactive quiz machine, one where each of the four teams or players gets a game controller, and will be able to "buzz" in first to answer, much as in TV games, such as Jeopardy or Family Feud.

This code was designed to teach about Python and GPIOs for a workshop and it is what I would call "squeaky clean". It is kept on purpose simple, yet demoes several key features of Python and the GPIOs.

The code was run through pep8 and pylint and is properly documented and formatted (even having 2 blank lines between functions, no use of ; etc), is quite verbose (several things were done in multiple lines but in normal use I would probably do as one) and yet, is less than 100 physical lines of code.

1:  #!/usr/bin/env python  
2:  # -*- coding: utf-8 -*-  
3:  """  
4:  PiQuizMachine - A quiz game for the Raspberry Pi.  
5:  Loads questions and answers from quiz data file. It follows the format  
6:  from triviachamp.com. Lock out through pushbutton controllers  
7:  """  
8:    
9:  __author__ = "Francois Dion"  
10:  __email__ = "[email protected]"  
11:    
12:  import re  
13:  import random  
14:  import RPi.GPIO as gpio  
15:    
16:  PINS = (22, 23, 24, 25) # list of pins as tuple  
17:  OFFSET = 21 # team number to GPIO pin offset  
18:    
19:    
20:  def loadtrivia(filename):  
21:    """ Load the trivia into a list, after extracting the fields """  
22:    data = []  
23:    with open(filename, 'r') as f:  
24:      for line in f:  
25:        if len(line) > 1:  
26:          match = re.search(r'^([\w ]+)-([\w ,-]+)\.(.+)', line)  
27:          if match:  
28:            answer = match.group(1).strip()  
29:            trivia = match.group(2).strip()  
30:            question = match.group(3).strip()  
31:            choices = []  
32:          else:  
33:            match = re.search(r'\s+(\w)\.([\w ,-]+)', line)  
34:            if match:  
35:              choices.append(match.group().strip())  
36:              choice = match.group(1)  
37:              description = match.group(2).strip()  
38:              if description == answer:  
39:                correct_response = choice  
40:              if choice == 'd':  
41:                entry = (question, choices[0], choices[1],  
42:                     choices[2], choices[3],  
43:                     correct_response, trivia)  
44:                data.append(entry)  
45:    return data  
46:    
47:    
48:  def getteam(lockedout):  
49:    """ figure out which team presses their button first """  
50:    poll = [pin for pin in PINS if pin - OFFSET not in lockedout]  
51:    while True:  
52:      buttons = [gpio.input(pin) for pin in poll] # list comprehension  
53:      if False in buttons: # at least one button was pressed  
54:        if buttons.count(False) == 1:  
55:          return buttons.index(False) + 1  
56:        else: # trouble, multiple buttons  
57:          teams = [i for i, b in enumerate(buttons) if b is False]  
58:          return random.choice(teams)  
59:    
60:    
61:  def main():  
62:    """ our main program """  
63:    data = loadtrivia('questions.txt')  
64:    
65:    gpio.setmode(gpio.BCM) # broadcom mode, by GPIO  
66:    for pin in PINS:  
67:      gpio.setup(pin, gpio.IN) # set pins as INput  
68:    random.shuffle(data)  
69:    for question, choice_a, choice_b, choice_c, choice_d, \  
70:        correct_response, trivia in data:  
71:      # if we wanted to make a graphical game using pygame  
72:      # we would replace the print statements below  
73:      print(question)  
74:      print(choice_a)  
75:      print(choice_b)  
76:      print(choice_c)  
77:      print(choice_d)  
78:      lockedout = [] # we start with no team locked out  
79:      while len(lockedout) < 4:  
80:        team = getteam(lockedout)  
81:        prompt = "Your answer, team {0}? ".format(team)  
82:        team_answer = raw_input(prompt) # get an answer  
83:        if team_answer == correct_response:  
84:          print("That is correct, the answer is:")  
85:          print(correct_response)  
86:          print(trivia)  
87:          print("")  
88:          lockedout = [1, 2, 3, 4]  
89:        else:  
90:          print("That is not the answer.")  
91:          lockedout.append(team)  
92:    
93:    
94:  if __name__ == "__main__":  
95:    try:  
96:      main()  
97:    except KeyboardInterrupt:  
98:      print("Goodbye")  
99:      gpio.cleanup()

Lines 1 to 16 should be familiar. I did add two variables for author and email. This is just a convention some people do in their code. Line 17 I'm defining an offset between the team number (1,2,3,4) and the GPIOs (22,23,24,25).

Lines 20 through 45 is the code from the previous example, but put into a function that accepts a file name for the quiz data and that upon execution, will return a list containing all our questions.

lines 48 through 58 is our quiz4.py code from the previous article, put into a function, but with a twist:

On line 52, instead of using directly the PINS tuple, we filter it first on line 50 to see which pins we should really poll. When getteam() is called, a list of teams that have been locked out is provided. We are not even going to check these teams button controllers, because they answered this question already, with a wrong answer.

50:    poll = [pin for pin in PINS if pin - OFFSET not in lockedout]

So, what is happening here merits an explanation. You've probably recognized this as a list comprehension (which we've introduced in the previous article). But it looks strange... Let's read it. We will add to this list a pin, from an iteration of the PINS tuple (containing 22,23,24,25), but we will do this only if the pin minus the offset (21) is not in the list of locked out teams. So, if we take 22 - 21, that is 1. If team 1 is locked out, the list stays empty and continues on to the next value and test it. So on and so forth.

I mentioned earlier we had just touched the tip of the list comprehensions. here we went a little deeper, but it goes on.

Lines 51-58, we loop until a button is pressed. If only one button was pressed at the exact same time, we are good to go, and return which of the teams (1 - 4) pressed the button first. But trouble is looming on the horizon. It is possible for 2 or more buttons to be pressed at the exact same moment.

We thus have to introduce some random process to select one of those that have been captured as pressed. We do that on lines 57 and 58. 58 uses the choice() function of the random module (imported on line 13), but 57 requires some explanation, for those who are just starting with Python:

57:          teams = [i + 1 for i, b in enumerate(buttons) if b is False]

We want to generate a list of all the teams that had pressed their buttons. buttons is a list containing something like [False, True, True, False], indicating here that team 1 and 4 pressed their button at the same time.

What I now want is a list containing [1, 4] to select randomly from it. So what we do is to use a built in function called enumerate() on that buttons list. This returns the index (starting at zero) and the value, so we capture this with a for i, b. We will assign i (the zero based index) + 1 (to get a team number) to the list, but only if b (the value) is False. There, that wasn't so bad, after all!

We are now ready for our main() function, lines 61 to 91 (only 30 lines).

On line 63 we load the data from the file questions.txt

Lines 65-67, we set up the GPIOs, as we've done in quiz4.py.

Lines 68-77 is the logic we had in our previous example: piquiz.py

At line 78, we initialize the list of locked out teams to be empty, and start a loop on line 79 that will go on until we have 4 teams locked out by 4 wrong answers, or a right answer (which forces all 4 teams to be locked out).

Line 80 is where we poll the button controllers until we have somebody pressing a button.

81:        prompt = "Your answer, team {0}? ".format(team)  
82:        team_answer = raw_input(prompt) # get an answer

Lines 81 and 82 are asking for an answer from the team that pushed their button first. This answer is given using the keyboard. Line 81 could also have been written as:

prompt = "Your answer, team " + str(team) + "?"

This is closer to what is done in other languages, but each concatenation with + creates a new resulting object. As such, it is a less efficient way of doing it, and if used in a loop with lots of data, will consume a lot of memory and be slow.

lines 83-88 handle a correct answer (concluding by locking out all teams to force a new question), while lines 89-91handle a wrong one, adding that one team to the list of locked out teams (passed to the function on line 80)

94:  if __name__ == "__main__":  
95:    try:  
96:      main()  
97:    except KeyboardInterrupt:  
98:      print("Goodbye")  
99:      gpio.cleanup()

Lines 94-99 repeat the safeguard pattern we established in our previous article, in quiz4.py on lines 21-25

There you go, a complete game with Raspberry Pi GPIO hardware integration, that can be played with friends, in less than 100 lines of python code.

Team 2 is smoking! Two correct answers in a row

Conclusion

I've avoided classes and methods on purpose. These would have complicated things too much for the workshop audience (which ranged from teenagers to adults and from some who had never programmed in python, to some who had written a good bit).

Several also had a background in basic and shell scripts, so I did a good bit as straight scripts, without functions, waiting to the end to introduce these concepts. I've also added a bit more code than in the workshop in order to provide a better reference after the fact.

I hope it is useful to others, and if you are in the area, make sure to keep an eye for our next PyHack Workshop. Let me know if I've provided enough details, what things I could have explained more etc.

Pages

Wednesday, August 26, 2020

R Origin

Installing Python

Installing R

Installing packages

Combining R and Python

One more trick: Radian

Tuesday, October 11, 2016

PyData Carolinas 2016

Description

Abstract

Github Repo

After the fact

Video

Friday, September 30, 2016

5 in 5

5 Music Things

1 - Audio

2 - libROSA

3 - music21

4 - python-sonic

5 - pyKnon

Sunday, September 25, 2016

Two topics will be covered:

Chipmusic, limitations and creativity

Numfocus (Open code = better science)

Sunday, September 18, 2016

Sunday, August 9, 2015

No Admission Charge.

Tuesday, December 4, 2012

Definition...

Sierpinski

Fractint

Dosbox

Tuesday, November 27, 2012

Python

Minecraft + Python + Raspberry Pi = ☺

Follow me

Solid Hg

Chchch...changes

Sunday, November 25, 2012

A little experiment

Saturday, November 24, 2012

Monday, November 19, 2012

espeak

Variations

Other options

Still unresolved

Sunday, November 18, 2012

Saturday, November 17, 2012

Thursday, November 15, 2012

So, why am I talking about this?

Bucolic Mix

Monday, November 12, 2012

The machine

The PiQuizMachine

The Circuit

Source Code

Regular Expressions

Simple Quiz

The Real Deal

Conclusion