Introduction to Cyberinfrastructure and Grid Computing

Spring, 2007

CS 696 Assignment #1: Due 05-Feb-07
(This page last updated on January 25, 2007)
[Return to Assignments]


Problem 1:
File I/O testing
  • Write this as a script file and run it from the command line
  • Use a text file (more than 5000 characters).
  • Process the following commands
    • Filter the data and find the frequency at which the ascii chars appear ([a-z,A-Z], [0-9]).
    • Store the results in a dictionary.
    • Emulate the Unix command "wc" to get the number of lines, #words, #chars
    • Find the top 50 words (most frequently ocurring) on this Web page, print them out in sorted order and reverse order.
note: you will reuse these in Problem 3. use the list, tuple, and dictionary objects in this problem.


Problem 2:   
An ice cream store sells M flavours of ice cream. You want to buy N dishes of ice cream.
  • Write a script to find all the different ways in which you can fulfill your purchase.  Use recursion to compute  the permutations.
  • Assume:
    • M > N
    • Each dish has one scoop
    • Each combination of dishes is unique
  • Write the script as a module and import it into the interpretor and run it.
  • Save results to a file
  • Test your code for different values of M, for (N<M, N=M, N>M).

Problem 3: 
The TeraGrid (http://www.teragrid.org) hosts information and resources for the NSF computational science community. The TeraGrid User Portal provides  dynamics information about comuptational  and archival systems:
  • Use "HTML Scraping" to dynamically read the contents of this Web page.
  • Filter the data and identify "words" using common "white space" filters (blanks, CR, etc.)
    • you can make the filter case insensitive
    • hint: look for ascii words like 'the', 'teragrid'
  • Emulate the Unix command "wc":
    • Find the top 20 words on this Web page.
HINT: Google for the term "HTML Scraping" and  the Python modules urllib and popen2 



[Visitor Counter] Visitors since 25-Jan-2007
  

Copyright ©, All rights reserved.
2007 SDSU & Mary Thomas, 5500 Campanile Drive, San Diego, CA 92182-7700 USA.
OpenContent license defines the copyright on this document.