The Search for Knowledge The Search for Knowledge

The integrating practical exercise asks students to search for text labels on top of pillars spreaded on a huge surface. The analysis of the entire surface, however, would require too much computing resources and time, so you have to split the surface in smaller chunks, and analyze each of them individually.

The surface can be explored with the Java sfkscanner application. The application takes five parameters specifying the 4 coordinates of the surface to be analyzed (lower left corner (x,y), upper right corner (x,y)) and the step (or granularity) to be used, and it returns information on the surface contents. An example of the sfkscanner usage and outputs can be seen below.

$ java -jar sfkscanner.jar 4250 2510 4350 2610 1

Scan area: 4250.000000, 2510.000000 - 4350.000000, 2610.000000 step: 1.000000
Exploring(4250.000000, 2510.000000, 4350.000000, 2610.000000, 10000)
No pillar found in this area.
$
$java -jar sfkscanner.jar 4350 2610 4450 2710 1
Scan area: 4350.000000, 2610.000000 - 4450.000000, 2710.000000 step: 1.000000
Exploring(4350.000000, 2610.000000, 4450.000000, 2710.000000, 10000)
Found area: 4391.000000, 2611.000000 - 4427.000000, 2620.000000
But no plaque can be found. A smaller stepsize may help.
$
$ java -jar sfkscanner.jar 4391 2611 4427 2620 0.05
Scan area: 4390.000000, 2600.000000 - 4420.000000, 2630.000000 step: 0.050000
Exploring(4390.000000, 2600.000000, 4420.000000, 2630.000000, 360000)
Found area: 4390.500000, 2603.500000 - 4419.950000, 2620.350000
   Sub area: 4406.400000, 2611.300000 - 4411.500000, 2613.550000
Label area: 4406.400000, 2611.850000 - 4411.500000, 2613.150000
 
 
###########################   #####################      ############       ###########################
###########################   #####################      ############       ###########################
            ###               ###                                                       ###            
            ###               ###                     ###            ####               ###            
            ###               ###                     ###            ####               ###            
            ###               ###                     ###                               ###            
            ###               ###                     ###                               ###            
            ###               ###                     ###                               ###            
            ###               ###                     ###                               ###            
            ###               ###                     ###                               ###            
            ###               ###                                                       ###            
            ###               ##################         ######                         ###            
            ###               ##################         ######                         ###            
            ###               ###                                                       ###            
            ###               ###                              ######                   ###            
            ###               ###                              ######                   ###            
            ###               ###                                                       ###            
            ###               ###                                    ####               ###            
            ###               ###                                    ####               ###            
            ###               ###                                    ####               ###            
            ###               ###                                    ####               ###            
            ###               ###                                    ####               ###            
            ###               ###                                    ####               ###            
            ###               ###                     ###            ####               ###            
            ###               ###                                                       ###            
            ###               #####################      ############                   ###            
            ###               #####################      ############                   ###   
 

The area to scan is vast, but don't worry, you get some hints left by past researchers. The hints give a rough idea of where to begin looking for each pillar, however some are not very accurate and you will still need to run the scanner several times in order to find the data. You should iterate through subareas varying the starting coordinates, shifting them to scan the next subarea until you have searched all over the choosen area.

The sfkscanner Application

The sfkscanner application can be downloaded here dciss-integrating-practical.tgz. It consists of four files:

BoxData.txt

issgc_sfk_nesc.jar

PillarsData2.txt.en

sfkscanner.jar

You need all files in the same directory to run the scan, so when you submit your job to the Grid, don't forget to include all four files in your job description language so that they will be uplodaded to the CE.

Instructions

  • Download the sfkscanner application
  • Test the application using the coordinates shown above for the TEST pillar
  • Download the hint file
  • Find the coordinates you want to search from the hint file
  • Create a job description file and submit the sfkscanner application to the EMI grid
  • Collect your results
  • Refer back to EMI User Tutorial for help on how to do any of these steps!

Hints

The hint file is available for download here.

Untar the file and open the html file to see a map of the approximate location of your pillars. You should use this map to get the lower and upper bounds for each scan area.

gLite

To run the exercise using gLite you may need to set up some environment variables and add the library file issgc_sfk_nesc.jar to your classpath.  The easiest way to do this is to create a simple wrapper shell script which sets your CLASSPATH environment variable and then executes the java application with the appropriate arguments.  Alternatively you can set this with the JDL Environment setting.

Environment = {"$PWD/issgc_sfk_nesc.jar:$CLASSPATH}

You can run your jobs as a job collection to submit multiple jobs at a time.

ARC

Not all of the ARC resources are running a recent version of Java, so it is necessary to narrow down the list of available resources for this exercise.

Edit your ~/.arc/client.conf file and set

[common]
defaultservices=computing:ARC0:testbed8.grid.upjs.sk computing:ARC0:pgs03.grid.upjs.sk
 

This will ensure that your job is submitted to a CE which is capable of running it.

UNICORE

To run sfkscanner with UNICORE you need to specify the Java application and remember to include all of our input files in the job description.

The simplest way to perform the scan is probably to create multiple job description files with different coordinates and to run these as a batch job.

dCache

When you have found a pillar you should capture the output of the sfkscanner.jar application and upload this using dCache to the SE sligo.desy.de. You should put your files into the directory (or more properly, collection) /pnfs/desy.de/data/testers.eu-emi.eu/integrating/.

Your results files should have a name conforming to the format group_X_pillar_Y where X is your group number and Y is the pillar number (based on the order you find the pillars).

Note: you can use whichever dCache command you prefer to upload the files!

Tips

The scanning process is quite memory intensive, so scanning a large area at a time, or scanning with a very small step size may cause a Java memory error. It is best to start with a large step size, say "1", and then once you have found something promising, reduce the area to search and the step size in order to find the data.

Some of the words you are looking for could contain spaces, or could be quite long. Make sure when you find something that you have the entire area of interest in your scan.

If the step size is too large the word that you find may not be readable. Experiment with step sizes to get readable output.

Acknowledgements

The sfkscanner application and the integrating practical exercise were developed by the National eScience Centre in Edinburgh as part of the International Summer School of Grid Computing (ISSGC) programme.